You may obtain incorrect results and misleading labels when you use the Analysis ToolPak t-Test in Excel 2003 (829252)

MORE INFORMATION

Typically, you apply the Paired Two Sample t-Test (sometimes named the Matched Pairs t-Test) when you have Before and After measurements on the same experiments for a treatment. For example, you might measure a subject's weight before and after a 30-day diet plan.

Typically, you get rid of data on any subject whose Before measurement or After measurement is missing. Incomplete data on a subject makes information about that subject useless. Unfortunately, this Analysis ToolPak tool behaves differently than the typical practice. First, this Analysis ToolPak tool counts the number of subjects with Before measurements and the number of subjects with After measurements. If these totals are different, you receive an error message and this Analysis ToolPak tool does not continue. Therefore, for example, if there are 49 subjects who all have both Before and After measurements and a fiftieth subject who has only a Before measurement, the Analysis ToolPak tool does not do the analysis.

If the number of subjects that are missing Before data equals the number of subjects that are missing After data and this number is positive, the tool performs an inappropriate analysis. For example, assume that there are 50 subjects. Subject A is missing a Before measurement, and Subject B is missing an After measurement, and the other 48 subjects have no missing data. The tool counts 49 Before measurements and 49 After measurements; the tool acts as if there were 49 subjects with no missing data. This violates your intent of eliminating any subject who is missing a Before value or an After value. The number of subjects in this example should be 48, not 49. Therefore, the tool uses an incorrect number of degrees of freedom. Additionally, because the tool discards neither Subject A's After measurement nor Subject B's Before measurement, these two measurements are included in calculations of sample means that are used in the t-statistic. Therefore, these calculated sample means are inappropriate.

In summary, it is inappropriate to use the tool when there is missing data, because the tool either will not compute or it will compute with inappropriate formulas. The latter case occurs when the number of subjects with missing Before data equals the number of subjects with missing After data.

The example in the "Example Use" section of this article illustrates these problems and also points out confusing labels in the tool's output. The "Workaround" section of this article suggests a workaround in a case where you cannot verify the absence of missing data before you use the tool.

Example Use

To illustrate the problem of missing data, follow these steps:

Create a blank Excel worksheet, and then copy the following table.

Experiment 1		Experiment 2		Experiment 3		Experiment 3 modified to remove
before	after	before	after	before	after	subjects with missing data
200	170	200	170	200	170	200	170
190	180	190	180	190	180	190	180
180	175	180	175	180	175	180	175
170	175	170	175	170	175	170	175
160	165	160	165	160	165	160	165
150	140	150	140	150	140	150	140
140	130	140	130		130	130	125
130	125	130	125	130	125	120	125
120	125	120	125	120	125	110	100
110	100	110	100	110	100
100	100	100		100

Behavior of 2-tailed t-test
=TTEST(A3:A13, B3:B13,2,1)		=TTEST(C3:C13, D3:D13, 2, 1)		=TTEST(E3:E13, F3:F13, 2, 1)
		=TTEST(C3:C12, D3:D12, 2, 1)		=TTEST(G3:G11, H3:H11, 2, 1)

ATP Tool for Experiment 1:
t-Test: Paired Two Sample for Means

	Variable 1	Variable 2
Mean	150	144.090909090909
Variance	1100	914.090909090909
Observations	11	11
Pearson correlation	0.952384533866487
Hypothesized mean difference	0
df	10
t Stat	1.92092590483801
P(T<=t) one-tail	0.0418403929085198
t Critical one-tail	1.81246110219722
P(T<=t) two-tail	0.0836807858170396
t Critical two-tail	2.22813884242587

ATP Tool for Experiment 2:
Will not compute because of unequal numbers of datapoints

ATP Tool for Experiment 3:
t-Test: Paired Two Sample for Means

	Variable 1	Variable 2
Mean	151	148.5
Variance	1210	778.055555555556
Observations	10	10
Pearson correlation	0.936537537274845
Hypothesized mean difference	0
df	9
t Stat	0.141327169509421
P(T<=t) one-tail	0.445362157564494
t Critical one-tail	1.83311292255007
P(T<=t) two-tail	0.890724315128988
t Critical two-tail	2.26215715817358

Select cell A1 in your blank Excel worksheet, and then click Paste on the Edit menu so that the entries in the table fill cells A1:I52 in your worksheet.
Click the Paste Options button, and then click Match Destination Formatting. With the pasted range still selected, point to Column on the Format menu, and then click AutoFit Selection.

You can use the worksheet to compare results of the TTEST function of Excel with the Analysis ToolPak tool. Experiment 1 shows complete data on eleven subjects. The value of TTEST in cell A16 is the probability that the t-statistic will be greater than the observed value, assuming a t-distribution with ten degrees of freedom. This value, 0.837, is also shown in the tool's output in cell B32. The label in cell A32 should read "P(T >= |t|) two-tail", not "P(T<=t) two-tail", but the numeric answer is correct because there is no missing data in Experiment 1. Similarly, the label in cell A30 should read "P(T >= |t|) one-tail", not "P(T<=t) one-tail". The "t Critical" cutoffs are correct. They correspond to the default significance level, 0.05, and they use the correct number of degrees of freedom, 10.

Experiment 2 has one missing After measurement on one subject and no other missing data. The tool refuses to compute. The values of TTEST in cells A16 and A17 are the same. In cell A16, the data cell range C3:D13 is used; this includes the last subject, the only one with missing data. In cell A17, the data cell range C3:D12 is used; this corresponds to an experiment with the first ten subjects and no missing data. The fact that the results are the same indicates that when TTEST is called in cell A16, TTEST appropriately discards the subject with missing data.

Experiment 3 has one missing Before measurement and one missing After measurement on two different subjects. Experiment 3 modified shows the nine remaining subjects with no missing data. The TTEST results in cells E16 and E17 are the same. In cell E16, TTEST is called on the Experiment 3 data in cells E3:F13. In cell E17, TTEST is called on the Experiment 3 modified data in cells G3:H11. The results are the same because TTEST appropriately discards the seventh and eleventh subjects in Experiment 3, the two with missing data. If you examine the tool's output for Experiment 3, the number of Before and After observations in cells B44 and C44 is ten in each case. It is easy to verify that SUM(E3:E13) is 1510 and SUM(F3:F13) is 1485; because there are 10 observations in each range, the respective means are 151 and 148.5, shown in cells B42 and C42. Therefore, the tool has not discarded any subjects and has included the After measurement for the seventh subject and the Before measurement for the eleventh subject in its analysis. The number of degrees of freedom in cell B47 is inappropriate, because there should have been nine subjects and eight df. This makes for incorrect entries of cutoff values in cells B50 and B52 (in addition to misleading labels for those entries in cells A50 and A52.)

Results in All Versions of Excel

Unfortunately, this tool was not corrected for Excel 2003.

Suggested Workaround for All Versions of Excel

You can get rid of subjects with missing data before you use the tool. However, you may not want to edit an Excel worksheet in this way. One procedure for eliminating subjects with missing data is shown in the following steps. To get rid of subjects with missing data, follow these steps:

Copy the two data ranges to a new area of your worksheet.
Scan the data upward from the common bottom of the two ranges.
1. If the bottom row contains missing data, clear the bottom row. This reduces the range of data. Go to step 3.
2. Identify row r above the bottom row, but closest to the bottom with missing data.
  1. Copy all data below row r.
  2. Select row r, and then paste copied data into it.
  3. Clear the bottom row of data (which will now duplicate the next to last row of data). This reduces the range of data.
Repeat step 2 until no missing data remains.

Note You can use the tool if you can guarantee that there are no missing observations.

You can duplicate much but not all of the tool's output without transforming the data. You cannot find appropriate values for Mean, Variance, and Observations without a lot of effort. The tool finds inappropriate values by examining the Before and After data separately. The tool's df is the common value of Observations minus one; therefore, it is also inappropriate if there are missing data. You cannot find t Stat without a lot of effort, because you have to examine Before and After data at the same time.

However, you can find Pearson Correlation by applying PEARSON or CORREL to the two data ranges. Both of these Excel functions handle missing data appropriately. Also, you can find the one-tail and two-tail t probabilities associated with the data by calling the TTEST function of Excel, which handles missing data appropriately. For the one-tailed and two-tailed probabilities in experiment 3, you might call TTEST(E3:E13, F3:F13, 1, 1) and TTEST(E3:E13, F3:F13, 2, 1) respectively. You could also verify that results of these functions agree with those of the tool in Experiment 1, where the tool behaves appropriately because there are no missing data. The corresponding calls for Experiment 1 are TTEST(A3:A13, B3:B13, 1, 1) and TTEST(A3:A13, B3:B13, 2, 1) respectively.

For the critical cutoffs, you must establish the number of degrees of freedom. In experiments 1, 2, and 3, the correct numbers of degrees of freedom are ten, nine, and eight respectively. These numbers are always one less than the number of useful subjects in your data without missing Before or After measurements. For experiment 3, for example, you could enter in cell J3, =IF(OR(ISBLANK(E3), ISBLANK(F3)), 0, 1), then fill down this formula into cells J4:J13 and find df by entering in cell J14: =SUM(J3:J13) - 1.

After you establish df, you can use the TINV function of Excel. With significance level 0.05, the calls for Experiments 1, 2, and 3 would be TINV(0.05, 10), TINV(0.05, 9), and TINV(0.05, 8) respectively. These would return the "t Critical two-tail" values. To get the "t Critical one-tail values", you would use the analogous calls with the significance level doubled, such as TINV(0.10, 10), TINV(0.10, 9), and TINV(0.10, 8) respectively.

Conclusions

Do not use the Analysis ToolPak t-Test Paired Two Sample for Means tool unless you can make sure that there are no missing data points. This article describes suggestions for duplicating most of the tool's functionality by using Excel functions, instead of the Analysis ToolPak.

The tool also provides misleading "P(T<=t)" labels. This article describes the correct interpretations.