Description of the ZTEST function in Excel 2003 (828296)

MORE INFORMATION

The ZTEST(array, µ0, sigma) function is used for hypothesis testing when observations in the sample are assumed to come from a Normal distribution with a known standard deviation, sigma. If the null hypothesis is such that the mean of this Normal distribution is µ0, then for a sample of size n, under this null hypothesis, the sample mean has a Normal distribution with mean µ0 and standard deviation sigma/SQRT(n). ZTEST returns the probability that a typical random variable with this distribution will take on a value higher than the observed sample mean corresponding to the sample in array.

If sigma is omitted, then ZTEST will use STDEV(array) in its place.

Syntax

ZTEST(array, µ0, sigma)

Note The Array contains numeric data, µ0 is a real number and sigma (if included) is a positive number. If sigma is omitted, STDEV(array) is used. In this case, the data in the array must be such that STDEV(array) is positive. This is a reasonably innocuous supposition that is satisfied unless each entry in the array has the same value.

Example Usage

To test how the ZTEST function works, assume that Intelligence Quotient (IQ) scores follow a Normal distribution with a standard deviation of 15, and that you review the IQs for a sample of nine students in your local school.

To illustrate this, follow these steps:

Create a blank Excel worksheet, and then copy the following table:

array	mu0	sigma
110	100	15
115
120
95
110
105
90
105
125

for one-sided test:			for two-sided test
=AVERAGE(A2:A10)
=ZTEST(A2:A10,B2,C2)			=2*MIN(ZTEST(A2:A10,B2,C2),1 - ZTEST(A2:A10,B2,C2))
=1 - NORMSDIST((AVERAGE(A2:A10) - B2)/(C2/SQRT(COUNT(A2:A10))))

with sigma assumed unknown:
=STDEV(A2:A10)
=ZTEST(A2:A10,B2)			=2*MIN(ZTEST(A2:A10,B2), 1 - ZTEST(A2:A10,B2))
=1 - NORMSDIST((AVERAGE(A2:A10) - B2)/(STDEV(A2:A10)/SQRT(COUNT(A2:A10))))

Select cell A1 in your blank Excel worksheet, and then click Paste on the Edit menu so that the entries in the table below fill cells A1:D20 in your worksheet.
After you paste the table in your new Excel worksheet, click the Paste Options button, and then click Match Destination Formatting. With the pasted range still selected, point to Column on the Format menu, and then click AutoFit Selection.

The IQ data for the nine students are given in A2:A10, µ0 is in B2, and sigma is in C2. First, assuming that you know (and will therefore use) the value of sigma, the value returned by ZTEST for the one-sided (or one-tailed) test is in cell A14. The average IQ value, shown in A13 is 108.333. ZTEST confirms that if the true mean of the underlying Normal distribution from which all nine student IQs were drawn is 100 and the true standard deviation is 15, a sample mean higher than 108.333 would occur with probability 0.0478. Therefore, if before drawing the sample, you had established a significance level of 0.05 for this one-tailed test, we would reject the null hypothesis (in favor of the one-sided alternative that the true mean of the underlying distribution from which our sample was drawn is greater than 100).

Cell A15 shows how ZTEST is computed when you assume that sigma is known. If you call ZTEST(array, µ0, sigma), ZTEST returns

1 - NORMSDIST((AVERAGE(array) - µ0)/(sigma/SQRT(n)))

where n is the number of observations in the sample.

Cells A17:A20 provide results for the case where sigma is assumed to be not known (and the optional third argument to ZTEST is omitted). In this case, Excel calculates STDEV(array) shown in A18, and it uses this quantity instead of sigma. Cell A19 shows the results of ZTEST with the third argument omitted. Notice that the probability under the null hypothesis of a higher value of the sample mean is about one fourth of the value in the case where sigma is known. This occurs because the sample standard deviation was 11.18, lower than 15. Dividing these quantities by SQRT(sample size) = SQRT(9) = 3 gives 3.73 and 5, respectively. The observed sample mean is 8.333 units above the hypothesized µ0 = 100 and 8.333/3.73 is about 2.23, while 8.333/5 is 1.67. The probability that a standard Normal random variable exceeds 2.23 is about one fourth the probability that it exceeds 1.67.

ZTEST is designed to give a one-tailed result, the probability that a sample mean higher than observed would occur under the null hypothesis. ZTEST can be adapted to give a two-tailed result, the probability that a sample mean further from µ0 in either direction than the observed sample mean would occur under the null hypothesis. To answer this question, you can use the following formula involving ZTEST when sigma is assumed known:

= 2 * MIN(ZTEST(array, µ0 , sigma); 1 - ZTEST(array, µ0 , sigma))

You can still do a two-tailed test when sigma is assumed unknown by using:

= 2 * MIN(ZTEST(array, µ0); 1 - ZTEST(array, µ0))

Cells D14 and D19 show these two results.

Results in Earlier Versions of Excel

Code for ZTEST was not changed for Excel 2003. However, because ZTEST effectively calls NORMSDIST in computing its result, the accuracy of ZTEST relies primarily on the accuracy of NORMSDIST. The accuracy of NORMSDIST has been improved in Excel 2003. In earlier versions of Excel, a single computational procedure was used for all values of z. Results were essentially accurate to seven decimal places, more than sufficient for most practical examples.

For additional information about NORMSDIST, click the following article number to view the article in the Microsoft Knowledge Base:

827369 Excel Statistical Functions: NORMSDIST

Results in Excel 2003

The procedure for NORMSDIST in Excel 2003 uses two different computational procedures depending on the value of z. The first is for z between -5 and +5. The second is for z values in the extreme left or right tails, below -5 or above +5. Accuracy was improved for all values because, over the range of z values where each was used, these two methods were both superior to the single method used in previous versions of Excel. Typical accuracy is now 14 to 15 decimal places.

Conclusions

There are rare occasions where you might demand accuracy in ZTEST better than seven decimal places. On such occasions, the Excel 2003 version of NORMSDIST will cause ZTEST to give superior performance. For all other computations involving ZTEST, you should not notice a difference between Excel 2003 and earlier versions of Excel.

When you conduct a hypothesis test, you should establish a significance level in advance of the test. Typical significance levels are 0.05, 0.01, or 0.001. After the data is gathered, you can run ZTEST to determine whether the null hypothesis should be rejected because the value of ZTEST is below the significance level cutoff. It is hard to imagine that more than seven decimal places of accuracy in ZTEST would be required for this determination unless you started with an extremely small significance level, such as 0.00000001.

For more information about ZTEST, click Microsoft Excel Help on the Help menu, type ztest in the Search for box in the Assistance pane, and then click Start searching to view the topic.