MORE INFORMATION
The CONFIDENCE(alpha, sigma, n) function returns a value
that you can use to construct a confidence interval for a population mean. The
confidence interval is a range of values that are centered at a known sample
mean. Observations in the sample are assumed to come from a normal distribution
with known standard deviation, sigma, and the number of observations in the
sample is n.
Syntax
CONFIDENCE(alpha,sigma,n)
Parameters: Alpha is a probability and 0 < alpha < 1. Sigma
is a positive number, and n is a positive integer that corresponds to the
sample size.
Typically, alpha is a small probability, such as 0.05.
Example Usage
Assume that Intelligence Quotient (IQ) scores follow a normal
distribution with standard deviation 15. You test IQs for a sample of 50
students in your local school and obtain a sample mean of 105. You want to
compute a 95% confidence interval for the population mean. A 95% or 0.95
confidence interval corresponds to alpha = 1 - 0.95 = 0.05.
To
illustrate the CONFIDENCE function, create a blank Excel worksheet, copy the
table below, and then select cell A1 in your blank Excel worksheet. On the
Edit menu, click
Paste so that the entries in
the table below fill cells A1:B7 in your worksheet.
alpha | 0.05 |
stdev | 15 |
n | 50 |
sample mean | 105 |
| |
=CONFIDENCE(B1,B2,B3) | |
=NORMSINV(1 - B1/2)*B2/SQRT(B3) | |
Note After you paste this table into your new Excel worksheet, click
the
Paste Options button, and then click
Match
Destination Formatting. With the pasted range still selected, on the
Format menu, point to
Column, and then click
AutoFit Selection.
Cell A6 shows the value of
CONFIDENCE. Cell A7 shows the same value because a call to CONFIDENCE(alpha,
sigma, n) returns the result of computing:
NORMSINV(1 - alpha/2) * sigma / SQRT(n)
No changes were made directly to CONFIDENCE, but NORMSINV was
improved between versions of Excel that are earlier than Excel 2002 and Excel
2002, and then again, improvements were made between Excel 2002 and Excel 2003.
Therefore, CONFIDENCE may return different (and improved) results in these
later versions because CONFIDENCE relies on NORMSINV.
This does not
mean that you should lose confidence in CONFIDENCE for earlier versions.
Inaccuracies in NORMSINV generally occurred for values of its argument very
close to 0 or very close to 1. In practice, alpha is generally set to 0.05,
0.01, or maybe 0.001. Values of alpha have to be much smaller than that, for
example 0.0000001, before round off errors in NORMSINV are likely to be
noticed.
Note See the article on NORMSINV for a discussion of computational
differences in NORMSINV for versions of Excel that are earlier than Excel 2002,
for Excel 2002, and for Excel 2003.
For additional
information, click the following article number to view the article in the
Microsoft Knowledge Base:
826772
Excel Statistical Functions: NORMSINV
Interpretation of the Results of CONFIDENCE
The Excel Help file for CONFIDENCE has been rewritten for Excel
2003 because all earlier versions of the Help file gave misleading advice on
interpreting results. The example states, "Suppose we observe that, in our
sample of 50 commuters, the average length of travel to work is 30 minutes with
a population standard deviation of 2.5. We can be 95 percent confident that the
population mean is in the interval 30 +/- 0.692951" where 0.692951 is the value
returned by CONFIDENCE(0.05, 2.5, 50). For the same example the conclusion
reads, "the average length of travel to work equals 30 ± 0.692951 minutes, or
29.3 to 30.7 minutes." Presumably, this is also a statement about the
population mean falling within the interval [30 - 0.692951, 30 + 0.692951] with
probability 0.95.
Before conducting the experiment that yielded the
data for this example, a classical statistician (as opposed to a Bayesian
statistician) can make no statement about the probability distribution of the
population mean. Instead, a classical statistician deals with hypothesis
testing. For example, a classical statistician may want to conduct a two-sided
hypothesis test that is based on the supposition of a normal distribution with
known standard deviation (such as 2.5), a particular pre-selected value of the
population mean, µ0, and a pre-selected significance level (such as 0.05). The
test's result would be based on the value of the observed sample mean (for
example 30) and the null hypothesis that the population mean is µ0 would be
rejected at a significance level 0.05 if the observed sample mean was too far
from µ0 in either direction. If the null hypothesis is rejected, the
interpretation is that a sample mean that far or further from µ0 would occur by
chance less than 5% of the time under the supposition that µ0 is the true
population mean. After conducting this test, a classical statistician still
cannot make any statement about the probability distribution of the population
mean.
A Bayesian statistician, on the other hand, would start with an
assumed probability distribution for the population mean (named an a priori
distribution), would gather experimental evidence in the same way as the
classical statistician, and would use this evidence to revise her or his
probability distribution for the population mean and thereby obtain an a
posteriori distribution. Excel provides no statistical functions that would
help a Bayesian statistician in this endeavor. Excel's statistical functions
are all intended for classical statisticians.
Confidence intervals are
related to Hypothesis Tests. Given the experimental evidence, a confidence
interval makes a concise statement about the values of the hypothesized population mean µ0 that would yield acceptance of the null hypothesis that the
population mean is µ0 and the values of µ0 that would yield rejection of the
null hypothesis that the population mean is µ0. A classical statistician cannot
make any statement about the chance that the population mean falls in any
specific interval, because she or he never makes a priori assumptions about
this probability distribution and such assumptions would be required if one
were to use experimental evidence to revise them.
Explore the
relationship between hypothesis tests and confidence intervals by using the
example at the beginning of this section. With the relationship between
CONFIDENCE and NORMSINV stated in the last section, you have:
CONFIDENCE(0.05, 2.5, 50) = NORMSINV(1 - 0.05/2) * 2.5 / SQRT(50) = 0.692951
Because the sample mean is 30, the confidence interval is 30 +/-
0.692951.
Now consider a two-sided hypothesis test with the
significance level 0.05 as described earlier that assumes a normal distribution
with standard deviation 2.5, a sample size of 50 and a specific hypothesized
population mean, µ0. If this is the true population mean, then the sample mean
will come from a normal distribution with population mean µ0 and standard
deviation, 2.5/SQRT(50). This distribution is symmetrical about µ0 and you
would want to reject the null hypothesis if ABS(sample mean - µ0) > some
cutoff value. The cutoff value would be such that if µ0 were the true
population mean, a value of sample mean - µ0 higher than this cutoff or a value
of µ0 - sample mean higher than this cutoff would each occur with probability
0.05/2. This cutoff value is
NORMSINV(1 - 0.05/2) * 2.5/SQRT(50) = CONFIDENCE(0.05, 2.5, 50) = 0. 692951
So reject the null hypothesis (population mean = µ0) if
sample mean - µ0 > 0. 692951
-or-
0 - sample mean > 0. 692951
Because sample mean = 30 in our example these two statements
become
30 - µ0 > 0. 692951
-or-
µ0 - 30 > 0. 692951
and rewriting them so that only µ0 appears on the left
µ0 < 30 - 0. 692951
-or-
µ0 > 30 + 0. 692951
These are exactly the values of µ0 that are not in the confidence
interval [30 - 0.692951, 30 + 0.692951]. Therefore the confidence interval [30
- 0.692951, 30 + 0.692951] contains those values of µ0 where the null
hypothesis that the population mean is µ0 would not be rejected given the
sample evidence. For values of µ0 outside this interval, the null hypothesis
that the population mean is µ0 would be rejected given the sample evidence.
Conclusions
Inaccuracies in earlier versions of Excel generally occur for
extremely small or extremely large values of p in NORMSINV(p). CONFIDENCE is
evaluated by calling NORMSINV(p) so accuracy of NORMSINV is a potential concern
for users of CONFIDENCE. However, values of p used in practice are not likely
to be extreme enough to cause significant round off errors in NORMSINV and
performance of CONFIDENCE should not be a concern to users of any version of
Excel.
Most of this article has focused on interpreting the results
of CONFIDENCE, in other words, "what is the meaning of a confidence interval?"
Confidence intervals are frequently misunderstood. Unfortunately, Excel Help
files in all versions of Excel that are earlier than Excel 2003 have
contributed to this misunderstanding. The Excel 2003 Help file has been
improved.
For more
information about CONFIDENCE, click
Microsoft Excel Help on
the
Help menu, type
confidence in the
Search for box in the Assistance pane, and then click
Start searching to view the topic.