Excel statistical functions: CRITBINOM (828117)



The information in this article applies to:

  • Microsoft Office Excel 2003
  • Microsoft Excel 2004 for Mac

SUMMARY

This article describes the CRITBINOM function in Excel 2003, illustrates how the function is used, and compares the results of the function for Excel 2003 with the results of CRITBINOM in earlier versions of Excel.

Microsoft Excel 2004 for Mac information

The statistical functions in Excel 2004 for Mac were updated by using the same algorithms that were used to update the statistical functions in Microsoft Office Excel 2003. Any information in this article that describes how a function works or how a function was modified for Excel 2003 also applies to Excel 2004 for Mac.

MORE INFORMATION

The CRITBINOM(n, p, alpha) function examines for various x the cumulative probability of x successes in n independent Bernoulli trials. Each trial has the associated probability p of success. CRITBINOM returns the smallest value of x for which this cumulative probability is greater than or equal to alpha. This cumulative probability is greater than or equal to alpha.

Syntax

CRITBINOM(n, p, alpha)

Parameter n is a positive integer, 0 < p < 1, and 0 <= alpha <= 1 (although the cases where alpha = 0 and alpha = 1 are of limited interest).

Example usage

In baseball, you can make the simplifying supposition that a ".300 hitter" receives a hit (success) with probability 0.300 each time he comes to bat (each trial) and that successive times at bat are independent Bernoulli trials. In 100 trials, 30 successes is the batter's expected number of hits.

You can use CRITBINOM to answer the following questions:
  • The smallest number x of successes where the batter has at least a 0.25 chance of getting x or fewer successes; x would be less than 30.
  • (Small variation) The smallest number of successes x' where the batter has at least a 0.25 chance of not getting at least x' successes; x' would also be less than 30.
  • The smallest number y of successes so that the chance of getting y or more successes is less than 0.10; y would be greater than 30.
To illustrate the CRITBINOM function, create a blank Excel worksheet, copy the table below, and then select cell A1 in your blank Excel worksheet. On the Edit menu, click Paste so that the entries in the table below fill cells A1:B21 in your worksheet.

number of trials, n100
success probability, p0.3
alpha
0.05=CRITBINOM($B$1,$B$2,A4)
0.1=CRITBINOM($B$1,$B$2,A5)
0.2=CRITBINOM($B$1,$B$2,A6)
0.25=CRITBINOM($B$1,$B$2,A7)
0.3=CRITBINOM($B$1,$B$2,A8)
0.4=CRITBINOM($B$1,$B$2,A9)
0.5=CRITBINOM($B$1,$B$2,A10)
0.6=CRITBINOM($B$1,$B$2,A11)
0.7=CRITBINOM($B$1,$B$2,A12)
0.75=CRITBINOM($B$1,$B$2,A13)
0.8=CRITBINOM($B$1,$B$2,A14)
0.9=CRITBINOM($B$1,$B$2,A15)
0.95=CRITBINOM($B$1,$B$2,A16)
CRITBINOMcorrect value
=CRITBINOM(1030,0.5,0.16704)499
=CRITBINOM(1030,0.5,0.1831)500
=CRITBINOM(1030,0.5,0.51242)515


Note After you paste this table into your new Excel worksheet, click the Paste Options button, and then click Match Destination Formatting. With the pasted range still selected, on the Format menu, point to Column, and then click AutoFit Selection.

Cells B4:B16 show values of CRITBINOM for respective alpha values in A4:A16. For question 1, CRITBINOM(100, 0.3, 0.25) in cell B7 is 27 so the probability of getting 27 or fewer successes is greater than or equal to 0.25 whereas the probability of getting 26 or fewer successes is less than 0.25. So, 27 is the smallest number of successes x such that the batter has at least a 0.25 chance of getting x or fewer successes.

Question 2 is just a small twist on question 1. Because there is at least a 0.25 chance of 27 or fewer successes, there is at most a 0.75 chance of getting 28 or more successes, and therefore at least a 0.25 chance of not getting 28 or more successes.

For question 3, CRITBINOM(100, 0.3, 0.9) is 36. So there is at least a 0.9 chance of getting 36 or fewer successes. Therefore, there is at most a 0.1 chance of getting 37 or more successes. Also, 37 is the smallest number that satisfies this condition.

Questions 2 and 3 illustrate the fact that many questions about the cumulative Binomial distribution require finding a CRITBINOM value. CRITBINOM makes a statement like "the smallest x that gives a cumulative probability associated with x that is greater than or equal to alpha is x = 27." This statement might then be used indirectly to answer the question. For example, if P(number of successes <= 27) >= alpha, then P(number of successes >= 28) < 1 - alpha.

Cells A18:B21 illustrate performance of CRITBINOM. Correct values were found by using Knusel's software, ELV, that he used as the basis for computations in his paper cited below. If you have a version of Excel that is earlier than Excel 2003, the entries in A20 and A21 will be #NUM!; if you have Excel 2003, entries in A19:A21 and B19:B21 will agree.

Results in earlier versions of Excel

CRITBINOM depends on BINOMDIST. Knusel (see note 1) documented instances where BINOMDIST does not return a numeric answer and yields #NUM! instead because of a numeric overflow. When numeric answers are returned by BINOMDIST they are correct. #NUM! is returned only when the number of trials is greater than or equal to 1030; there are no computational problems when n < 1030. The article on BINOMDIST discusses this more thoroughly.

Note 1: Knusel, L. On the Accuracy of Statistical Distributions in Microsoft Excel 97, Computational Statistics and Data Analysis, 26, 375-377, 1998.

For additional information about BINOMDIST, click the following article number to view the article in the Microsoft Knowledge Base:

827459 Excel statistical functions: BINOMDIST

Knusel did not report on CRITBINOM in his paper, but as expected, CRITBINOM gives the same problems but again only when n >= 1030.

Results in Excel 2003

The approach to improvements in Excel 2003 is exactly the same as with BINOMDIST: use existing pre-Excel 2003 code if n < 1030 and switch to an alternative plan if n >= 1030. The remainder of the discussion in this section deals with only the case where n >= 1030. The alternative plan is built in the same way as for BINOMDIST: find the modal value m (m = approximately n*p), assign an unscaled probability of 1 to m, find unscaled probabilities of m+1, m+2, m+3, ... stopping when such probabilities become infinitesimal, find unscaled probabilities of m-1, m-2, m-3, ... stopping when such probabilities become infinitesimal. Finally, scale the appropriate probabilities.

CRITBINOM is a kind of inverse function for a discrete distribution that is similar to NORMSINV for the continuous standard normal distribution. NORMSINV is computed through a search process that frequently calls NORMSDIST as it homes in on the result. Because BINOMDIST is computationally expensive, you want to avoid a similar process for CRITBINOM that would repeatedly call BINOMDIST. The procedure establishes an initial guess, executes the code below (similar to a single call to BINOMDIST), and then "tweaks" (adjusts) the guess to arrive at a final answer.

First, find a guess by using a normal approximation to the Binomial distribution. You can assume a normal distribution with the same mean and standard deviation as the Binomial, namely mean n*p and standard deviation SQRT(n*p*(1-p)). This approximation should be reasonably accurate as long as n*p*(1-p) > 30, say; and this will be the case with n >= 1030 unless p is very close to 0 or very close to 1. Because you require only an approximate value, use a quick approximation to NORMSINV instead of calling NORMSINV itself. The approximation comes from 26.2.23 in Abramowitz, M. and I. A. Stegun, Handbook of Mathematical Functions, Dover, 1972, p. 933.

If alpha <= 0.5, define t = SQRT(LN(1/(alpha^2))) then NORMSINV(alpha) is approximated by NApprox = -t + (2.515517 + 0.802853*t + 0.010328*t*t)/(1 + 1.432788*t + 0.189269*t*t + 0.001308*t*t*t)

If alpha > 0.5, define t = SQRT(LN(1/((1 - alpha)^2))) then NORMSINV(alpha) is approximated by NApprox = t - (2.515517 + 0.802853*t + 0.010328*t*t)/(1 + 1.432788*t + 0.189269*t*t + 0.001308*t*t*t)

The initial guessed value of CRITBINOM is

Guess = FLOOR(n*p + NApprox*SQRT(n*p*(1-p)))

You have to make sure that this is not outside the range [0,n] so run:

If (Guess < 0) then Guess = 0;
If (Guess > n) then Guess = n;

The pseudo code below is adapted from that for BINOMDIST (in the article on BINOMDIST) to find both the cumulative and non-cumulative probabilities for Guess. In other words, when this pseudo code terminates you will have computed BINOMDIST(Guess, n, p, TRUE) and BINOMDIST(Guess, n, p, FALSE).

Step 0: (Initialization). Initialize TotalUnscaledProbability, UnscaledPGuess (for non-cumulative probability), and UnscaledCumPGuess (for cumulative probability) all to 0. Initialize the constant EssentiallyZero to a very small number, such as 10^(-12).

Step 1: find n*p and round down to the nearest whole number, m. The most likely number of successes in n trials is either m or m+1.

TotalUnscaledProbability = TotalUnscaledProbability + 1;
If (m == Guess) then UnscaledPGuess = UnscaledPGuess + 1;
If (m <= Guess) then UnscaledCumPGuess = UnscaledCumPGuess + 1;

Step 2: (Calculate unscaled probabilities for k > m):

PreviousValue = 1;
Done = FALSE;
k = m + 1;
While (not Done && k <= n)
  {
	CurrentValue = PreviousValue * (n - k + 1) * p / (k * (1 - p));
	TotalUnscaledProbability = TotalUnscaledProbability + CurrentValue;
	If (k == Guess) then UnscaledPGuess = UnscaledPGuess + CurrentValue;
	If (k <= Guess) then UnscaledCumPGuess = UnscaledCumPGuess + 
		CurrentValue;
	If (CurrentValue <= EssentiallyZero) then Done = TRUE;
	PreviousValue = CurrentValue;
	k = k+1;
  }
end While;

Step 3: (Calculate unscaled probabilities for k < m):

PreviousValue = 1;
Done = FALSE;
k = m - 1;
While (not Done && k >= 0)
  {
	CurrentValue = PreviousValue * k+1 * (1-p) / ((n - k) * p);
	TotalUnscaledProbability = TotalUnscaledProbability + CurrentValue;
	If (k == Guess) then UnscaledPGuess = UnscaledPGuess + CurrentValue;
	If (k <= Guess) then UnscaledCumPGuess = UnscaledCumPGuess + 
		CurrentValue;
	If (CurrentValue <= EssentiallyZero) then Done = TRUE;
	PreviousValue = CurrentValue;
	k = k-1;
  }
end While;

Step 4: (Combine unscaled results)

PGuess = UnscaledPGuess/TotalUnscaledProbability;
CumPGuess = UnscaledCumPGuess/TotalUnscaledProbability;

Step 5: "Tweak" (adjust) the result. At this point you have values for Guess, PGuess, and CumPGuess. Let CumPGuessMinus1 be the cumulative probability of Guess - 1. Then Guess is the correct answer if CumPGuessMinus1 < alpha but CumPGuess >= alpha. If both of these are < alpha then increase Guess by 1; if both of these are >= alpha, then decrease Guess by 1. This is a simple process. First

CumPGuessMinus1 = CumPGuess - PGuess;

If you have to increase Guess then

PGuessPlus1 = PGuess * (n - Guess) * p / Guess / (1 - p);
CumPGuessMinus1 = CumPGuess;
CumPGuess = CumPGuess + PGuessPlus1;
PGuess = PGuessPlus1;
Guess = Guess + 1;

If you have to decrease Guess then

PGuessMinus1 = PGuess * Guess * (1 - p) / (n - Guess + 1) / p;
CumPGuess = CumPGuessMinus1;
CumPGuessMinus1 = CumPGuessMinus1 - PGuess;
PGuess = PGuessMinus1;
Guess = Guess - 1;

You might have to increase Guess more than one time or you might have to decrease Guess more than one time, but you would never have to both increase Guess at least one time and decrease Guess at least one time.

CRITBINOM has been thoroughly tested for accuracy. However, only casual anecdotal testing has been done to investigate how close the initial Guess is to the correct answer and how many times the Guess has to be increased or decreased. The normal approximation generally provides an excellent value of Guess; in our limited casual tests, we never had to increase or decrease the initial Guess by more than 2.

Conclusions

Inaccuracies in earlier versions of Excel occur only when the number of trials is greater than or equal to 1030. In such cases, CRITBINOM returns #NUM! in earlier versions of Excel. This issue occurs because one term in a sequence of terms to be multiplied together when you evaluate BINOMDIST overflows. This issue has been corrected in Excel 2003 by implementing an alternative procedure that is described earlier when such an overflow would otherwise occur.

Five functions that exhibit similar behavior in earlier versions of Excel are BINOMDIST, CRITBINOM, HYPGEOMDIST, NEGBINOMDIST, and POISSON. These functions always return correct numeric results or #NUM! or #DIV/0!. Again, problems occur because of overflow (or underflow). The conditions where these problems will occur are easily identified, and an alternative plan algorithm in the same spirit as the one for BINOMDIST (or CRITBINOM) is implemented in Excel 2003 to return correct answers in cases where earlier versions return #NUM! or #DIV/0!.

For more information about CRITBINOM, click Microsoft Excel Help on the Help menu, type critbinom in the Search for box in the Assistance pane, and then click Start searching to view the topic.

Modification Type:MinorLast Reviewed:1/11/2006
Keywords:kbinfo KB828117 kbAudEndUser