MORE INFORMATION
The articles on VAR and VARP describe how a two-pass
procedure has replaced the "calculator formula" for computing these measures.
With infinite precision arithmetic, both procedures return the same results and
the calculator formula takes half the computation time. However, with finite
precision arithmetic, the potential for round off errors becomes a factor. The
result of using the two-pass procedure is less susceptibility to round off
errors in extreme situations in Excel 2003. However, you may not notice the
effect of these improvements because for most data sets, there are not enough
significant digits in the data to cause different results from the calculator
formula and the two-pass procedure. VAR and VARP are always more accurate in
Excel 2003, but VAR and VARP differences between Excel 2003 and earlier
versions of Excel are likely to be noticeable only when data values contain
many significant digits and yet have small variation.
Microsoft
recommends that you read the article about VAR because it explains
modifications in the computational procedure and it also provides a worksheet
for more extensive experimentation with adding a power of ten to data
values. For additional information about VAR, click the following
article number to view the article in the Microsoft Knowledge Base:
826112
Excel Statistical Functions: VAR
The intent of the improvements was to also
improve separate code for computing VAR for summarizing data in pivot tables.
This code has been improved for part of the pivot table, but problems remain
unresolved for other parts. The numeric example later in this article helps clarify the
situation.
Create a blank Microsoft Office Excel worksheet, copy the table that follows later in this section, and
then select cell A1 in your blank Excel worksheet. On the
Edit
menu, click
Paste so that the entries in the table fill
cells A1:I28 in your worksheet.
| Group | Trial | Value | Modified
Value | Power of 10 | | | | |
| 1 | 1 | 1 | =C2+10^$E$2 | 8 | | | | |
| 1 | 1 | 2 | =C3+10^$E$2 | | All
Excel versions: | | | |
| 1 | 1 | 3 | =C4+10^$E$2 | | Var
of Value | Trial | | |
| 1 | 2 | 4 | =C5+10^$E$2 | | Group | 1 | 2 | Grand
Total |
| 1 | 2 | 5 | =C6+10^$E$2 | | 1 | 1 | 1 | 3.5 |
| 1 | 2 | 6 | =C7+10^$E$2 | | 2 | 4 | 4 | 14 |
| 2 | 1 | 2 | =C8+10^$E$2 | | 3 | 1 | 1 | 3.5 |
| 2 | 1 | 4 | =C9+10^$E$2 | | Grand
Total | 2.5 | 6.25 | 8.35294117647059 |
| 2 | 1 | 6 | =C10+10^$E$2 | | | | | |
| 2 | 2 | 8 | =C11+10^$E$2 | | Excel
2002 values with E2 set to 8: | | | |
| 2 | 2 | 10 | =C12+10^$E$2 | | Var
of Modified Value | Trial | | |
| 2 | 2 | 12 | =C13+10^$E$2 | | Group | 1 | 2 | Grand
Total |
| 3 | 1 | 3 | =C14+10^$E$2 | | 1 | 0 | 0 | 4.8 |
| 3 | 1 | 4 | =C15+10^$E$2 | | 2 | 4 | 4 | 14.4 |
| 3 | 1 | 5 | =C16+10^$E$2 | | 3 | 0 | 0 | 1.6 |
| 3 | 2 | 6 | =C17+10^$E$2 | | Grand
Total | 4 | 6 | 7.52941176470588 |
| 3 | 2 | 7 | =C18+10^$E$2 | | | | | |
| 3 | 2 | 8 | =C19+10^$E$2 | | Office
Excel 2003 values with E2 set to 8: | | | |
| | | | | Var of Modified
Value | Trial | | |
| | | Values with E2 set to
8: | | Group | 1 | 2 | Grand Total |
| | | Excel 2002 | Office Excel
2003 | 1 | 1 | 1 | 4.8 |
| all | | =VAR(C2:C19) | 7.52941176470588 | 8.35294117647059 | 2 | 4 | 4 | 28.7 |
| group
1 | | =VAR(C2:C7) | 4.8 | 3.5 | 3 | 1 | 1 | 1.6 |
| group
2 | | =VAR(C8:C13) | 14.4 | 14 | Grand
Total | 7500000450000010 | 0 | 7058824164705910 |
| group
3 | | =VAR(C14:C19) | 1.6 | 3.5 | | | | |
| trial 1 | | =VAR(C2:C4, C8:C10,
C14:C16) | 4 | 2.5 | | | | |
| trial 2 | | =VAR(C5:C7, C11:C13,
C17:C19) | 6 | 6.25 | | | | |
Note After you paste this table into your new Excel worksheet, click
Paste Options, and then click
Match
Destination Formatting. With the pasted range still selected, point to
Column on the
Format menu, and then click
AutoFit Selection.
Data for pivot tables are in cells
A1:D19. Columns F through I show values that are obtained in three pivot
tables. The first shows VAR of Value (column C). The other two pivot table
results are the same. By using this worksheet, you can create Modified Values
(column D) by adding the power of 10 in cell E2 to each Value. Adding a
constant to each data value will not affect variances. The second pivot table
demonstrates that all variances are incorrect in Microsoft Excel 2002 and earlier
versions if the data is modified by adding the large constant 10^8 to each
value. This produces data with many significant digits but small variance. Such
extreme data is unlikely to occur in practice, and you are unlikely to be
affected by this degree of numeric imprecision. This experiment is designed to
magnify round off errors. All results in this table, while incorrect, are
consistent with results from computing various VARs by using the Excel VAR
function in cells D23:D28.
The third pivot table shows Excel 2003
results. The good news is that VAR has been correctly upgraded for the interior
of the table, cells G22:H24. Values in these six cells agree with corresponding
values in the first pivot table as they should. However, something yet
unresolved is significantly wrong in the Grand Total row and Grand Total column.
Cells E23:E28 show results from computing various VARs by using the Excel 2003
VAR function for Modified Values with cell E2 set to 8. These results agree
with entries in the Grand Total row and Grand Total column of the first pivot
table as they should.
Unfortunately, this author recommends
that you ignore values in the Grand Total row and Grand Total column when you
use VAR to summarize data in an Excel 2003 pivot table. For earlier versions of
Excel, all entries in such pivot tables are likely to be correct provided that
data values are not so extreme as to cause significant round off problems. To
examine the accuracy of any VAR in the table, identify the range of data, Range,
of whose VAR you want. For earlier versions of Excel, the correct value is
obtained by computing DEVSQ(Range) / (COUNT(Range) - 1). For Excel 2003,
the correct value comes from VAR(Range).
The same comments and same
performance issues apply to VARP. To verify the accuracy of any VARP in the
table in any version, identify the range of data, Range, of whose VARP that
you want. For earlier versions of Excel, the correct value is obtained by
computing DEVSQ(Range) / COUNT(Range). For Excel 2003, the correct value comes
from VARP(Range).
Results in Earlier Versions of Excel
Use of the calculator formula in earlier versions of Excel makes
VAR and VARP, STDEV, STDEVP more susceptible to round off errors. However, round off errors are very rarely significant enough that a typical user will notice them.
These errors are likely to occur only in extreme situations, particularly when
data contains many significant digits but small variations.
Results in Excel 2003
An improved procedure that involves two passes through the data is
correctly implemented for the interior of a pivot table. On the first pass, the
sample mean is calculated; on the second pass, the sum of squared deviations
about this sample mean is calculated. This sum is then divided by the number of
observations minus 1 for VAR or by the number of observations for VARP. This
procedure is generally recommended over the calculator formula to minimize the
risk of round off errors.
Unfortunately, this has not been correctly
implemented for the Grand Total row and Grand Total column. The author suggests
an alternative mechanism for obtaining correct values of these VAR or VARPs.
Conclusions
You will occasionally see differences when you compare the values
of VAR or VARP that are calculated with earlier versions of Excel to values
that are calculated with Excel 2003. In these cases, Excel 2003 versions will
always be more accurate.
If Excel 2003 is used to summarize data with
VAR or VARP in a pivot table, you can rest assured that results in the interior
of the table are equally accurate or more accurate than in earlier versions of
Excel. Excel 2003 users can ignore results in the Grand Total row and Grand
Total column until computational issues with these entries are
resolved.