Regression Analysis and Best Fit Lines (XE0124) (103839)



The information in this article applies to:

  • Microsoft Excel 97 for Windows
  • Microsoft Excel 98 Macintosh Edition

This article was previously published under Q103839

SUMMARY

The Application Note "Regression Analysis and Best Fit Lines" (XE0124), discusses how to use Microsoft Excel functions to perform simple, multiple, and polynomial regression analysis. It contains examples of how to use LINEST(), LOGEST(), TREND(), and GROWTH() to describe a best fit line or curve and to make predictions about your data. It also outlines some of the new statistical functions and tools available with Microsoft Excel versions 4.0 and later.

The following file is available for download from the Microsoft Download Center:
For more information about how to download Microsoft Support files, click the following article number to view the article in the Microsoft Knowledge Base:

119591 How to obtain Microsoft support files from online services

Microsoft scanned this file for viruses. Microsoft used the most current virus-detection software that was available on the date that the file was posted. The file is stored on security-enhanced servers that help prevent any unauthorized changes to the file.

If you are unable to access the source(s) listed above, you can have this Application Note mailed or faxed to you by calling Microsoft Product Support Services Monday through Friday, 6:00 A.M. to 6:00 P.M. Pacific time at (425) 635-7070. If you are outside the United States, contact the Microsoft subsidiary for your area. To locate your subsidiary, see the Microsoft World Wide Offices Web site at:

Text of Xe0124

The following is the full text of the Application Note. (Note that some graphics and formatting may be missing from this version of the document. For best results, download the Microsoft Word version of this document.)

                                 OVERVIEW
                                 ========

This Application Note discusses how to use Microsoft Excel functions to
perform simple, multiple, and polynomial regression analysis. It contains
examples of how to use LINEST, LOGEST, TREND, and GROWTH to describe a
best-fit line or curve and to make predictions about your data. It also
outlines some of the built-in statistical functions and tools available
with Microsoft Excel.

   General Information
      To Use the Analysis ToolPak
      Choosing the Best Function
         When Your Data Is Linear
         When Your Data Is Exponential
         When Your Data Is Curvilinear

   Simple Regression Analysis
      Describing a Best-fit Line
         Finding the Slope and the Y Intercept
      Calculating a Best-fit Line
         Using TREND
         Using LINEST
      Plotting the Best-fit Line
         Plotting the Trendline Automatically
      Predicting Future Values
         Using FORECAST
         Using TREND
         Using LINEST

   Multiple Regression Analysis
      Predicting Y-Values
         Using TREND
         Using LINEST

   Polynomial Regression Analysis
      Calculating a Polynomial Curve
      Charting a Polynomial Curve

   Using Regression Statistics
         Using LINEST/LOGEST for Regression Statistics
         Using R2 to Test Regression Model Accuracy

GENERAL INFORMATION
===================

Regression is a statistical method used to predict values based on
relationships in existing data. By analyzing how a single dependent
variable (y) is affected by the values of one or more independent variables
(x), you can predict what y will be given x. You can use this information
to fit a line or a curve to your existing data and to forecast future
values. The LINEST, TREND, LOGEST, and GROWTH functions are the primary
functions you will use to perform regression analysis in Microsoft Excel.

While this Application Note focuses primarily on the functions that can be
used in Microsoft Excel versions 3.0 and later, Microsoft Excel versions
4.0 and later offer several new functions and tools that you can use to
perform regression analysis and to create best-fit lines. When one of these
new functions can be used to perform a task described in this Application
Note, the function will be noted in the appropriate section. The following
table lists some of these new functions.

                                        Use this
   To do this                           function
   ----------------------------------------------

   Return the correlation coefficient   CORREL
   for two arrays of cells

   Return a single predicted y-value    FORECAST
   based on a linear regression of
   known x and y ranges

   Return the y intercept of the linear INTERCEPT
   regression line

   Calculate R2, the coefficient of     RSQ
   Determination

   Return the slope of the linear       SLOPE
   regression line

   Return the standard error of the     STEYX
   Regression

   Table 1--Regression Analysis Functions in
   Microsoft Excel Versions 4.0 and later.

In addition, the Analysis ToolPak add-in provides a special set of
analysis tools, including tools to accomplish the following tasks.

                                            Use this
   To do this                               analysis tool
   -------------------------------------------------------

   Predict a value based on the forecast    Exponential
   for the prior period, adjusted for the   Smoothing
   error in that prior forecast

   Project values in the forecast period    Moving Average
   based on the average value of the
   variable over a specific number of
   preceding periods

   Perform linear regression analysis and   Regression
   return statistics and plots as specified

               Table 2--Analysis ToolPak Add-in Features

To Use the Analysis ToolPak
---------------------------

In Microsoft Excel 5.0 and later:

1. On the Tools menu, click Data Analysis.

2. If the Data Analysis command is not available, click Add-Ins on the
   Tools menu. In the Add-Ins dialog box, click to select the Analysis
   ToolPak check box.

NOTE: If the Analysis ToolPak add-in is not listed, run the Setup program,
choose Add/Remove, and select the Add-ins option for Microsoft Excel.

In Microsoft Excel 4.0:

1. On the Options menu, click Analysis Tools.

2. If the Analysis Tools command is not available, click Add-In on the
   Options menu. In the Add-Ins dialog box, click Add. Click Analysis.xla
   in the Library\Analysis folder.

3. In the Data Analysis dialog box, choose the tool that you want to use,
   such as Exponential Smoothing. For help on how to use a particular
   analysis tool, click Help in the dialog box for the tool.

CHOOSING THE BEST FUNCTION
==========================

Whether you are performing simple regression (one x variable), multiple
regression (two or more x variables), or polynomial regression (one x
variable raised to different powers), you will get the most accurate
results if the function that you choose to regress your data is based on
the patterns in your existing data.

When Your Data Is Linear
------------------------

Your data is linear if the rate of change in your data is even to such an
extent that when you plot it in a chart, the pattern in your data points
resembles a line. If your data is linear, use the linear regression
functions, LINEST and TREND. Both functions use the "least squares" method
to calculate a straight line that best fits your data. LINEST returns
information about the line, such as its slope and y intercept, and TREND
returns predicted values along the line.

In Microsoft Excel versions 4.0 and later, the Regression tool (in the
Analysis ToolPak add-in) performs linear regression, returns regression
statistics, calculates best-fit lines, and creates best-fit line charts.

When Your Data Is Exponential
-----------------------------

Your data is exponential if the rate of change in your data, when plotted
on a chart, resembles a curve that rises or falls at an increasingly higher
rate. If your data is exponential, use the logarithmic regression
functions, LOGEST and GROWTH. LOGEST calculates an exponential curve that
best-fits your data and, like LINEST, returns information about the curve.
Like TREND, GROWTH returns predicted values along the curve.

When Your Data Is Curvilinear
-----------------------------

To most accurately predict values when the pattern in your data is neither
linear nor exponential, use polynomial regression in conjunction with the
TREND function to calculate a best-fit curve. For example, use this method
if, when you plot your data in a chart, it resembles a curve for which the
rate of change is not dramatic or if your data fluctuates in such a way
that no linear or curved pattern can be identified.

SIMPLE REGRESSION ANALYSIS
==========================

Your regression analysis is "simple" if you have only one independent x
variable for each dependent y variable. For example, assume you are
analyzing the sales figures for the first six months of operation for
Wingtip Toys, a company that specializes in the design and manufacture of
toys.

   NOTE: The following examples primarily use the LINEST and TREND
   functions. Wherever these two functions are discussed, LOGEST and
   GROWTH can be substituted if your data is exponentially curved and
   if a curve fit would be more accurate than a straight line.

In the following sample data, the values in the Month column are the
independent x variables and the values in the Sales column are the
dependent y variables. Based on this data, you can describe, calcu late,
and plot a best-fit line, and you can then predict future sales figures.
Because the data is linear, you will use the LINEST and TREND functions to
perform the regression analysis.

The Regression tool in Microsoft Excel version 4.0 and later performs each
of these tasks automatically.

For additional information on calculating regression, see the following
references.

   Version of
   Microsoft Excel   Reference
   ------------------------------------------------------------

   97, 98            In Help, search for "Regression, Analysis"
   7.0               In Help, search for "Regression"
   5.0               In Help, search for "Regression "
   4.0               User's Guide 2, pages 41-45

Because this tool performs linear regression, if your data resembles
an exponential curve, use LOGEST and GROWTH.

Following are the sales figures for Wingtip Toys and the corresponding
months in both table and chart form.

         A       B
   1   Month   Sales
   2        1  $4,200
   3        2  $6,100
   4        3  $7,300
   5        4  $7,300
   6        5  $8,700
   7        6  $10,500

         Table 3--Sample Data (Sales Figures for Wingtip Toys)

DESCRIBING A BEST-FIT LINE
==========================

The equation of a straight line is y=mx+b, where m is the slope and b is
the y intercept. LINEST returns the slope (m) and y intercept (b) values
that describe the line derived from your existing data.

Microsoft Excel versions 4.0 and later provide specific SLOPE and INTERCEPT
functions for calculating the slope and the y intercept when your data is
linear.

For additional information, see the following references.

   Version of
   Microsoft Excel   Reference
   ------------------------------------------------------------

   97, 98            In Help, search for "slope" or "intercept"
   7.0               In Help, search for "slope" or "intercept"
   5.0               In Help, search for "slope" or "intercept"
   4.0               Function Reference Guide, pages 405-406

   NOTE: If your data is exponentially curved, use LOGEST to return
   the slope (m) and y intercept (b) values that describe the curve.
   The equation used by LOGEST is y=b*m^x.

Finding the Slope and the Y Intercept
-------------------------------------

To calculate the values of the slope (m) and y intercept (b), use the
procedure appropriate for your version of Microsoft Excel.

Microsoft Excel 4.0 and later:

1. Use the data in Table 3--Sample Data (Sales Figures for Wingtip Toys).

2. To find the slope, select cell E2 and type the following formula:

      =SLOPE(B2:B7,A2:A7)

The slope of the line for this data is 1122.857.

3. To find the y intercept, select cell F2, and type the following
   formula:

      =INTERCEPT(B2:B7,A2:A7)

The point at which the line crosses the y axis is 3420.

Microsoft Excel 3.0:

1. Using the data in "Table 3--Sample Data (Sales Figures for Wingtip
   Toys)," select cells E2:F2.

2. Type the following formula:

      =LINEST(B2:B7,A2:A7)

   NOTE: Because the function returns data to more than one cell, you
   must enter the formula as an array by pressing CTRL+SHIFT+ENTER in
   Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel
   for the Macintosh.

The first argument in the LINEST function is the array containing the known
y-values (which in this example are the Sales numbers). The second argument
is the array containing the known x-values (in this case, the Month
numbers).

   NOTE: LINEST also takes other optional arguments that are not
   necessary for this example.

The result 1122.857, in E2, is the slope, and the result 3420, in cell
F2, is where the line crosses the y-axis (y intercept).

        E        F
   1  Slope      Y intercept
   2  1122.857    3420

   Table 4--Example of Slope Intercept Values

CALCULATING A BEST-FIT LINE
===========================

If your data is linear, use TREND or LINEST to calculate your best-fit
line. In Microsoft Excel versions 4.0 and later, you can also use the
FORECAST function (forecast is mainly useful for finding a data point based
on existing data, but can it also be used as a substitute for the TREND
function). If your data fits an exponential curve, use LOGEST or GROWTH.

Using TREND
-----------

The TREND function is the easiest and most efficient function for
calculating the points along a best-fit line. To simultaneously
calculate all the values on the best-fit line, do the following:

1. Using the data in Table 3 of this Application Note, select cells C2:C7
   and type the following formula:

      =TREND(B2:B7,A2:A7)

   NOTE: Because the function returns data to more than one cell, you
   must enter the formula as an array by pressing CTRL+SHIFT+ENTER in
   Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel
   for the Macintosh.

The first argument in the TREND function is the array containing the known
y-values and the second argument is the array containing the known x-
values.

   NOTE: TREND also takes other optional arguments that aren't
   necessary for this example.

Using LINEST
------------

You can also use the slope (m) and the y intercept (b) values returned by
LINEST to find data points on the best-fit line by substituting the y-
values or the x-values into the equation for a line (y=mx+b). By plugging
each month number into this formula, you can calculate all the data points
for your best-fit line. With the slope(m) value (1122.857) in cell E2 and
the intercept (b) value (3420) in cell F2, do the following to generate the
points on your best-fit line:

1. Select cell D2 and enter the following formula (because the formula
   only references single cells, it is not necessary to enter this formula
   as an array):

      =($E$2*A2)+$F$2

2. Select cells D2:D7.

3. Use the method appropriate for you version of Microsoft Excel.

    - In Microsoft Excel 5.0 and later, point to Fill on the Edit menu,
      and click Down.

    - In Microsoft Excel 4.0 and earlier, click Fill Down on the Edit
      menu.

The values returned are the y-values for your best-fit line.

The following table shows the results of the values returned when you
use TREND and LINEST.

        A        B          C             D           E           F
   -----------------------------------------------------------------------
   1  Month    Sales   Predicted Y   Predicted Y    Slope    Y Intercept
                          TREND         LINEST
   -----------------------------------------------------------------------
   2       1    $4,200       $4,543        $4,543  1122.857          3420
   3       2    $6,100       $5,666        $5,666
   4       3    $7,300       $6,789        $6,789
   5       4    $7,300       $7,911        $7,911
   6       5    $8,700       $9,031        $9,031
   7       6   $10,500      $10,157       $10,157

  Table 5--Results of the Values Returned When You Use TREND and LINEST.

   NOTE: The returned values for TREND and LINEST are identical.

PLOTTING THE BEST-FIT LINE
==========================

Once you have calculated the values on your best-fit line, you can add that
line to your existing chart by copying cells C1:C7 and pasting them into
your existing chart.

The resulting chart will have a straight line (best-fit line) running
through your original data.

   NOTE: When you display the points of a best-fit line against the
   original data in a chart, in most cases you will get the best results
   by using an xy (scatter) chart. If you use a line chart, the x-values
   will be treated as labels rather than as values, and curved lines may
   result.

Plotting the Trendline Automatically
------------------------------------

In Microsoft Excel versions 5.0 and later, you can insert a trendline
directly into your chart without having to first calculate the points or
copy and past those points into the chart. Do the following to
automatically insert a trendline in a chart:

1. Double-click the chart to activate it.

2. Select the series for which you want to plot a trendline.

3. Use the procedure appropriate for your version of Microsoft Excel:

   If you are using Microsoft Excel 97 or 98, click Add Trendline on the
   Chart menu. In the Trendline dialog box, click the Type tab, and then
   select the type of trend or regression you want to plot. Click OK.

   If you are using Microsoft Excel 5.0 or 7.0, click Trendline on the
   Insert menu.

PREDICTING FUTURE VALUES
========================

In addition to returning values along the line fitted to your existing
data, you can use TREND and LINEST to predict future values. In Microsoft
Excel 4.0 and later, you can also use the FORECAST function to predict
future values.

Using the Wingtip Toys example, suppose you want to calculate sales figures
for months 7, 8, and 9. The following examples show how to accomplish this
using the FORECAST, TREND, and LINEST functions, respectively.

To predict values for months 7, 8, and 9, first enter the month
numbers for which you want predicted sales figures and then use the
FORECAST function to calculate the values.

Using FORECAST
--------------

1. In cells A8:A10, type 7, 8, and 9, respectively.

2. Select cells B8:B10.

3. Type the following formula:

      =FORECAST(A8:A10,B2:B7,A2:A7)

   NOTE: Because the function returns data to more than one cell, you
   must enter the formula as an array by pressing CTRL+SHIFT+ENTER in
   Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel
   for the Macintosh.

The first argument in the FORECAST function is the array containing
the new x-values for which you want to derive predicted y-values.

The resulting values in cells B8:B10 are the predicted sales for the
next three months.

         A       B
   1   Month   Sales
   2     1     $4,200
   3     2     $6,100
   4     3     $7,300
   5     4     $7,300
   6     5     $8,700
   7     6     $10,500
   8     7     $11,280
   9     8     $12,403
   10    9     $13,526

   Table 6--Sample Data Using TREND to Predict Future Values Using TREND

1. In cells A8:A10, type 7, 8, and 9, respectively.

2. Select cells B8:B10.

3. Type the following formula:

      =TREND(B2:B7,A2:A7,A8:A10)

   NOTE: Because the function returns data to more than one cell, you must
   enter the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft
   Excel for Windows or COMMAND+RETURN in Microsoft Excel for the
   Macintosh.

The third argument in the TREND function is the array containing the
new x-values for which you want to derive predicted y-values.

The resulting values in cells B8:B10 are the predicted sales for the
next three months.

         A       B
   1   Month   Sales
   2     1     $4,200
   3     2     $6,100
   4     3     $7,300
   5     4     $7,300
   6     5     $8,700
   7     6    $10,500
   8     7    $11,280
   9     8    $12,403
   10    9    $13,526

       Table 7--Sample Data Using TREND to Predict Future Values

   NOTE: In Microsoft Excel versions 4.0 and later, you can use the
   AutoFill feature to predict future values. Using the data in the
   previous table, if you wanted to predict sales for months 7, 8, and 9,
   you would select cells B2:B7, select the AutoFill handle in the lower-
   right corner of the selected area, and drag down three additional
   cells. (The AutoFill method is by far the easiest method to use for
   predicting values; however, if you use the formulas, it is easier to
   tell which values are derived and which values are static). The
   functions give you more power and flexibility than the AutoFill feature
   does.

   CAUTION: In addition to returning predicted values for months 7, 8, and
   9, the data in cells B2:B7 will be overwritten with the values that
   represent the best-fit line. If you do not want your original data to
   be overwritten, copy it to a separate area on your worksheet and then
   use AutoFill.

Using LINEST
------------

To obtain the new y-values, you can also substitute the slope and y
intercept values that you derived with the LINEST function (these results
are on page 5) and the new x-values (7, 8, and 9) into the formula, y=mx+b.
See Using LINEST in the "Calculating a Best-fit Line" section for step-by-
step instructions on how to do this.

                       MULTIPLE REGRESSION ANALYSIS
                       ============================

When you have two or more independent x variables for each y variable, the
regression analysis is considered multiple. For example, you could predict
a child's weight given his or her age and height. Assume you've collected
the following data

        A      B      C
   1   Age   Height  Weight
   2       3     32     35
   3       5     40     40
   4       6     39     43
   5      10     50     70

        Table 8--Sample Data: Age, Height, and Weight of Child

where the values under Weight (C2:C5) represent the dependent y variables
and the values under Age and Height (A2:B5) represent the independent x
variables.

PREDICTING Y-VALUES
===================

You can use either the TREND or the LINEST function to analyze the
relationship of the age and height to weight, and you can make
predictions based on the results of this analysis.

In Microsoft Excel 4.0 and later, the Regression tool can also be used
to predict y-values in a multiple regression model.

   NOTE: Do not use the FORECAST function because it only works for
   simple regression.

For additional information on predicting values, see the following
references.

   Version
   Of Microsoft Excel   Reference
   ---------------------------------------------------------------
   97, 98               In Help, search for "Multiple, Regression"
   7.0                  In Help, search for "Multiple Regression"
   5.0                  In Help, search for Multiple Regression"
   4.0                  User's Guide 2, pages 41-45

Using TREND
-----------

Using TREND to Predict a Child's Weight:

This example uses the data in Table 8--Sample Data: Age, Height, and
Weight of Child.

To use TREND to predict the weight of a 9-year-old, 45-inch child, do
the following:

1. In cells A6 and B6, type 9 and 45, respectively.

2. Select cell C6 and type the following formula:

      =TREND(C2:C5,A2:B5,A6:B6)

   Because the function returns data to more than one cell, you must enter
   the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft Excel
   for Windows or COMMAND+RETURN in Microsoft Excel for the Macintosh.

The result of the formula, 63.42, is the predicted weight.

Using LINEST
------------

To predict a y-value with LINEST, you must first calculate the slopes for
each x variable and find the y intercept. Because a slope is returned for
each x variable, when you use the LINEST function, you must first select a
range of cells that consist of a single row and a single column plus an
additional column for each x variable in your data table. In this example,
because you have two x variables, you will need to select a range of three
cells, three columns wide by one row tall.

Using LINEST to Calculate a Child's Weight:

This example uses the data in Table 8--Sample Data: Age, Height, and
Weight of Child.

To calculate the slopes and the y intercepts, select cells A7:C7 and
type the following formula:

   =LINEST(C2:C5,A2:B5)

NOTE: Because the function returns data to more than one cell, you must
enter the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft
Excel for Windows or COMMAND+RETURN in Microsoft Excel for the Macintosh.

The following values will be returned

        A     B      C
   7  -0.32  5.98     24

where -0.32 is the slope for the second x variable (height), 5.98 is
the slope for the first x variable (age), and 24 is the y intercept.

   NOTE: The slopes are in reverse order: the first slope value
   corresponds to the second x variable and the second slope value
   corresponds to the first x variable. The LINEST function always
   returns the slopes in reverse order when more than one x variable
   is involved.

You can use the slope values and the y intercept value to make
predictions based on your data. Using the formula,
y=(m1*x1)+(m2*x2)+(mn*xn)+b, you can predict the weight of a 9-year
old, 45-inch child:

    =(9*5.98)+(45*-0.32)+24

The result of the formula, 63.42, is the predicted weight.

Similar to LINEST, the Regression tool in Microsoft Excel versions 4.0 and
later returns the slope values and the y intercept value. As described
previously, you can plug these values into the formula, y=mx+b, to predict
y.

   NOTE: Because the x variables are independent, there may not be a
   good graphical representation for a multiple regression model. Each
   x-value can be plotted with its corresponding y-value, but the
   individual lines may be completely unrelated, and, therefore, may
   be meaningless.

                      POLYNOMIAL REGRESSION ANALYSIS
                      ==============================

When your data is neither exponentially curved nor consistently
linear, use the polynomial method of regression. When you plot a best-
fit curve calculated with polynomial regression, the curve will rise
and fall with the data.

CALCULATING A POLYNOMIAL CURVE
==============================

To calculate a polynomial curve, the dependent y variable is regressed
against the independent x variable raised to different powers. To
illustrate this process, take the following example. Fitting a straight
line to the following data would not accurately predict the sales for any
given month.

         A       B
   1   Month   Sales
   2        1  $4,200
   3        2  $1,600
   4        3  $5,120
   5        4  $4,500
   6        5  $5,400
   7        6  $1,460

   Table 9--Six-Month Sales Figures Wingtip Toys

In this case, you will get the best results by setting up the following
polynomial regression model.

       A    B    C      D      E        F
   10  X   X^2  X^3    X^4   Sales    Trend
   11   1     1    1      1  $4,200   $4,089
   12   2     4    8     16  $1,600   $2,154
   13   3     9   27     81  $5,120   $4,011
   14   4    16   64    256  $4,500   $5,609
   15   5    25  125    625  $5,400   $4,846
   16   6    36  216   1296  $1,460   $1,571

   Table 10--Polynomial Regression Model for Wingtip Sales Figures

The values in cells A11:A16 are the month numbers copied from cells A2:A7
of Table 9--Six-Month Sales Figures Wingtip Toys. The values in cells
B11:D16 are the original x variables raised to the second, third, and
fourth powers, respectively. To obtain these values, do the following:

1. Select cell B11 and enter the formula:

      =A11^2

2. Select cell C11 and enter the formula:

      =A11^3

3. Select cell D11 and enter the formula:

      =A11^4

4. Select cells B11:D16.

5. Use the method appropriate for your version of Microsoft Excel.

    - In Microsoft Excel version 5.0 or later, point to Fill on the Edit
      menu, and click Down.

    - In Microsoft Excel version 4.0 or earlier, click Fill Down on the
      Edit menu.

The values in E11:E16 are the sales figures copied from B2:B7. To derive
the trend values in column F, select cells F11:F16 and type the following
formula:

      =TREND(E11:E16,A11:D16)

   NOTE: Because the function returns data to more than one cell, you
   must enter the formula as an array by pressing CTRL+SHIFT+ENTER in
   Microsoft Excel for Windows or COMMAND+RETURN in Microsoft Excel
   for the Macintosh.

CHARTING A POLYNOMIAL CURVE
===========================

To add the TREND results to this chart, select cells F10:F16, and then
copy and paste them into your existing chart.

                        USING REGRESSION STATISTICS
                        ===========================

The LINEST and LOGEST functions can return additional regression
statistics that can be helpful in using and evaluating your regression
model. If you are using Microsoft Excel 4.0 or later and have linear
data, you can use the Regression Tool from the Analysis ToolPak add-
in. This tool will automatically return all the regression statistics.
If your data resembles an exponential curve, use LOGEST to return
accurate regression statistics.

USING LINEST/LOGEST FOR REGRESSION STATISTICS
=============================================

To return the additional statistics using LINEST or LOGEST, you must
select a range that includes five rows and a single column plus an
additional column for each x variable in your data. In addition, the
stats argument, which is the fourth argument in both of these
functions, must be set to TRUE.

The following table lists the ages, weights, and heights of a number
of children.

        A      B      C
   1   Age   Height  Weight
   2       3     32     35
   3       5     40     40
   4       6     39     43
   5      10     50     70

        Table 11--Sample Data: Age, Height, and Weight of Child

To return the additional regression statistics using the data from "Table
11--Sample Data: Age, Height, and Weight of Child" use the following steps:

1. Select cells D1:F5.

NOTE: This range consists of five rows and a single column plus two
additional columns (one for each x variable).

2. Type the following formula:

      =LINEST(C2:C5,A2:B5,,TRUE)

Because the function returns data to more than one cell, you must enter
the formula as an array by pressing CTRL+SHIFT+ENTER in Microsoft Excel
for Windows or COMMAND+RETURN in Microsoft Excel for the Macintosh.

The resulting data should resemble the data in the following table.

          D          E         F
   1  -0.32      5.98      24
   2  2.243569   5.647619  57.68449
   3  .950813    6.024948  #N/A
   4  9.665289   1         #N/A
   5  701.7      36.3      #N/A

   Table 12--Regression Statistics

The first row of the statistics contains the slope for the height, the
slope for the age, and the y intercept. The second row contains the
standard error of the slopes and of the y intercept. The third row contains
R2 and the standard error for the y estimate. The fourth row contains the F
statistic and degrees of freedom. And, the fifth row contains the
regression sum of squares and the residual sum of squares.

USING R2 TO TEST REGRESSION MODEL ACCURACY
==========================================

A particularly useful statistic returned is the coefficient of
determination called R2. In Microsoft Excel versions 4.0 and later, you can
also use the RSQ function to find R2. This R2 indicator ranges in value
from 0 to 1 and reveals how closely the estimated y-values correlate to
your actual y-values. The closer R2 is to 1, the more perfect the
correlation-this correlation indicates that the regression equation is very
useful in accurately predicting a y-value. On the other hand, the closer R2
is to 0, the less helpful it will be in predicting a y-value.

In the previous example, the value for R2 returned by LINEST is .95, an
excellent correlation. This indicates that, based on the collected data,
the LINEST model can be used to make extremely accurate predictions of a
child's weight given a specific age and height.

The disk and software contained on it, including any accompanying
documentation (the "Software"), are provided to you at no additional
charge. Microsoft Corporation owns all rights, title, and interest in and
to the Software. The user assumes the entire risk as to the accuracy and
the use of the Software.

                                    ###

COPYRIGHT NOTICE. Copyright (c) 1992-1997 Microsoft Corporation. Microsoft
and/or its suppliers, One Microsoft Way, Redmond, Washington 98052-6399
U.S.A. All rights reserved.

TRADEMARKS. Microsoft, Windows, Windows NT, MSN, The Microsoft Network
and/or other Microsoft products referenced herein are either trademarks or
registered trademarks of Microsoft. Other product and company names
mentioned herein may be the trademarks of their respective owners.

The names of companies, products, people, characters and/or data mentioned
herein are fictitious and are in no way intended to represent any real
individual, company, product or event, unless otherwise noted.

NO WARRANTY. THE SOFTWARE IS PROVIDED "AS-IS," WITHOUT WARRANTY OF ANY
KIND, AND ANY USE OF THIS SOFTWARE PRODUCT IS AT YOUR OWN RISK. TO THE
MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, MICROSOFT AND ITS SUPPLIERS
DISCLAIM ALL WARRANTIES AND CONDITIONS, EITHER EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES AND CONDITIONS OF
MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE, TITLE, AND NON-
INFRINGEMENT, WITH REGARD TO THE SOFTWARE.

LIMITATION OF LIABILITY. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW,
IN NO EVENT SHALL MICROSOFT OR ITS SUPPLIERS BE LIABLE FOR ANY SPECIAL,
INCIDENTAL, INDIRECT, OR CONSEQUENTIAL DAMAGES WHATSOEVER (INCLUDING,
WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS PROFITS, BUSINESS
INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR ANY OTHER PECUNIARY LOSS)
ARISING OUT OF THE USE OF OR INABILITY TO USE THE SOFTWARE, EVEN IF
MICROSOFT HAS BEEN ADVISED OF THE POSSIBLITY OF SUCH DAMAGES. BECAUSE SOME
STATES AND JURISDICTIONS DO NOT ALLOW THE EXCLUSION OR LIMITATION OF
LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES, THE ABOVE LIMITATION MAY
NOT APPLY. MICROSOFT'S ENTIRE LIABILITY AND YOUR EXCLUSIVE REMEDY UNDER
THIS EULA SHALL NOT EXCEED FIVE DOLLARS (US$5.00).

The following conditions also apply to your use of the Software:

The Software may be copied and distributed internally only, subject to the
following conditions: All text must be copied without modification and all
pages must be included; If software is included, all files on the disk(s)
|must be copied without modification [the MS-DOS(R) utility diskcopy is
appropriate for this purpose]; All components of this Software must be
distributed together; and This Software may not be distributed to any third
party.

If you are not a Microsoft Premier customer, Microsoft shall not provide
technical support for this Software.

The Software is provided with RESTRICTED RIGHTS. Use, duplication, or
disclosure by the Government is subject to restrictions set forth in
subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer
Software clause at DFARS 252.227-7013 or subparagraphs (c)(1) and (2) of
the Commercial Computer Software-Restricted Rights at 48 CFR 52.227- 19, as
applicable. Manufacturer is Microsoft Corporation, One Microsoft Way,
Redmond, WA 98052-6399. Any transfer of the Software must be accompanied by
this statement and may only be transferred if first approved by Microsoft.

You agree that you will not export or re-export the Software to any
country, person, entity or end user subject to U.S.A. export restrictions,
and you are responsible for complying with all applicable U.S. and local
export laws in connection with the use of this Software. You warrant and
represent that neither the U.S.A. Bureau of Export Administration nor any
other federal agency has suspended, revoked or denied you export
privileges.

This EULA is governed by the laws of the State of Washington, U.S.A.
				

Modification Type:MajorLast Reviewed:6/8/2005
Keywords:kbdownload kbappnote kbfile kbGraphxLink kbhowto KB103839