TechSkills 2: Regression
In this TechSkills module, we take a given data set and fit an appropriate function to the data. The formal process of fitting a function to a given data set is called regression. Regression is a formal component of statistics. We start with fitting the simplest type of function, a linear function.
Linear regression is the process of fitting a straight line to a data set. A linear function has the form y = ax + b. Hence, the task of linear regression is to determine values for a and b that create the straight line that best fits the given data. For example, we consider a data set from Sullivan and Sullivan, Precalculus enhanced with graphing utilities, Second Edition, page 112, Example 3.
|Fertilizer, X (lbs/100ft2)||0||0||5||5||10||10||15||15||20||20||25||25|
|Yield, Y (bushels)||4||6||10||7||12||10||15||17||18||21||23||22|
Our first task is to enter the data set into the Casio. Fortunately, the Casio makes this a simple task. Turn the calculator on, and then press the key to display the MAIN MENU. Press to select the STAT menu. A spreadsheet is now displayed. Enter the fertilizer amounts in List 1 and the corresponding yields in List 2. Use the thumb pad to navigate from List 1 to List 2. To enter the fertilizer amounts in List 1, press , , , , ..., 25 . Then use the thumb pad to navigate to List 2. To enter the yields in List 2, press , , 10 , ,..., 22 . Note that a specific number is not actually placed into the list until you press the key.
|After the data set has been entered and checked for accuracy, we are ready to see a plot of the data. Press the key to access the GRPH menu. Press to select GPH1, statistical graph 1. A plot of the data set appears.|
Notice that it looks reasonable to fit a straight line through the displayed data points. A suite of regression types appears at the bottom of the screen.
|Press to select X, the CasioÕs symbol for linear regression. The screen then displays the LinearReg screen with the values for the linear form y = ax+b for the straight line of best fit. Here, to three decimal places, the linear equation of best fit is y = .717x + 4.786.|
Press to select COPY to copy the linear regression equation into the Graph Func editor.
Press to store the regression equation as function Y1 in the Graph Func menu and return to the LinearReg screen; otherwise, press to return to the LinearReg screen.
|Press to select DRAW. The Casio then displays the linear regression line superimposed on the data.|
Of course there are other functional relationships besides linear. The Casio supports a handy suite of choices, including linear, quadratic, cubic, quartic, logarithmic, exponential, power, sinusoidal and logistic.
To illustrate obtaining a function from data, we utilize a data set for the growth of the brewers' yeast, Saccharomyces cerevisiae, obtained by the biologist Tor Carlson in 1913. Here, time is measured in hours and population in biomass units.
|First enter the data set. Press to return to the spreadsheet. To clear the old data, press again to return to the main spreadsheet menu. Press to access more options. Press to select DEL-A (delete all), and then press to select YES. Use the thumb pad to move to another column with data and delete the entries by pressing the key, then press to select YES. After the old data are completely cleared, then enter the new data set.|
|After the data are entered and checked, then we are ready for the regression. If GRPH is not displayed as the menu item, press the key to select more menu options until GRPH is the F1 menu item. Then press to select GRPH. Press to select GPH1. A plot of the data set appears. Use V-Window to adjust the viewing window; for example, Xmin: -1, max: 8, scale: 1, Ymin: 0, max: 275, scale: 50 produce the graph at right.|
We are now ready for the regression. The problem now is, which function should we choose for the regression? No technology can tell us what function to select. Accordingly, some basic background is fundamental. Here, we want to model the growth of a biological organism. Hence, we appeal to the basic law of biological growth:
Organisms grow exponentially until acted on by some outside force.
|In accordance with this basic law, we select exponential regression to give us the exponential function of best fit. Press the key to display more options until Exp is displayed over the key. Then press to select Exp, exponential regression. The ExpReg screen appears. Copy the exponential equation of best fit into the Graph Func screen, then press to return to the ExpReg screen. Press the key to view the exponential function overlaid on the data set.|
|Comment The simplicity with which one can enter a data set, view a plot, select a function for regression, and display the graph of the function superimposed on the data set is a major asset of the Casio CFX-9850.|
For illustration, we conduct a quadratic regression that displays a quadratic function of best fit for the data set. Press to return to the spreadsheet, then press again to return the spreadsheet's main menu. Press to select GRPH. Press to select GPH1. The Casio displays a plot of the data set.
|Press to select X^2, Casio's symbol here for a quadratic function. The parameters (values for the coefficients) a, b, c for the quadratic function of best fit are displayed in a QuadReg screen. Here, to two decimal places, the quadratic of best fit is given by y = 6.10x2 Ð 9.28x + 16.43.|
|Press to select DRAW. The quadratic equation of best fit is now overlaid on a plot of the data set.|
One models a cyclic phenomenon using the sine function. The following data set represents the average monthly temperature (°F) for Eureka, California.
To create a sine function to model the data, we must convert the months to numbers in order to establish the domain of the function. We set January = 0, February = 1, March = 2, É, December = 11.
Now access the STAT menu and clear any old data from the spreadsheet. Enter 0, 1, 2, É, 11 in List 1. Then enter the temperatures in order in List 2.
After the data are entered and checked, then we are ready for the regression. Press F6 to select more menu options, then press F1 to select GRPH. Press F1 to select GPH1. A plot of the data set appears.
|We are now ready for the regression. Press to display more options until Sin appears over the key. Press to select Sin. The Casio displays the SinReg information for the sinusoidal curve y = asin(bx+c)+d. Here, to two decimal places, the sine function of best fit is given by y = 4.85sin(.56xÐ2.19) + 52.87.|
|Press to display graph of the sine function of best fit superimposed on the data set.|
The Casio has a rich suite of functions for fitting to data. However, the calculator cannot select for you which function to use. That is why an intelligent use of a sophisticated graphing calculator, like the Casio, must incorporate a comprehension of the basic mathematics and science at work when doing a regression. However, because of the improved technology brought by the graphing calculator, one can explore the mathematics and science at a deeper level of learning. While the Casio facilitates the calculations, it also provides the user with a tool for greater understanding and appreciation of the problem being worked.
Charles M. Biles, Ph.D.
Department of Mathematics
Humboldt State University
Arcata, CA 95521-8299
I extend my appreciation to Casio, Inc., for its professor assistance program. You can visit the Casio web site at http://www.casio.com.