TechSkills 2: Regression
In this TechSkills module, we take a given data set and fit an appropriate function to the data. The formal process of fitting a function to a given data set is called regression. Regression is a formal component of statistics. We start with fitting the simplest type of function, a linear function.
Linear regression is the process of fitting a straight line to a data set. A linear function has the form y = ax + b. Hence, the task of linear regression is to determine values for a and b that create the straight line that best fits the given data. For example, we consider a data set from Sullivan and Sullivan, Precalculus enhanced with graphing utilities, Second Edition, page 112, Example 3.
|Fertilizer, X (lbs/100ft2)||0||0||5||5||10||10||15||15||20||20||25||25|
|Yield, Y (bushels)||4||6||10||7||12||10||15||17||18||21||23||22|
Our first task is to enter the data set into the Casio. Fortunately, the Casio makes this a simple task. Turn the calculator on, then press the key to display the MAIN MENU. Press to select the STAT menu. A spreadsheet is now displayed. Enter the fertilizer amounts in List 1 and the corresponding yields in List 2. Use the thumb pad to navigate from List 1 to List 2.
|After the data set has been entered and checked for accuracy, we are ready to see a plot of the data. Press the silver key to access the GRAPH menu. Press to select 1:S-Grph1, statistical graph 1. A graph of the data set appears.|
Notice that it looks reasonable to fit a straight line through the displayed data points.
|Press to select CALC. Press to select 2:Linear. The screen now displays the values for the linear form y=ax+b for the straight line of best fit. Here, to three decimal places, the linear equation of best fit is y = .717x + 4.786.|
Press to select COPY to copy the linear regression equation into the Graph Func editor.
Press to return to the LinearReg screen.
Press to select DRAW. The Casio then displays the linear regression line superimposed on the data.
Of course there are other functional relationships besides linear. The Casio supports a handy suite of choices.
To illustrate obtaining a function from data, we utilize a data set for the growth of the brewers' yeast, Saccharomyces cerevisiae, obtained by the biologist Tor Carlson in 1913.
|First enter the data set. Press to return to the spreadsheet. To clear the old data, press the silver key to select DEL-A (delete all), then press . Use the thumb pad to move to another column with data and delete the entries by pressing the silver key, then . After the old data are completely cleared, then enter the new data set.|
|After the data are entered and checked, then we are ready to view a scatter plot. Press the silver key to select GRAPH. Press or to select the highlighted 1:S-Gph1. A plot of the data set appears.|
We are now ready for the regression. Press the silver key to select CALC. The problem now is, Which function should we choose for the regression? No technology can tell us what function to select. Accordingly, some basic background is fundamental. Here, we want to model the growth of a biological organism. Hence, we appeal to the basic law of biological growth:
Organisms grow exponentially until acted on by some outside force.
|Accordingly, we select exponential regression to give us the exponential function of best fit. Use the thumb pad to highlight 8:Exp, then press||. Press the silver key to copy the exponential equation of best fit into the Graph Func screen, then press to return to the ExpReg screen. To two decimal places, the exponential function of best fit is y = 10.98e0.46x.|
|Press the silver key to view the exponential function superimposed on the data set. The simplicity with which one can enter a data set, view a plot, select a function for regression, and display the graph of the function superimposed on the data set, is a major asset of the Casio Algebra FX2.0.|
For illustration, we also conduct a quadratic regression that displays a quadratic function of best fit for the data. Press to return to the spreadsheet. Press the silver key to select GRPH. Press or to select 1:S-Gph1. A plot of the data set is displayed.
|Press the silver key to select CALC. Press to select 4:Quad. The parameters (values for the coefficients) a, b, c for the quadratic function of best fit are displayed in the QuadReg screen. Here, to two decimal places the quadratic function of best fit is y = 6.10x2 - 9.28x + 16.43.|
|Press the silver key to select DRAW. The quadratic equation of best fit is now overlaid on a plot of the data set.|
One models a cyclic phenomenon using the sine function. The data set at left represents the average monthly temperature (°F) for Eureka, California, obtained from the following web site:
To create a sine function to model the data, we must convert the months to numbers in order to establish the domain of the function. We set January=0, February=1, March=2, ..., December=11. Access the STAT menu and clear any old data from the spreadsheet. Enter 0, 1, 2, ..., 11 in List 1. Then enter the temperatures in order in List 2.
|After the data are entered and checked, then we are ready for the regression. Press the silver key to select GRAPH. Press or to select the highlighted 1:S-Gph1. A plot of the data set appears.|
|We are now ready for the regression. Press the silver key to select CALC. Use the thumb pad to go up to highlight A:Sin, then press . The Casio displays the SinReg information for the sinusoidal curve y = asin(bx+c)+d. Here, to two decimal equations the sine function of best fit is y = 4.85sin(.56x - 2.19) + 52.87.|
|Press the silver key to display the data set overlaid with a graph of the sine curve of best fit.|
Charles M. Biles, Ph.D.
Department of Mathematics
Humboldt State University
Arcata, CA 95521-8299
Website: http://www.humboldt.edu/~cmb2 p>
I extend my appreciation to Casio, Inc., for its professor assistance program. You can visit the Casio web site at http://www.casio.com.