Saturday, September 30, 2006

Chapter 3 Linear relationships

Now you've done it all! Can you

identify bivariate data?

graph response and explanatory variables?

differentiate in a scatterplot for a categorical variable?

describe data represented in a scatterplot?

find the least-squares regression line?

compare and contrast the concepts of regression, correlation, association?

explain what the correlation coefficient tells us?

explain what the coefficient of determination tells us?

use a predictor line (LSRL) to predict the value of y for a given x?

use a predictor line to calculate and interpret residuals?

calculate EVERYTHING using the formulas in the text?

explain the vocabulary?

identify the key topics from this chapter?

write good questions for a test?

teach someone else how to work these problems?

squeeze the maximum information from real data using linear methods appropriately?

prove (in writing) that you understand and can apply the concepts of this chapter?



What have I left out?

CU @ CiCi's.

10 comments:

Mrs.L said...

You have it! There are good examples in the text.

Look at the pictures (graphs).

Mrs.L said...

Seventh period posed an intriguing problem: If you work problem 3.58 using husband's height to predict wife's height you get a different answer. Is this the cause of divorce today? Can we blame it on regression?


We looked at the equations: The regression equation can be rearranged to look like this. . .

the predicted response variable =
(the slope)* (the difference between the actual x and the average x).

Now, that makes sense, because the slope tells you the rate of increase per unit of growth in the explanatory variable where there is NO difference at the point (x-bar, y-bar).

If you solve this for the inverse, you get something that looks a lot like this equation, except instead of r you are using 1/r. That seems wrong, but it would give us the relationship we expected where x maps to y and y maps to x uniquely.

Our confusion arises because the two models (x regressed on y and y regressed on x) are not actually inverses of each other. Each regression equation is designed to minimize the sum of the squares of the vertical distances between the actual y-values and the predicted y-values. The inverse would then minimize the sum of the squares of the horizontal distances. That's not what we want in a regression equation.

Long story short: It does matter whether you "regress" the ys on the xs or vice versa. I will try to be clear in my wording on the test. The r (correlation coefficient) WILL remain the same, however. It does not change. If is locked in. You don't use the inverse of the r. It is immutable. (Look that one up.)

One point to Gryffendor. Good work Team Jones.

Mrs.L said...

Bivariate, hmmmmm. Bi= TWO
variate = variables

(x,y)

Mrs.L said...

Could you create a scatterplot of bivariate data using the TI-83/4 and categorical data?


Hmmm

food for thought.

nkhat said...

Could someone explain the purpose of r^2 in simple terms?

Mrs.L said...

Hey, guys. Class was super tonight!

Let's see. Residuals are the actual y-values minus the expecteds, that is

y - Y1(L1) usually.

Pattern to data means BAD FIT.

R-squared represents the percent of the change in y (from the average of the y vaues) that we could have predicted.


x-sub-i and y-sub-i are just like x1 and x2, y1 and y2 that you used in algebra to find the slope of a line. The -sub-i part just means "let's start with the first ones, process them, move on to the next ones, continue the process, and keep doing it until we've used all of the values oin our sample.

For instance SUM of (x-sub-i)/n is the average of the x values: add up each x value and divide by n.


You guys are doing a great job.

Billy, just use the original y-values (not L1, probably L2) and you're golden. In class we used Y1 instead of typing in the formula for the predictor because we had already stored the formula there so we didn't have to round off.

Mrs.L said...

FPI - yes, the results will be the same, because the calues of r, sx,sy, x-bar, and y-bar are all used in finding the LSRL using the points.

CO--because the LSRL minimizes the sum of the squares of the residuals in the y-direction, the residuals are

actual y minus predicted y.


On problem 2 on the quiz, look at the units. R is a DIMENSIONLESS value.

nkhat said...

Could someone explain the purpose of r^2 in simple terms?

Mrs.L said...

Re: r-squared see previous post. It is the proprtion of the variation in y that we could have figured out using our formula for y-hat.

RE: 3.55 You know if the response variable is high or low by looking at the graph. If the point is above the regression line, then the response value is high. Otherwise, it is low. To predict, plug in the x value into the the formula for y-hat.

Mrs.L said...

That's why all of you Mileses are my favorite students.

Have a great weekend. Enjoy the beautiful fall weather.

Go collect some data.