We couldn't possibly cover all of the first semester's content in two days of review, so let's point out a few topics that no one has asked about:
Correlation coefficient r and coefficient of determination r-squared. What special insight does the value of r-squared give you about the relationship between x and y?
Why can't we just take the square root of r-squared to get the value of r?
Slope of a LSRL = the estimated increase (or decrease) in the response variable for every unit increase in the explanatory variable.
The easiest way to get it is r*sy/sx where sy is the sample standard deviation of y and sx is the sample standard deviation of x.
While we're talking about sample standard deviations. . . the formula is SQRT(variance of the variable), so the sample standard deviation of x would be
SQRT[(sum of the squares of (Xi - Xbar) for all values of X)/(n-1)]. N is the sample size.
Don't panic if you can't read that--just look up the formula in the text.
What does it mean to be resistant to outliers? Give examples of measures which are resistant. Give examples of some which are not.
What are the benefits of different types of graphs (box and whisker, stem and leaf, histogram)?
How do you know if a set of data is approximately normally distributed? Look it up.
Why do we block?
Why do we experiment?
What makes an experiment special?
What are the characteristics of a well-designed experiment?
Why do people sometimes need double-blind experiments?
What is the placebo effect?
How do you know if two characteristics are independent?
Monday, December 19, 2005
Thursday, December 15, 2005
Things to think about when you should be studying
The icon used for the command SAVE in Microsoft's Office applications is a 3.5" diskette. Now that diskettes are nearly obsolete, when will they change the icon and what will they change it to?
Tuesday, December 13, 2005
Chapter 8 Binomial and Geometric Probabilities
What are the differences between **having two kids and counting x=the number of girls** and **having kids until you get a girl**? What is the random variable x in the second case? What are the means [expected values] of the random variable x for each of these scenarios? What is the standard deviation of x in the first case? How could you simulate each of these scenarios?
How are these distributions similar? How are they different?
How are these distributions similar? How are they different?
Tuesday, November 29, 2005
Chapter 7 - Random Variables
How do you distinguish between a discrete random variable and a continuous random variable?
Compare and contrast probability histograms and density curves.
If X is discretely distributed for the integers {1, 2, 3} and P(X=1) does not equal P(X=3), does the expected value of X have to be an integer? Why or why not? Does the mode have to be an integer? Why or why not? Does the expected value of a distribution have to be a value of x from your distribution (for instance, does the average number of pips on one die rolled have to be 1, 2, 3, 4, 5, or 6)? Does the mode have to be an observed value of x? Why or why not?
How does the Law of Large Numbers relate to the Kid-sino lab on November 18th?
The mean of the sum is the sum of the means.
The variance of the sum is the sum of the variances (if the variables are independent).
The variance of the difference is the SUM of the variances (if the variables are independent).
Why?
The variance of 2X is 4 times the variance of X.
The variance of (X + Y) is the variance of X plus the variance of Y (if the variables are independent).
Why are these different formulas? Or are they?
Have a super day.
Compare and contrast probability histograms and density curves.
If X is discretely distributed for the integers {1, 2, 3} and P(X=1) does not equal P(X=3), does the expected value of X have to be an integer? Why or why not? Does the mode have to be an integer? Why or why not? Does the expected value of a distribution have to be a value of x from your distribution (for instance, does the average number of pips on one die rolled have to be 1, 2, 3, 4, 5, or 6)? Does the mode have to be an observed value of x? Why or why not?
How does the Law of Large Numbers relate to the Kid-sino lab on November 18th?
The mean of the sum is the sum of the means.
The variance of the sum is the sum of the variances (if the variables are independent).
The variance of the difference is the SUM of the variances (if the variables are independent).
Why?
The variance of 2X is 4 times the variance of X.
The variance of (X + Y) is the variance of X plus the variance of Y (if the variables are independent).
Why are these different formulas? Or are they?
Have a super day.
Thursday, November 17, 2005
Monday, November 07, 2005
Chapter 6 - Probability
Alas, here's your chance to finally learn to like probability. We'll be covering the important stuff and giving you the opportunity to extend your understanding through an optional challenge. The test will be on Thursday, November 17. On Friday, November 18th we will have our annual casino day. We would appreciate adult help on this day, especially from parents who have some experience watching chips pass back to the "house." If you want to design a casino game of chance where you will be the "house" and the students will play against you, see Mrs. L this week.
Please be safe on Tuesday. Good luck to the GHP interviewees. See you all on Wednesday.
Please be safe on Tuesday. Good luck to the GHP interviewees. See you all on Wednesday.
Sunday, October 30, 2005
Chapter 5 - Experimental design, sampling, simulation
The test wil be Monday, November 8. Be thinking about what experiment or data collection activity you can perform during your lunch on Wednesday at the honor card event.
Thursday, October 13, 2005
The Simpson's Paradox project
You are supposed to work in SMALL groups to pick an example of Simpson's Paradox and present it to the class. You are strongly encouraged to use technology in your presentation, but it is not actually required.
The website of the example we used in class on Thursday is
http://www.cawtech.freeserve.co.uk/simpsons.2.html . An electronic version of the megasearch list is available on Classhomework.com.
Everyone in your group must contribute. Your presentation must include an explanation provided by the weakest member of your group!
Electronic documents can be sent to me at Forensicslime at aol.com or brought in on CD or flashdrive. If you have a question about whether it will work on my machine, send it by 5:00 Tuesday evening. I'll get back to you.
Once you pick your example, post your selection here and identify your class period so no other group in your class picks the same example. First come-first served (and noone is fobbed off with a bad example). :)
CiCi you on Sunday!
The website of the example we used in class on Thursday is
http://www.cawtech.freeserve.co.uk/simpsons.2.html . An electronic version of the megasearch list is available on Classhomework.com.
Everyone in your group must contribute. Your presentation must include an explanation provided by the weakest member of your group!
Electronic documents can be sent to me at Forensicslime at aol.com or brought in on CD or flashdrive. If you have a question about whether it will work on my machine, send it by 5:00 Tuesday evening. I'll get back to you.
Once you pick your example, post your selection here and identify your class period so no other group in your class picks the same example. First come-first served (and noone is fobbed off with a bad example). :)
CiCi you on Sunday!
Thursday, October 06, 2005
On to Chapter 4
You can post your thoughts--linear and nonlinear--here. See you in the morning if you have questions!
Go Braves.
Go Braves.
Friday, September 30, 2005
Chapter 14 concepts
You will fully understand the confidence intervals and hypothesis tests after you have mastered Ch 10-Ch 13, but you can follow the patterns and use the tools for inference about the models you generated in Ch 3.
First, the confidence interval for the slope of the regression line is just a band around the estimate you created in Ch 3. You extend your interval t* standard deviations in each direction with your value of b as the center. The value of t* comes from the t-table using the confidence level C and df = n-2. Usually, the value of the standard deviation will be provided for you in the computer or calculator output.
The second tool is the hypothesis test. Look at the Minitab output on page 763. The values of b and SEb are given. The concept is that you are dividing the difference between the observed slope (b) and the default slope (0) by the standard deviation associated with the slope (Std dev of ddays) and call it the t-value. This is just like (x-mu)/sigma calculations for calculating z, except we have to use the t-table again. Find the probability (p) that we would get a t-value more extreme than the one we got and interpret. Remember how we found out how unlikely a z score of -2.08 was in problem 2.50? The interpretation is like that.
If p is small(usually anything less than .05), reject the null hypothesis. Therefore, the line through the data has a slope not equal to zero. Yippeeeee.
First, the confidence interval for the slope of the regression line is just a band around the estimate you created in Ch 3. You extend your interval t* standard deviations in each direction with your value of b as the center. The value of t* comes from the t-table using the confidence level C and df = n-2. Usually, the value of the standard deviation will be provided for you in the computer or calculator output.
The second tool is the hypothesis test. Look at the Minitab output on page 763. The values of b and SEb are given. The concept is that you are dividing the difference between the observed slope (b) and the default slope (0) by the standard deviation associated with the slope (Std dev of ddays) and call it the t-value. This is just like (x-mu)/sigma calculations for calculating z, except we have to use the t-table again. Find the probability (p) that we would get a t-value more extreme than the one we got and interpret. Remember how we found out how unlikely a z score of -2.08 was in problem 2.50? The interpretation is like that.
If p is small(usually anything less than .05), reject the null hypothesis. Therefore, the line through the data has a slope not equal to zero. Yippeeeee.
Sunday, September 25, 2005
Chapter 2 Test Questions
If you still have questions about the Chapter 2 test, please post your questions here. Kindly summarize the problem because different classes had different questions.
If you know the answer to someone else's question, please jump in and answer it!
You guys make me proud. . . every day.
If you know the answer to someone else's question, please jump in and answer it!
You guys make me proud. . . every day.
Friday, September 23, 2005
Chapter 3 test is still Thursday
We just won't include the chapter 14 parts. Post your questions here. Do more exercises so you'll have good questions to ask on Wednesday. Please be safe. YKYMFS
I will be at CiCi's on Sunday.
Mrs. L
I will be at CiCi's on Sunday.
Mrs. L
Saturday, September 10, 2005
Homework Help Hotline- post Q & A here
Post your question or post help for a classmate here. Don't be shy!
Wednesday, September 07, 2005
Flip50
Your assignment was to do problem 2.19 AND to determine what portion of your data fell in the "middle" three columns, then the "middle" 5 columns. Re-run the program a few times and record your new results. Post them to this site as a comment. Do you see a trend?
Friday, September 02, 2005
Assigned reading
Did you understand all of the probability and statistics in your book? Please identify the book you read and the concepts you need to know more about.
Please sign your posting with your first name and the class period. No last names, please!
Moneyball readers: Check this site for another take on your topic.
http://www.nytimes.com/2005/08/28/weekinreview/28leon.html
Please sign your posting with your first name and the class period. No last names, please!
Moneyball readers: Check this site for another take on your topic.
http://www.nytimes.com/2005/08/28/weekinreview/28leon.html
Subscribe to:
Posts (Atom)