Linner Statistics: 2006

Friday, December 15, 2006

Reviewing for the 1st Semester Final

Heard in passing today:

Hang up. Log off. Study.

OK, if you log off you can't blog with the APSTAT crowd.

*** What are the most important concepts from each chapter?
*** What are the parts that you still don't understand?

HW: Write down the three most important from each chapter, 1-7. Due Monday.

*** CiCi's on Sunday.
*** Review sessions Monday 8-9PM for 5th and 6th periods, Tuesday 8-9 PM for 1st and 7th periods.

*** I have a meeting Monday morning. I will get to the MU as soon as possible.

*** Yes, we are having classes in 1st and 7th periods on Tuesday!

Post away, my statty friends.

Thursday, December 07, 2006

Chapter 7 Random Variables

This chapter prepares us to work with distributions of random variables and to find their measures of center and spread.

E[X] = the sum of (x * P(x) for all values of x).

The rules for means are straight-forward. The expected value of a random variable, E[X], is the mean, commonly called mu. The mean of the sum of random variables is the sum of the means. The mean of the difference of random variables is the difference of the means. The E[aX] = a*E[X]. The expected value of a constant is just that constant.

Really complex example: E[aX + bY + c] = a*E[X] + b*E[Y] + c.

When you work with measures of spread you have to be more careful! You cannot add standard deviations. You must work with their squares--the variances.

Var[X] = the sum of ((x-mu)^2 * P(x) for each value of x)

= E[X^2] - (E[X])^2

The Var[aX + b] = a^2 * Var[X]. The constant, b, does not vary, so it contributes NOTHING to the variance.

Now, IF X AND Y ARE INDEPENDENT (THAT"S A BIG IF!!!!!!), then Var(X + Y) = Var(X) + Var(Y). If they are NOT independent, then there is some covariance factor which could be increasing or decreasing the variance. The covariance concept is beyond the scope of this course.

That covariance thing is why we can't calculate the variance of the sum of the math and verbal portions of the SAT directly. We know that these scores are not independent.

Examples from class:

X={1, 11}, Y={-4, 20}, X+Y={-3, 7, 21, 31}

Find the variance of each set and look for a pattern.

Here's another:
X={1, 15}, Y={-4, 44}, X+Y={-3, 11, 45, 59}

Can you create two sets which, when added together, have a variance of 100?

Monday, November 13, 2006

Chapter 6 Probability

GO TO THE CLASSHOMEWORK SITE TO PICK UP DAILY HOMEWORK ASSIGNMENTS>>>>>>>>>>>>>>>>>>

There's a Homework link on the menu to the right that will take you there!>>>>>>>>>

Essential questions:
Is probability a fixed number or something developed through many, many repetitions?

How can a probability model help us to make decisions?

6.1 Randomness

The chapter starts off with some philosophical and theoretical concepts that you probably didn't consider when you first took probability and developed that deep, underlying appreciation for all things probabilistic.

What is probability? There are two positions on this subject. First, you have the experts who believe that there is an intrinsic probability associated with a random phenomenon. For instance, the actual probability of flipping that quarter in your pocket and getting heads is some fixed number between 0 and 1. All the observations we get from flipping that coin ba-zillions of times will only point us in the direction of the true probability of a success.

There was an important piece of information in that last bit: a probability p must fall between 0 and 1 inclusive. [0<= p <=1]

Then you have the other camp: the experts who claim that all the possible flips of the coin define the probability of success for that coin. Of course we can never observe ALL possible flips of a coin, because every second it is not being flipped is a waste of a flip! This is consistent with the authors' approach to this chapter of the book.

OK, I guess that there would be exceptions. You wait eagerly while the conveyor belt at the U S Mint carries a bright shiny quarter to you. It falls off the production line into your hands, you flip it in the air where it glistens and falls, heads up, to the floor of the Mint. A bulldozer appears out of nowhere and smashes the quarter into a mangled silver mess of metal. The probability of getting heads on that coin WAS 100%. That information doesn't do us much good now.

Back to the chapter.

First things first. Just because there are two possible outcomes to a random phenomenon does not mean that you have a 50% chance of a success.

The authors define probability as the proportion of times the outcome would occur in a very long string of repetitions.

Independence means that one trial is not going to influence the outcome of any other trial. If the outcomes are determined by some non-random influence, then it is not a random phenomenon.

But you know that you've seen problems that dealt with events that are not truly random--like whether or not a student took the SAT prep class. The way that this non-random event is turned into a random event is by asking what the likelihood of randomly drawing a student who HAD taken the SAT prep class was. If it's not random, then we don't have a probability distribution.

So, what good is running a computer simulation? You just have to give it the answer-- the probability of a success--and in the long run it would tell you that you had that percent of successes! That's if you're lucky. In the shorter run the computer can help us measure how likely or unlikely a particular outcome from a random even would be, given that the true probability of a success was some number p.

6.2 Probability models

Sample space is the list of all possible outcomes. For instance {heads, tails} or {H, T} is the set of all possible outcomes from one flip of our trusty quarter. For two flips the outcomes could be order-based (HH, HT, TH, TT} or summarized {2H, 1H1T, 2T}, depending on what you are trying to count. Note that all the outcomes in the summarized set are NOT equally likely. Likewise, the outcomes from adding together the number of pips from the roll of two dice would give the sample space {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} with probabilities 1/36, 2/36, 3/36, 4/36, 5/36, 6/36, 5/36, 4/36, 3/36, 2/36, 1/36 respectively.

When we relate the probabilities to discrete outcomes we create a probability distribution. This is often represented in a table. If the data are continuous, there will be a function that describes the probability density.

Some basic concepts from counting theory and probability:

tree diagram- you can map out what happens at each stage of a multi-step random process. Multiplying probabilities as you move out toward the branch of the tree will yield the joint probabilities. The sum of all of the joint probabilities will be 1.

multiplication principle- to find the number of ways joint, independent events can be combined, find the product of the number of ways each step can be performed. For instance, 3 shirts with 5 pants means 15 different combinations if we don't care what goes with what.

replacement- if you put the selected object back into the pool of objects to draw from. If you pick a card, record it, put it back, and draw again, the make-up of the deck does not change from draw to draw. That is WITH replacement. If you kept the card out of the deck, then the remaining cards do not have the same distribution as the original deck. That is WITHOUT replacement. RandInt is with replacement: sorting L2 and L1 by rand numbers in L2 is without replacement.

event- an outcome or a set of outcomes of the random procedure or phenomenon.

We use the expression P(A) to mean the probability that A occurs.

0 <= P(A) <= 1, like we agreed before.

Since the sample space S contains all of the mutually-exclusive, exhaustive possible outcomes of the phenomenon, P(S) = 1.

The probability that an event A does not happen is 1-P(A). There are many symbols for this event, the complement of A. Since they do not translate well to the html format, write them down in class so you will recognize them.

If two events A and B cannot both happen together, they are called disjoint and the probability of at least one of the events happening = P(A or B) = P(A) + P(B). If they actually had some overlap, like drawing a queen and drawing a spade, then you would have to subtract out the probability of the overlap (the Queen of spades!). This more general formula is P(A or B) = P(A) + P(B) - P(both A and B).

You can use a Venn diagram to represent these relationships and make the procedures clearer.

If all outcomes are equally likely, then the probability of any one happening is 1/(the number of outcomes). This is like our treatment of problem 6 in the book and the activity with the numbers 1-2-3 in class. Each outcome from problem 6, {H H H H}, {H H H T}, {H H T H}, {H T H H}, {T H H H}, {H H T T}, etc., is equally likely. We know that there are 2 X 2 X 2 X 2 = 16 possible outcomes (by the multiplication principle). Then the likelihood of any specific event happening is 1/16. If we know that there are 4 ways to get exactly one tail, we can combine these probabilities to get the probability that any one of those four outcomes happens, P(exactly one tail), = P( {H H H T} or {H H T H} or {H T H H}or {T H H H})=4/16.

It is usually wise to write all the probabilities with a common denominator so you can check tht the sum is 1.

Independent events revisited: In Chapter 4 we looked at the probability rules surrounding independent events. If two events are independent, then one event happening does not affect the probability that the other happens.

For instance these are NOT independent:

A = being a Lassiter student
and
B = owning a piece of Lassiter spiritwear

If you are a Lassiter student, then you are FAR more likely than other people to own Lassiter apparel. If you do not own Lassiter apparel, then you have a much higher likelihood of going to Kell.

C= being a Lassiter student
and
D= not being a Lassiter student

If one of these is true, then the other CANNOT be true. Therefore, the occurrance of one SEVERELY impacts the likelihood of the other. These are mutually-exclusive. Mutually-exclusive events are NEVER independent of each other.

It takes something special to be independent. The two most common ways to prove independence are

(1) to check that the product of the marginal probabilities equals the joint probability. If P(A) * P(B) = P(both A and B), then events A and B are independent. This is really an if-and-only-if statement.

and

(2) to check that the marginal probability equals the conditional probability. If P(A) = P(A|B), then B's happening does not affect A's likelihood and A and B are independent.

Remember: P(A) * P(B) = P(A&B) WHEN A and B are independent.

Also, coins, dice, cards, and the Roulette wheel have no memory. They don't care what their last outcome was: every trial is independent.

6.3 More

The 5 basic rules of probability are recorded on page 341.

Can you answer. . .

what is a union of events?

how do you compute the probability of at least one of some collection of events happening?

how is the addition rule modified when there is overlap between the events?

how is the addition rule modified when there is overlap among three events?

what does conditional probability mean? Can you interpret a Venn diagram to calculate a conditional probability? Can you CONSTRUCT one??????

what is the intersection of two events?

when is the probability of BOTH of two events equal to the product of their respective probabilities? when is it NOT?

can you read tree values with probabilities to calculate marginal, joint, and conditional probabilities? Can you CONSTRUCT one?????

It looks like we need better understanding of INDEPENDENCE.

Problem 6.36 If the probability that the woman in the study was over 65 years old was (.365 + .190) = .555 and the probability that she had the tests done was (.321 + .365) = .686, then the probability that she was over 65 and had the tests done should be .555*.686 if the AGE and TEST status are independent. We multiply these two marginal probabilities together and get .38073. We look in the table and see that the actual probability of being over 65 and having the tests done is .365. Because these are NOT the same, we conclude that there was some connection between the two characteristics, AGE and TEST status.

Problem 6.42 d. You have to demonstrate that the two characteristics are not mathematically independent. Find P(widowed), P(65+), P(widowed)*P(65+), and P(widowed AND 65+). If P(widowed)*P(65+) = P(widowed AND 65+), then the two characteristics are independent. You HAVE TO show the numbers. YOu have to show that they are equal--or not.

To find each piece of the puzzle:
P(widowed) = row total # widowed/total of all women

P(65+) = column total # of 65+/total of all women

P(widowed and 65+) = number from the body of the table where widowed and 65+ intersect/total of all women.

6.44 P(W) = 856/1626, P(W|Pr) = 30/74. Because these are not equal, we know that the characteristics Female and Professional are not independent. We prove it by comparing P(w)*P(Pr) to P(W and Pr). (856/1626)(74/1626) does not equal 30/1626. Therefore, they are not independent.

6.46 P(Male) = 24,457/(24,457+6027), P(F) = (15802+2367)/(24457+6027), P(F|Male) = 15802/24457, P(F|Female) = 2367/6027

"Among those who. . .males are more likely than females . . ."

6.47 P(all three)=5%, P(Coffee only) = 20%, P(coffee and tea only) = 10%, P(tea only) = 5%, P(Tea and cola only) = 5%, P(cola only) = 15%, P(Coffee and cola only) = 20%. So, what is the probability that a randomly-selected adult drinks none of the above?

6.48 P(B|A) = .32 = P(A&B)/P(A) = P(A&B)/.46

Solve for P(A&B).

6.49 P(R|F) = .8 = P(R&F)/P(F) = P(R&F)/.4

Solve for P(R&F)

Interesting site: http://www.paly.net/~sfriedla/apstatistics/

Wednesday, November 01, 2006

Chapter 5 Sampling, experiments, and simulation

Essential questions:

Can the data we collected be generalized to the population?

How can the survey or experiment be designed to accomplish our goals?

How can we confirm our suspicions using simulation?

-----------------------------

Running list of key concepts from class:
Survey
Census
Simple Random Sample (SRS)
Systematic Random Sample
Stratified Random Sample takes samples from all strata
Convenience Sampling
Table of Random Digits

Cluster Sampling takes a sample from a few clusters
Multi-stage sampling is a complex form of cluster sampling
Probability Sample like the computer lottery at LHS
Bias when method favors certain outcome(s)
Undercoverage when systematically omits part of population from inclusion
Non-Response when they refuse to participate
Sampling Frame is the list from which the sample is drawn

Experiments:
observational studies
experiments
experimental units/subjects
treatment
factor
level

control
comparison of several treatments
placebo effect results in bias
reduces the effect of lurking variables (confounding and bias)
could include blocking (not required) *BLOCKING reduces the variability within the group, so effects of the treatments can be more easily recognized.
control group
matched pairs design is smallest block

randomization
matching of characteristics does not work
required real randomization, not just haphazard guesswork
makes the effect of any uncontrollable lurking variables affect all groups equally, thereby also reducing bias
When the problem asks for the experimental design, it requires that you describe how you will randomly allocate experimental units/subjects to treatment groups. Two key points to remember: you CAN'T randomly assign subjects to blocks, because the characteristic you are blocking for is not random, AND this is not a SRS.

replication
allows you to generalize your data to your population
makes the experiment more sensitive to differences among treatments, instead of just random variation between the groups. The compiled or averaged results from a larger group of subjects should more precisely represent the actual, underlying truths of the relationship than results from smaller numbers of subjects. Of course, there is a cost trade-off.

simulation
use table of random variables or random number generator
CLEARLY identify what specific random outcomes represent, such as
The digits 0-4 represent a vote for Adams, 5 & 6 are a vote for Jefferson, 7-9 will be a vote for Roosevelt. Take one random digit at a time, comparing the result to our mapping above, until we have identified 100 votes and the corresponding candidates.

or . . . in cases where you CAN'T reuse a number . . . "Assign each child a unique number 01-47. Take two digits at a time from the TORD (table of random digits), recording the names of the students as we select their number, throwing out any number greater than 47 or those which have already been used.

When a question asks you to describe or explain, there should be a description or explanation in your answer. Just providing a mapping is not sufficient.

When it asks you for the sampling or experimental DESIGN, an explanation of how you are going to select your random units must follow. You must describe how you will assign the digits to the outcomes, how you will take the digits from the TORD, what "toss out" rules you need for duplicates or numbers that have no correspondences, and when you will stop. You have to explain it all. You will need to write.

Some common calculator stuff: Rand(100) selects 100 random numbers between 0 and 1 where repetition is HIGHLY unlikely.

RandInt(5,29,31) selects 31 random digits from the range [5,29] and allows repeats.

SortA(L2,L1) sorts both L2 and L1 in the ascending order of L2.

Watch this space for more key words.

Saturday, October 07, 2006

Chapter 4 Nonlinear relationships

Read through the list of goals at the end of the chapter frequently.

In this chapter you will work with bivariate quantitative data and relationships between two categorical variables. For the quantitative part, you will learn to "straighten" x-y data, that is to use a transformation function to create a new relationship between f(x) and g(y) that is approximately linear. Find the least-squares relationship between the transformed data, then find the inverse of the original transformation function to transform the model into a curve which passes through your original x-y data. It's pretty cool to accomplish this and magnificently powerful math.

The second part, the categorical part, covers conditional and marginal probabilities. For instance, break the class into m/f and soph/jun/sen identifiers. Each person falls into exactly one of the gender groups and exactly one of the class year groups. Overall, what is the likelihood that a randomly-selected person is in a particular class? What is the probability that they are a particular gender? If they are a girl, then what is the probability that they are a senior? If they are a senior, what is the probability that they are a guy? Also, if guys do better than girls in 1st period and guys do better than girls in 5th period, how could the combination of the two classes indicate that girls are doing better than boys?

Dress appropriately for the weather and for doing activities that involve sitting on the floor this week. See you 10/8 at CiCi's???

Saturday, September 30, 2006

Chapter 3 Linear relationships

Now you've done it all! Can you

identify bivariate data?

graph response and explanatory variables?

differentiate in a scatterplot for a categorical variable?

describe data represented in a scatterplot?

find the least-squares regression line?

compare and contrast the concepts of regression, correlation, association?

explain what the correlation coefficient tells us?

explain what the coefficient of determination tells us?

use a predictor line (LSRL) to predict the value of y for a given x?

use a predictor line to calculate and interpret residuals?

calculate EVERYTHING using the formulas in the text?

explain the vocabulary?

identify the key topics from this chapter?

write good questions for a test?

teach someone else how to work these problems?

squeeze the maximum information from real data using linear methods appropriately?

prove (in writing) that you understand and can apply the concepts of this chapter?

What have I left out?

CU @ CiCi's.

Wednesday, September 20, 2006

Chapter 2 thoughts

Essential question: Why is the Normal distribution so special?

Here's a teaser for you--if you know that 20% of the data in a normally-distributed population fall below the x-value 306 and 80% fall below the value 772, can you find the mean and standard deviation of the population???

I will be in the classroom as early as possible on Thursday, but I have a parent meeting at 7:45. Please use your other resources well. Don't forget the book! The chapter summaries are great tools. Do the practice quiz online. Dream up questions that I might ask.

Have a slice of pizza for me.

Monday, September 11, 2006

Pre-test discussion

Ask and answer the most pressing questions here. You WILL nned to establish an identity to post. Please DO NOT use your first and last names.

Have you taken the practice quiz yet? The link is on the right edge of the main screen. Spread the word to your friends and neighbors.

Good luck.

Thursday, August 31, 2006

Straightening data

Here's the plan we developed in class:

Load your (x,y) data into L1 and L2.
Look at them.

If they are not already straight, figure out if the ideal model would pass through the x-axis, the y-axis, or both. To save some time, try this to straighten your data:

Case 1: if the ideal model would cross the y-axis, take the ln of the observed y values. Case 2: if the model would cross the x-axis, take the ln of the x values, too.

THEN
Case 1: Run the linear regression on the original x and the ln y. Change your stat plot to show the (x, ln y) points with the linear regression equation. If this is a good fit, then the residuals will be scattered. Correct the linear regression equation to reflect that the y-values were really ln y. Solve the fixed equation for y.

Case 2: Run the linear regression on the ln x and the ln y. Change your stat plot to show the (ln x, ln y) points with the linear regression equation. If this is a good fit, then the residuals will be scattered. Correct the linear regression equation to reflect that the y-values were really ln y and the x-values were really ln x. Solve the corrected equation for y.

To check your results, put the new equation for y into the y= register to graph. Change your statplot to show the original data (x,y), probably in L1 and L2. The curve you generated should pass neatly through the data.

Note that this method finds the line which miniomizes the sum of the squared residuals from the STRAIGHTENED data, not the squares of the residuals from the curved fit.

Oh yeah, go Braves!

Tuesday, August 15, 2006

Welcome to the new school year

How is a permutation different from a combination? How are they similar?

Sunday, July 02, 2006

Scores from the exam and a Must-see-video

I'd love to hear how you did on all of your AP exams and how your summer is going, but PLEASE do not use first and last names on the BLOG. YOu can send private emails to my school email if you want.

You guys are great!

I hope that you are done with your summer assignments so you can enjoy the last month! :)

You HAVE TO check out this video:
http://video.google.com/videoplay?docid=5243677894327730537

Let me know how you like it.

Mrs. L

Friday, April 28, 2006

AP Central link

The website you will go to is
http://apcentral.collegeboard.com/exam/0,3060,152-0-0-0,00.html

If you have never been there before, you should register as a student so you can look at info about all of your tests. Also, there is a section on tips for students by Darren Starnes in the Stat part that you should read.

C U @ C CCCCCCCCCC.

Thursday, March 23, 2006

Chapter 13 Chi-square tests

There are three different tests in this chapter, but only two distinct methods.

The first method is what you used in class to determine whether your sample was reasonably consistent with the hypothesized proportions by color of Goldfish, Froot Loops, or Smarties. You determine the expected counts by multiplying the hypothesized proportion by the total of objects. The number of degrees of freedom is the number of categories minus one. This was a Chi-square goodness of fit test.

The next method you will use is the Chi-square test of homogeneity. This is used when you have two populations that you are comparing to see if they have a common distribution by the categorical variable. You base this decision on your sample comparison. Using the same methods, you can perform a Chi-square test of independence. This is used to determine whether a sample described in a two-way table by two different characteristics demonstrates independence between the two variables or if there appears to be a connection. In these cases, you have to multiply the row total by the column total and divide by the table total to get the expected count for each cell. You will use (r-1)*(c-1) for the number of degrees of freedom. This is the number of cells you would have to fill in (if you knew all of the totals) before the rest of the cells' values are determined.

You have now seen every topic on the Barron's guide and on the AP exam syllabus. We're almost there!

Thursday, March 09, 2006

Chapter 12 Inferences about proportions

BINOMIAL CONNECTION:
The methods of this chapter are based on the binomial distribution.

Let x be the number of successes in n trials.
If the conditions of a binomial setting hold,
then mu of x = np and
sigma sub x =sqrt(n*p*(1-p)).

Now, because p-hat, the estimator for the population proportion equals x/n,
mu sub p-hat = (mu sub x)/n Which means that p-hat is an unbiased estimator of p

and

sigma sub p-hat = (sigma sub x) / n.

Well, if you take that last part, (sigma sub x) / n, and substitute for sigma sub x,
you get sigma sub p-hat = sqrt(np(1-p)) / n

which can be rewritten as sqrt(p(1-p)/n).

WHICH P DO I USE?

If you have a hypothesized p, you use that. For instance, if your previous study or some expert indicated that p = .35, then you use .35 in your hypothesis, the standard deviation for your hypothesis test, and calculations to find the minimum sample size for a margin of error.

You also use this value when checking assumptions np>10 and n(1-p)>10.

If you have only your sample proportion, then you use p-hat to estimate the standard deviation for confidence intervals and for checking conditions for CI: n*p-hat> 10 and n*(1 - p-hat) > 10.

If you have neither, then you must be finding the minimum sample size, so use the most conservative estimate: .5.

2 proportion methods:

It helps A LOT to make a table of values as they showed in the book.

For confidence intervals, the methods just as you imagined. You are developing a confidence interval for the difference between two proportions,
so use p-hat1 - p-hat2.

For the standard deviation,look to the variances. Add the variances of the two samples and take the sqrt. Among the conditions, compare the products n1*p-hat1, n1*(1 - p-hat1), n2*p-hat2, and n2*(1 - p-hat2) to 5. Each product must exceed 5.

For hypothesis tests, there is a nifty twist. Your null hypothesis probably stated that the two proportions were the same. Therefore, their standard deviations should be combined. Take (x1+x2)/(n1+n2) to calculate a new, stronger p-hat which you use for standard deviation calculations and checking conditions.

The standard deviation would be sqrt( p-hat(1-p-hat)/n1 + p-hat(1-p-hat)/n2), but that requires that you enter p-hat too many times. Rewritten, that formula is sqrt( p-hat * (1-p-hat) * (1/n1+1/n2)). It looks nicer in the book. Go there to read all about it.

There are super examples in this chapter.

Friday, March 03, 2006

Chapter 11

The test is Tuesday, March 7th. I will not be available before school to help. Prepare early!

Sunday. 2-4. You know where.

Saturday, February 25, 2006

Chapter 10 test results

I will return tests to interested students at CiCi's on Sunday or in class on Monday. No, I do not pay students to attend CiCi's.

Sunday, February 05, 2006

SCAD for statistical inference problems

Georgia teens are probably familiar with SCAD-the Savannah College of Art and Design. Remembering this acronym can help you to include all of the parts of an inference problem (and maximize your points!!!!).

First you will address the SET-UP. This is where you write down all of the information that you pulled out of the question. You will define the hypotheses or the type of confidence interval, the statistics, and any parameters disclosed. You need to define your random variable. Keep in mind that mu and p do not vary; they are fixed. The values of x-bar or p-hat that you get from samples will vary. Therefore, you define your random variable x-bar or p-hat IN WORDS and symbols. Continue by identifying x-bar as the average value of ____________ from samples of size ______ or p-hat as the sample proportion of ___________________ with samples of size ______ (inserting the words you used when you defined your random variable and sample size). Define mu and p as the measures for the entire population. You can't use a symbol until you define it.

In hypothesis testing, be sure to use the values of mu or p in your hypotheses, not the statistics. For instance, if your were testing the proportion of students who lurk on the blog instead of writing to it when the experts think the proportion is 80%, but you think it is higher, your hypotheses are Ho: p = .8 and Ha: p > .8. Also, you only have an equals statement in the null hypothesis Ho.

The second portion of the complete answer is the ASSUMPTION or condition check. Yes, I know this is out of order, but at least you'll remember to do it! Most kids lose points by forgetting to do this or doing it poorly. The resource page of assumptions and tests in the appendix of the Barron's study guide provides a great summary of the conditions you need to check. Pay particular care to things like p or p-hat in the formulas. You have to use the right one to get satisfactory results. Don't just copy the items from the list and put check marks next to them. The readers know that you haven't actually done the check. Identify the reason why you did each test, like testing for 10n < population allows you to use simplified forms of the standard deviation. Know this. Write it down as your result after you plug in the values and test the conditions.

Of course the most satisfying part is the CALCULATION. Tell the reader what calculation you're doing, write out the formula, plug in the values, show how it is calculated, and write the numerical answer. Draw the picture. You can use your calculator to provide probabilities or z-values or t-values the same way you would use the standard normal or student's t distribution tables. Don't use them to magically provide the answer. You won't get any credit.

The final part is the DECISION. This would be the most important part to your employer. State your decision or explanation of the confidence interval in the terms of the problem, connnecting your numerical answers, the probabilities involved, and the actual words the author used. This is not time for fancy paraphrasing or concerns about plagiarism. The authors want an answer to their problem--not the answer to some related and colorfully-worded problem. Give them the facts, the probabilities, and a clear decision.

A local professor and excellent AP Statistics tutor, Michael Roty, once told me that his memory hook for answering statistics problems is "What did you do? Why did you do it? What does it mean?" I think that this summarizes the expectaions of the authors nicely. The SCAD structure should answer these questions.

Sunday, January 29, 2006

Chapter 10 The beginning of inference

What is the goal if this chapter? How can these methods be used?

On an unrelated note, visit http://www.infoplease.com/p/brainpop/basicprobability.html for a quick review of basic probability.

Wednesday, January 11, 2006

Chapter 9 Sampling distributions

What big ideas have you identified in this chapter?

CiCi's Sunday January 23rd. 2-4