Monday, November 13, 2006

Chapter 6 Probability

GO TO THE CLASSHOMEWORK SITE TO PICK UP DAILY HOMEWORK ASSIGNMENTS>>>>>>>>>>>>>>>>>>

There's a Homework link on the menu to the right that will take you there!>>>>>>>>>

Essential questions:
Is probability a fixed number or something developed through many, many repetitions?

How can a probability model help us to make decisions?

6.1 Randomness

The chapter starts off with some philosophical and theoretical concepts that you probably didn't consider when you first took probability and developed that deep, underlying appreciation for all things probabilistic.

What is probability? There are two positions on this subject. First, you have the experts who believe that there is an intrinsic probability associated with a random phenomenon. For instance, the actual probability of flipping that quarter in your pocket and getting heads is some fixed number between 0 and 1. All the observations we get from flipping that coin ba-zillions of times will only point us in the direction of the true probability of a success.

There was an important piece of information in that last bit: a probability p must fall between 0 and 1 inclusive. [0<= p <=1]

Then you have the other camp: the experts who claim that all the possible flips of the coin define the probability of success for that coin. Of course we can never observe ALL possible flips of a coin, because every second it is not being flipped is a waste of a flip! This is consistent with the authors' approach to this chapter of the book.

OK, I guess that there would be exceptions. You wait eagerly while the conveyor belt at the U S Mint carries a bright shiny quarter to you. It falls off the production line into your hands, you flip it in the air where it glistens and falls, heads up, to the floor of the Mint. A bulldozer appears out of nowhere and smashes the quarter into a mangled silver mess of metal. The probability of getting heads on that coin WAS 100%. That information doesn't do us much good now.

Back to the chapter.

First things first. Just because there are two possible outcomes to a random phenomenon does not mean that you have a 50% chance of a success.

The authors define probability as the proportion of times the outcome would occur in a very long string of repetitions.

Independence means that one trial is not going to influence the outcome of any other trial. If the outcomes are determined by some non-random influence, then it is not a random phenomenon.

But you know that you've seen problems that dealt with events that are not truly random--like whether or not a student took the SAT prep class. The way that this non-random event is turned into a random event is by asking what the likelihood of randomly drawing a student who HAD taken the SAT prep class was. If it's not random, then we don't have a probability distribution.

So, what good is running a computer simulation? You just have to give it the answer-- the probability of a success--and in the long run it would tell you that you had that percent of successes! That's if you're lucky. In the shorter run the computer can help us measure how likely or unlikely a particular outcome from a random even would be, given that the true probability of a success was some number p.

6.2 Probability models

Sample space is the list of all possible outcomes. For instance {heads, tails} or {H, T} is the set of all possible outcomes from one flip of our trusty quarter. For two flips the outcomes could be order-based (HH, HT, TH, TT} or summarized {2H, 1H1T, 2T}, depending on what you are trying to count. Note that all the outcomes in the summarized set are NOT equally likely. Likewise, the outcomes from adding together the number of pips from the roll of two dice would give the sample space {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} with probabilities 1/36, 2/36, 3/36, 4/36, 5/36, 6/36, 5/36, 4/36, 3/36, 2/36, 1/36 respectively.

When we relate the probabilities to discrete outcomes we create a probability distribution. This is often represented in a table. If the data are continuous, there will be a function that describes the probability density.

Some basic concepts from counting theory and probability:

tree diagram- you can map out what happens at each stage of a multi-step random process. Multiplying probabilities as you move out toward the branch of the tree will yield the joint probabilities. The sum of all of the joint probabilities will be 1.

multiplication principle- to find the number of ways joint, independent events can be combined, find the product of the number of ways each step can be performed. For instance, 3 shirts with 5 pants means 15 different combinations if we don't care what goes with what.

replacement- if you put the selected object back into the pool of objects to draw from. If you pick a card, record it, put it back, and draw again, the make-up of the deck does not change from draw to draw. That is WITH replacement. If you kept the card out of the deck, then the remaining cards do not have the same distribution as the original deck. That is WITHOUT replacement. RandInt is with replacement: sorting L2 and L1 by rand numbers in L2 is without replacement.

event- an outcome or a set of outcomes of the random procedure or phenomenon.

We use the expression P(A) to mean the probability that A occurs.

0 <= P(A) <= 1, like we agreed before.

Since the sample space S contains all of the mutually-exclusive, exhaustive possible outcomes of the phenomenon, P(S) = 1.

The probability that an event A does not happen is 1-P(A). There are many symbols for this event, the complement of A. Since they do not translate well to the html format, write them down in class so you will recognize them.

If two events A and B cannot both happen together, they are called disjoint and the probability of at least one of the events happening = P(A or B) = P(A) + P(B). If they actually had some overlap, like drawing a queen and drawing a spade, then you would have to subtract out the probability of the overlap (the Queen of spades!). This more general formula is P(A or B) = P(A) + P(B) - P(both A and B).

You can use a Venn diagram to represent these relationships and make the procedures clearer.

If all outcomes are equally likely, then the probability of any one happening is 1/(the number of outcomes). This is like our treatment of problem 6 in the book and the activity with the numbers 1-2-3 in class. Each outcome from problem 6, {H H H H}, {H H H T}, {H H T H}, {H T H H}, {T H H H}, {H H T T}, etc., is equally likely. We know that there are 2 X 2 X 2 X 2 = 16 possible outcomes (by the multiplication principle). Then the likelihood of any specific event happening is 1/16. If we know that there are 4 ways to get exactly one tail, we can combine these probabilities to get the probability that any one of those four outcomes happens, P(exactly one tail), = P( {H H H T} or {H H T H} or {H T H H}or {T H H H})=4/16.

It is usually wise to write all the probabilities with a common denominator so you can check tht the sum is 1.

Independent events revisited: In Chapter 4 we looked at the probability rules surrounding independent events. If two events are independent, then one event happening does not affect the probability that the other happens.

For instance these are NOT independent:

A = being a Lassiter student
and
B = owning a piece of Lassiter spiritwear

If you are a Lassiter student, then you are FAR more likely than other people to own Lassiter apparel. If you do not own Lassiter apparel, then you have a much higher likelihood of going to Kell.

C= being a Lassiter student
and
D= not being a Lassiter student

If one of these is true, then the other CANNOT be true. Therefore, the occurrance of one SEVERELY impacts the likelihood of the other. These are mutually-exclusive. Mutually-exclusive events are NEVER independent of each other.


It takes something special to be independent. The two most common ways to prove independence are

(1) to check that the product of the marginal probabilities equals the joint probability. If P(A) * P(B) = P(both A and B), then events A and B are independent. This is really an if-and-only-if statement.

and

(2) to check that the marginal probability equals the conditional probability. If P(A) = P(A|B), then B's happening does not affect A's likelihood and A and B are independent.

Remember: P(A) * P(B) = P(A&B) WHEN A and B are independent.

Also, coins, dice, cards, and the Roulette wheel have no memory. They don't care what their last outcome was: every trial is independent.

6.3 More

The 5 basic rules of probability are recorded on page 341.

Can you answer. . .

what is a union of events?

how do you compute the probability of at least one of some collection of events happening?

how is the addition rule modified when there is overlap between the events?

how is the addition rule modified when there is overlap among three events?

what does conditional probability mean? Can you interpret a Venn diagram to calculate a conditional probability? Can you CONSTRUCT one??????

what is the intersection of two events?

when is the probability of BOTH of two events equal to the product of their respective probabilities? when is it NOT?

can you read tree values with probabilities to calculate marginal, joint, and conditional probabilities? Can you CONSTRUCT one?????



It looks like we need better understanding of INDEPENDENCE.

Problem 6.36 If the probability that the woman in the study was over 65 years old was (.365 + .190) = .555 and the probability that she had the tests done was (.321 + .365) = .686, then the probability that she was over 65 and had the tests done should be .555*.686 if the AGE and TEST status are independent. We multiply these two marginal probabilities together and get .38073. We look in the table and see that the actual probability of being over 65 and having the tests done is .365. Because these are NOT the same, we conclude that there was some connection between the two characteristics, AGE and TEST status.

Problem 6.42 d. You have to demonstrate that the two characteristics are not mathematically independent. Find P(widowed), P(65+), P(widowed)*P(65+), and P(widowed AND 65+). If P(widowed)*P(65+) = P(widowed AND 65+), then the two characteristics are independent. You HAVE TO show the numbers. YOu have to show that they are equal--or not.

To find each piece of the puzzle:
P(widowed) = row total # widowed/total of all women

P(65+) = column total # of 65+/total of all women

P(widowed and 65+) = number from the body of the table where widowed and 65+ intersect/total of all women.


6.44 P(W) = 856/1626, P(W|Pr) = 30/74. Because these are not equal, we know that the characteristics Female and Professional are not independent. We prove it by comparing P(w)*P(Pr) to P(W and Pr). (856/1626)(74/1626) does not equal 30/1626. Therefore, they are not independent.

6.46 P(Male) = 24,457/(24,457+6027), P(F) = (15802+2367)/(24457+6027), P(F|Male) = 15802/24457, P(F|Female) = 2367/6027

"Among those who. . .males are more likely than females . . ."

6.47 P(all three)=5%, P(Coffee only) = 20%, P(coffee and tea only) = 10%, P(tea only) = 5%, P(Tea and cola only) = 5%, P(cola only) = 15%, P(Coffee and cola only) = 20%. So, what is the probability that a randomly-selected adult drinks none of the above?


6.48 P(B|A) = .32 = P(A&B)/P(A) = P(A&B)/.46

Solve for P(A&B).

6.49 P(R|F) = .8 = P(R&F)/P(F) = P(R&F)/.4

Solve for P(R&F)



Interesting site: http://www.paly.net/~sfriedla/apstatistics/

Wednesday, November 01, 2006

Chapter 5 Sampling, experiments, and simulation

Essential questions:

Can the data we collected be generalized to the population?

How can the survey or experiment be designed to accomplish our goals?

How can we confirm our suspicions using simulation?

-----------------------------

Running list of key concepts from class:
Survey
Census
Simple Random Sample (SRS)
Systematic Random Sample
Stratified Random Sample takes samples from all strata
Convenience Sampling
Table of Random Digits


Cluster Sampling takes a sample from a few clusters
Multi-stage sampling is a complex form of cluster sampling
Probability Sample like the computer lottery at LHS
Bias when method favors certain outcome(s)
Undercoverage when systematically omits part of population from inclusion
Non-Response when they refuse to participate
Sampling Frame is the list from which the sample is drawn


Experiments:
observational studies
experiments
experimental units/subjects
treatment
factor
level


control
comparison of several treatments
placebo effect results in bias
reduces the effect of lurking variables (confounding and bias)
could include blocking (not required) *BLOCKING reduces the variability within the group, so effects of the treatments can be more easily recognized.
control group
matched pairs design is smallest block

randomization
matching of characteristics does not work
required real randomization, not just haphazard guesswork
makes the effect of any uncontrollable lurking variables affect all groups equally, thereby also reducing bias
When the problem asks for the experimental design, it requires that you describe how you will randomly allocate experimental units/subjects to treatment groups. Two key points to remember: you CAN'T randomly assign subjects to blocks, because the characteristic you are blocking for is not random, AND this is not a SRS.


replication
allows you to generalize your data to your population
makes the experiment more sensitive to differences among treatments, instead of just random variation between the groups. The compiled or averaged results from a larger group of subjects should more precisely represent the actual, underlying truths of the relationship than results from smaller numbers of subjects. Of course, there is a cost trade-off.


simulation
use table of random variables or random number generator
CLEARLY identify what specific random outcomes represent, such as
The digits 0-4 represent a vote for Adams, 5 & 6 are a vote for Jefferson, 7-9 will be a vote for Roosevelt. Take one random digit at a time, comparing the result to our mapping above, until we have identified 100 votes and the corresponding candidates.

or . . . in cases where you CAN'T reuse a number . . . "Assign each child a unique number 01-47. Take two digits at a time from the TORD (table of random digits), recording the names of the students as we select their number, throwing out any number greater than 47 or those which have already been used.

When a question asks you to describe or explain, there should be a description or explanation in your answer. Just providing a mapping is not sufficient.


When it asks you for the sampling or experimental DESIGN, an explanation of how you are going to select your random units must follow. You must describe how you will assign the digits to the outcomes, how you will take the digits from the TORD, what "toss out" rules you need for duplicates or numbers that have no correspondences, and when you will stop. You have to explain it all. You will need to write.

Some common calculator stuff: Rand(100) selects 100 random numbers between 0 and 1 where repetition is HIGHLY unlikely.

RandInt(5,29,31) selects 31 random digits from the range [5,29] and allows repeats.

SortA(L2,L1) sorts both L2 and L1 in the ascending order of L2.



Watch this space for more key words.