Wednesday, November 14, 2007

Chapter 5 -- Producing Data

Be sure to check the comments for this post. Your fellow classmates ask the best questions! Have you taken the online quiz? Have you worked through a study guide? Have you prepared for Thursday's test? Have you finished the reading for Tuesday? Why would I ask you to read that book in the middle of Chapter 5???? [There must be a reason.] Why would you use simulation instead of actually testing the real thing in an experiment? What are the three essential principles of good experimental design? Why is each one important? What does bias have to do with all of this? What IS bias? What IS confounding? When do you block? What is the difference between stratifying and blocking?


The two sweet diagrams of experimental design are on page 272 (Completely randomized) and page 280 (block design).

Blocking is a form of control. When a large number of your experimental units share some pre-existing condition that may make their responses to the treatment vary tremendously WITHIN the treatment groups, you will have a hard time differentiating between the results of the treatment groups. You would really prefer to have the differences in results BETWEEN groups to be big enough so you can make a decision about your comparison. To reduce this vaiability, you may choose to BLOCK by the nuisance variable (the pre-existing condition). Then you RANDOMLY allocate the experimental units in each block to the different treatments. If there are two treatments, then each block is randomly broken into two treatment groups. You proceed by running the experiment on each of the blocks individually.







Thursday and Friday (11/8-11/9) Finish writing up the experiment described below, using all the concepts of section 5.2. ALSO, answer the free response (FR) problems from 2001 and 2003 handed out in class. You definitely NEED to read the section of the book. These are NOT opinion problems.

The Chapter 5 test is Thursday, November 15. Freakonomics must be read by 11/13.


Wednesday night's (11/7) HW: 5.65 PLUS design an experiment to answer this question (at least 6-7 sentences!!).

Does the choice of presentation technology make a difference in student achievement in a geometry class?

Conditions: Geometry classes at Lassiter
four teachers teach geometry
some teachers have students write HW on the board
some teachers have students write HW answers on the overhead projector
some teachers put their official answer transparencies on the O/H.

Two document cameras are available to use (Google document camera if you haven't seen one!)

Students are already assigned to the classes.

How could we design this experiment to answer the question? What questions or clarifications do you have? Bring at least seven complete sentences of helpful guidelines for performing this study.



Friday night's HW: 5.63 and 5.64. Complete most of your Freakonomics assignment this weekend. When you get to the part where the authors belabor their unique name theory, you can consider your assignment completed. what was your favorite part? What connections did the authors make that you agree with? that you don't agree with?

Thursday night's HW: 5.60 and 5.61
Wednesday night's HW: 5.54, 5.55, 5.56 Be safe.

Tuesday night's HW: Complete both of the problems from the 2001 exam.

Example of using the TORD to simulate a bag of M&Ms with the OLD color distribution:

Old Distribution:
Brown 30%
Red 20%
Yellow 20%
Green 10%
Blue 10%
Orange 10%

Let's try this two ways. First, let's use two-digit numbers to simulate candies according to the following schedule.
01-30 Brown
31-50 Red
51-70 Yellow
71-80 Green
81-90 Blue
91-00 Orange

There are no excluded numbers. If we draw the same number twice, use it again!

Using the following line from a table of random digits, simulate drawing 5 candies.

63996 32914

63>>>>Yellow
99>>>>Orange
63>>>>Yellow
29>>>>Brown
14>>>>Brown

The second way requires only one digit. Let 1-3 represent Brown, 4-5 for Red, 6-7 for Yellow, 8 for Green, 9 for Blue, and 0 for Orange.

21833 70905
Using the TORD above, you would get
2 Brown
1 Brown
8 Green
3 Brown
3 Brown

Link to interesting site about the Dewey-Truman polling error. Did you know who the third party candidate was who threw the wrench into the process? Strom Thurmond. Your parents will be impressed that you know this.

http://www.hannibal.net/stories/101998/Pollstersrecall.html

Interesting historical link about Tukey. Scroll to the middle to see his influence in predicting outcomes of elections.

http://www.amstat.org/about/statisticians/index.cfm?fuseaction=biosinfo&BioID=14

The two books I assigned for November are Freakonomics and State of Fear. Freakonomics discusses a lot of associations/correlations that promote critical thinking. State of Fear makes you enlightened consumers of research (even though it IS fiction). Many parents have probably already read one or both of these books. Last year's students (generally) loved them.


Wednesday, October 24
Take notes on the first section of the new chapter, especially new vocabulary.

For Monday and Tuesday of next week:

5.1-5.5, 5.8, 5.11, 5.17-5.18, 5.22, 5.23

Key concepts covered in class (alliteration, anyone??) today included

undercoverage
non-response bias
response bias
convenience sampling
voluntary response sample

and examples like the C-SPAN and American Idol calls, surveying the people sitting around you, the Dewey Defeats Truman mistake, answering with un-truths, failure to respond to surveys.

Can you match the concept to the example? Can you think of another example of each concept in action? Why does each of these result in data we cannot rely on?

10 comments:

Ileana said...

Okay, so I made this basic outline of questions that answer each of the 6 steps (in our little world of Lassiter) in probability simulations, and I just want to make sure I have it straight. It's very rough, but here goes:

1. What outcome does each number represent?
2. How many digits from Table B do I take at a time?
3. Do I exclude any numbers? (If so, which?)
4. What do I do with the valid numbers?
5. Are duplicates allowed?
6. How many times to I repeat the simulation?

Is that good?

Also... CiCi's this Sunday? (despite the long weekend and thus people's inclination to be on vacation?)

Mrs.L said...

That's better than good--it's wonderful!

Yes to CiCi's!

Katie said...

When are supposed to have Freakonomics finished by?
Katie - 3rd

Mrs.L said...

The big discussion day for Freakonomics will be November 13th. Please don't wait until the last minute to find a copy of this book. (Check the local libraries, used booksellers, and your friends' bookshelves. Someone you know probably has a copy you can borrow.)

rossrip said...

I still don't understand what the treatments and the factors are in the experimental design are: is the technologies the treatments and the factors are both the technologies and the teachers, and what about the levels?????

Mrs.L said...

Good question! There are two diagrams in your book that will help.

The technology is definitely a factor, and the different types of technology are the levels of that factor. This could be though of as dosages of the factor.

In our case, because we cannot randomly assign our students to teachers, the teacher is not an experimental factor.

Therefore, the three treatments are the three TECHNOLOGIES.

IF, HOWEVER, we added another factor that had two levels, maybe kids who presented their homework and those who didn't (randomly selected), then the factor PRESENTING has two levels, and there would be six different groups that a kid could randomly be assigned to:

Tech 1 + presenting
Tech 2 + presenting
Tech 3 + presenting
Tech 1 + not presenting
Tech 2 + not presenting
Tech 3 + not presenting

for a total of six treatments.

About the teachers. . .
this is actually the characteristic that you would BLOCK on because (1) the teaching style would have an effect on your test results, and (2) students cannot be randomly assigned to teachers. Look in the text for the section on blocking. Notice that before the randomization occurs in the diagram, the participants are divided up based on their characteristic - gender- that is NOT random. After they are divided into blocks, each block is divided up randomly into treatment groups. There are only three treatments. Each block has as many groups as there are treatments. One treatment is applied to each group.

Important point: THe participants WITHIN a block are supposed to be alike. They are supposed to be different from the participants in other blocks.

Another point: Blocking is not always required. If you have a completely randomized design, you do not have blocking. If you have blocking, you do not have a completely randomized design.

Unknown said...

Mrs. Linner, My dad is mad that i haven't brought home my progress report yet. I was wondering if you could possibly pass then out at school tomorrow. That would help me out a lot!!

Mrs.L said...

I'm sorry! We got so busy with statistics both Monday and Wednesday that I completely forgot!

If I don't remember, please remind me. Thank you, Jonathan.

Mrs.L said...

Question was raised. . .
Is a probability sample that example with the two dice and stuff?

A probability sample is performed like a simple random sample, BUT each outcome does not have the same likelihood of being selected. Instead, as in the case of the M&Ms, outcomes have some other fixed likelihood. You use the table of random digits (TORD) to assign numbers that have a probability that matches the distribution. As an example, if you want the probaility of drawing a defective ball bearing out of a box to be 5%, then you could use numbers 00-04 to represent BAD ball bearings and 05-99 to represent good ones. Simulate the drawing of n ball bearings. . .

You actually did the mechanics of this a lot when you wrote out the 6 simulation steps.

Mrs.L said...

Could you use a probability sample concept with the rolling of two dice? YES.

The probability of rollling a sum of 2 is 1/36,
3 is 2/36,
4 is 3/36, etc.

So you could use 01 = rolling a sum of 2
02, 03 = rolling a sum of 3
04, 05, 06 = rolling a sum of 4
etc.

Ignore the numbers 00 and those > 36.

* * * * * * * * * * * * * * * * * *
Six "steps" are described by Ileana in an earlier post quite beautifully.