Wednesday, February 21, 2007

Chapter 11 Inference for Distributions

To understand this chapter you have to understand the processes of Chapter 10.


The t-distribution is a lot like the normal (z) distribution. It is much more forgiving (look for the references in the book to robustness) than the normal and we use it mostly when we have only a sample to work from--no population standard deviation.

The formulas involving t start out a lot like the z formulas.

t-statistic = (x-bar - mean)/(sample std dev/sqrt n)

and t-interval boundaries are x-bar +/- t* (sample std dev/sqrt n)

We use n-1 degrees of freedom because we "lost " one when we used x-bar to create the estimator s.

The sample std dev / sqrt n is called the standard error of the mean.


The value we use for t*, in fact the line of the table we use when considering probabilities, is based on the number of degrees of freedom (df). You can't use a line with a df = some number if you don't have at least that number of degrees of freedom. It's kind of like buying stuff. If you don't have the money, you can't buy the product. Do you realize what this means??? If you have 990 degrees of freedom and the closest choices in the text are 100 and 1000, you are supposed to select the conservative number, the one you can afford, 100 df. Now, if you can get a closer number from your calculator, use it.

How can you get the value from your calculator? (1) Use the Inv T program or function. Ti-84s with system 2.41 have it. If you have an '84, upgrade your system. If you have something else, get the program.
(2) Use the trick we demonstrated in class: Use T-INT with x-bar = 0, sx = sqrt of n, and n = n. The upper bound of the interval you generate is the estimate for t*.


Paired t-test

This is a routine t-test that is done on matched-pairs data. When you can load the first data set into L1 and the second into L2 and the following two conditions hold, you are looking at a matched-pairs design. (1) Each row of the data has to be naturally linked, as in data coming from the same person--and a different person from the rest of the rows. The two lists are DEFINITELY NOT independent of each other. (2) The variable of interest is the difference between the two values, like L1 - L2. The null hypothesis is usually mu(of the differences) = 0.

To perform the test, just do the regular t-procedures on the column of differences. DF still equals n-1.

If the two sets of data are two independent samples, that's something different. . ..

Two-sample tests

Note: The t-statistic for the difference betwen two means IS NOT t-distributed, but it is pretty close under most conditions.

We use two-sample procedures when we are looking at two separate, independent samples and trying to make an inference about the difference between the two population means.

While most of the procedure is intuitive, the standard error and the number of degrees of freedom require a little explanation.

Std Error of the difference of the means:
Do you remenber how we can't add std deviations? And how the variance of the difference of two variables is the sum of the variances? Put it together for this problem.

Find each sample variance--(s/sqrt(n))^2. Add the two sample variances together. Take the square root. In these formulas, s1 is the sample std dev for the first sample, n1 is the size of the firs sample, etc.

Then the std error of the difference = sqrt( (s1^2/n1) + (s2^2/n2) ).

Degrees of freedom:
For the number of degrees of freedom, either use the number that the calculator or the computer calculates for you or use the more conservative minimum of n1-1 or n2-1.

Hypothesis:
Ho: mu1 = mu 2 which is equivalent to Ho: mu1 - mu2 = 0

Other than these little changes, the procedures are similar to those you've already practiced.



Pooled vs unpooled

This refers to the situations when you believe that the variances of the two populations should really be equal. Using a concept similar to our Law of Large Numbers, combining the standard deviations from the samples in a clever way creates an even stronger estimate for the ONE estimated standard deviation. This is pooling of variances.

Just because the means are the same we cannot assume that the variances are equal also.

We almost never pool variances of X-bar. You can generally leave your calculator set on UNPOOLED and forget about memorizing the formula. You can only pool variances if you are really sure that the variances are equal.

4 comments:

Mrs.L said...

Have you tried to perform a two-sample t-test on your calculator?

ihavenoideawhatsgoingon said...

heyyy so how do you do #1 on the practice test?

girlinstat said...

when you wanna find the degrees of freedom on your calc. i know you use T-interval and the things like what ms linner said about x bar=0, etc. but what order do you put it in? my calc. keeps saying argument!

ihavenoideawhatsgoingon said...

how do you do #3 on the online practice test?