Chapter 6

Besides

Inferential statistics are based upon the idea of a

For example, suppose you were arrested on some charge (say murder). In the US criminal justice system, the null hypothesis is that you are innocent of the crime. The state has the burden to show that this null hypothesis is not likely to be true (guilty beyond a reasonable doubt). If the state does that then the jury rejects the null hypothesis and accepts the alternate hypothesis, that is that you are guilty. Thus the state has to show that you are not innocent, in order to reject the null hypothesis. It is similar in statistics. Your statistical test must show that the two items are different.

In the trial, if the null hypothesis is not rejected, your innocence has not been proven. It is just that the state failed to support your guilt. You are never proven innocent in terms of the trial; the state simply failed to show your guilt. The same is true in statistics, if you fail to show that the two items are different, then you fail to reject the null hypothesis but you have not proven that the null hypothesis is true. It is correct to say that you rejected your null hypothesis, but it is incorrect to say that you accepted your null hypothesis. The last statement implies that you have shown the null hypothesis to be correct and that is not the case. If you reject your null hypothesis, then you must accept your null hypothesis that the two items are different. In the trial if the jury rejected your hypothesis of innocence, then they must accept the alternate hypothesis that you are guilty and you must be sent to jail.

Philosophically, it is important to understand these concepts associated with the null and alternate hypothesis. After looking at a statistical test, we will revisit these concepts and examine the types of errors that one can make with statistics.

This is the case when you have a sample and you wish to determine if this sample could have come from a population with a known mean.

Suppose that it is known that the mean life span of horses is 22 years and the population standard deviation (sigma) is 3.8 years. Your family has for years been developing a new breed of horse, and based upon a sample of 25 horses, the mean life span (sample mean) is 24.23 years. Does this new breed of horse have a life span that is different from than that of horses in general? The null and alternate hypotheses are these.

H

H

Some symbols are impossible to put into HTML and thus I will use the following for these:To test this null hypothesis, we will state this a different way. What is the probability of drawing at random a sample of 25 horses whose mean life span is more deviate than 24.23 from 22 years. If this probability is low (<= 0.05), then we will reject our null hypothesis. The reason that I said more deviate, is that we can also reject if the Xbar was much smaller than 22 as well as rejecting if Xbar is much larger than 22.

=/for not equal to

<=for less than or equal to

>=for greater than or equal to

x^2for x to the power of 2, x squared

mufor the population mean

sigmafor the population standard deviation

The first step is to convert Xbar to a Z score (example 6.6, page 81)

Z = (Xbar - mu)/(sigma/square root of n)

Z = (24.23-22)/(3.8/square root of 25) = 2.23/0.76 = 2.93

Thus, we want the quantity P(Z > 2.93) + P(Z < -2.93). This includes Xbar being more deviant by being larger (Z > 2.93) and by being smaller (Z < -2.93).

P(Z > 2.93) + P(Z < -2.93) = 0.0017 + 0.0017 = 0.0034.

We used table B2, because we knew the population standard deviation and thus our standard deviate was distributed as a normal curve. The chance of getting an Xbar this deviant from a population whose mean is 22 and whose standard deviation is 3.8 is less than 0.05, thus we reject our null hypothesis that mu of the new breed is equal to 22. It is unlikely (P = 0.0034) that our sample of 25 horses came from a population whose mean is 22.

We are working with the standard normal curve. You might realize that in general, we would reject any null hypothesis for which the area in the two tails beyond our calculated Z score combined is less than or equal to 0.05. Thus if P(Z > 1.96) + P(-1.96 < Z) = 0.025 + 0.025 = 0.05 we would reject our null hypothesis. Thus 1.96 becomes a critical value, such that when we calculate our Z score if the |Z| >= 1.96, then we reject our null hypothesis. For example, if we had drawn a sample of 16 horses and the Xbar was 24, then our Z score would have been 2.11 and again we would have rejected our null hypothesis.

What we did above is called a two-tailed test, that is we will reject our null hypothesis if Xbar is too large

H

H

Note that to show that the new breed of horse is longer lived, we must reject a null hypothesis that says that it is not longer lived. To do this, we need to know what value of Z would satisfy the equation P(Z > ??) = 0.05 (the area to reject is totally within the right tail. If we look at table B2, we can see that this value is 1.65. The true value for Z is between 1.64 and 1.65. Thus, we will reject our null hypothesis if our calculated Z score is >= 1.65. As our calculated Z score for Xbar equals 24.23 years is 2.93, we again reject our null hypothesis.

What if we turned our null hypothesis around. We want to show that Xbar comes from a population that has a shorter life span.

H

H

In this case, to show that the new breed of horse has a shorter life span, we attempt to reject a null hypothesis that says that the new breed comes from a population whose mean is equal to or greater than 22. We want to reject the null hypothesis, if the Z score falls in the left tail. As the normal curve is symmetrical, we will reject our null hypothesis if our calculated Z score is less than - 1.65. As our calculated Z score is +2.93, we fail to reject our null hypothesis.

When doing two-tailed tests, you can use the absolute value of the Z score to test relative to a critical value. On the other hand, when doing one-tailed tests, it is important to understand in what tail the area to reject lies (left or right tail) and that the sign of the calculated Z score is also important.

It is also important that you choose to do a one-tailed or two-tailed test before collecting the data and calculating the statistic. You probably noticed than one-tailed tests have lower critical values making it easier to reject a null hypothesis.

conclusion about null hypothesis from statistical test | |||

fail to reject | reject | ||

truth about null hypothesis | true | correct | type I error |

false | type II error | correct |

From the table above (table 6.1, page 83), if the null hypothesis is true and we fail to reject it then we have made a correct decision and if the null hypothesis is false and we rejected it, then we have also made a correct decision.

If the null hypothesis is correct and we reject it then we have made a

If the null hypothesis is false, but we fail to reject the null hypothesis, then we make a

If you increase the sample size, you can decrease beta while holding alpha constant. Thus increasing your sampling effort is an excellent (and often only) way to reduce your chance of making an error. This is the reason that your major advisor or project leader may suggest that you need more data, especially if you have been unable to reject a null hypothesis. When you fail to reject a null hypothesis, you do not know if it is because the two populations are really equal or if you have been unable to reject because of type II error.

The quantity (1- beta) is called

Previously, we had discussed two quantities (g

For

For

Other methods include a statistic called the

If you recall from our basic statistics that we calculated using Sigmastat, we got output concerning g

Skewness (g

Kurtosis (g

Read through chapter 7.14 to understand one-tailed testing of these hypothesis concerning normality.

In the output from SigmaStat, you also found a value for the K-S Distance (Komolgorov-Smirnoff D Statistic). This value was 0.149. Notice also that the P value associated with this number is 0.037. As P is less than 0.05 (remember this is our level of significance), we reject our null hypothesis (H

In the vast majority of cases, we will not know what the population variance is but will have to estimate it from a sample. For example, in the above example with the horses, we assumed that the population variance was 3.8, however it is unlikely that we would have know this. In that case 3.8 would have been the sample standard deviation. Now to test the null hypothesis that mu of the new breed of horses equals 22, we would still calculate the quantity (Xbar - 22)/SEM (equation 7.1). However this quantity, as we noted earlier, is not distributed as a normal distribution but has a t-distribution when sigma must be estimated from the sample. Thus we can not use the standard normal curve to determine a critical value to reject the null hypothesis, instead we must use table B3, which are values for t- distribution and the quantity given by equation 7.1 is a t value.

To look up a critical value for the t-distribution, we need to know the degrees of freedom, which is n - 1 (25 - 1 = 24). Table B3 lists alpha (the level of significance) across the top and as we are doing a two-tailed test, we will use the column with alpha(2) values. Along the left side are the degrees of freedom, the critical value is at the intersection of 24 degrees of freedom and alpha(2) = 0.05, which is 2.064.

Now when we calculate our t-value, if the absolute value (we are doing a two-tailed test) is greater than or equal to 2.064 we will reject our null hypothesis and if the absolute value of the calculated t-value is less than 2.064, then we fail to reject our null hypothesis.

t = (Xbar - mu)/(sigma/square root of n) = (24.23-22)/(3.8/square root of 25) = 2.23/0.76 = 2.93

As our calculated t-value is greater than 2.064, we reject the null hypothesis that the mean longevity of our new breed of horses equals 22 years.

This is our first introduction to the use of the distribution for the testing of hypotheses. This is generally called a t-test and is one of the most common statistical tests used. It should be mentioned that one of the underlying assumption of the test is that the observations come from a population whose distribution is normal. However a t-test is

Considerations about a one-tailed test for the mean are essentially the same as they were for the normal distribution (chapter 7.2).

Chapters 7.5 and 7.6 are important in that these chapters give you some insights into how the power of a t-test varies with sample size. I will not cover this nor expect you to know any of these formulas, but you should read through this section paying attention to the general concepts. The rest of the chapter will not be covered in this course.

Do problem 6.6 at the end of chapter 6 (page 90) and problems 7.1, 7.2, and 7.4 at the end of chapter 7 (pages 120 - 121).Answers: |

Last updated on 15 September 2009.