False Positives and False Negatives

Two types of errors can occur when deciding whether or not the means of two data sets are different. One can conclude that there is a real difference between the means when, in fact, there is not. Scientists call this a false positive. It is also called a Type I error. If one instead concludes that there is not a real difference between the means when, in fact, there is a difference, this is called a false negative. This is also called a Type II error.

If the means are in fact the same, a scientist using the t-test and the criterion that the difference is statistically significant if there is a 1 in 20 (5%) probability that the difference is due to chance, will correctly conclude that there is no real difference between the means 95% of the time. However, the other 5% of the time the scientist will incorrectly conclude that there is a real difference between the means, when, in fact, there is no difference, making a false positive (Type I) error.

In addition to determining the statistical significance of the conclusion, it is also important to know the power of the measurements. Power is the probability of correctly deciding that the means are different. The lower the power of the measurements, the greater the probability of making a false negative error. The power depends on the size of the difference between the two means, as well as on the variability in the measurements. The power also depends on the criterion that the scientist uses to decide when the difference in the measured means is statistically significant. One can reduce the probability of making a Type I error and increase the statistical significance of the conclusion that there is a real difference by requiring that there be only a 1 in 100 (1%) or 1 in 1000 (0.1%) probability that the difference is due to chance. The more stringent the criterion used, however, the lower the power and the greater the probability of incorrectly concluding that the means are the same when they are in fact different. The greater the difference between the two means, the easier it is to correctly conclude that the means are different.

The validity of the conclusions tends to increase as the number of measurements increases. Scientists designing measurement programs to determine the effects of underwater sound on marine animals have to carefully consider the number of measurements needed to reach meaningful conclusions. They also need to consider the relative importance of making Type I or Type II errors when selecting the criterion used to determine whether or not the difference between the means is statistically significant. Scientists want to avoid concluding that an effect is real when it is not (a Type I error), but in some cases it may be important to avoid concluding that there is no effect when in fact there is one (a Type II error).

The following table summarizes the possibilities. In the formal language of science, the null hypothesis being tested is that there is not a real difference between the means of the two data sets. The term “hypothesis” refers to a careful statement of a tentative or provisional conclusion to be tested.

Decision
Accept The Null Hypothesis
Reject The Null Hypothesis
A The null hypothesis is really true. There is not a real difference between the means of the two groups. 1
You accepted the null hypothesis when it is true. You concluded that there is no difference between the means of the two groups, which, in fact, is the case. You were correct.
2
You rejected the null hypothesis when it is true. You concluded that there is a difference between the means of the two groups when, in fact, there is not a difference. You were incorrect and made a false positive (Type I) error.
B The null hypothesis is really false. There is a real difference between the means of the two groups. 3
You accepted the null hypothesis when it is false. You concluded that there is no difference between the means of the two groups when in fact there is a real difference. You were incorrect and made a false negative (Type II) error.
4
You rejected the null hypothesis when it is false. You concluded that there is a difference between the means of the two groups, which, in fact, is the case. You were correct.
(Table credit: Basic Statistics Web Site For Nova Southeastern University Educational Leadership Students: http://www.schoolofed.nova.edu/edl/secure/stats/)

References

  • "Basic Statistics Web Site For Nova Southeastern University Educational Leadership Students." (Link)
  • Trochim, William M, "The Research Methods Knowledge Base, 2nd Edition." (Link)
Additional Resources

  • Zales, C. R., and Colosi, J. C. 1998, "An Exercise Where Students Demonstrate the Meaning of 'Not Statistically Significantly Different'" The American Biology Teacher. 60 (8), 596–600.