Statistical vs. Biological Significance
The conclusion that there is a statistically significant difference indicates only that the difference is unlikely to have occurred by chance. It does not mean that the difference is necessarily large, important, or significant in the common meaning of the word. An example is the measurements made to determine whether or not Surveillance Towed Array Sensor System Low Frequency Active (SURTASS LFA) sonar transmissions affect the singing of humpback whales near Hawaii. The following graphs show the distribution of humpback whale song length (in minutes) during control periods when no sounds were being played (top) and during experimental conditions when LFA sounds were being played (bottom).
The mean length of the whale songs was 29% greater during transmissions. Given the measurements that were made, there is only a 4.7% probability that this difference is due to chance, and the scientists doing the study therefore concluded that the result is statistically significant. However, these data are from measurements made on a small number of whales that were followed before, during, and after transmissions. Since the number of measurements is relatively small, the probability that the scientists could have made a false negative (Type II) error is 50%. The power of the measurements to detect a difference is low.
Although the scientists concluded that the difference in the length of whale songs in the presence and absence of transmissions is statistically significant, the two graphs look similar and show that there is considerable variation in the length of humpback whale songs. The standard deviations of the distributions are considerably greater than the difference in the means. The response to the transmissions, that is the magnitude of the increase in song length, is well within the normal variation in the absence of the transmissions. The songs are sung exclusively by males and are thought to be displays to attract mates. Large changes in singing behavior might therefore have significant consequences to a humpback whale population. It seems unlikely, however, that changes in singing behavior that are well within the natural range of variability pose such a risk. The conclusion is that although the difference in song lengths is statistically significant, it is unlikely that it is biologically significant.
Additional pages under Statistical Uncertainty:
- Measurement Errors are due to the physical limitations of the sensors and techniques used to make the measurements.
- Natural Variability is the range in values of naturally occurring parameters in biological and other natural systems.
- False Positives and False Negatives are errors associated with making a decision.
Additional Links on DOSITS
- Behavioral Changes
- False Positives and False Negatives
- Humpback Whale
- Surveillance Towed Array Sensor System Low Frequency Active (SURTASS LFA) Sonar
- Vocalizations Associated with Reproduction
- Fristrup, K. M., Hatch, L. T., & Clark, C. W. (2003). Variation in humpback whale (Megaptera novaeangliae) song length in relation to low-frequency sound broadcasts. The Journal of the Acoustical Society of America, 113(6), 3411. https://doi.org/10.1121/1.1573637
- Miller, P. J. O., Biassoni, N., Samuels, A., & Tyack, P. L. (2000). Whale songs lengthen in response to sonar. Nature, 405(6789), 903–903. https://doi.org/10.1038/35016148
- Zales, C. R., & Colosi, J. C. (1998). An exercise where students demonstrate the meaning of “not statistically significantly different.” The American Biology Teacher, 60(8), 596–600. https://doi.org/10.2307/4450557