The P value (significance level) is possibly the mostly widely used, and also misused, quantity in data analysis. P has been heavily criticized on philosophical and theoretical grounds, especially from a Bayesian perspective. In contrast, a properly interpreted P has been strongly defended as a measure of evidence against the null hypothesis, H0. We discuss the meaning of P and null-hypothesis statistical testing, and present some key arguments concerning their use. P is the probability of observing data as extreme as, or more extreme than, the data actually observed, conditional on H0 being true. However, P is often mistakenly equated with the posterior probability that H0 is true conditional on the data, which can lead to exaggerated claims about the effect of a treatment, experimental factor or interaction. Fortunately, a lower bound for the posterior probability of H0 can be approximated using P and the prior probability that H0 is true. When one is completely uncertain about the truth of H0 before an experiment (i.e., when the prior probability of H0 is 0.5), the posterior probability of H0 is much higher than P, which means that one needs P values lower than typically accepted for statistical significance (e.g., P = 0.05) for strong evidence against H0. When properly interpreted, we support the continued use of P as one component of a data analysis that emphasizes data visualization and estimation of effect sizes (treatment effects).
All Science Journal Classification (ASJC) codes
- Agronomy and Crop Science
- Plant Science