What is P-Value Hacking

P-value hacking refers to the inappropriate manipulation of data analysis by researchers to present statistically insignificant data as significant. Specifically, researchers adjust aspects such as data exploration, experimental methods, and analytical techniques to obtain the desired p-value (usually below 0.05). This practice increases the risk of misinterpreting chance errors or data fluctuations as true effects.

Examples of P-Value Hacking

A well-known example is the comic "Correlation" from xkcd. This comic is often cited as an illustration of p-value hacking.

Initially, researchers hypothesized a connection between consuming jellybeans and causing acne. After conducting a test to investigate this hypothesis, a p-value greater than 0.05 was obtained, suggesting no significant correlation between jellybean consumption and acne occurrence.

However, the researchers didn't give up and tried a new approach. They formulated a new hypothesis, focusing on whether jellybeans of a specific color might cause acne. With another test using 20 different colors of jellybeans, they obtained a result of p < 0.05 for green jellybean consumption. This suggested a potential link between consuming green jellybeans and causing acne. This result gained media attention and was reported as news.

Repeated Testing and the Risk of False Positives

Repeatedly conducting tests raises the risk of arriving at false results. In a single test, the probability of mistakenly rejecting the null hypothesis when it's true is generally set at 5%. However, this probability accumulates with repeated testing.

In statistical hypothesis testing, when the null hypothesis is true, the p-value follows a uniform distribution. This implies a 5% chance of incorrectly rejecting it, even when the null hypothesis is correct. This mistake is known as an α error.

As an example, when using 20 different colors of jellybeans, the probability of mistakenly concluding that there is an effect for at least one color is:

1−{0.95}^{20} = 0.64

This means that by repeating tests, the risk of a Type I error increases from 5% to 64%.

Preventive Measures for P-Value Hacking

Several preventive measures can be taken to counteract p-value hacking:

  • Pre-registration of Research Plans
    Clearly outlining research plans and publicly registering them before starting the study can mitigate the risk of p-value hacking, such as changing analytical methods during data exploration.

  • Avoiding Dependence on P-Value Thresholds
    Rather than fixating on the threshold of p < 0.05, considering the actual effect size and other statistical indicators can reduce the risk of yielding false results.

  • Prohibition of Selective Reporting of Analytical Results
    Reporting all findings impartially helps avoid bias resulting from selective reporting.

  • Ensuring Data Transparency
    Sharing research data and analysis methods enables other researchers to verify reproducibility, thereby avoiding suspicions of p-value hacking.

References

https://www.explainxkcd.com/wiki/index.php/882:_Significant
https://scienceinthenewsroom.org/resources/statistical-p-hacking-explained/
https://embassy.science/wiki/Theme:6b584d4e-2c9d-4e27-b370-5fbdb983ab46
https://en.wikipedia.org/wiki/Data_dredging
https://datascience.stanford.edu/news/data-snooping

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!