Blog der Hauptbibliothek

p-hacking – not a trivial offense

19. November 2019 | Martina Gosteli | Keine Kommentare |

This post is also available in: Deutsch

Guest post by Dr. Eva Furrer, Center for Reproducible Science, UZH

p-hacking, among others also called data dredging or fishing for significance, is one of several questionable research practices, that is denounced and fought against increasingly in the last few years. But the awareness of the issue is not new, for example:

  • “The Meaning of “Significance” for Different Types of Research”, de Groot 1956, originally in Dutch, here translated by E.J. Wagenmakers.
  • “The scandal of poor medical research”, Altman 1994 in BMJ.
  • “Why Most Published Research Findings Are False”, Ioannidis 2005 in PLOS Medicine.

So, what is p-hacking? Searching for p-hacking on Wikipedia leads to the article on data dredging that starts like this:

“Data dredging (also data fishing, data snooping, data butchery, and p-hacking) is the misuse of data analysis to find patterns in data that can be presented as statistically significant, thus dramatically increasing and understating the risk of false positives. This is done by performing many statistical tests on the data and only reporting those that come back with significant results.”

Impressive examples of misuse of data and of the massive damage this can inflict on science and society are the examples of Andrew Wakefield, Diedrik Stapel or Brian Wansink. These were much discussed in the media because they contain to a varying degree elements of fraud; such cases are in reality rather rare. But it appears that science in general has room for improvement concerning reproducibility, see for example Baker 2016 in Nature.

Besides p-hacking there are related and similarly questionable practices:

  • HARKing (Hypothesizing After the Results are Known): describing in hindsight statistically significant results as the ones that were searched for in the first place
  • Optional stopping: continuing to experiment until a significant result turns up
  • Selective reporting: publishing only statistically significant results, non-significant ones end up in the file-drawer.

These practices contribute on the whole to a bias in the literature through too many published false positive and too few published true negative results. In combination with low power this leads to serious problems, see for example Button et al 2016 in Nature. If you are a statistics amateur and want to understand the problem behind too many false positive results watch this Video by The Economist.

There are several promising approaches on how to overcome such prevalent reproducibility issues:

An entire series of articles regarding suggestions for improvement of scientific practice has been published in 2014 in The Lancet, Research: increasing value, reducing waste. At UZH the Center for Reproducible Science has been founded in 2018 with the objective to raise awareness regarding good scientific practice and to close gaps in the education to this effect. The final goal is as suggested in the Lancet series to increase value and to reduce waste at UZH through training, collaboration and methodological research

Presentation by Dr. Eva Furrer

Abgelegt unter: Coffee LecturesGood to knowTips for Physicians & Health ProfessionsTips for Researchers
Tags: