this post was submitted on 09 Jul 2023
77 points (100.0% liked)
Science
13025 readers
5 users here now
Studies, research findings, and interesting tidbits from the ever-expanding scientific world.
Subcommunities on Beehaw:
Be sure to also check out these other Fediverse science communities:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
P-values-based methods and statistical significance are flawed: even when used correctly (e.g.: stopping rule decided beforehand, various "corrections" of all kinds for number of datapoints, non-gaussianity, and so on), one can get results that are "statistically non-significant" but clearly significant in all common-sense meanings of this word; and vice-versa. There's a constant literature – with mathematical and logical proofs – dating back from the 1940s pointing out the in-principle flaws of "statistical significance" and null-hypothesis testing. The editorial from the American Statistical Association gives an extensive list.
I'd like to add: I'm saying this not because I read it somewhere (I don't like unscientific "my football team is better than yours"-like discussions), but because I personally sat down and patiently went through the proofs and counterexamples, and the (almost non-existing) counter-proofs. That's what made me change methodology. This is something that many researchers using "statistical significance" have not done.
This is interesting and something I've not heard of - can you recommend a starter link for someone with a basic stats background? I had some in undergrad, but this sounds like a topic that could get very tinfoil-hat-y if not searched correctly and with good context.
There's still a lot of debate around this topic. It's obviously difficult for people who have used these methods for the past 60 years to simply say "I've been using a flawed method for 60 years" – although in the end that's how science works. The problem moreover is double: the method has built-in flaws, and on top of that it's often misused.
Some starters:
The official statement by the American Statistical Association
A follow-up editorial
Signatories for the dismissal of the method
Many papers explaining the built-in flaws, from this old 1935 paper and this old 1965 discussion, to more recent ones; for example this, or this, or this, or this, or this tutorial
This paper gives a good summary
Journals that don't accept "statistical significance" methods anymore: this or this
Several books, for example this one. I agree with the factual content of this book, but I don't like the authors's braggart way of writing. In their defence, though: it's the same braggart way of writing that R. A. Fisher, the father of "statistical significance", often had.
What's sad is that these discussions easily end in political or "football-team"-like debates. But the mathematical and logical proofs are there, for those who care to go and read them.
Thanks, I appreciate it - looks like I've got some bedtime reading for awhile :)
My pleasure!
Ah, I thought you were talking about p-values - which is just a simple metric and gets a bad rep from being used for statistical significance. Statistical significance certainly is trash.
Yes I'm talking about p-values. Statistical "significance" is based on p-values.