Careful with the S-word

Market researcher Tom Ewing offers some advice that applies equally well to statisticians — be careful when you use the word “significant” in its technical sense. Depending on the audience, it could lead to misunderstandings:

Non-researchers tend to misread “significant” as “important” or simply “big”. Which isn’t the case – it can be trivial or small, it’s just unlikely to be fluke or coincidence.

Researchers tend to read “significant” as “interesting”. Which isn’t the case either – even big results can be utterly banal, especially if they simply confirm something you could have guessed, or if they repeat information you already have.

It’s good advice in general, but with regard to the latter point we are given the following example:

Suppose we give 1,000 people an IQ test, and we ask if there is a significant difference between male and female scores. The mean score for males is 98 and the mean score for females is 100. We use an independent groups t-test and find that the difference is significant at the .001 level. The big question is, “So what?”. The difference between 98 and 100 on an IQ test is a very small difference… so small, in fact, that its not even important.

Then…

Non-researchers tend to misread “significant” as “important” or simply “big”. Which isn’t the case – it can be trivial or small, it’s just unlikely to be fluke or coincidence.

Researchers tend to read “significant” as “interesting”. Which isn’t the case either – even big results can be utterly banal, especially if they simply confirm something you could have guessed, or if they repeat information you already have.

It’s good advice in general, but with regard to the latter point we are given the following example:

Suppose we give 1,000 people an IQ test, and we ask if there is a significant difference between male and female scores. The mean score for males is 98 and the mean score for females is 100. We use an independent groups t-test and find that the difference is significant at the .001 level. The big question is, “So what?”. The difference between 98 and 100 on an IQ test is a very small difference… so small, in fact, that its not even important.

Then why did the t-statistic come out significant? Because there was a large sample size. When you have a large sample size, very small differences will be detected as significant. This means that you are very sure that the difference is real (i.e., it didn’t happen by fluke). It doesn’t mean that the difference is large or important. If we had only given the IQ test to 25 people instead of 1,000, the two-point difference between males and females would not have been significant.

Personally, I’m not so sure I’d dismiss that significant 2-point difference so lightly. 2 points may not be a meaningful difference in terms of IQ tests, but I’m immediately led to wonder why a significant difference was observed at all. Was there a problem with the sampling, that led to the men and women in the test being different in some way? Was there some kind of problem with the test, that favored women over men? If you get a significant result you don’t expect, it’s well worth investigating why — you may find a surprising, and dare I say, significant, problem with the way the experiment was conducted.

Blackbeard Blog: The Significance Problem (via @russhmeyer)

Follow us on Facebook

Latest News

Optimizing Trademark Registration with Data Analytics

Unlocking Zip Code Insights with Data Analytics

AI and the Future of SEO: How Businesses Can Stay Ahead

Is Your Internet Fast Enough for Streaming AI Generated Content?

Stay Connected

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

Artificial Intelligence for eCommerce: A Closer Look

The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts

Quick Link

More Read

Follow us on Facebook

Latest News

Stay Connected