Posts Tagged ‘t-statistic’

June 24, 2012

http://video.ted.com/assets/player/swf/EmbedPlayer.swf

Noticed:

  • It’s easier for me to grok statistical significance (p’s and t’s) from a scatterplot than magnitude (β’s).
  • Even though magnitude can be the most important thing, it’s “hidden” off to the left. Note to self: look off to the left more, and for longer.
  • But I’m set up to understand the correlativeness in a sub_i, sub_j sense — which particular countries fit the pattern as well as how closely.

Questions:

  • Minute __:__ Do each of the dimensions of social problems correlate individually, or is this only a mass effect of the combination?

If it’s true that raising marginal tax rates on the rich lowers crime rates without paying for any anti-crime programmes, that’s almost a free lunch.

UPDATE: Oh, hey, six months after I watch this and 3 days after I put up the story, I see Harvard Business Review has a story corroborating the same effect, instead pointing out how economists don’t look at the p’s and t’s on a regression table. I feel like I “mentally cross out” any lines with a low t value and then wonder about the F value on a regression with the “worthless” line removed.

May 23, 2012

It takes ~20 observations to verify your first significant digit of the mean with confidence.

Do you know how many observations it takes to verify your first sig-fig of the variance? More like 1000. And that’s just to get one digit of accuracy! Higher moments (skew, kurtosis) are even worse.

That’s why I often laugh out loud when I read in the newspaper claims that rely on a certain value of the variance. Even in serious, published papers!—I often see tables with estimates of standard deviation that go out to three decimal places, just because the software spat the numbers out that way. It gives a false sense of accuracy. It’s ridiculous.

Karen Kafadar

May 10, 2011

Null hypothesis testing is voodoo.

Changes in the mental state of the experimenter should not affect the objective inference of the experiment. An argument for using Bayesian data analysis instead of H0 vs Ha.

Imagine you have a scintillating hypothesis about the effect of some different treatments on a metric dependent variable. You collect some data (carefully insulated from your hopes about differences between groups) and compute a t statistic for two of the groups. The computer program, that tells you the value of t, also tells you the value of p, which is the probability of getting that t by chance from the null hypothesis.

You want the p value to be less than 5%, so that you can reject the null hypothesis and declare that your observed effect is significant.

What is wrong with that procedure? Notice the seemingly innocuous step from t to p. The p value, on which your entire claim to significance rests, is conjured by the computer program with an assumption about your intentions when you ran the experiment. The computer assumes you intended, in advance, to fix the sample sizes in the groups.

In a little more detail, and this is important to understand, the computer figures out the probability that your t value could have occurred from the null hypothesis if the intended experiment was replicated many, many times. The null hypothesis sets the two underlying populations as normal populations with identical means and variances. If your data happen to have six scores per group, then, in every simulated replication of the experiment, the computer randomly samples exactly six data values from each underlying population, and computes the t value for that random sample. Usually t is nearly zero, because the sample comes from a null hypothesis population in which there is zero difference between groups. By chance, however, sometimes the sample t value will be fairly far above or below zero. The computer does a bizillion simulated replications of the experiment. The top panel of Figure 1 shows a histogram of the bizillion t values. According to the decision policy of NHST, we decide that the null hypothesis is rejectable by an actually observed tobs value if the probability that the null hypothesis generates a value as extreme or more is very small, say p < 0.05. The arrow in Figure 1 marks the critical value tcrit at which the probability of getting a t value more extreme is 5%. We reject the null hypothesis if tobs > tcrit In this case, when N = 6 is fixed for both groups, tcrit = 2.23. This is the critical value shown in standard textbook t tables, for a two-tailed t-test with 10 degrees of freedom.

In computing p, the computer assumes that you did not intend to collect data for some time period and then stop; you did not intend to collect more or less data based on an analysis of the early results; you did not intend to have any lost data replaced by additional collection. Moreover, you did not intend to run any other conditions ever again, or compare your data with any other conditions. If you had any of these other intentions, or if the analyst believes you had any of these other intentions, the p value can change dramatically.

AUTHOR: John Kruschke. The Road to Null Hypothesis Testing is Paved with Good Intentions.