Posts Tagged ‘robustness’

April 17, 2012

As nice as it is to be able to assume normality, … there are problems. The most obvious problem is that we could be wrong.

One … very nice thing … is that, in many situations, … [being wrong] won’t send us immediately to jail without passing “Go.” Under a … broad set of conditions … our assumption [could be wrong, yet we] get away with it. By this I mean that our answer may still be correct even if our assumption is false. This is what we mean when we speak of a [statistic] … being robust.

However, this still leaves at least two problems. In the first place, it is not hard to create reasonable data that violate a normality (or homogeneity of variance) assumption and have “true” answers that are quite different from the answer we would get by making a normality assumption. In other words, we can’t always get away with violating assumptions. Second, there are many situations where even with normality, we don’t know enough about the statistic we are using to draw the appropriate inferences.

One way to look at bootstrap procedures is as procedures for handling data when we are not willing to make assumptions about the parameters of the populations from which we sampled. The most that we are willing to assume (and it is an absolutely critical assumption) is that the data we have are a reasonable representation of the population from which they came. We then resample from the pool of data that we have, and draw inferences about the corresponding population and its parameters.

The second way to look at bootstrap procedures is to think of them as what we use when we don’t know enough.

David Howell

May 10, 2011

Null hypothesis testing is voodoo.

Changes in the mental state of the experimenter should not affect the objective inference of the experiment. An argument for using Bayesian data analysis instead of H0 vs Ha.

Imagine you have a scintillating hypothesis about the effect of some different treatments on a metric dependent variable. You collect some data (carefully insulated from your hopes about differences between groups) and compute a t statistic for two of the groups. The computer program, that tells you the value of t, also tells you the value of p, which is the probability of getting that t by chance from the null hypothesis.

You want the p value to be less than 5%, so that you can reject the null hypothesis and declare that your observed effect is significant.

What is wrong with that procedure? Notice the seemingly innocuous step from t to p. The p value, on which your entire claim to significance rests, is conjured by the computer program with an assumption about your intentions when you ran the experiment. The computer assumes you intended, in advance, to fix the sample sizes in the groups.

In a little more detail, and this is important to understand, the computer figures out the probability that your t value could have occurred from the null hypothesis if the intended experiment was replicated many, many times. The null hypothesis sets the two underlying populations as normal populations with identical means and variances. If your data happen to have six scores per group, then, in every simulated replication of the experiment, the computer randomly samples exactly six data values from each underlying population, and computes the t value for that random sample. Usually t is nearly zero, because the sample comes from a null hypothesis population in which there is zero difference between groups. By chance, however, sometimes the sample t value will be fairly far above or below zero. The computer does a bizillion simulated replications of the experiment. The top panel of Figure 1 shows a histogram of the bizillion t values. According to the decision policy of NHST, we decide that the null hypothesis is rejectable by an actually observed tobs value if the probability that the null hypothesis generates a value as extreme or more is very small, say p < 0.05. The arrow in Figure 1 marks the critical value tcrit at which the probability of getting a t value more extreme is 5%. We reject the null hypothesis if tobs > tcrit In this case, when N = 6 is fixed for both groups, tcrit = 2.23. This is the critical value shown in standard textbook t tables, for a two-tailed t-test with 10 degrees of freedom.

In computing p, the computer assumes that you did not intend to collect data for some time period and then stop; you did not intend to collect more or less data based on an analysis of the early results; you did not intend to have any lost data replaced by additional collection. Moreover, you did not intend to run any other conditions ever again, or compare your data with any other conditions. If you had any of these other intentions, or if the analyst believes you had any of these other intentions, the p value can change dramatically.

AUTHOR: John Kruschke. The Road to Null Hypothesis Testing is Paved with Good Intentions.

December 16, 2010

Antifragility

Nassim Taleb wants us all to go long vol — not just be able to withstand volatility, but to actively seek it out.

He can certainly bet that way (and does — though it’s not paying off), but it’s a bad idea to make society anti-fragile.

Let me define a few words describing potential responses to volatility:

  • fragile — Taleb means systems that break when catastrophic volatility is applied; he’s thinking of people who deep short volatility or at least indirectly bet on stability
  • robust — like a bridge, or an earthquake-resistant building: built to withstand shocks
  • agile — able to adapt to shocks
  • anti-fragile — shock-loving; shock-seeking; volatility-loving; risk-avid

Taleb points out that there is no word for “the opposite of fragile”; only for “not fragile”. True.

But we really shouldn’t try to make the system break in the case of no catastrophes. Imagine a bridge that shattered only-and-always, when no cars drove on it. Or a building that toppled only-and-always, when no earthquakes were shaking it. (Those would be anti-fragile things.)

It would be stupid to build things that way. Same with the financial system — we want to be prepared for bad times but also, ready to capitalize on good times. A mouse who’s so afraid of cats that it never goes to look for food, will die.

What makes anti-fragility an especially bad idea in finance, is that people might try to sabotage, tweak, or influence the system to make their bet pay off. Let’s say some powerful crook is long volatility — that is, s/he will only get paid if some huge catastrophe happens within the next year. Maybe s/he will engineer a catastrophe. That could be truly terrible.