Monday, November 8, 2010

Just an Idea to Throw Out There

Bayesian Non-Parametric Models as the Appropriate Null Hypothesis.

EX: The Cascading Indian Buffet Process

I was doing some casual web surfing when I came across a set of slides Hanna Wallach made regarding a generative model for deep belief networks (link above). I always liked DBM's but felt that they were almost too general. Add enough layers, give it enough data and they can do just about anything.

That's when something which is probably obvious to many people doing Bayesian Non-Parametrics finally occurred to me: The cascade indian buffet process may constitute the Bayesian equivalent of a null hypothesis (at least for directed graphical models of this kind). After all, given this directed structure, these models have basically no assumptions built into them. None-the-less they are quite complex, much more so than the standard null hypothesis of 'no relationship' which is almost surely false. Structured models appropriate to the data should at least be better than these assumption free models. This is a sad statement for data for which these models actually are the best performers as it suggest that when the cascading IBP is the best performer we should probably conclude that, in those cases, we really don't understand squat about the mechanism which is generating the data.

Anyway, just a thought... and a Chardonnay induced one at that.

Sunday, February 21, 2010

Stop saying that it's JUST noise

There is a comical trend these days of labeling data incompatible with ones theories or views as 'just noise' or as 'simply random fluctuations'. The suggestion implicit statements such as these is that you should somehow discount that data point. This is the confirmational bias at work and is utter folly. Just as every data point compatible with a particular hypothesis should increase your degree of belief in the theory, so also should every disconfirmational data point should decrease that degree of belief. Perhaps not by much, but it must be taken into account and treated with equal weight as all previous data points.

This view is hardly novel, but it is worth repeating. Now that I have you nodding in agreement (and likely questioning the depth of this post) there is one ever so slightly more subtle point I would like to politely make.


The noise is not just some random fluctuations that obscure whatever signal our theory predicts. No, noise is how we model ignorance. Now it may be that there is some fundamental limit to our knowledge about something (as in quantum mechanics), but, in general, the physics is in the fluctuations and the things we should be trying to understand tomorrow is the stuff we were forced to label as noise today. Even in the presence of rational inference, calling something noise and suggesting we forget about it is the equivalent of declaring the scientific process finished and that is something I hope we never do, regardless of how politically convenient it might be.

Q: What do you call a theory that is equally compatible with any data set?

Unscientific is the polite answer.

Time Magazine Then (June 17, 2009):
'Warming will make skiing, ice-skating and snowmobiling pastimes of the past in many areas of the Northeast, decimating the multibillion-dollarwinter-sports industry.'

Time Magazine Now (Feb 10, 2010):
'Climate change could in fact make such massive snowstorms more common, even as the world continues to warm.'

National Geographic has an equally short memory.

National Geographic Then (2009):
'Droughts will become more common'

National Geographic Now (Feb. 2010):
'Scientists say global warming is the main culprit behind this month's eastern-U.S. snowstorms—and it could cause more heavy snowfalls in future winters.'

Not credible on any topic (other than pretty pictures)