Why do simple techniques work?

My past few posts have been driven by an underlying question that was pointedly raised by someone in a discussion group I follow on linkedin (if you’re curious, this is a Quant Finance group that I follow due to my interest in autonomous agent design and the question was posed by a hedge fund person with a Caltech PhD and a Wharton MBA):

I read Ernest Chan’s book on quantitative trading. He said that he tried a lot of complicated advanced quantitative tools, it turns out that he kept on losing money. He eventually found that the simplest things often generated best returns. From your experiences, what do not think about the value of advanced econometric or statistical tools in developing quantitative strategies. Are these advanced tools (say wavelet analysis, frequent domain analysis, state space model, stochastic volatility, GMM, GARCH and its variations, advanced time series modeling and so on) more like alchemy in the scientific camouflage, or they really have some value. Stochastic differential equation might have some value in trading vol. But I am talking about quantitative trading of futures, equities and currencies here. No, technical indicators, Kalman filter, cointegration, regression, PCA or factor analysis have been proven to be valuable in quantitative trading. I am not so sure about anything beyond these simple techniques.

This is not just a question about trading. The exact same question comes up over and over in the domain of robotics and I have tried to address it in my published work.

My take on this issue is that before one invokes a sophisticated inference algorithm, one has to have a sensible way to describe the essence of the problem – you can only learn what you can succinctly describe and represent! All too often, when advanced methods do not work, it is because they’re being used with very little understanding of what makes the problem hard.  Often, there is a fundamental disconnect in that the only people who truly understand the sophisticated tools are tools developers who are more interested in applying their favourite tool(s) to any given problem than in really understanding a problem and asking what is the simplest tool for it. Moreover, how many people out there have a genuine feel for Hilbert spaces and infinite-dimensional estimation while also having the practical skills to solve problems in constrained ‘real world’ settings? Anyone who has this rare combination would be ideally placed to solve the complex problems we are all interested in, whether using simple methods or more sophisticated ones (i.e., it is not just about tools but about knowing when to use what and why). But, such people are rare indeed.

Advertisements

5 thoughts on “Why do simple techniques work?

  1. Hi Ram!

    If you’re starting with nothing, the simple tools provide an enormous improvement over what you already have. After that more complicated tools can at best provide asymptotic improvements toward the optimum. Combine this with the fact that in many real-world prediction problems (e.g. trading) the Bayes-optimal solution is actually pretty poor compared to “perfect” performance (e.g. what a time traveler from the future could do.), and advanced tools are often disappointing. (Although this doesn’t explain why they would actually lose money in financial prediction.)

    In addition, complicated tools usually have large and poorly understood parameter spaces. The researchers who originally published the work on those tools probably spent the better part of a Ph.D. program learning the black art of setting the parameters of their tools, but that art is rarely if ever passed along in any systematic way in the literature. In fact the researchers themselves may not be aware of the depth of their own expertise until they see someone else using their techniques and say “no no! Those parameters don’t make any sense because your data has property X.” In many contexts time is quite literally equivalent to money, so its hardly surprising that, for example, Naive Bayes, with its one free parameter, does so well.

    Often the real problem isn’t that the advanced techniques “don’t work”, it’s that the cost of figuring out how to make them work is (or seems) greater than the marginal gains over the simpler methods.

    On the other hand I think that within certain communities there often arises the “cultural illusion” that techniques don’t work, because “so-and-so tried it, and it didn’t work.” But we all know how hard it is to really prove a negative result. How do you now if so-and-so really knew what he was doing? Maybe Ernest Chan didn’t know what he was doing? In the late 60’s Minsky and Papert published a paper proving that a single-layer perceptron couldn’t be trained to solve non-linearly-separable classification tasks and effectively killed Neural Networks research for 15 years. Of course, it was later shown that multi-layer perceptrons can be trained to solve non-linearly separable problems.

    I recently had a conversation with some algoritmic traders, and I asked whether they had tried any reinforcement learning methods for trading. They replied very dismissively that they had “tried that and it didn’t work”. I wasn’t really able to ask them exactly what they had tried, but it was clear to me that (a) they had no RL expertise at all, and (b) their experience had caused them to foreclose on the possibility of any success with RL in the future. Is this their failure, or a failure of the RL research community to publish their research in such a way that non-expert practitioners can get some benefit from it? (If such a thing is even possible.)

    • Is this their failure, or a failure of the RL research community to publish their research in such a way that non-expert practitioners can get some benefit from it? (If such a thing is even possible.)

      My view is that it is very hard to publish in the research literature in the hope that non-expert practitioners will pick up on it. It sometimes happens, and I know that practitioners do pay attention to claims being made in the research literature (e.g., my most downloaded paper – by a very large margin – is a trading agents paper based on my entry to the PLAT competition which was a ‘side project’. My robotics work, which was my core research that I spent years on, hasn’t attracted anywhere near that kind of attention on the web).

      However, it is better for algorithms researchers to venture out and take the result far enough to make it count, preferably collaborating with practitioners. This makes the original research worthwhile, but also informs the researcher of the problems that actually matter – making the next research question that much more relevant.

  2. Jeff,

    I agree with the incremental benefit argument. We just submitted a trading related paper to a workshop which was intended to be a first cut of a more elaborate conceptual idea. And we were surprised by how well the first pass already performs. In this particular case, the benefit of more complex methods would be to make performance more robust across a broader range of scenarios and, we expect, even gaining a bit of quantitative improvement.

    The poorly understood parameter space issue is what seems to make the comparison tricky. In our own work, we find that it is nontrivial to scale from our simple first experiment to the more elaborate real experiment. I am quite sure there is improvement to be had, but there is a price in terms of modelling effort. Since I am an academic with an interest in such things, I’ll put in the effort. But, I can easily see this becoming too steep a barrier.

    Also, just to clarify, Ernest Chan is a theoretical physics PhD with substantial experience in major investment banks. So, I suspect he really did have the skills to handle the advanced tools. His most recent book is based on disillusionment with his experience there. Of course, beyond that, it is hard to know whether or not he suffers from the same biases based on failures of partial experiments.

  3. My view on this – it’s not the methods by the underlying assumptions. Complicated methods call for various simplifications and these may be spurious. SO, while the analysis might be spot-on, the behavior of the market, other actors, extraneous factors and even the model of “randomness” chosen may be misunderstood. Simple techniques usually call for simplistic, but empirically verified assumptions that are most likely to be work.

  4. I never believe there is some “really simple” problem there. Any research to any problem has a context which can involved so many uncertain factors and unexpected disturbances. However, I also believe sometimes people have the talent, if I can call it a talent, to intuitively solve problems that can hardly be formulated formally and deducted in a strict scientific approach. Like the paragraph you quoted, it was from “a hedge fund person with a Caltech PhD and a Wharton MBA”. Considering the complex background of this person and the topic he was trying to address, I, personally, think it’s partly an “art” problem. I strongly agree with you in that people should really know well about the representation of the problem in question, then we can employ the corresponding methodology. But in a point of view of engineering, I am not sure if it is even possible to get to know all the context and root of the problem in question before you start to crack on it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s