I have been following Rick Bookstaber’s blog and there is a very interesting discussion there about robustness in the face of large scale uncertainty. As a conceptual issue, this has been on my mind for a while. In fact, this issue was one of the motivations behind some aspects of my doctoral thesis and I am actively trying to address this question in the context of my robotics and autonomous agents work. All this has made me realize that this is one of the big hard problems (however, somewhat under-appreciated) within the science of decision processes.
Bookstaber’s take on this issue is well put in this paper in the Journal of Theoretical Biology. In a nutshell, he argues that if an animal has to make decisions in a world where it can observe some states, s, but not some other states, t, (whose evolution it simply can not model, i.e., the concept of extended uncertainty) then the process of making an optimal decision in this setting would imply “averaging” over the extended uncertainty in a proper decision-theoretic sense. The main implication of this process is that the animal will adopt a set of coarse behaviours that will be sub-optimal with respect to many observable features in the restricted state space, s. There is a lot more detail in that paper, mainly focussed on theoretical biology issues of animal behaviour.
I find this particularly interesting because I too had been approaching the problem of learning robust behaviours by assuming that there are coarse behaviours (however, defined differently from how Bookstaber does it) for most tasks of real interest, such that fine behaviours with respect to specific quantitative optimality criteria are specializations of these coarse behaviours. Two questions arise within this program. How will you find concise descriptions of these coarse behaviours while learning from experience in a complex world? Many learning techniques, such as reinforcement learning, run into deep computational difficulties in this hierarchical setting. However, this is a well posed technical question already being addressed by many people, and I am actively working on as well. A somewhat more foundational conceptual question is – why do you think these types of coarse behaviours will exist in general? I am a control theorist who would ideally like to find solutions for these learning problems that are indepdent of the special domains (e.g., is it clear that coarse behaviours would exist in arbitrary decision processes such as, say, in autonomous trading), so this generality question is of interest to me. Bookstaber’s paper points towards the answer in a general decision-theoretic sense.
In an earlier post, I tried to make an argument that the point of “intelligence” is to be able to act robustly in an uncertain world. Now we see that optimal decisions in the face of extended uncertainty implies the existence of coarse behaviours for many common decision tasks. So, perhaps, an agent who is learning a complex behaviour in an uncertain world is better off structuring this process in a similar multi-level way…