I will be giving a talk, as part of the IPAB seminar series, through which I will try to further develop this framing of the problem which I expect our group to try to solve in the medium term. In a certain sense, my case will hark back to fairly well established techniques familiar to engineers but slowly lost with the coming of statistical methods into the robotics space. Also, many others are picking up on the underlying needs, e.g., this recent article in the MIT Technology Review gives a popular account of current sentiment among some in the AI community.
The program induction route to explainability and safety in autonomous systems
The confluence of advances in diverse areas including machine learning, large scale computing and reliable commoditised hardware have brought autonomous robots to the point where they are poised to be genuinely a part of our daily lives. Some of the application areas where this seems most imminent, e.g., autonomous vehicles, also bring with them stringent requirements regarding safety, explainability and trustworthiness. These needs seem to be at odds with the ways in which recent successes have been achieved, e.g., with end-to-end learning. In this talk, I will try to make a case for an approach to bridging this gap, through the use of programmatic representations that intermediate between opaque but efficient learning methods and other techniques for reasoning that benefit from ’symbolic’ representations.
I will begin by framing the overall problem, drawing on some of the motivations of the DARPA Explainable AI programme (under the auspices of which we will be starting a new project shortly) and on extant ideas regarding safety and dynamical properties in the control theorists’ toolbox – also noting where new techniques have given rise to new demands.
Then, I will shift focus to results from one specific project, for Grounding and Learning Instances through Demonstration and Eye tracking (GLIDE), which serves as an illustration of the starting point from which we will proceed within the DARPA project. The problem here is to learn the mapping between abstract plan symbols and their physical instances in the environment, i.e., physical symbol grounding, starting from cross-modal input provides the combination of high- level task descriptions (e.g., from a natural language instruction) and a detailed video or joint angles signal. This problem is formulated in terms of a probabilistic generative model and addressed using an algorithm for computationally feasible inference to associate traces of task demonstration to a sequence of fixations which we call fixation programs.
I will conclude with some remarks regarding ongoing work that explicitly addresses the task of learning structured programs, and using them for reasoning about risk analysis, exploration and other forms of introspection.
I found this article, and the associated discussions about what exactly is needed for a useful level of autonomy to be really interesting: http://nyti.ms/1LRy9MF.
A point that immediately stands out is this: “Researchers in the fledgling field of autonomous vehicles say that one of the biggest challenges facing automated cars is blending them into a world in which humans don’t behave by the book.” Roboticists should of course realise that this is the real and complete problem – we can’t just complain about problem humans who do not ‘behave by the book’ – that is exactly the wrong way to approach the design of a usable product! Instead, we need to focus on how to make the autonomous system capable enough to learn and reason about the world – including other agents – despite their idiosyncrasies and irrationality! This really is the difference between the rote precision of old and genuinely robust autonomy of the future.
In our own small way, we have been approaching such issues with projects such as the following:
If you are a UK student looking to work on a PhD project in this area, look into this studentship opening: http://www.edinburgh-robotics.org/vacancy/studentship/571.
Interesting observation (by Kim Hill, social anthropologist, Arizona State University) that I came across while reading something today:
“Humans are not special because of their big brains. That’s not the reason we can build rocket ships — no individual can. We have rockets because 10,000 individuals can cooperate in producing the information.”
A very nice theme that has emerged from research on perception, during the past couple of decades, is the idea that the apparent complexity of a variety of things ranging from images to natural language may be described as synthesis from a much smaller basis. These kinds of generative models are the bread and butter of one of the major branches of statistical machine learning today. One also finds interesting biological evidence, e.g., the famous basis set of Olshausen and Field – based on the nice observation that the natural statistics of images in the world induces the lower dimensional representation.
Where is the action/decision equivalent of these? To be sure, people have applied all of the heavy machinery, ranging from Bayesian Reinforcement Learning to compression of value functions. But, somehow, the payoff from this seems ambiguous at best. We don’t really have robots that can go off and perform wildly different tasks using a nice clean basis of policies that are quickly composed.
One obvious distinction, that often doesn’t get discussed except by people with deep background, is that the perception problem has the singular advantage of admitting a relatively well posed notion of goodness/penalty, which makes statistics well posed too. I want to reconstruct some aspect of the image I see – this was the raison d’etre of a whole variety of approximation theory people in applied math before the vision folks showed up asking for it! On the other hand, describing all possible decision making problems in a finite setting requires a much more elaborate notion of invariance. Could it be that the final answer to this question will demand more recent mathematical tools, perhaps even topology and the mathematics behind game theory?
I helped organize one of the events in the Edinburgh International Science Festival – a talk by Prof. Simon Tett entitled A gentle introduction to climate modelling. As one of the hosts of the event, I had dinner with the speaker and had the opportunity to discuss his views regarding what is hard about modelling these types of processes.
Clearly, the weather seems rather unpredictable and complex. So, what does it take to understand it well enough to be able to do long term predictions – such as for climate change related questions. Is data really the bottleneck as many people believe?
Simon pointed that most of these models are based on a somewhat coarse discretization of the underlying process.For instance, fluid and heat transfer models might be based on cells that are something like 100 square miles across. Clearly, this is going to be ignoring a significant amount of fine structure – some part of which will undoubtedly have larger scale implications. Moreover, the simple solution of just throwing more processors at the problem doesn’t suffice because the problem scales poorly and there is significant sequential structure to these simulations. So, in his view, the hard questions all have to do with structuring models so that both the coarse and fine dynamics can be reasonably captured concisely. Clearly, uncertainty and probabilities play a key role. However, this is quite different from the naive approach of just collecting more and more data in order to refine the underlying distributions. The trick is to first structure the model well enough that as more and more data comes in we really do get a sequentially improved idea of what we really want to understand – the dynamical behaviour of this large system.
I have been following Rick Bookstaber’s blog and there is a very interesting discussion there about robustness in the face of large scale uncertainty. As a conceptual issue, this has been on my mind for a while. In fact, this issue was one of the motivations behind some aspects of my doctoral thesis and I am actively trying to address this question in the context of my robotics and autonomous agents work. All this has made me realize that this is one of the big hard problems (however, somewhat under-appreciated) within the science of decision processes.
Bookstaber’s take on this issue is well put in this paper in the Journal of Theoretical Biology. In a nutshell, he argues that if an animal has to make decisions in a world where it can observe some states, s, but not some other states, t, (whose evolution it simply can not model, i.e., the concept of extended uncertainty) then the process of making an optimal decision in this setting would imply “averaging” over the extended uncertainty in a proper decision-theoretic sense. The main implication of this process is that the animal will adopt a set of coarse behaviours that will be sub-optimal with respect to many observable features in the restricted state space, s. There is a lot more detail in that paper, mainly focussed on theoretical biology issues of animal behaviour.
I find this particularly interesting because I too had been approaching the problem of learning robust behaviours by assuming that there are coarse behaviours (however, defined differently from how Bookstaber does it) for most tasks of real interest, such that fine behaviours with respect to specific quantitative optimality criteria are specializations of these coarse behaviours. Two questions arise within this program. How will you find concise descriptions of these coarse behaviours while learning from experience in a complex world? Many learning techniques, such as reinforcement learning, run into deep computational difficulties in this hierarchical setting. However, this is a well posed technical question already being addressed by many people, and I am actively working on as well. A somewhat more foundational conceptual question is – why do you think these types of coarse behaviours will exist in general? I am a control theorist who would ideally like to find solutions for these learning problems that are indepdent of the special domains (e.g., is it clear that coarse behaviours would exist in arbitrary decision processes such as, say, in autonomous trading), so this generality question is of interest to me. Bookstaber’s paper points towards the answer in a general decision-theoretic sense.
In an earlier post, I tried to make an argument that the point of “intelligence” is to be able to act robustly in an uncertain world. Now we see that optimal decisions in the face of extended uncertainty implies the existence of coarse behaviours for many common decision tasks. So, perhaps, an agent who is learning a complex behaviour in an uncertain world is better off structuring this process in a similar multi-level way…
For many years now, beginning with some questions that were part of my doctoral dissertation research, I have been curious about multi-level models that describe phenomena and strategies. A fundamental question that arises in this setting is regarding which direction (top-down/bottom-up) takes primacy.
A particular sense in which this directly touches upon my work is in the ability of unsupervised and semi-supervised learning methods to model “everything of interest” in a complex domain (e.g., robotics) so that any detailed analysis of the domain is rendered unnecessary. A claim that is often made is that the entire hierarchy will just emerge from the bottom-up. My own experience with difficult problems such as synthesizing complex humanoid robot behaviours makes me sceptical of the breadth of this claim. I find that, often, the easily available observables do not suffice and one needs to work hard to get the true description. However, I am equally sceptical of the chauvinistic view that the only way to solve problems is to model everything in the domain and dream up a clever strategy or the defeatist view that the only way to solve the problem is to look at pre-existing solutions somewhere else and copy them. Instead, in my own work, I have searched for a middle ground where one seeks general principles on both ends of the spectrum and tries to tie it together efficiently.
Recently, while searching google scholar for some technical papers on multi-level control and learning, I came across an interesting philosophical paper (R.C. Bishop and H. Atmanspacher, Contextual emergence in the description of properties, Foundations of Physics 36(12):1753-1777, 2006.) that makes the case that extracting deep organizational principles for a higher level from a purely bottom-up approach is, in a certain sense, a fundamentally ill-posed problem. Even in “basic” areas like theoretical physics one needs more context. Yet, all is not lost. What this really means is that there are some top-down contextual constraints (much weaker than arbitrary rigid prescriptions) that are necessary to make the two mesh together. You will probably have to at least skim the paper to get a better idea but I think this speaks to the same issue I raise above and says something quite insightful.