One of our students, Daniel Angelov, has spent the past few months at Xerox PARC as an intern. He has been doing interesting things associated with the COGLE project I mentioned earlier, part of the DARPA XAI programme.
The folks at PARC have put up this interview on their blog, in which Daniel talks about his work and time there:
We have been awarded one of the projects under the DARPA Explainable AI programme, to be kicked off next week. Our project, entitled COGLE (Common Ground Learning and Explanation), will be coordinated by Xerox Palo Alto Research Centre, and I will a PI leading technical efforts on the machine learning side of the architecture.
COGLE will be a highly interactive sense-making system for explaining the learned performance capabilities of an autonomous system and the history that produced that learning. COGLE will be initially developed using an autonomous Unmanned Aircraft System (UAS) test bed that uses reinforcement learning (RL) to improve its performance. COGLE will support user sensemaking of autonomous system decisions, enable users to understand autonomous system strengths and weaknesses, convey an understanding of how the system will behave in the future, and provide ways for the user to improve the UAS’s performance.
To do this, COGLE will:
- Provide specific interactions in sensemaking user interfaces that directly support modes of human explanation known to be effective and efficient in human learning and understanding.
- Support mapping (grounding) of human conceptualizations onto the RL representations and processes.
This area is becoming one that is increasingly being discussed in the public sphere, in the context of the increasing adoption of AI into daily lives, e.g., see this article in the MIT Technology Review and this one in Nautilus, both referring directly to this DARPA programme. I look forward to contributing to this theme!
I will be giving a talk, as part of the IPAB seminar series, through which I will try to further develop this framing of the problem which I expect our group to try to solve in the medium term. In a certain sense, my case will hark back to fairly well established techniques familiar to engineers but slowly lost with the coming of statistical methods into the robotics space. Also, many others are picking up on the underlying needs, e.g., this recent article in the MIT Technology Review gives a popular account of current sentiment among some in the AI community.
The program induction route to explainability and safety in autonomous systems
The confluence of advances in diverse areas including machine learning, large scale computing and reliable commoditised hardware have brought autonomous robots to the point where they are poised to be genuinely a part of our daily lives. Some of the application areas where this seems most imminent, e.g., autonomous vehicles, also bring with them stringent requirements regarding safety, explainability and trustworthiness. These needs seem to be at odds with the ways in which recent successes have been achieved, e.g., with end-to-end learning. In this talk, I will try to make a case for an approach to bridging this gap, through the use of programmatic representations that intermediate between opaque but efficient learning methods and other techniques for reasoning that benefit from ’symbolic’ representations.
I will begin by framing the overall problem, drawing on some of the motivations of the DARPA Explainable AI programme (under the auspices of which we will be starting a new project shortly) and on extant ideas regarding safety and dynamical properties in the control theorists’ toolbox – also noting where new techniques have given rise to new demands.
Then, I will shift focus to results from one specific project, for Grounding and Learning Instances through Demonstration and Eye tracking (GLIDE), which serves as an illustration of the starting point from which we will proceed within the DARPA project. The problem here is to learn the mapping between abstract plan symbols and their physical instances in the environment, i.e., physical symbol grounding, starting from cross-modal input provides the combination of high- level task descriptions (e.g., from a natural language instruction) and a detailed video or joint angles signal. This problem is formulated in terms of a probabilistic generative model and addressed using an algorithm for computationally feasible inference to associate traces of task demonstration to a sequence of fixations which we call fixation programs.
I will conclude with some remarks regarding ongoing work that explicitly addresses the task of learning structured programs, and using them for reasoning about risk analysis, exploration and other forms of introspection.
My student, Emmanuel Kahembwe, is part of Team Edina – consisting of students and postdoctoral researchers from the School of Informatics at the University of Edinburgh – who are one of 12 teams competing for The Alexa Prize. The grand challenge is to build a socialbot that can converse coherently and engagingly with humans on popular topics for 20 minutes.
Let us wish them all the best and I am very curious to see what comes out of this competition!
It is often the case that one of the crucial factors underpinning wide scale adoption of any computational procedure is the availability of easy to use packages containing that procedure, usable by people who have only a passing familiarity with the innards.
Topological data analysis is one such toolbox, the inner workings of which have often seemed too hard to penetrate for anyone but the most mathematically well trained scientists (a very small population, indeed). Yet, the tool has many appealing characteristics, given how centrally relevant its core questions are to hierarchical representations, multi-scale modelling and so on.
In this context, it is nice to see this web based interface for persistent homology calculations, a key tool in the TDA toolbox:
I came across the following wonderful gems of tech history quotes in one of my recent readings:
The relational model is a particular suitable structure for the truly casual user (i.e. a non-technical person who merely wishes to interrogate the database, for example a housewife who wants to make enquiries about this week’s best buys at the supermarket).
In the not too distant future the majority of computer users will probably be at this level.
(Date and Codd 1975; 95)
Casual users, especially if they were managers might want to ask a database questions that had never been asked before and had not been foreseen by any programmer.
As I was reading these, it occurred to me that many in my own area of robotics also often think in the same way. What would the proper abstractions look like in emerging areas such as robotics, mirroring the advances that have happened in the software space (e.g., contrast the above vision vs. Apple’s offerings today)? Our typical iPad-toting end user is going to speak neither SQL nor ROS, yet the goal of robotics is to let them flexibly and naturally program the machine – how can that actually be achieved?
- C.J. Date, E.F. Codd, The relational and network approaches: Comparison of the application programming interfaces, Proc. SIGFIDET (now SIGMOD) 1974.
- D Gugerli, Die Welt als Datenbank. Zur Relation von Softwareentwicklung, Abfragetechnik und Deutungsautonomie, Daten, Zurich, Berlin: Diaphanes, 2007.
This is the primary question shaping much of the public debate and discourse around the development of autonomous systems technologies, robots being the most visible and eye catching example – occasionally quite scary when you see them being used, e.g., as carriers of weapons and bombs. As a roboticist who is often deeply ensconced in the technical side of developing capabilities, I find most public articles in the popular media to be ill-informed and hyperbolic. So, I was pleasantly surprised to read this McKinsey report. This is not quite popular media, but in the past I have found even some of these consulting company reports to be formulaic. The key point being made by the authors is that just because something can be automated does not mean it should, or that it will. In reality, a variety of different factors, including economic and social will shape the path of these technologies – something all of us should pragmatically consider.
This key point is summarised in the following excerpt:
Technical feasibility is a necessary precondition for automation, but not a complete predictor that an activity will be automated. A second factor to consider is the cost of developing and deploying both the hardware and the software for automation. The cost of labor and related supply-and-demand dynamics represent a third factor: if workers are in abundant supply and significantly less expensive than automation, this could be a decisive argument against it. A fourth factor to consider is the benefits beyond labor substitution, including higher levels of output, better quality, and fewer errors. These are often larger than those of reducing labor costs. Regulatory and social-acceptance issues, such as the degree to which machines are acceptable in any particular setting, must also be weighed. A robot may, in theory, be able to replace some of the functions of a nurse, for example. But for now, the prospect that this might actually happen in a highly visible way could prove unpalatable for many patients, who expect human contact. The potential for automation to take hold in a sector or occupation reflects a subtle interplay between these factors and the trade-offs among them.