Design for Interpretability

Using environment design to improve the explicability of automated agent's behaviors in shared environments.

Designing robots and other automated agents capable of generating interpretable behavior is a prerequisite for achieving effective human-robot collaboration. This means that the robots need to be capable of generating behavior that aligns with human expectations and, when required, provide explanations to the humans in the loop. However, exhibiting such behavior in arbitrary environments could be quite expensive for robots, and in some cases, the robot may not even be able to exhibit the expected behavior. Given structured environments (like warehouses and restaurants), it may be possible to design the environment so as to boost the interpretability of robot’s behavior or to shape the human’s expectations of the robot’s behavior. In our IROS'2020 paper, we investigate the opportunities and limitations of environment design as a tool to promote a type of interpretable behavior – known in the literature as explicable behavior.

We formulate a novel environment design framework that considers design over multiple tasks and over a time horizon. In addition, we explore the longitudinal aspect of explicable behavior and the trade-off that arises between the cost of design and the cost of generating explicable behavior over a time horizon.




Consider a restaurant with a robot server (left figure). Let G1 and G2 represent the robot’s possible goals of serving the two booths: it travels between the kitchen and the two booths. The observers consist of customers at the restaurant. Given the position of the kitchen, the observers may have expectations on the route taken by the robot. However, unbeknownst to the observers, the robot can not traverse between the two tables and can only take the route around the tables. Therefore, the path marked in red is the cheapest path for the robot but the observers expect the robot to take the path marked in green.

In this environment, there is no way for the robot to behave as per the human’s expectations. Applying environment design provides us with alternatives. For example, the designer could choose to build two barriers as shown in right figure. With these barriers in place, the humans would expect the robot to follow the path highlighted in green. However, whether it is preferable to perform environment modifications or to bear the impact of inexplicable behavior depends on the cost of changing the environment versus the cost of inexplicability caused by the behavior. In our paper, we will explore the details of this trade-off.



Illustrating of longitudinal impact on explicability. Prob determines the probability associated with executing each task. For each task, the reward is determined by the inexplicability score of that task. The probability of achieving this reward is determined by γ times the probability of executing that task. Additionally, with a probability (1 − γ) the human ignores the inexplicability of a task and the associated reward is given by an inexplicability score of 0.