Archive for February 27, 2013
Close physical interaction between robots and humans is a particularly challenging aspect of robot development. For successful interaction and cooperation, the robot must have the ability to adapt its behavior to the human counterpart. Based on our earlier work, we present and evaluate a computationally efficient machine learning algorithm that is well suited for such close-contact interaction scenarios. We show that this algorithm helps to improve the quality of the interaction between a robot and a human caregiver. To this end, we present two human-in-the-loop learning scenarios that are inspired by human parenting behavior, namely, an assisted standing-up task and an assisted walking task.
Human–Robot Interaction and Cooperation
Until recently, robotic systems mostly remained in the realm of industrial applications and academic research. However, in recent years, robotics technology has significantly matured and produced highly realistic android robots. As a result of this ongoing process, the application domains of robots have slowly expanded into domestic environments, offices, and other human-inhabited locations. In turn, interaction and cooperation between humans and robots has become an increasingly important and, at the same time, challenging aspect of robot development. Particularly challenging is the physical interaction and cooperation between humans and robots. For such interaction to be successful and meaningful, the following technical difficulties need to be addressed:
guaranteeing safety at all times
ensuring that the robot reacts appropriately to the force applied by the human interaction partner
improving the behavior of the robot using a machine learning algorithm in a physical human–robot interaction (PHRI).
In our previous research , we presented a PHRI scenario in which we addressed the above topics. Inspired by the parenting behavior observed in humans, a test subject was asked to physically assist a state-of-the-art robot in a standing-up task. In such a situation, both the human and the robot are required to adapt their behaviors to cooperatively complete the task. However, most machine learning scenarios to date do not address the question of how learning can be achieved for tightly coupled, physical interactions between a learning agent and a human partner. Building on the results in , we present an extended evaluation and discussion of such human-in-the-loop learning scenarios.
To realize learning and adaptation on the robot’s side, we employ a computationally efficient learning algorithm based on a dimensional reduction technique. In particular, after each trial, the human can judge whether the interaction was successful, then the judgment is used in a machine learning algorithm to apply a dimensional reduction technique and update the behavior of the robot. As learning progresses, the robot creates a behavioral model, which implicitly includes the actions of the human counterpart.
At the same time, refining the motions of the robot during a physical interaction requires the motions of the human to be improved, because the two motions influence each other. Hence, the human counterpart is part of the learning system and overall dynamics. To analyze the efficiency of the proposed learning algorithm and the effect of human habituation to the robot during such close-contact interactions, we perform a set of PHRI experiments. In addition to the assisted standing-up interaction scenario presented in , we also present and discuss the first results based on a novel interaction scenario. More specifically, we present an assisted walking task in which a human caregiver must assist a humanoid robot while walking.
We believe that human-in-the-loop learning scenarios, such as that presented herein, will be particularly interesting in the future because they can help to strengthen the mutual relationship between humans and robots. Ideally, this will lead to a higher acceptance of robotic agents in society.
Important aspects of PHRIs have been investigated in a perspective research project conducted by the European Network of Excellence (EURON) . The objective of the project was to present and discuss important requirements for safe and dependable robots involved in PHRIs. Initial approaches for achieving these requirements are currently being addressed in a follow-up research project called PHRIENDS (a PHRI that is dependable and safe). To reduce risks and fatalities in industrial manufacturing workplaces, the primary goal of the PHRIENDS project is to design robots that are intrinsically safe. This requires the development of new actuator concepts, safety measures, and control algorithms, which take the presence of human subjects into account. The results of this project are also relevant to applications outside the manufacturing industry. However, learning and adaptation between humans and robots is not the focus of the PHRIENDS project.
Khatib et al.  discussed the basic capabilities needed to enable robots to operate in human-populated environments. In particular, they discussed how mobile robots can calculate collision-free paths and manipulate surrounding objects. In their approach, they characterized free space using an elastic strip approach. However, the described robots were not expected to come into direct (physical) contact with the surrounding human subjects. The importance of direct physical interaction was highlighted in the haptic creature project , which investigated the role of affective touch in fostering the companionship between humans and robots. In an attempt to improve human–robot interaction, Kosuge et al. presented a robot that can dance with a human by adaptively changing the dance steps according to the force/moment applied to the robot . Amor et al.  used kinesthetic interactions to teach new behaviors to a small humanoid robot.
Furthermore, the behavior of the robot may be optimized with respect to a given criterion in simulation. In this learning scheme, the robot is a purely passive interaction partner and acts only after the learning process is complete. Similar approaches to teaching new skills have also been reported in  and  using different learning methods, i.e., continuous time-recurrent neural networks and Gaussian mixture models (GMMs), respectively. Odashima et al.  developed a robot that can come into direct physical contact with humans. This robot is intended for caregiving tasks such as carrying injured persons to a nearby physician.
The robot can also learn new behaviors and assistive tasks by observing human experts as they perform these tasks. However, this learning does not take place during interactions but rather in offline sessions using immersive virtual environments. In , Evrard et al. present a humanoid robot with the ability to perform a collaborative manipulation task together with a human operator. In a teaching phase, the robot is first teleoperated using a force-feedback device. The recorded forces and positions are then used to learn a controller for the collaborative task. The main hypothesis underlying this approach is that the intentions of the human interaction partner can be guessed from haptic cues. In , physical interactions between a robot’s hand and the hand of a human are modeled by recording their distances. The distances are then encoded in a hidden Markov model (HMM), which in turn is used to synthesize similar hand contacts. A recent survey on modern approaches to physical and tactile human–robot interaction can be found in .
In this article, we present experiments with a flexible joint robot that is involved in close physical interaction with a human caregiver. In contrast to the above research, both human and robot play an active role in the interaction to learn and adapt their behaviors to their partner so as to achieve a common goal. This tight coupling of robot and human learning and coadaptation is a unique feature and is the primary contribution of the present study. We assume that it is important to focus on the active role in the interaction because the forces generated during the active behavior of the robot influence the behavior of the human, which in turn influences the passive behavior of the robot. In addition, these active and passive roles cannot be clearly separated because the robot and the human influence each other when they are in physical contact.
Physical Interaction Learning Approach
The goal of interaction learning is to improve the cooperation of humans and robots while they are working to achieve a common goal. Figure 1 shows an overview of the learning scheme used in this article. After an initial physical interaction between a human and a robot, the human is given the chance to evaluate the behavior of the robot. More precisely, the human can judge whether the interaction was a success or failure (binary evaluation). The feedback can be provided in various ways, such as through touch or through a simple graphical user interface. Once the evaluation information is collected by the robot system, it is stored in a database in the memory. The memory collects information about recent successful interactions and manages the data for the subsequent learning step. This allows us to optimize the set of training examples used for learning to improve learning quality. Figure 1 shows the human-in-the-loop learning system considered in this article, where the behavior of the human influences the behavior of the robot and, simultaneously, the behavior of the robot influences the behavior of the human. Furthermore, the behavior of the robot changes as learning progresses, which in turn influences the behavior of the human and its physical support. This system demonstrates one of the applications of a tightly coupled physical interaction.
Figure 1. (a) Overview of the physical interaction learning approach. After physical interaction, the human judges whether the interaction was successful. This information is stored in the robot’s memory and used for later learning. (b) Flexible-joint humanoid robot used in the experiments in this study. (Photos courtesy of ERATO Asada Project.)
After a number of interactions, the learning system queries the memory for a new set of training data. The data are then projected onto a low-dimensional manifold using dimensional reduction techniques. There are three justifications for this step. First, dimensional reduction allows a reduction of the space in which learning takes place. Thus, the learning can be much faster and more efficient. In addition, dimensional reduction generally helps to detect meaningful low-dimensional structures in high-dimensional inputs. Second, dimensional reduction allows us to visualize and understand the adaptation taking place during interaction. This is particularly helpful for later review and analysis purposes. Finally, dimensional reduction reduces the negative influence of outliers on learning. The inputs to the dimensional reduction step are high-dimensional state vectors describing the postures of the robot during the interaction.
The output is a low-dimensional posture space. Once the state vectors are projected onto a low-dimensional manifold, we group the resulting points into sets according to the action performed in that state. Thus, we obtain for each possible action a set of states in which the corresponding action should be triggered. For each action, a GMM is learned. The model encodes a probability density function of the learned state vectors. The ideal number of Gaussian mixtures is estimated using the Bayesian information criterion (BIC) .
By computing the likelihood of a given state vector p in a GMM of action A, we can estimate how likely it is that the robot should perform action A when in posture p. The learned models are then used during the next physical interaction trial to determine the actions of the robot. Here, each new posture is projected into the low-dimensional posture space. Then, the likelihood of the projected point for each GMM is computed. Following a maximumlikelihood rationale, the action corresponding to the GMM with the highest likelihood is executed by the robot.
With each iteration of the above learning loop, the robot adapts its model more and more toward successful interactions. The result is a smoother and easier cooperative behaviour between the human and the robot.
The robot used in this study is called the child–robot with biomimetic body, or CB2
. The robot has the following features.
Its height is 130 cm, and its mass is approximately 33 kg.
The degree of freedom (DOF) is 56.
The supplied air pressure is 0.6 MPa.
The efficient torque of the knee is theoretically 28.6 N∙m.
All joints, apart from the joints used to move the eyes and eyelids, are driven by pneumatic actuators.
All joints, apart from the joints used to move the fingers, are equipped with potentiometers.
The joints have low mechanical impedance because of the compressibility of air. The joints can also be made to be completely passive if the system discontinues air compression during robot motion. This helps the robot to perform passive motions during physical interaction and helps to ensure the safety of the human partner. This is in contrast to most other robots, in which the joints are driven by electricmotors with decelerators. The flexible actuators enable the joints to produce seemingly smooth motions, even when the input signal changes drastically.
This feature of the CB2
robot is used to realize complex
motions using the simple control
architecture  depicted in Figure 2.
More specifically, full body motions
of the robot are realized by switching
between a set of successive desired
postures. Furthermore, the flexible
actuators enable motions generated
by this simple control architecture to
be adaptively changed in response to
an applied force from the human
partner. Each posture is described by
a posture vector x, with each entry
of the vector denoting the
angular value of a particular
joint. A low-level
controller is implemented
by the proportional-integral-
control of angular values.
Each time the desired
posture is switched drastically,
large drive torques are generated, resulting in an
active force being applied to the human caregiver. As the
posture of the robot approaches the desired posture,
the passive motion gradually becomes the dominant
motion of the robot because the amount of error in the
angular control gradually becomes smaller.
Figure 2. Control architecture of the CB2
Figure 3 shows how the examined standing-up task is realized using the proposed control architecture. The behavior is realized by switching between three desired postures. At first glance, the specifications of the robot motion appear to be extremely simple. However, the switching times are highly dependent on the human interaction. More specifically, the switching times depend on the anatomy and skills of the human. This means that the robot has to adapt the switching times to the characteristics of its partner during the period of interaction. In addition, it must be noted that this motion cannot be performed by the robot if a human does not assist in its execution.
Figure 3. The three desired postures used in the standing-up task of the experiment. The learning task is to determine the ideal switching conditions between the desired postures. (Photo courtesy of ERATO Asada Project.)