Charlotte M. Morrison
Dr. Ayan Dutta | College of Computing, Engineering and Construction | School of Computing
We study the problem of automated object manipulation using two arms of a Baxter robot. The robot uses a novel multi-agent reinforcement learning strategy to learn how to complete the task without any prior experience. It learns what actions to take by storing its interactions with the environment and uses these experiences to create a policy that guides future actions. Each of Baxter’s arms is modeled as an independent agent that can move and learn separately from the other. Each arm learns independent policies (i.e., environment state to robot action mapping) about how to best move in order to complete a collaborative (i.e., using two arms) task (e.g., push an item, pick-and-place etc.). The individual agents are trained using the standard TD3 algorithm, which uses the experiences that include how well the agent’s past actions guided it towards completing the task. TD3 has two neural networks: the actor network takes the states (joint angles) as an input and outputs the actions (joint movements) and the twin critic networks evaluate the quality of those actions. The actions between the agents are coordinated through a game theory-based distributed coordination strategy for successful coordination. This coordination learning framework produces a policy that produces a good set of actions for each arm to execute. Finally, Baxter uses this policy to complete tasks using both its arms in collaboration. To the best of our knowledge, this work is the first to use a game theory-based strategy for dual arm manipulation learning.