Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Reinforcement learning for robotic manipulation using simulated locomotion demonstrations

Tools
- Tools
+ Tools

Kilinc, Ozsel and Montana, Giovanni (2022) Reinforcement learning for robotic manipulation using simulated locomotion demonstrations. Machine Learning, 111 . pp. 465-486. doi:10.1007/s10994-021-06116-1 ISSN 2632-2153.

[img]
Preview
PDF
WRAP-Reinforcement-learning-robotic-manipulation-locomotion-demonstrations-2021.pdf - Published Version - Requires a PDF viewer.
Available under License Creative Commons Attribution 4.0.

Download (2130Kb) | Preview
Official URL: http://dx.doi.org/10.1007/s10994-021-06116-1

Request Changes to record.

Abstract

Mastering robotic manipulation skills through reinforcement learning (RL) typically requires the design of shaped reward functions. Recent developments in this area have demonstrated that using sparse rewards, i.e. rewarding the agent only when the task has been successfully completed, can lead to better policies. However, state-action space exploration is more difficult in this case. Recent RL approaches to learning with sparse rewards have leveraged high-quality human demonstrations for the task, but these can be costly, time consuming or even impossible to obtain. In this paper, we propose a novel and effective approach that does not require human demonstrations. We observe that every robotic manipulation task could be seen as involving a locomotion task from the perspective of the object being manipulated, i.e. the object could learn how to reach a target state on its own. In order to exploit this idea, we introduce a framework whereby an object locomotion policy is initially obtained using a realistic physics simulator. This policy is then used to generate auxiliary rewards, called simulated locomotion demonstration rewards (SLDRs), which enable us to learn the robot manipulation policy. The proposed approach has been evaluated on 13 tasks of increasing complexity, and can achieve higher success rate and faster learning rates compared to alternative algorithms. SLDRs are especially beneficial for tasks like multi-object stacking and non-rigid object manipulation.

Item Type: Journal Article
Subjects: Q Science > Q Science (General)
T Technology > TJ Mechanical engineering and machinery
Divisions: Faculty of Science, Engineering and Medicine > Engineering > WMG (Formerly the Warwick Manufacturing Group)
Library of Congress Subject Headings (LCSH): Reinforcement learning, Robotics , Robots -- Control systems
Journal or Publication Title: Machine Learning
Publisher: Springer
ISSN: 2632-2153
Official Date: February 2022
Dates:
DateEvent
February 2022Published
24 November 2021Available
25 October 2021Accepted
Volume: 111
Page Range: pp. 465-486
DOI: 10.1007/s10994-021-06116-1
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Open Access (Creative Commons)
Date of first compliant deposit: 22 December 2021
Date of first compliant Open Access: 23 December 2021

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics

twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us