Project Overview

Project Type: Course
Course: Reinforcement Learning
Robot Used: Fetch + UR3e
Software Used: OpenAI | Python | ROS
Date(s): Sep 2019 - Dec 2019

Project Abstract

  There’s a current research project at Northeastern University’s RIVeR Lab to use robotics to sort, process and handle raw seafood. Since seafood is particularly slippery, in many cases it makes more sense to slide the piece of seafood from point A to point B rather than using traditional pick-and-place methods. Using reinforcement learning to solve this problem prevents the need to determine the dynamics of a slippery piece of fish interacting with a surface in order to directly calculate the amount of force needed to push the fish with.
  The goal of this course project was to use this real-world problem as a basis for exploring the challenges of applying reinforcement algorithms to robotics applications. OpenAI Gym's existing environment using the Fetch robot to slide pucks across a table to a random goal location was used as a starting point that emulates the problem of sliding seafood using a robotic arm.


Personal Contributions

  An on-policy reinforcement learning algorithm (Proximal Policy Optimization) and an off-policy algorithm (Hindsight Experience Replay) were used to train a policy on the existing OpenAI Gym environment with the Fetch Robot. A custom environment was also created to model the specific robotic equipment used in the labs at Northeastern by swapping the Fetch Robot with a UR3e robotic arm. The same two reinforcement algorithms were again used to train a policy on the custom environment for comparison. A Gazebo/ROS environment was also created to emulate the OpenAI Gym simulated environment.
  As shown in the above presentation, meaningful learning was achieved with the Hindsight Experience Replay (HER) algorithm. It was also shown that an on-policy algorithm such as Proximal Policy Optimization (PPO) is not suited for this type of complex environment with sparse rewards. The largest conclusion that was realized from this work is how important the choice of algorithm and reward function are. It also became readily apparent how closely these two choices are to each other. Some rewards functions may work well with certain algorithms but might work poorly with other algorithms.

Contact Me

Address

Phone #

Email

Boston, MA

781 812 8630

joelynch523@gmail.com