Harry H. Zhang

I am a PhD student in the SPARK Lab of MIT LIDS. I am extremely fortunate to be advised by Prof. Luca Carlone.

Prior to MIT, I was a MS-Research student in the CMU Robotics Institute studying Artificial Intelligence and Robotics, advised by Prof. David Held.

Prior to CMU, I earned my B.S. (2017-2021) with Honors from UC Berkeley with a major in EECS and a minor in Mechanical Engineering. During my time at Berkeley, I did research under Prof. Ken Goldberg and Dr. Jeffrey Ichnowski in AUTOLab. I maintain and curate a popular deep reinforcement learning tutorial on my Github.

Email  /  CV  /  LinkedIn  /  Google Scholar  /  Github  /  Twitter

profile photo
News and Updates

In reverse chronological order:

  • Sep 2022: TAX-Pose accepted to CoRL, see you in New Zealand!
  • May 2022: I am joining Amazon this summer as an applied research scientist, working on 3D learning problems.
  • Apr. 2022: FlowBot3D accepted to RSS, see you in NYC!
  • Jan. 2022: We submitted the FlowBot3D paper to RSS.
  • Aug. 2021: Started grad school at CMU RI. Looking forward to Pittsburgh, PA.
  • May. 2021: Won the Warren Y. Dere Award from UC Berkeley EECS.
  • Mar. 2021: Dynamic cable manipulation paper accepted to ICRA 2021.
  • Nov. 2020: Dynamic cable manipulation paper featured at Bay Area Robotics Symposium (BARS) hosted by Stanford.
  • Nov. 2020: Dynamic cable manipulation paper submitted to ICRA 2021.
  • Jun. 2020: Dex-Net AR got featured on VentureBeat.
  • Jun. 2020: Dex-Net AR got featured on Sohu.
  • Aug. 2019: AI4All Berkeley was a blast. We organized an AI crash course for under-represented HS/MS students. See summary here.

Research Interests

My current research focuses on practical problems that artificial intelligence faces in real life. My interests are on the intersection of robotics, computer vision, reinforcement learning, and control theory. I would like to let AI agents gain a better understanding of the structure of the world in terms of perception, modeling, and manipulation. Specifically, I think about how robots can perceive the world in a way that facilitates downstream policy and planning. I firmly believe cleverly-designed learned representations of both visual input and action output could significantly improve downstream policy learning in robotic manipulation tasks. Quixotic though it may sound, I hope to use AI and robotics to change the world for the better. Here is the Personal Statement I used for my PhD applications.

Peer-Reviewed Publications
TAX-Pose: Task-Specific Cross-Pose Estimation for Robot Manipulation
Brian Okorn*, Chu Er Pan*, Harry Zhang*, Benjamin Eisner*, David Held
Accepted to Conference on Robot Learning (CoRL), 2022 (* indicates equal contribution)
Arxiv | Code | Video | Open Review

We conjecture that the task-specific pose relationship between relevant parts of interacting objects is a generalizable notion of a manipulation task that can transfer to new objects. We call this task-specific pose relationship "cross-pose". We propose a vision-based system that learns to estimate the cross-pose between two objects for a given manipulation task.

FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects
Benjamin Eisner*, Harry Zhang*, David Held
Accepted to Robotics Science and Systems (RSS), 2022 (* indicates equal contribution) - Long talk, Best Paper Award Finalist (Selection Rate 1.5%).
Arxiv | Code | Video | Berkeley CPAR Talk | MIT Technology Review China | Synced Review Sohu CMU Research Highlights

We explore a novel method to perceive and manipulate 3D articulated objects that generalizes to enable the robot to articulate unseen classes of objects.

AVPLUG: Approach Vector Planning for Unicontact Grasping amid Clutter
Yahav Avigal*, Vishal Satish*, Harry Zhang, Huang Huang, Michael Danielczuk, Jeffrey Ichnowski, Ken Goldberg
Accepted to Conference on Automation Science and Engineering (CASE), 2021.
Arxiv | Code | Video

We present present AVPLUG: Approach Vector PLanning for Unicontact Grasping: an algorithm for efficiently finding the approach vector using an efficient oct-tree occupancy model and Minkowski sum computation to maximize information gain.

project image Robots of the Lost Arc: Self-Supervised Learning to Dynamically Manipulate Fixed-Endpoint Cables
Harry Zhang, Jeffrey Ichnowski, Daniel Seita, Jonathan Wang, Huang Huang, Ken Goldberg
Accepted to International Conference on Robotics and Automation (ICRA), 2021
Arxiv | Code | Bay Area Robotics Symposium Coverage | ICRA 2022 Deformable Object Manipulation Workshop

We propose a self-supervised learning framework that enables a UR5 robot to perform these three tasks. The framework finds a 3D apex point for the robot arm, which, together with a task-specific trajectory function, defines an arcing motion that dynamically manipulates the cable to perform tasks with varying obstacle and target locations.

Dex-Net AR: Distributed Deep Grasp Planning Using a Commodity Cellphone and Augmented Reality App
Harry Zhang, Jeffrey Ichnowski, Yahav Avigal, Joseph Gonzalez, Ion Stoica, Ken Goldberg
Accepted to International Conference on Robotics and Automation (ICRA), 2020
Arxiv | Code | Video | VentureBeat Coverage | Sohu Coverage (in Mandarin)

We present a distributed pipeline, Dex-Net AR, that allows point clouds to be uploaded to a server in our lab, cleaned, and evaluated by Dex-Net grasp planner to generate a grasp axis that is returned and displayed as an overlay on the object.

Orienting Novel Objects using Self-Supervised Rotation Estimation
Shivin Devgon, Jeffrey Ichnowski, Ashwin Balakrishna, Harry Zhang, Ken Goldberg
Accepted to Conference on Automation Science and Enigeering (CASE), 2020.
Arxiv | Code | Video

We present an algorithm to orient novel objects given a depth image of the object in its current and desired orientation.

Self-Supervised Learning of Dynamic Planar Manipulation of Free-End Cables
Jonathan Wang*, Huang Huang*, Vincent Lim, Harry Zhang, Jeffrey Ichnowski, Daniel Seita, Yunliang Chen, Ken Goldberg
Preprint, in submission to International Conference on Robotics and Automation (ICRA), 2022.
Arxiv | Code | Video

We present an algorithm to train a robot to control free-end cables in a self-supervised fashion.

Safe Deep Model-Based Reinforcement Learning with Lyapunov Functions
Bobby Yan*, Harry Zhang*, Huang Huang*,
Preprint, 2022.
Arxiv | Code | Video

We introduce andexplore a novel method for adding safety constraints for model-based RL during training and policy learning.


10-725: Graduate Convex Optimization
16-385: Computer Vision

CS 189: Introduction to Machine Learning

EE 127: Introduction to Convex Optimization

CS 188: Introduction to Artificial Intelligence

CS 170: Algorithms

ME C231A: Model Predictive Control

Website template from Jon Barron