News and Updates
In reverse chronological order:
-
Jan. 2024: Multi-model fitting paper accepted to ICRA, see you in Yokohama!
-
Oct. 2023: DiffCLIP accepted to WACV, see you in Hawaii!
-
Aug. 2023: FlowBot++ accepted to CoRL, see you in ATL!
-
Sep. 2022: TAX-Pose accepted to CoRL, see you in New Zealand!
-
May 2022: I am joining Amazon this summer as an applied research scientist, working on 3D learning problems.
-
Apr. 2022: FlowBot3D accepted to RSS, see you in NYC!
-
May. 2021: Won the Warren Y. Dere Award from UC Berkeley EECS.
-
Mar. 2021: Dynamic cable manipulation paper accepted to ICRA 2021.
-
Nov. 2020: Dynamic cable manipulation paper featured at Bay Area Robotics Symposium (BARS) hosted by Stanford.
-
Jun. 2020: Dex-Net AR got featured on VentureBeat.
-
Jun. 2020: Dex-Net AR got featured on Sohu.
|
Research Interests
My current research focuses on trustworthy AI and autonomous systems. Specifically, I design algorithms for machines to learn representations for more robust real-world generalization and better certifiability. My research revolves around the theme of learning-based perception systems and robotic systems.
|
Peer-Reviewed Publications
|
|
Multi-Model 3D Registration: Finding Multiple Moving Objects in Cluttered Point Clouds
David Jin,
Sushrut Karmalkar,
Harry Zhang,
Luca Carlone
Accepted to IEEE International Conference on Robotics and Automation (ICRA), 2024.
Arxiv |
Code |
Video
We investigate a variation of the 3D registration
problem, named multi-model 3D registration. In the multi-model
registration problem, we are given two point clouds picturing a
set of objects at different poses (and possibly including points
belonging to the background) and we want to simultaneously
reconstruct how all objects moved between the two point clouds.
|
|
DiffCLIP: Leveraging Stable Diffusion for Language Grounded 3D Classification
Sitian Shen,
Zilin Zhu,
Linqian Fan,
Harry Zhang,
Xinxiao Wu
Accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024.
Arxiv |
Code |
Video
We propose DiffCLIP, a new pre-training framework that incorporates stable diffusion with ControlNet to minimize the domain gap in the
visual branch. Additionally, a style-prompt generation module is introduced for
few-shot tasks in the textual branch.
|
|
FlowBot++: Learning Generalized Articulated Objects Manipulation via Articulation Projection
Harry Zhang,
Benjamin Eisner,
David Held
Accepted to Conference on Robot Learning (CoRL), 2023.
Arxiv |
Code |
Video |
Open Review
We explore yet another novel method to perceive and manipulate 3D articulated objects that generalizes to enable the robot to articulate unseen classes of objects.
|
|
TAX-Pose: Task-Specific Cross-Pose Estimation for Robot Manipulation
Brian Okorn*,
Chu Er Pan*,
Harry Zhang*,
Benjamin Eisner*,
David Held
Accepted to Conference on Robot Learning (CoRL), 2022 (* indicates equal contribution)
Arxiv |
Code |
Video |
Open Review
We conjecture that the task-specific pose relationship between relevant parts of interacting objects is a generalizable notion of a manipulation task that can transfer to new objects. We call this task-specific pose relationship "cross-pose". We propose a vision-based system that learns to estimate the cross-pose between two objects for a given manipulation task.
|
|
FlowBot3D: Learning 3D Articulation Flow to Manipulate Articulated Objects
Benjamin Eisner*,
Harry Zhang*,
David Held
Accepted to Robotics Science and Systems (RSS), 2022 (* indicates equal contribution) - Long talk, Best Paper Award Finalist (Selection Rate 1.5%).
Arxiv |
Code |
Video |
Berkeley CPAR Talk |
MIT Technology Review China |
Synced Review Sohu |
CMU Research Highlights
We explore a novel method to perceive and manipulate 3D articulated objects that generalizes to enable the robot to articulate unseen classes of objects.
|
|
AVPLUG: Approach Vector Planning for Unicontact Grasping amid Clutter
Yahav Avigal*,
Vishal Satish*,
Harry Zhang,
Huang Huang,
Michael Danielczuk,
Jeffrey Ichnowski,
Ken Goldberg
Accepted to Conference on Automation Science and Engineering (CASE), 2021.
Arxiv |
Code |
Video
We present present AVPLUG: Approach Vector PLanning for Unicontact Grasping: an algorithm for efficiently finding the approach vector using an efficient oct-tree occupancy model and Minkowski sum computation to maximize information gain.
|
|
Robots of the Lost Arc: Self-Supervised Learning to Dynamically Manipulate Fixed-Endpoint Cables
Harry Zhang,
Jeffrey Ichnowski,
Daniel Seita,
Jonathan Wang,
Huang Huang,
Ken Goldberg
Accepted to International Conference on Robotics and Automation (ICRA), 2021
Arxiv |
Code |
Bay Area Robotics Symposium Coverage |
ICRA 2022 Deformable Object Manipulation Workshop
We propose a self-supervised learning framework that enables a UR5 robot to perform these three tasks. The framework finds a 3D apex point for the robot arm, which, together with a task-specific trajectory function, defines an arcing motion that dynamically manipulates the cable to perform tasks with varying obstacle and target locations.
|
|
Dex-Net AR: Distributed Deep Grasp Planning Using a Commodity Cellphone and Augmented Reality App
Harry Zhang,
Jeffrey Ichnowski,
Yahav Avigal,
Joseph Gonzalez,
Ion Stoica,
Ken Goldberg
Accepted to International Conference on Robotics and Automation (ICRA), 2020
Arxiv |
Code |
Video |
VentureBeat Coverage |
Sohu Coverage (in Mandarin)
We present a distributed pipeline, Dex-Net AR, that allows point clouds to be uploaded to a server in our lab, cleaned, and evaluated by Dex-Net grasp planner to generate a grasp axis that is returned and displayed as an overlay on the object.
|
|
Orienting Novel Objects using Self-Supervised Rotation Estimation
Shivin Devgon,
Jeffrey Ichnowski,
Ashwin Balakrishna,
Harry Zhang,
Ken Goldberg
Accepted to Conference on Automation Science and Enigeering (CASE), 2020.
Arxiv |
Code |
Video
We present an algorithm to orient novel objects given a depth image of the object in its current and desired orientation.
|
|
Self-Supervised Learning of Dynamic Planar Manipulation of Free-End Cables
Jonathan Wang*,
Huang Huang*,
Vincent Lim,
Harry Zhang,
Jeffrey Ichnowski,
Daniel Seita,
Yunliang Chen,
Ken Goldberg
Preprint, in submission to International Conference on Robotics and Automation (ICRA), 2022.
Arxiv |
Code |
Video
We present an algorithm to train a robot to control free-end cables in a self-supervised fashion.
|
|
Safe Deep Model-Based Reinforcement Learning with Lyapunov Functions
Bobby Yan*,
Harry Zhang*,
Huang Huang*,
Preprint, 2022.
Arxiv |
Code |
Video
We introduce andexplore a novel method for adding safety constraints for model-based RL during training and policy learning.
|
|
10-725: Graduate Convex Optimization
16-385: Computer Vision
|
|
CS 189: Introduction to Machine Learning
EE 127: Introduction to Convex Optimization
CS 188: Introduction to Artificial Intelligence
CS 170: Algorithms
ME C231A: Model Predictive Control
|
|