Haidong Zhu

I am a research scientist at Waymo Research. I received my Ph.D. in Computer Science from the University of Southern California (USC) in May 2024. I have the privilege of working at the USC IRIS Computer Vision Lab, advised by Prof. Ram Nevatia. Before that, I completed my Bachelor's Degree in Electronic Engineering from Tsinghua University in 2019.

My research interests lie in computer vision and deep learning, especially multimodal analysis for 3-D vision and biometric understanding.

Email   /    CV   /    Github   /    Google Scholar

Experience
Waymo LLC, New York, NY
Research Scientist • Jun. 2024 to Present
  • Developed auto-labeling pipeline using large language models for semantic labeling of BEV data.
  • Implemented diffusion models for accurate road graph construction, improving baseline accuracy by 30%.
  • Integrated Semantic Neural Radiance Fields for efficient semantic segmentation in partially labeled scenarios.
Education
University of Southern California, Los Angeles, CA
Ph.D., Computer Science • Aug. 2019 to May 2024
Tsinghua University, Beijing, China
B.Eng., Electronic Information Science and Technology • Sep. 2015 to Jul. 2019
Internship and On-campus Experience
IRIS Computer Vision Lab, USC, Los Angeles, CA
Research/Teaching Assistant • Aug. 2019 to May 2024
Advisor: Prof. Ram Nevatia
  • Enhanced 3D reconstruction methods with implicit neural representations.
  • Investigated biometric identification through gait and 3D body shape analysis.
  • Developed multimodal sentiment analysis models using self-supervised learning.
Applied Sciences Group, Microsoft, Redmond, WA
Research Intern • May 2023 to Aug. 2023
Advisor: Dr. Tianyu Ding
  • Developed few-shot generalizable Neural Radiance Field (NeRF) architectures.
  • Implemented NeRF-based methods for interactive 3D scene editing.
Lab 126, Amazon, Bellevue, WA
Applied Scientist Intern • May 2022 to Aug. 2022
Advisor: Dr. Yuyin Sun
  • Designed a multimodal NeRF framework integrating RGB and depth data.
  • Enhanced robustness of 3D point cloud registration algorithms.
AI Lab, Bytedance Inc., Mountain View, CA
Research Intern • May 2021 to Aug. 2021
Advisor: Dr. Ye Yuan
  • Developed methods for single-image-based 3D human mesh reconstruction.
  • Created generative models for automatic clothing geometry and texture synthesis.
Preprint
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding, Tianyi Chen, Haidong Zhu, Jiachen Jiang, Yiqi Zhong, Jinxin Zhou, Guangzhi Wang, Zhihui Zhu, Ilya Zharkov, and Luming Liang
Selected Manuscripts
CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering
Haidong Zhu*, Tianyu Ding*, Tianyi Chen, Ilya Zharkov, Ram Nevatia, and Luming Liang
Accepted to European Conference on Computer Vision (ECCV), 2024
SEAS: ShapE-Aligned Supervision for Person Re-Identification
Haidong Zhu, Pranav Budhwant, Zhaoheng Zheng, and Ram Nevatia
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 164-174, 2024
Large Language Models are Good Prompt Learners for Low-Shot Image Classification
Zhaoheng Zheng, Jingmin Wei, Xuefeng Hu, Haidong Zhu, and Ram Nevatia
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 28453-28462, 2024
ShARc: Shape and Appearance ReCognition for Person Identification In-the-Wild
Haidong Zhu, Wanrong Zheng, Zhaoheng Zheng, and Ram Nevatia
In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 6290-6300, 2024
CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning
Zhaoheng Zheng, Haidong Zhu, and Ram Nevatia
In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1721-1731, 2024
GaitSTR: Gait Recognition With Sequential Two-Stream Refinement
Wanrong Zheng*, Haidong Zhu*, Zhaoheng Zheng, and Ram Nevatia
In IEEE Transactions on Biometrics, Behavior, and Identity Science (TBIOM), 2024.
GaitRef: Gait Recognition with Refined Sequential Skeletons
Haidong Zhu*, Wanrong Zheng*, Zhaoheng Zheng, and Ram Nevatia
In Proceedings of IEEE International Joint Conference on Biometrics (IJCB), 2023 (Oral)
Multimodal Neural Radiance Field
Haidong Zhu, Yuyin Sun, Chi Liu, Lu Xia, Jiajia Luo, Nan Qiao, Ram Nevatia, and Cheng-Hao Kuo
In IEEE International Conference on Robotics and Automation (ICRA), pp. 9393-9399, 2023
Gait Recognition Using 3-D Human Body Shape Inference
Haidong Zhu, Zhaoheng Zheng, and Ram Nevatia
In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 909-918, 2023
Self-supervised Learning for Sentiment Analysis via Image-text Matching
Haidong Zhu, Zhaoheng Zheng, Mohammad Soleymani, and Ram Nevatia
In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1710-1714, 2022
Utilizing Every Image Object for Semi-supervised Phrase Grounding
Haidong Zhu, Arka Sadhu, Zhaoheng Zheng, and Ram Nevatia
In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2210-2219, 2021
Curriculum DeepSDF
Yueqi Duan*, Haidong Zhu*, He Wang, Li Yi, Ram Nevatia, and Leonidas J. Guibas
In European Conference on Computer Vision (ECCV), pp. 51-67, 2020
Pick-and-Learn: Automatic Quality Evaluation for Noisy-Labeled Image Segmentation
Haidong Zhu, Jialin Shi, and Ji Wu
In Proceedings of the International Conf. on Medical Image Computing and Computer Assisted Intervention (MICCAI), LNCS 11769, pp. 576-584, 2019.
Biologically-Constrained Graphs for Global Connectomics Reconstruction
Brian Matejek, Daniel Haehn, Haidong Zhu, Donglai Wei, Toufiq Parag, and Hanspeter Pfister
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2089-2098, 2019.
Workshop, Survey and System Manuscrips
AG-ReID 2023: Aerial-Ground Person Re-identification Challenge Results
Kien Nguyen, Clinton Fookes, Sridha Sridharan, Feng Liu, Xiaoming Liu, Arun Ross, Dana Michalski, Huy Nguyen, Debayan Deb, Mahak Kothari, Manisha Saini, Dawei Du, Scott McCloskey, Gabriel Bertocco, Fernanda Andaló, Terrance E Boult, Anderson Rocha, Haidong Zhu, Zhaoheng Zheng, Ram Nevatia, Zaigham Randhawa, Sinan Sabri, Gianfranco Doretto
In Proceedings of IEEE International Joint Conference on Biometrics (IJCB), 2023
CAT-NeRF: Constancy-Aware Tx2Former for Dynamic Body Modeling
Haidong Zhu, Zhaoheng Zheng, Wanrong Zheng, and Ram Nevatia
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 6618-6627, 2023
GAIA at SM-KBP 2020 - A Dockerized Multi-media Multi-lingual Knowledge Extraction, Clustering, Temporal Tracking and Hypothesis Generation System
Manling Li, Ying Lin, Tuan Manh Lai, Xiaoman Pan, Haoyang Wen, Sha Li, Zhenhailong Wang, Pengfei Yu, Lifu Huang, Di Lu, Qingyun Wang, Haoran Zhang, Qi Zeng, Chi Han, Zixuan Zhang, Yujia Qin, Xiaodan Hu, Nikolaus Parulian, Daniel Campos, Heng Ji, Brian Chen, Xudong Lin, Alireza Zareian, Amith Ananthram, Emily Allaway, Shih-Fu Chang, Kathleen McKeown, Yixiang Yao, Michael Spector, Mitchell DeHaven, Daniel Napierski, Marjorie Freedman, Pedro Szekely, Haidong Zhu, Ram Nevatia, Yang Bai, Yifan Wang, Ali Sadeghian, Haodi Ma, Daisy Zhe Wang
In Proceedings of the Thirteenth Text Analysis Conference (TAC), 2020
CPARR: Category-based Proposal Analysis for Referring Relationships
Chuanzi He, Haidong Zhu, Jiyang Gao, Kan Chen, and Ram Nevatia
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 4074-4083, 2020.
Thesis
Shape-assisted Multimodal Person Re-Identification
• Doctor of Philosophy, University of Southern California, 2024.
• Advisor: Prof Ram Nevatia
Deep Learning Based Target Delineation System
• Bachelor of Engineering, Tsinghua University, 2019.
• Advisor: Prof Ji Wu
Selected Projects
Structural Relational Reasoning for Point Clouds
• Introduced structural relational network (SRN) for reasoning.
• Improved the results on public point cloud datasets.
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 949-958, 2019.
Online Big Data Face Recognition System
• Real time face recognition with data from Internet.
• Big data management policy for renewing database.
• Predicting relationship between the people in the image.
Visual-audio Similarity Evaluation System
• Evaluating similarity between given audio and visual fragments.
• Sequence feature extraction and similarity evaluation.
Competition & Lecture Management System
• Lecture management system with wechat and website version.
• Organizing information according to user's habit and need.

Website source from Jon Barron and Xingyuan Sun.