Homepage - Haidong Zhu

Haidong Zhu

I am a research scientist at Waymo Research. I received my Ph.D. in Computer Science from the University of Southern California (USC) in May 2024. I have the privilege of working at the USC IRIS Computer Vision Lab, advised by Prof. Ram Nevatia. Before that, I completed my Bachelor's Degree in Electronic Engineering from Tsinghua University in 2019.

My research interests lie in computer vision and deep learning, especially multimodal analysis for 3-D vision and biometric understanding.

Email / CV / Github / Google Scholar

Experience

Waymo LLC, New York, NY
Research Scientist • Jun. 2024 to Present

Developed auto-labeling pipeline using large language models for semantic labeling of BEV data.
Implemented diffusion models for accurate road graph construction, improving baseline accuracy by 30%.
Integrated Semantic Neural Radiance Fields for efficient semantic segmentation in partially labeled scenarios.

Education

University of Southern California, Los Angeles, CA Ph.D., Computer Science • Aug. 2019 to May 2024
Tsinghua University, Beijing, China B.Eng., Electronic Information Science and Technology • Sep. 2015 to Jul. 2019

Internship and On-campus Experience

IRIS Computer Vision Lab, USC, Los Angeles, CA
Research/Teaching Assistant • Aug. 2019 to May 2024
Advisor: Prof. Ram Nevatia

Enhanced 3D reconstruction methods with implicit neural representations.
Investigated biometric identification through gait and 3D body shape analysis.
Developed multimodal sentiment analysis models using self-supervised learning.

Applied Sciences Group, Microsoft, Redmond, WA
Research Intern • May 2023 to Aug. 2023
Advisor: Dr. Tianyu Ding

Developed few-shot generalizable Neural Radiance Field (NeRF) architectures.
Implemented NeRF-based methods for interactive 3D scene editing.

Lab 126, Amazon, Bellevue, WA
Applied Scientist Intern • May 2022 to Aug. 2022
Advisor: Dr. Yuyin Sun

Designed a multimodal NeRF framework integrating RGB and depth data.
Enhanced robustness of 3D point cloud registration algorithms.

AI Lab, Bytedance Inc., Mountain View, CA
Research Intern • May 2021 to Aug. 2021
Advisor: Dr. Ye Yuan

Developed methods for single-image-based 3D human mesh reconstruction.
Created generative models for automatic clothing geometry and texture synthesis.

Preprint

The Efficiency Spectrum of Large Language Models: An Algorithmic Survey

Tianyu Ding, Tianyi Chen, Haidong Zhu, Jiachen Jiang, Yiqi Zhong, Jinxin Zhou, Guangzhi Wang, Zhihui Zhu, Ilya Zharkov, and Luming Liang

[Paper] [Project]

Selected Manuscripts

	CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering Haidong Zhu, Tianyu Ding, Tianyi Chen, Ilya Zharkov, Ram Nevatia, and Luming Liang Accepted to European Conference on Computer Vision (ECCV), 2024 [Paper] [Supp] [Project] [Code]
	SEAS: ShapE-Aligned Supervision for Person Re-Identification Haidong Zhu, Pranav Budhwant, Zhaoheng Zheng, and Ram Nevatia In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 164-174, 2024 [Paper] [Supp]
	Large Language Models are Good Prompt Learners for Low-Shot Image Classification Zhaoheng Zheng, Jingmin Wei, Xuefeng Hu, Haidong Zhu, and Ram Nevatia In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 28453-28462, 2024 [Paper] [Supp] [Code]
	ShARc: Shape and Appearance ReCognition for Person Identification In-the-Wild Haidong Zhu, Wanrong Zheng, Zhaoheng Zheng, and Ram Nevatia In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 6290-6300, 2024 [Paper]
	CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning Zhaoheng Zheng, Haidong Zhu, and Ram Nevatia In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1721-1731, 2024 [Paper] [Code]
	GaitSTR: Gait Recognition With Sequential Two-Stream Refinement Wanrong Zheng, Haidong Zhu, Zhaoheng Zheng, and Ram Nevatia In IEEE Transactions on Biometrics, Behavior, and Identity Science (TBIOM), 2024. [Paper]
	GaitRef: Gait Recognition with Refined Sequential Skeletons Haidong Zhu, Wanrong Zheng, Zhaoheng Zheng, and Ram Nevatia In Proceedings of IEEE International Joint Conference on Biometrics (IJCB), 2023 (Oral) [Paper] [Code]
	Multimodal Neural Radiance Field Haidong Zhu, Yuyin Sun, Chi Liu, Lu Xia, Jiajia Luo, Nan Qiao, Ram Nevatia, and Cheng-Hao Kuo In IEEE International Conference on Robotics and Automation (ICRA), pp. 9393-9399, 2023 [Paper]
	Gait Recognition Using 3-D Human Body Shape Inference Haidong Zhu, Zhaoheng Zheng, and Ram Nevatia In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 909-918, 2023 [Paper] [Supp]
	Self-supervised Learning for Sentiment Analysis via Image-text Matching Haidong Zhu, Zhaoheng Zheng, Mohammad Soleymani, and Ram Nevatia In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1710-1714, 2022 [Paper]
	Utilizing Every Image Object for Semi-supervised Phrase Grounding Haidong Zhu, Arka Sadhu, Zhaoheng Zheng, and Ram Nevatia In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2210-2219, 2021 [Paper]
	Curriculum DeepSDF Yueqi Duan, Haidong Zhu, He Wang, Li Yi, Ram Nevatia, and Leonidas J. Guibas In European Conference on Computer Vision (ECCV), pp. 51-67, 2020 [Paper] [Code]
	Pick-and-Learn: Automatic Quality Evaluation for Noisy-Labeled Image Segmentation Haidong Zhu, Jialin Shi, and Ji Wu In Proceedings of the International Conf. on Medical Image Computing and Computer Assisted Intervention (MICCAI), LNCS 11769, pp. 576-584, 2019. [Paper]
	Biologically-Constrained Graphs for Global Connectomics Reconstruction Brian Matejek, Daniel Haehn, Haidong Zhu, Donglai Wei, Toufiq Parag, and Hanspeter Pfister In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2089-2098, 2019. [Paper] [Supp] [Code]

Workshop, Survey and System Manuscrips

	AG-ReID 2023: Aerial-Ground Person Re-identification Challenge Results Kien Nguyen, Clinton Fookes, Sridha Sridharan, Feng Liu, Xiaoming Liu, Arun Ross, Dana Michalski, Huy Nguyen, Debayan Deb, Mahak Kothari, Manisha Saini, Dawei Du, Scott McCloskey, Gabriel Bertocco, Fernanda Andaló, Terrance E Boult, Anderson Rocha, Haidong Zhu, Zhaoheng Zheng, Ram Nevatia, Zaigham Randhawa, Sinan Sabri, Gianfranco Doretto In Proceedings of IEEE International Joint Conference on Biometrics (IJCB), 2023 [Paper]
	CAT-NeRF: Constancy-Aware Tx²Former for Dynamic Body Modeling Haidong Zhu, Zhaoheng Zheng, Wanrong Zheng, and Ram Nevatia In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 6618-6627, 2023 [Paper] [Code] [Example Videos]
	GAIA at SM-KBP 2020 - A Dockerized Multi-media Multi-lingual Knowledge Extraction, Clustering, Temporal Tracking and Hypothesis Generation System Manling Li, Ying Lin, Tuan Manh Lai, Xiaoman Pan, Haoyang Wen, Sha Li, Zhenhailong Wang, Pengfei Yu, Lifu Huang, Di Lu, Qingyun Wang, Haoran Zhang, Qi Zeng, Chi Han, Zixuan Zhang, Yujia Qin, Xiaodan Hu, Nikolaus Parulian, Daniel Campos, Heng Ji, Brian Chen, Xudong Lin, Alireza Zareian, Amith Ananthram, Emily Allaway, Shih-Fu Chang, Kathleen McKeown, Yixiang Yao, Michael Spector, Mitchell DeHaven, Daniel Napierski, Marjorie Freedman, Pedro Szekely, Haidong Zhu, Ram Nevatia, Yang Bai, Yifan Wang, Ali Sadeghian, Haodi Ma, Daisy Zhe Wang In Proceedings of the Thirteenth Text Analysis Conference (TAC), 2020 [Paper]
	CPARR: Category-based Proposal Analysis for Referring Relationships Chuanzi He, Haidong Zhu, Jiyang Gao, Kan Chen, and Ram Nevatia In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 4074-4083, 2020. [Paper]

Thesis

	Shape-assisted Multimodal Person Re-Identification • Doctor of Philosophy, University of Southern California, 2024. • Advisor: Prof Ram Nevatia [Thesis] [Defense]
	Deep Learning Based Target Delineation System • Bachelor of Engineering, Tsinghua University, 2019. • Advisor: Prof Ji Wu [Thesis] [Proposal] [Midterm] [Defense]

Selected Projects

	Structural Relational Reasoning for Point Clouds • Introduced structural relational network (SRN) for reasoning. • Improved the results on public point cloud datasets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 949-958, 2019. [Paper]
	Online Big Data Face Recognition System • Real time face recognition with data from Internet. • Big data management policy for renewing database. • Predicting relationship between the people in the image. [Code]
	Visual-audio Similarity Evaluation System • Evaluating similarity between given audio and visual fragments. • Sequence feature extraction and similarity evaluation. [Code]
	Competition & Lecture Management System • Lecture management system with wechat and website version. • Organizing information according to user's habit and need. [Code]

Website source from Jon Barron and Xingyuan Sun.