Yubo ZHANG

Logo

About Me

I’m Yubo Zhang, a Ph.D. candidate in the Department of Computer Science, UNC-Chapel Hill, working with Prof. Stephen Pizer. My research interests lie in deep learning approaches for computer vision and medical image analysis. In the first two years of my stay at UNC, I was working with Prof. Mohit Bansal, focusing on language-and-vision grounding tasks.

I received my bachelor’s degree from the Department of Automation, Tsinghua University in 2018. During undergraduate studying, I spent the summer of 2017 at University of Southern California working with Prof. C.-C. Jay Kuo, focusing on medical image processing.

News

Experience


Research

Real-time 3D reconstruction of colonoscopic surfaces

RNNSLAM

  • Real-time 3D reconstruction using deep learning (depth estimation) and SLAM methods, applying to colonoscopy videos. Project website.
  • Deep learning for image enhancement, applying to colonoscopy video frames.
  • Developing geometric algorithms for evaluating the reconstructed colonoscopic surfaces.
  1. Shuxian Wang*, Yubo Zhang*, Sarah K. McGill, Julian G. Rosenman, Jan-Michael Frahm, Soumyadip Sengupta, Stephen M. Pizer. “A Surface-normal Based Neural Framework for Colonoscopy Reconstruction,” in IPMI 2023.
  2. Yubo Zhang, Jan-Michael Frahm, Samuel Ehrenstein, Sarah K. McGill, Julian G. Rosenman, Shuxian Wang, Stephen M. Pizer. “ColDE: A Depth Estimation Framework for Colonoscopy Reconstruction,” arXiv preprint arXiv:2111.10371 (2021).
  3. Yubo Zhang, Shuxian Wang, Ruibin Ma, Sarah K. McGill, Julian Rosenman, Stephen Pizer. “Lighting Enhancement Aids Reconstruction of Colonoscopic Surfaces,” in IPMI 2021.
  4. Ruibin Ma, Rui Wang, Yubo Zhang, Stephen Pizer, Sarah K. McGill, Julian Rosenman. “RNNSLAM: Reconstructing the 3D Colon to Visualize Missing Regions during a Colonoscopy,” in Medical Image Analysis 72 (2021): 102100.
  5. Ruibin Ma, Sarah K. McGill, Rui Wang, Julian Rosenman, Jan-Michael Frahm, Yubo Zhang, Stephen Pizer. “Colon10K: A Benchmark for Place Recognition in Colonoscopy,’’ in ISBI 2021.

Vision-and-language

VLN

Focusing on vision and language grounding problems, i.e., vision-and-language navigation (VLN) and visual question answering (VQA) tasks. Improving the generalizability and interpretability of the multi-modality neural models.

  1. Yubo Zhang*, Hao Tan*, Mohit Bansal. “Diagnosing the Environment Bias in Vision-and-Language Navigation,” in IJCAI 2020.
  2. Yubo Zhang, Feiyang Niu, Qing Ping, Govind Thattai. “A Multi-level Alignment Training Scheme for Video-and-Language Grounding,” ICDM 2022, FOMO-VL Workshop.

MRI image super-resolution and segmentation

MRI

MRI image super resolution and segmentation using deep neural networks.

Information theory in synthetic biological processes

Wavelet transform in cables’ fault detection