Cho-Ying Wu

Senior Research Scientist @ Bosch Research

Formerly @ Google Pixel Camera Team

PhD from CS Department of Univeristy of Southern California under advise by Prof. Ulrich Neumann . Before that, I obtained my MS degree in Graduate Institute of Communication and Engineering at National Taiwan University. I earned a double major degree from National Taiwan University for Electrial Engineering and Law.

I also passed the Attorney of Higher Examination in Taiwan in 2016. This is equalivent to the bar exam in the United States.

Internship:

Argo AI, 2019

Amazon, 2020

Facebook, 2021

NVIDIA Toronto AI Lab, 2022

Publications

No Calibration, No Depth, No Problem: Cross-Sensor View Synthesis with 3D Consistency
Cho-Ying Wu, Zixun Huang, Xinyu Huang, Liu Ren
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2026
[paper] [project page]
The work synthesizes view-aligned RGB-X pairs (thermal, NIR, SAR, Normal maps ...) from either raw sensor sequences or style maps to facilitate multi-modality learning. The work proposes a match-densify-consolidate framework to work from cross-modal image matching, guided densification, and consolidation in 3DGS.

Pantheon360: Taming Digital Twin Generation via 3D-Aware 360° Video Diffusion
Ting-Hsuan Chen, Ying-Huan Chen, Tao Tu, Jie-Ying Lee, Cho-Ying Wu, Fangzhou Lin, Hengyuan Zhang, David Paz, Xinyu Huang, Yuliang Guo, Yu-Lun Liu, Yue Wang, Liu Ren
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2026
Pantheon360 generates controllable 360° videos from sparse panoramic inputs, and user-defined trajectories, with a video diffusion model producing photorealistic outputs. It drastically simplifies digital twin creation while preserving geometric consistency and visual quality.

3DGEER: 3D Gaussian Rendering Made Exact and Efficient for Generic Cameras
Zixun Huang, Cho-Ying Wu, Yuliang Guo, Xinyu Huang, Liu Ren
International Conference on Learning Representations (ICLR) 2026
(Top 1% score)
[paper] [project page] [openreview] [video]
An efficient volumetric Gaussian rendering method redesigned from first principles. Through exact closed-form Gaussian ray integration, a novel particle bounding design, we eliminate projective approximation error and support large-FoV cameras, and preserve both exactness and speed in differentiable rendering.

Online Language Splatting
Saimouli Katragadda, Cho-Ying Wu, Yuliang Guo, Xinyu Huang, Guoquan Huang, Liu Ren
International Conference on Computer Vision (ICCV) 2025
[paper] [project page] [code] [video]
Online SLAM with integration of dense CLIP features into Gaussian Splatting Mapping. The strategy attains high speed, high ability of open-vocabulary, and retain the base performance for visual synthesis and camera trajectory estimation via modality disntanglement.

InSpaceType: Dataset and Benchmark for Reconsidering Cross-Space Type Performance in Indoor Monocular Depth
Cho-Ying Wu, Quankai Gao, Chin-Cheng Hsu, Te-Lin Wu, Jing-Wen Chen, Ulrich Neumann
The British Machine Vision Conference (BMVC) 2024, acceptance rate ~26.4%
Conference on Robot Learning 2023 (CoRL), Workshop
[paper] [project page] [code] [data]
This work introduces a dataset and benchmark that reconsiders an important but usually overlooked factor- space type. We detailedly analyze ten SOTA models and four popular training dataset and unveil their potential biases.

See our project page to download the datasets!

Meta-Optimization for Higher Model Generalizability in Single-Image Depth Prediction
Cho-Ying Wu, Yiqi Zhong, Junying Wang, Ulrich Neumann
International Conference on Intelligent Robots and Systems (IROS), 2024
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2023, Workshop
[paper] [code]
This work studies learning scheme perspective for popular monocular depth estimation. We formulate our meta-learning based method by novel fine-grained task concept to address less affinity issue in single images. We show performance gain by simply changing learning scheme.

Toward Practical Monocular Indoor Depth Estimation
Cho-Ying Wu, Jialiang Wang, Michael Hall, Ulrich Neumann, Shuochen Su
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[paper] [project page] [code] [data] [video] [poster]
Practical indoor depth estimation: without depth annotation, efficient training data collection, high generalizability, and accurate and real-time depth sensing.

See our project page to download the largest datasets for indoor stereo!

Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Cho-Ying Wu, Chin-Cheng Hsu, Ulrich Neumann
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[paper] [code] [project page] [video] [poster]
An anlaysis on the statistical correlation between voices and 3D faces. Unlike previous work using 2D representations that include background or hairstyle variations, our 3D approach better validate correlation between voices and geometry.

See our project page for explanation of correlation between face geometry and voice!

Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry
Cho-Ying Wu, Qiangeng Xu, Ulrich Neumann
IEEE International Conference on 3D vision (3DV), 2021
[paper] [code] [project page] [video] [poster]
This work attains the state of the art on 3D facial geometry prediction, including 3D facial alignment, face orientation estimation, and 3D face modeling.

Check our code for the SOTA performance 3D facial alignment and face pose estimation!

Scene Completeness-Aware Lidar Depth Completion for Driving Scenario
Cho-Ying Wu, Ulrich Neumann
IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP), 2021
[paper] [code] [project page] [1-min demo] [long version video] [poster] [slides]
This work is the first to attend scene-completeness issue of depth completion. We obtain both structured and accurate scene depth.

Geometry-Aware Instance Segmentation with Disparity Maps
Cho-Ying Wu, Xiaoyan Hu, Michael Happold, Qiangeng Xu, Ulrich Neumann
IEEE Conference on Computer Vision and Pattern Recognition Workshop Scalability in Autonomous Driving (CVPRw), 2020
[paper] [project page] [code] [video]
The first outdoor instacne segmentation that using disparity maps. Based on Mask-RCNN, we show that using multi-modality of geometric information can improve the performance.

Grid-GCN for Fast and Scalable Point Cloud Learning
Qiangeng Xu, Xudong Sun, Cho-Ying Wu, Panqu Wang, Ulrich Neumann
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
[paper] [code]

Deep RGB-D Canonical Correlation Analysis for Sparse Depth Completion
Cho-Ying Wu*, Yiqi Zhong*, Suya You, Ulrich Neumann (*Equal Contribution)
Neural Information Processing System (NeurIPS), 2019
[paper] [code] [Youtube video] [poster]
We study deep canonical correlation analysis for multi-modal fusion on depth completion and attain the SOTA performance when only few sparse measurements are available.

Occluded Face Recognition Using Low-rank Regression with Generalized Gradient Direction
Cho-Ying Wu, Jian Jiun Ding
Pattern Recognition (PR), vol. 80, pp. 256–268, 2018
[paper] [code]
A robust and efficient occluded face recognition framework that attains the SOTA, using the sparse and low-rank model.

Occlusion Pattern-based Dictionary For Robust Face Recognition
Cho-Ying Wu, Jian-Jiun Ding
IEEE International Conference on Multimedia & Expo (ICME), 2016.
[paper]

Academic Activities

Reviewer

Regularly reviewing conferences: CVPR, ECCV, ICCV, AAAI, NeurIPS, ICLR, ICML, ACCV, ICIP