- Machine Perception
- Efficient long-term video understanding [CMU-DIVA]
- Large Multimodal Model [NeurIPS'22, NeurIPS'22]
- Learning from 3D simulation [ForkingPaths], [SimAug], [CARLA Sim]
- Forecasting
- Action Anticipation & Human Intention Prediction [Next-prediction]
- Trajectory Prediction [Multiverse]
- Spatial-temporal Forecasting
- Navigation
- Point/Image/Object Goal Navigation
- Social Navigation
- Manipulation
- Mobile Manipulation
- Whole Body Control
- ChatGPT + Mobile Manipulation @Jacobi.ai
- Pedestrian Trajectory Prediction
- Efficient Action Detection
- 3D Event Reconstruction
-
Shooter Localization System
Best Demo Award at CBMI 2019
- Multimodal Question Answering