Multimodal embodied AI and Robotic Systems Lab Nanyang Technological University, Singapore

Research

Our pioneering research has been recognized in top-tier publications across AI (TPAMI, IJCV, NeurIPS, ICLR, CVPR, ICCV, ECCV), robotics (ICRA, IROS), and interdisciplinary fields (Cell Patterns, CACIE, BIB), reflecting our commitment to advancing science through impactful and innovative contributions.

Highlighted

2026

REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?
REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?
Chenxi Jiang, Chuhao Zhou, Jianfei Yang
ICLR 2026  ·  22 Jan 2026  ·  arXiv:2505.10872
RM-RL: Role-Model Reinforcement Learning for Precise Robot Manipulation
RM-RL: Role-Model Reinforcement Learning for Precise Robot Manipulation
Xiangyu Chen, Chuhao Zhou, Yuxi Liu, Jianfei Yang
ICRA 2026  ·  22 Jan 2026  ·  arxiv:2510.15189
Real-world RL framework that uses a role-model strategy to label online samples for offline supervised replay, achieving millimeter-level precision in robot manipulation.

2025

IoT-LLM: a framework for enhancing Large Language Model reasoning from real-world sensor data
IoT-LLM: a framework for enhancing Large Language Model reasoning from real-world sensor data
Tuo An, Yunjiao Zhou, Han Zou, Jianfei Yang
Patterns, Cell Press  ·  10 Oct 2025  ·  arxiv:2410.02429
A RAG-enhanced framework for enhancing LLM reasoning from real-world IoT sensor data across diverse tasks.

All

2026

M4Human: A Large-Scale Multimodal mmWave Radar Benchmark for Human Mesh Reconstruction
M4Human: A Large-Scale Multimodal mmWave Radar Benchmark for Human Mesh Reconstruction
Junqiao Fan, Yunjiao Zhou, Yizhuo Yang, Xinyuan Cui, Jiarui Zhang, Lihua Xie, Jianfei Yang, Chris Xiaoxuan Lu, Fangqiang Ding
CVPR 2026  ·  23 Feb 2026  ·  arxiv:2512.12378
The largest multimodal mmWave radar benchmark (661K frames) for high-fidelity human mesh reconstruction.
When Robots Should Say "I Don't Know": Benchmarking Abstention in Embodied Question Answering
When Robots Should Say "I Don't Know": Benchmarking Abstention in Embodied Question Answering
Tao Wu, Chuhao Zhou, Guangyu Zhao, Haozhi Cao, Yewen Pu, Jianfei Yang
CVPR 2026  ·  23 Feb 2026  ·  arxiv:2512.04597
First benchmark for evaluating abstention ability of embodied agents when facing ambiguous human queries.
$\mathbf{M^3A}$ Policy: Mutable Material Manipulation Augmentation Policy through Photometric Re-rendering
$\mathbf{M^3A}$ Policy: Mutable Material Manipulation Augmentation Policy through Photometric Re-rendering
Jiayi Li, Yuxuan Hu, Haoran Geng, Xiangyu Chen, Chuhao Zhou, Ziteng Cui, Jianfei Yang
CVPR 2026 (Findings)  ·  23 Feb 2026  ·  arxiv:2512.01446
A photometric re-rendering framework for material-generalized robotic manipulation via computational photography.
RF-MatID: Dataset and Benchmark for Radio Frequency Material Identification
RF-MatID: Dataset and Benchmark for Radio Frequency Material Identification
Xinyan Chen, Qinchun Li, Ruiqin Ma, Jiaqi Bai, Li Yi, Jianfei Yang
ICLR 2026  ·  22 Jan 2026  ·  arxiv:2601.20377
First open-source large-scale wide-band RF dataset and benchmark for fine-grained material identification.
REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?
REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?
Chenxi Jiang, Chuhao Zhou, Jianfei Yang
ICLR 2026  ·  22 Jan 2026  ·  arXiv:2505.10872
RM-RL: Role-Model Reinforcement Learning for Precise Robot Manipulation
RM-RL: Role-Model Reinforcement Learning for Precise Robot Manipulation
Xiangyu Chen, Chuhao Zhou, Yuxi Liu, Jianfei Yang
ICRA 2026  ·  22 Jan 2026  ·  arxiv:2510.15189
Real-world RL framework that uses a role-model strategy to label online samples for offline supervised replay, achieving millimeter-level precision in robot manipulation.

2025

mmPred: Radar-based Human Motion Prediction in the Dark
mmPred: Radar-based Human Motion Prediction in the Dark
Junqiao Fan, Haocong Rao, Jiarui Zhang, Jianfei Yang, Lihua Xie
AAAI 2026  ·  01 Dec 2025  ·  arxiv:2512.00345
First diffusion-based framework for radar-based human motion prediction, robust under adverse environments.
Zero-Shot Open-Vocabulary Human Motion Grounding with Test-Time Training
Zero-Shot Open-Vocabulary Human Motion Grounding with Test-Time Training
Yunjiao Zhou, Xinyan Chen, Junlang Qian, Lihua Xie, Jianfei Yang
AAAI 2026  ·  01 Nov 2025  ·  arxiv:2511.15379
A zero-shot framework for open-vocabulary human motion grounding via LLM-guided decomposition and soft masking optimization.
Mask2IV: Interaction-Centric Video Generation via Mask Trajectories
Mask2IV: Interaction-Centric Video Generation via Mask Trajectories
Gen Li, Bo Zhao, Jianfei Yang, Laura Sevilla-Lara
AAAI 2026  ·  01 Nov 2025  ·  arxiv:2510.03135
A two-stage framework for interaction-centric video generation that predicts mask trajectories for both actors and objects.
IoT-LLM: a framework for enhancing Large Language Model reasoning from real-world sensor data
IoT-LLM: a framework for enhancing Large Language Model reasoning from real-world sensor data
Tuo An, Yunjiao Zhou, Han Zou, Jianfei Yang
Patterns, Cell Press  ·  10 Oct 2025  ·  arxiv:2410.02429
A RAG-enhanced framework for enhancing LLM reasoning from real-world IoT sensor data across diverse tasks.
Interactive Test-Time Adaptation with Reliable Spatial-Temporal Voxels for Multi-Modal Segmentation
Interactive Test-Time Adaptation with Reliable Spatial-Temporal Voxels for Multi-Modal Segmentation
Haozhi Cao, Yuecong Xu, Pengyu Yin, Xingyu Ji, Shenghai Yuan, Jianfei Yang, Lihua Xie
European Conference on Computer Vision (ECCV), 2024  ·  07 Oct 2025  ·  arxiv:2403.06461

2024

Video Unsupervised Domain Adaptation with Deep Learning: A Comprehensive Survey
Video Unsupervised Domain Adaptation with Deep Learning: A Comprehensive Survey
Yuecong Xu, Haozhi Cao, Lihua Xie, Xiao-li Li, Zhenghua Chen, Jianfei Yang
ACM Computing Surveys  ·  01 Oct 2024  ·  doi:10.1145/3679010
MoPA: Multi-Modal Prior Aided Domain Adaptation for 3D Semantic Segmentation
MoPA: Multi-Modal Prior Aided Domain Adaptation for 3D Semantic Segmentation
Haozhi Cao, Yuecong Xu, Jianfei Yang, Pengyu Yin, Shenghai Yuan, Lihua Xie
2024 IEEE International Conference on Robotics and Automation (ICRA)  ·  13 May 2024  ·  doi:10.1109/ICRA57147.2024.10610316
Can We Evaluate Domain Adaptation Models Without Target-Domain Labels?
Can We Evaluate Domain Adaptation Models Without Target-Domain Labels?
Jianfei Yang, Hanjie Qian, Yuecong Xu, Kai Wang, Lihua Xie
ICLR 2024  ·  01 Jan 2024  ·  doi:10.48550/arXiv.2305.18712

2023

Going Deeper into Recognizing Actions in Dark Environments: A Comprehensive Benchmark Study
Going Deeper into Recognizing Actions in Dark Environments: A Comprehensive Benchmark Study
Yuecong Xu, Haozhi Cao, Jianxiong Yin, Zhenghua Chen, Xiaoli Li, Zhengguo Li, Qianwen Xu, Jianfei Yang
International Journal of Computer Vision  ·  08 Nov 2023  ·  doi:10.1007/s11263-023-01932-5
Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation
Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation
Haozhi Cao, Yuecong Xu, Jianfei Yang, Pengyu Yin, Shenghai Yuan, Lihua Xie
2023 IEEE/CVF International Conference on Computer Vision (ICCV)  ·  01 Oct 2023  ·  doi:10.1109/ICCV51070.2023.01724
MetaFi++: WiFi-Enabled Transformer-Based Human Pose Estimation for Metaverse Avatar Simulation
MetaFi++: WiFi-Enabled Transformer-Based Human Pose Estimation for Metaverse Avatar Simulation
Yunjiao Zhou, He Huang, Shenghai Yuan, Han Zou, Lihua Xie, Jianfei Yang
IEEE Internet of Things Journal (IoT-J)  ·  15 Aug 2023  ·  doi:10.1109/JIOT.2023.3262940
MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing
MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing
Jianfei Yang, He Huang, Yunjiao Zhou, Xinyan Chen, Yuecong Xu, Shenghai Yuan, Han Zou, Chris Xiaoxuan Lu, Lihua Xie
NeurIPS 2023 Datasets and Benchmarks Track  ·  11 May 2023  ·  arxiv:2305.10345
The world-first multi-modal non-intrusive 4D human dataset (RGB, Depth, LiDAR, mmWave, WiFi).
SenseFi: A library and benchmark on deep-learning-empowered WiFi human sensing
SenseFi: A library and benchmark on deep-learning-empowered WiFi human sensing
Jianfei Yang, Xinyan Chen, Han Zou, Chris Xiaoxuan Lu, Dazhuo Wang, Sumei Sun, Lihua Xie
Patterns, Cell Press  ·  10 Mar 2023  ·  doi:10.1016/j.patter.2023.100703
The world-first open-source benchmark on deep learning-empowered WiFi sensing.
Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors
Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors
Jianfei Yang, Xiangyu Peng, Kai Wang, Zheng Zhu, Jiashi Feng, Lihua Xie, Yang You
ICLR 2023  ·  01 Jan 2023  ·  arXiv:2205.14467