News
[2024-02] One paper accepted by CVPR, thanks to all co-authors!
[2024-01] One paper accepted by ICLR, thanks to all co-authors!
[2023-12] Start visiting in Human Sensing Lab @ CMU!
[2023-10] One paper accepted by Pattern Recognition, thanks to all co-authors!
[2022-08] We wrote an article about recent advances in audio-visual learning! [website]
[2022-05] Gave a talk @ 2022 BAAI Conference . Please find slides here!
[2022-03] Two papers accepted by CVPR 2022, thanks to all co-authors!
[2021-12] One paper accepted by TPAMI, thanks to all co-authors!
[2021-06] Graduate from University of Electronic Science and Technology of China (UESTC)!
|
Services
Conference Reviewer: CVPR 2022-2024, ECCV 2022, ICCV 2023, AAAI 2023-2024
Journal Reviewer: TMM, TPAMI, TCSVT
|
Publications(* equal contribution)
|
Enhancing Multi-modal Cooperation via Sample-level Modality Valuation
Yake Wei, Ruoxuan Feng, Zihe Wang, Di Hu
CVPR, 2024
arXiv / code
Observe and improve the fine-grained cooperation between modalities at sample-level.
|
|
Quantifying and Enhancing Multi-modal Robustness with Modality Preference
Zequn Yang, Yake Wei, Ce Liang, Di Hu
ICLR, 2024
arXiv / code
Analyze essential components for multi-modal robustness and delve into the
limitations imposed by modality preference.
|
|
Geometric-inspired graph-based Incomplete Multi-view Clustering
Zequn Yang, Han Zhang, Yake Wei, Zheng Wang, Feiping Nie, Di Hu
Pattern Recognition
paper / code
Conduct geometric analyses to mitigate missing views in weight aggregation.
|
|
Balanced Multimodal Learning via On-the-fly Gradient Modulation
Xiaokang Peng*, Yake Wei*, Andong Deng, Dong Wang, Di Hu
CVPR, 2022   (Oral Presentation)
arXiv / code
Alleviate optimization imbalance in multi-modal learning via on-the-fly gradient modulation.
|
|
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li*, Yake Wei*, Yapeng Tian*, Chenliang Xu, Ji-Rong Wen, Di Hu
CVPR, 2022   (Oral Presentation)
arXiv / project page
Audio-Visual Question Answering and propose MUSIC-AVQA dataset.
|
|
Class-aware Sounding Objects Localization via Audiovisual Correspondence
Di Hu, Yake Wei, Rui Qian, Weiyao Lin, Ruihua Song, Ji-Rong Wen
TPAMI
arXiv / project page
Discriminative sounding objects localization.
|
|