Yake Wei (卫雅珂)

I am a third year PhD student at Gaoling School of Artificial Intelligence, Renmin University of China. I am advised by Prof. Di Hu. Now I am having a visiting in Human Sensing Lab @ CMU. My research interests focus on multi-modal learning.

I received my bachelor's degree in Computer Science and Technology from University of Electronic Science and Technology of China (UESTC). Had a wondeful time with my friends in Chengdu, China from 2017-2021.

Email  /  Google Scholar  /  Github

profile photo
News

[2024-02] One paper accepted by CVPR, thanks to all co-authors!

[2024-01] One paper accepted by ICLR, thanks to all co-authors!

[2023-12] Start visiting in Human Sensing Lab @ CMU!

[2023-10] One paper accepted by Pattern Recognition, thanks to all co-authors!

[2022-08] We wrote an article about recent advances in audio-visual learning! [website]

[2022-05] Gave a talk @ 2022 BAAI Conference . Please find slides here!

[2022-03] Two papers accepted by CVPR 2022, thanks to all co-authors!

[2021-12] One paper accepted by TPAMI, thanks to all co-authors!

[2021-06] Graduate from University of Electronic Science and Technology of China (UESTC)!

Services

Conference Reviewer: CVPR 2022-2024, ECCV 2022, ICCV 2023, AAAI 2023-2024

Journal Reviewer: TMM, TPAMI, TCSVT

Preprint
clean-usnob Learning in Audio-visual Context: A Review, Analysis, and New Perspective

Yake Wei, Di Hu, Yapeng Tian, Xuelong Li

Under review
arXiv / website / awesome list

A systematical survey about the audio-visual learning field.

Publications(* equal contribution)
clean-usnob Enhancing Multi-modal Cooperation via Sample-level Modality Valuation

Yake Wei, Ruoxuan Feng, Zihe Wang, Di Hu

CVPR, 2024
arXiv / code

Observe and improve the fine-grained cooperation between modalities at sample-level.

clean-usnob Quantifying and Enhancing Multi-modal Robustness with Modality Preference

Zequn Yang, Yake Wei, Ce Liang, Di Hu

ICLR, 2024
arXiv / code

Analyze essential components for multi-modal robustness and delve into the limitations imposed by modality preference.

clean-usnob Geometric-inspired graph-based Incomplete Multi-view Clustering

Zequn Yang, Han Zhang, Yake Wei, Zheng Wang, Feiping Nie, Di Hu

Pattern Recognition
paper / code

Conduct geometric analyses to mitigate missing views in weight aggregation.

clean-usnob Balanced Multimodal Learning via On-the-fly Gradient Modulation

Xiaokang Peng*, Yake Wei*, Andong Deng, Dong Wang, Di Hu

CVPR, 2022   (Oral Presentation)
arXiv / code

Alleviate optimization imbalance in multi-modal learning via on-the-fly gradient modulation.

clean-usnob Learning to Answer Questions in Dynamic Audio-Visual Scenarios

Guangyao Li*, Yake Wei*, Yapeng Tian*, Chenliang Xu, Ji-Rong Wen, Di Hu

CVPR, 2022   (Oral Presentation)
arXiv / project page

Audio-Visual Question Answering and propose MUSIC-AVQA dataset.

clean-usnob Class-aware Sounding Objects Localization via Audiovisual Correspondence

Di Hu, Yake Wei, Rui Qian, Weiyao Lin, Ruihua Song, Ji-Rong Wen

TPAMI
arXiv / project page

Discriminative sounding objects localization.



Updated at Feb. 2024
Thanks Jon Barron for this amazing template.