default search action

combined dblp search
author search
venue search
publication search

ask others

Yihan Wu 0008

> Home > Persons

Person information

affiliation: Renmin University of China, Beijing, China

Other persons with the same name

see FAQ

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2025
[c14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/WuLPWS025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/WuLPWS025
Yihan Wu, Yichen Lu, Yifan Peng, Xihua Wang, Ruihua Song, Shinji Watanabe:
Enhancing Audiovisual Speech Recognition Through Bifocal Preference Optimization. AAAI 2025: 25516-25524
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/WangSL0LW0XW25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/WangSL0LW0XW25
Xihua Wang, Ruihua Song, Chongxuan Li, Xin Cheng, Boyuan Li, Yihan Wu, Yuyue Wang, Hongteng Xu, Yunfeng Wang:
Animate and Sound an Image. CVPR 2025: 23369-23378
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/0008WW0S25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/0008WW0S25
Xin Cheng, Xihua Wang, Yihan Wu, Yuyue Wang, Ruihua Song:
LoVA: Long-form Video-to-Audio Generation. ICASSP 2025: 1-5
[c11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuLCSCS025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuLCSCS025
Yihan Wu, Yichen Lu, Yijing Chen, Jiaqi Song, William Chen, Ruihua Song, Shinji Watanabe:
GALAXY: A Large-Scale Open-Domain Dataset for Multimodal Learning. INTERSPEECH 2025
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/mmasia/0003000TS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mmasia/0003000TS25
Yuyue Wang, Xin Cheng, Yihan Wu, Xihua Wang, Jinchuan Tian, Ruihua Song:
A Visual Speech Language Model for Visual Text-to-Speech Task. MMAsia 2025: 66:1-66:8
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-24773
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-24773
Xin Cheng, Yuyue Wang, Xihua Wang, Yihan Wu, Kaisi Guan, Yijing Chen, Peng Zhang, Xiaojiang Liu, Meng Cao, Ruihua Song:
VSSFlow: Unifying Video-conditioned Sound and Speech Generation via Joint Learning. CoRR abs/2509.24773 (2025)
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-22229
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-22229
Yuyue Wang, Xin Cheng, Yihan Wu, Xihua Wang, Jinchuan Tian, Ruihua Song:
VSpeechLM: A Visual Speech Language Model for Visual Text-to-Speech Task. CoRR abs/2511.22229 (2025)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-09841
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-09841
Yijing Chen, Yihan Wu, Kaisi Guan, Yuchen Ren, Yuyue Wang, Ruihua Song, Liyun Ru:
ChronusOmni: Improving Time Awareness of Omni Large Language Models. CoRR abs/2512.09841 (2025)
2024
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/Wang0WS0CXS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/Wang0WS0CXS24
Xihua Wang, Yuyue Wang, Yihan Wu, Ruihua Song, Xu Tan, Zehua Chen, Hongteng Xu, Guodong Sui:
TiVA: Time-Aligned Video-to-Audio Generation. ACM Multimedia 2024: 573-582
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WuPLCSW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WuPLCSW24
Yihan Wu, Yifan Peng, Yichen Lu, Xuankai Chang, Ruihua Song, Shinji Watanabe:
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts. SLT 2024: 43-48
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ShiTWJYMCWTBAZDSWLRJSW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ShiTWJYMCWTBAZDSWLRJSW24
Jiatong Shi, Jinchuan Tian, Yihan Wu, Jee-Weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharthi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander H. Liu, Bhiksha Raj, Qin Jin, Ruihua Song, Shinji Watanabe:
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs For Audio, Music, and Speech. SLT 2024: 562-569
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/www/WuSCJCY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/www/WuSCJCY24
Yihan Wu, Ruihua Song, Xu Chen, Hao Jiang, Zhao Cao, Jin Yu:
Understanding Human Preferences: Towards More Personalized Video to Text Generation. WWW 2024: 3952-3963
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-18045
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-18045
Yihan Wu, Soumi Maiti, Yifan Peng, Wangyou Zhang, Chenda Li, Yuyue Wang, Xihua Wang, Shinji Watanabe, Ruihua Song:
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition. CoRR abs/2401.18045 (2024)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-19853
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-19853
Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ze-Feng Gao, Yueguo Chen, Weizheng Lu, Ji-Rong Wen:
YuLan: An Open-source Large Language Model. CoRR abs/2406.19853 (2024)
[i10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-12370
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-12370
Yihan Wu, Yifan Peng, Yichen Lu, Xuankai Chang, Ruihua Song, Shinji Watanabe:
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts. CoRR abs/2409.12370 (2024)
[i9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-15157
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-15157
Xin Cheng, Xihua Wang, Yihan Wu, Yuyue Wang, Ruihua Song:
LoVA: Long-form Video-to-Audio Generation. CoRR abs/2409.15157 (2024)
[i8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-15897
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-15897
Jiatong Shi, Jinchuan Tian, Yihan Wu, Jee-weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharthi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander H. Liu, Bhiksha Raj, Qin Jin, Ruihua Song, Shinji Watanabe:
ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech. CoRR abs/2409.15897 (2024)
[i7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-19005
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-19005
Yihan Wu, Yichen Lu, Yifan Peng, Xihua Wang, Ruihua Song, Shinji Watanabe:
Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization. CoRR abs/2412.19005 (2024)
2023
[c5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/WuG0Z0S0ZM023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/WuG0Z0S0ZM023
Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian:
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing. AAAI 2023: 13772-13779
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/GuoLWZT23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/GuoLWZT23
Zhifang Guo, Yichong Leng, Yihan Wu, Sheng Zhao, Xu Tan:
Prompttts: Controllable Text-To-Speech With Text Descriptions. ICASSP 2023: 1-5
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangXWS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangXWS23
Yuyue Wang, Huan Xiao, Yihan Wu, Ruihua Song:
ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios. INTERSPEECH 2023: 4828-4832
[i6]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-12200
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-12200
Yuyue Wang, Huan Xiao, Yihan Wu, Ruihua Song:
ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios. CoRR abs/2305.12200 (2023)
2022
[c2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Wu00HZSQL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Wu00HZSQL22
Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu:
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios. INTERSPEECH 2022: 2568-2572
[c1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuWZHSN22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuWZHSN22
Yihan Wu, Xi Wang, Shaofei Zhang, Lei He, Ruihua Song, Jian-Yun Nie:
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis. INTERSPEECH 2022: 5503-5507
[i5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-00436
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-00436
Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu:
AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios. CoRR abs/2204.00436 (2022)
[i4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-12559
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-12559
Yihan Wu, Xi Wang, Shaofei Zhang, Lei He, Ruihua Song, Jian-Yun Nie:
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis. CoRR abs/2206.12559 (2022)
[i3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-12171
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-12171
Zhifang Guo, Yichong Leng, Yihan Wu, Sheng Zhao, Xu Tan:
PromptTTS: Controllable Text-to-Speech with Text Descriptions. CoRR abs/2211.12171 (2022)
[i2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-16934
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-16934
Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian:
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing. CoRR abs/2211.16934 (2022)
[i1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-14518
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-14518
Zehua Chen, Yihan Wu, Yichong Leng, Jiawei Chen, Haohe Liu, Xu Tan, Yang Cui, Ke Wang, Lei He, Sheng Zhao, Jiang Bian, Danilo P. Mandic:
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech. CoRR abs/2212.14518 (2022)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.