


default search action
Guanrou Yang
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2026
[i15]Ziyang Ma, Guanrou Yang, Wenxi Chen, Zhifu Gao, Yexing Du, Xiquan Li, Zhisheng Zheng, Haina Zhu, Jianheng Zhuo, Zheshu Song, Ruiyang Xu, Tiranrui Wang, Yifan Yang, Yanqiao Zhu, Zhikang Niu, Liumeng Xue, Yinghao Ma, Ruibin Yuan, Shiliang Zhang, Kai Yu, Eng Siong Chng, Xie Chen:
SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing. CoRR abs/2601.09385 (2026)- 2025
[c9]Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen:
Speech Recognition Meets Large Language Model: Benchmarking, Models, and Exploration. AAAI 2025: 24840-24848
[c8]Guanrou Yang, Fan Yu, Ziyang Ma, Zhihao Du, Zhifu Gao, Shiliang Zhang, Xie Chen:
Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap. ICASSP 2025: 1-5
[c7]Wenrui Liu
, Qian Chen
, Wen Wang
, Guanrou Yang
, Weiqin Li
, Minghui Fang
, Jialong Zuo
, Xiaoda Yang
, Tao Jin
, Jin Xu
, Zemin Liu
, Yafeng Chen
, Jionghao Bai
, Zhifang Guo
:
Speech Token Prediction via Compressed-to-fine Language Modeling for Speech Generation. ACM Multimedia 2025: 10632-10641
[c6]Guanrou Yang
, Chen Yang
, Qian Chen
, Ziyang Ma
, Wenxi Chen
, Wen Wang
, Tianrui Wang
, Yifan Yang
, Zhikang Niu
, Wenrui Liu
, Fan Yu
, Zhihao Du
, Zhifu Gao
, Shiliang Zhang
, Xie Chen
:
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting. ACM Multimedia 2025: 10748-10757
[i14]Qian Chen, Yafeng Chen, Yanni Chen, Mengzhe Chen, Yingda Chen, Chong Deng, Zhihao Du, Ruize Gao, Changfeng Gao, Zhifu Gao, Yabin Li, Xiang Lv, Jiaqing Liu, Haoneng Luo, Bin Ma, Chongjia Ni, Xian Shi, Jialong Tang, Hui Wang, Hao Wang, Wen Wang, Yuxuan Wang, Yunlan Xu, Fan Yu, Zhijie Yan, Yexin Yang, Baosong Yang, Xian Yang, Guanrou Yang, Tianyu Zhao, Qinglin Zhang, Shiliang Zhang, Nan Zhao, Pei Zhang, Chong Zhang, Jinren Zhou:
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction. CoRR abs/2501.06282 (2025)
[i13]Guanrou Yang, Chen Yang, Qian Chen, Ziyang Ma, Wenxi Chen, Wen Wang, Tianrui Wang, Yifan Yang, Zhikang Niu, Wenrui Liu, Fan Yu, Zhihao Du, Zhifu Gao, Shiliang Zhang, Xie Chen:
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting. CoRR abs/2504.12867 (2025)
[i12]Ziyang Ma, Yinghao Ma, Yanqiao Zhu, Chen Yang, Yi-Wen Chao, Ruiyang Xu, Wenxi Chen, Yuanzhe Chen, Zhuo Chen, Jian Cong, Kai Li, Keliang Li, Siyou Li, Xinfeng Li, Xiquan Li, Zheng Lian, Yuzhe Liang, Minghao Liu, Zhikang Niu, Tianrui Wang, Yuping Wang, Yuxuan Wang, Yihao Wu, Guanrou Yang, Jianwei Yu, Ruibin Yuan, Zhisheng Zheng, Ziya Zhou, Haina Zhu, Wei Xue, Emmanouil Benetos, Kai Yu, Chng Eng Siong, Xie Chen:
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix. CoRR abs/2505.13032 (2025)
[i11]Zhihao Du, Changfeng Gao, Yuxuan Wang, Fan Yu, Tianyu Zhao, Hao Wang, Xiang Lv, Hui Wang, Chongjia Ni, Xian Shi, Keyu An, Guanrou Yang, Yabin Li, Yanni Chen, Zhifu Gao, Qian Chen, Yue Gu, Mengzhe Chen, Yafeng Chen, Shiliang Zhang, Wen Wang, Jieping Ye:
CosyVoice 3: Towards In-the-wild Speech Generation via Scaling-up and Post-training. CoRR abs/2505.17589 (2025)
[i10]Tianrui Wang, Haoyu Wang, Meng Ge, Cheng Gong, Chunyu Qiang, Ziyang Ma, Zikang Huang, Guanrou Yang, Xiaobao Wang, Eng Siong Chng, Xie Chen, Longbiao Wang, Jianwu Dang:
Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis. CoRR abs/2509.24629 (2025)
[i9]Yakun Song, Xiaobin Zhuang, Jiawei Chen, Zhikang Niu, Guanrou Yang, Chenpeng Du, Dongya Jia, Zhuo Chen, Yuping Wang, Yuxuan Wang, Xie Chen:
DiSTAR: Diffusion over a Scalable Token Autoregressive Representation for Speech Generation. CoRR abs/2510.12210 (2025)
[i8]Wenming Tu, Guanrou Yang, Ruiqi Yan, Wenxi Chen, Ziyang Ma, Yipeng Kang, Kai Yu, Xie Chen, Zilong Zheng:
UltraVoice: Scaling Fine-Grained Style-Controlled Speech Conversations for Spoken Dialogue Models. CoRR abs/2510.22588 (2025)- 2024
[c5]Yakun Song, Zhuo Chen, Xiaofei Wang, Ziyang Ma, Guanrou Yang, Xie Chen:
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers. INTERSPEECH 2024
[c4]Guanrou Yang, Ziyang Ma, Fan Yu, Zhifu Gao, Shiliang Zhang, Xie Chen:
MaLa-ASR: Multimedia-Assisted LLM-Based ASR. INTERSPEECH 2024
[c3]Guanrou Yang, Ziyang Ma, Zhifu Gao, Shiliang Zhang, Xie Chen:
CTC-Assisted LLM-Based Contextual ASR. SLT 2024: 126-131
[i7]Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen:
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity. CoRR abs/2402.08846 (2024)
[i6]Guanrou Yang, Ziyang Ma, Fan Yu, Zhifu Gao, Shiliang Zhang, Xie Chen:
MaLa-ASR: Multimedia-Assisted LLM-Based ASR. CoRR abs/2406.05839 (2024)
[i5]Yakun Song, Zhuo Chen, Xiaofei Wang, Ziyang Ma, Guanrou Yang, Xie Chen:
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers. CoRR abs/2406.15752 (2024)
[i4]Guanrou Yang, Fan Yu, Ziyang Ma, Zhihao Du, Zhifu Gao, Shiliang Zhang, Xie Chen:
Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap. CoRR abs/2410.16726 (2024)
[i3]Guanrou Yang, Ziyang Ma, Zhifu Gao, Shiliang Zhang, Xie Chen:
CTC-Assisted LLM-Based Contextual ASR. CoRR abs/2411.06437 (2024)- 2023
[c2]Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen:
Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning. ASRU 2023: 1-7
[c1]Ziyang Ma, Zhisheng Zheng, Guanrou Yang, Yu Wang, Chao Zhang, Xie Chen:
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation. INTERSPEECH 2023: 1269-1273
[i2]Ziyang Ma, Zhisheng Zheng, Guanrou Yang, Yu Wang, Chao Zhang, Xie Chen:
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation. CoRR abs/2306.08920 (2023)
[i1]Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen:
Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning. CoRR abs/2309.13860 (2023)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2026-02-26 00:18 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







