default search action

combined dblp search
author search
venue search
publication search

ask others

Shan Yang 0001

> Home > Persons

Person information

affiliation: Tencent AI Lab, Beijing, China
affiliation (PhD): Northwestern Polytechnical University, School of Computer Science, Xi'an, China

Other persons with the same name

see FAQ

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2025
[c33]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/WangCYLLLYWMW25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/WangCYLLLYWMW25
Yuanyuan Wang, Hangting Chen, Dongchao Yang, Weiqin Li, Dan Luo, Guangzhi Li, Shan Yang, Zhiyong Wu, Helen Meng, Xixin Wu:
UniSep: Universal Target Audio Separation with Language Models at Scale. ICME 2025: 1-6
[c32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RenLXGZCXFY025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RenLXGZCXFY025
Yong Ren, Chenxing Li, Le Xu, Hao Gu, Duzhen Zhang, Yujie Chen, Manjie Xu, Ruibo Fu, Shan Yang, Dong Yu:
Hearing from Silence: Reasoning Audio Descriptions from Silent Videos via Vision-Language Model. INTERSPEECH 2025
[c31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuLRCGFY025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuLRCGFY025
Le Xu, Chenxing Li, Yong Ren, Yujie Chen, Yu Gu, Ruibo Fu, Shan Yang, Dong Yu:
Mitigating Audiovisual Mismatch in Visual-Guide Audio Captioning. INTERSPEECH 2025
[c30]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/HuangTYL025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/HuangTYL025
Guanjie Huang, Danny H. K. Tsang, Shan Yang, Guangzhi Lei, Li Liu:
Cued-Agent: A Collaborative Multi-Agent System for Automatic Cued Speech Recognition. ACM Multimedia 2025: 8313-8321
[c29]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/RongWLY025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/RongWLY025
Yan Rong, Jinting Wang, Guangzhi Lei, Shan Yang, Li Liu:
AudioGenie: A Training-Free Multi-Agent Framework for Diverse Multimodality-to-Multiaudio Generation. ACM Multimedia 2025: 8872-8881
[i29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2501-04644
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2501-04644
Hanzhao Li, Yuke Li, Xinsheng Wang, Jingbin Hu, Qicong Xie, Shan Yang, Lei Xie:
FleSpeech: Flexibly Controllable Speech Generation with Various Prompts. CoRR abs/2501.04644 (2025)
[i28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-23762
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-23762
Yuanyuan Wang, Hangting Chen, Dongchao Yang, Weiqin Li, Dan Luo, Guangzhi Li, Shan Yang, Zhiyong Wu, Helen Meng, Xixin Wu:
UniSep: Universal Target Audio Separation with Language Models at Scale. CoRR abs/2503.23762 (2025)
[i27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-13062
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-13062
Yong Ren, Chenxing Li, Le Xu, Hao Gu, Duzhen Zhang, Yujie Chen, Manjie Xu, Ruibo Fu, Shan Yang, Dong Yu:
Hearing from Silence: Reasoning Audio Descriptions from Silent Videos via Vision-Language Model. CoRR abs/2505.13062 (2025)
[i26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-22045
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-22045
Le Xu, Chenxing Li, Yong Ren, Yujie Chen, Yu Gu, Ruibo Fu, Shan Yang, Dong Yu:
Mitigating Audiovisual Mismatch in Visual-Guide Audio Captioning. CoRR abs/2505.22045 (2025)
[i25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-22053
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-22053
Yan Rong, Jinting Wang, Shan Yang, Guangzhi Lei, Li Liu:
AudioGenie: A Training-Free Multi-Agent Framework for Diverse Multimodality-to-Multiaudio Generation. CoRR abs/2505.22053 (2025)
[i24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-04134
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-04134
Jinting Wang, Shan Yang, Li Liu:
UniCUE: Unified Recognition and Generation Framework for Chinese Cued Speech Video-to-Speech Generation. CoRR abs/2506.04134 (2025)
[i23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2508-00391
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2508-00391
Guanjie Huang, Danny H. K. Tsang, Shan Yang, Guangzhi Lei, Li Liu:
Cued-Agent: A Collaborative Multi-Agent System for Automatic Cued Speech Recognition. CoRR abs/2508.00391 (2025)
[i22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2508-03543
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2508-03543
Tianxin Xie, Shan Yang, Chenxing Li, Dong Yu, Li Liu:
EmoSteer-TTS: Fine-Grained and Training-Free Emotion-Controllable Text-to-Speech via Activation Steering. CoRR abs/2508.03543 (2025)
[i21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-23994
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-23994
Tianxin Xie, Wentao Lei, Guanjie Huang, Pengfei Zhang, Kai Jiang, Chunhui Zhang, Fengji Ma, Haoyu He, Han Zhang, Jiangshan He, Jinting Wang, Linghan Fang, Lufei Gao, Orkesh Ablet, Peihua Zhang, Ruolin Hu, Shengyu Li, Weilin Lin, Xiaoyang Feng, Xinyue Yang, Yan Rong, Yanyun Wang, Zihang Shao, Zelin Zhao, Chenxing Li, Shan Yang, Wenfu Wang, Meng Yu, Dong Yu, Li Liu:
PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation. CoRR abs/2512.23994 (2025)
2023
[c28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/LeiYWXYX023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/LeiYWXYX023
Yi Lei, Shan Yang, Xinsheng Wang, Qicong Xie, Jixun Yao, Lei Xie, Dan Su:
UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis. AAAI 2023: 13025-13033
[c27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XiaoLWYSK0S023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XiaoLWYSK0S023
Wei Xiao, Wenzhe Liu, Meng Wang, Shan Yang, Yupeng Shi, Yuyong Kang, Dan Su, Shidong Shang, Dong Yu:
Multi-mode Neural Speech Coding Based on Deep Generative Networks. INTERSPEECH 2023: 819-823
[i20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-10992
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-10992
Wenzhe Liu, Wei Xiao, Meng Wang, Shan Yang, Yupeng Shi, Yuyong Kang, Dan Su, Shidong Shang, Dong Yu:
A High Fidelity and Low Complexity Neural Audio Coding. CoRR abs/2310.10992 (2023)
2022
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/spl/LeiYZXS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spl/LeiYZXS22
Yi Lei, Shan Yang, Xinfa Zhu, Lei Xie, Dan Su:
Cross-Speaker Emotion Transfer Through Information Perturbation in Emotional Speech Synthesis. IEEE Signal Process. Lett. 29: 1948-1952 (2022)
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/LeiYWX22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/LeiYWX22
Yi Lei, Shan Yang, Xinsheng Wang, Lei Xie:
MsEmoTTS: Multi-Scale Emotion Transfer, Prediction, and Control for Emotional Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 30: 853-864 (2022)
[c26]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiuYSY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiuYSY22
Songxiang Liu, Shan Yang, Dan Su, Dong Yu:
Referee: Towards Reference-Free Cross-Speaker Style Transfer with Low-Quality Data for Expressive Speech Synthesis. ICASSP 2022: 6307-6311
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangYSLYM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangYSLYM22
Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng:
VCVTS: Multi-Speaker Video-to-Speech Synthesis Via Cross-Modal Knowledge Transfer from Voice Conversion. ICASSP 2022: 7252-7256
[c24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XueYH0022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XueYH0022
Liumeng Xue, Shan Yang, Na Hu, Dan Su, Lei Xie:
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers. INTERSPEECH 2022: 2548-2552
[c23]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeiYC0022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeiYC0022
Yi Lei, Shan Yang, Jian Cong, Lei Xie, Dan Su:
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion. INTERSPEECH 2022: 2563-2567
[c22]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/XieYLXS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/XieYLXS22
Qicong Xie, Shan Yang, Yi Lei, Lei Xie, Dan Su:
End-to-End Voice Conversion with Information Perturbation. ISCSLP 2022: 91-95
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2201-06460
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2201-06460
Yi Lei, Shan Yang, Xinsheng Wang, Lei Xie:
MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis. CoRR abs/2201.06460 (2022)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2202-09081
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-09081
Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng:
VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion. CoRR abs/2202.09081 (2022)
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-07569
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-07569
Qicong Xie, Shan Yang, Yi Lei, Lei Xie, Dan Su:
End-to-End Voice Conversion with Information Perturbation. CoRR abs/2206.07569 (2022)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-00756
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-00756
Liumeng Xue, Shan Yang, Na Hu, Dan Su, Lei Xie:
Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers. CoRR abs/2207.00756 (2022)
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-01832
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-01832
Yi Lei, Shan Yang, Jian Cong, Lei Xie, Dan Su:
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion. CoRR abs/2207.01832 (2022)
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-01546
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-01546
Yi Lei, Shan Yang, Xinsheng Wang, Qicong Xie, Jixun Yao, Lei Xie, Dan Su:
UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis. CoRR abs/2212.01546 (2022)
2021
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/nn/AnSYX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/nn/AnSYX21
Xiaochun An, Frank K. Soong, Shan Yang, Lei Xie:
Effective and direct control of neural TTS prosody by removing interactions between different attributes. Neural Networks 143: 250-260 (2021)
[c21]
- view
  authority control:
- export record
  dblp key:
  - conf/icmi/ChenYHX021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmi/ChenYHX021
Yi Chen, Shan Yang, Na Hu, Lei Xie, Dan Su:
TeNC: Low Bit-Rate Speech Coding with VQ-VAE and GAN. ICMI Companion 2021: 126-130
[c20]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CongYXS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CongYXS21
Jian Cong, Shan Yang, Lei Xie, Dan Su:
Glow-WaveGAN: Learning Speech Representations from GAN-Based Variational Auto-Encoder for High Fidelity Flow-Based Speech Synthesis. Interspeech 2021: 2182-2186
[c19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CongYHLX021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CongYHLX021
Jian Cong, Shan Yang, Na Hu, Guangzhi Li, Lei Xie, Dan Su:
Controllable Context-Aware Conversational Speech Synthesis. Interspeech 2021: 4658-4662
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/LiYXX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/LiYXX21
Tao Li, Shan Yang, Liumeng Xue, Lei Xie:
Controllable Emotion Transfer For End-to-End Speech Synthesis. ISCSLP 2021: 1-5
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/WangGWYGCLXL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/WangGWYGCLXL21
Zhichao Wang, Wenshuo Ge, Xiong Wang, Shan Yang, Wendong Gan, Haitao Chen, Hai Li, Lei Xie, Xiulin Li:
Accent and Speaker Disentanglement in Many-to-many Voice Conversion. ISCSLP 2021: 1-5
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/LeiYX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/LeiYX21
Yi Lei, Shan Yang, Lei Xie:
Fine-Grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis. SLT 2021: 423-430
[c15]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YangYLF0X21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YangYLF0X21
Geng Yang, Shan Yang, Kai Liu, Peng Fang, Wei Chen, Lei Xie:
Multi-Band Melgan: Faster Waveform Generation For High-Quality Text-To-Speech. SLT 2021: 492-498
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/XueYLXL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/XueYLXL21
Heyang Xue, Shan Yang, Yi Lei, Lei Xie, Xiulin Li:
Learn2Sing: Target Speaker Singing Voice Synthesis by Learning from a Singing Teacher. SLT 2021: 522-529
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-10828
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-10828
Jian Cong, Shan Yang, Na Hu, Guangzhi Li, Lei Xie, Dan Su:
Controllable Context-aware Conversational Speech Synthesis. CoRR abs/2106.10828 (2021)
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-10831
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-10831
Jian Cong, Shan Yang, Lei Xie, Dan Su:
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis. CoRR abs/2106.10831 (2021)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2109-03439
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2109-03439
Songxiang Liu, Shan Yang, Dan Su, Dong Yu:
Referee: Towards reference-free cross-speaker style transfer with low-quality data for expressive speech synthesis. CoRR abs/2109.03439 (2021)
2020
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/nn/YangLKXXSXY20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/nn/YangLKXXSXY20
Shan Yang, Heng Lu, Shiyin Kang, Liumeng Xue, Jinba Xiao, Dan Su, Lei Xie, Dong Yu:
On the localness modeling for the self-attention based end-to-end speech synthesis. Neural Networks 125: 121-130 (2020)
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/spl/YangWX20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spl/YangWX20
Shan Yang, Yuxuan Wang, Lei Xie:
Adversarial Feature Learning and Unsupervised Clustering Based Speech Synthesis for Found Data With Acoustic and Textual Noise. IEEE Signal Process. Lett. 27: 1730-1734 (2020)
[c13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/blizzard/Tian0YZD00ZS0020
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/blizzard/Tian0YZD00ZS0020
Xiaohai Tian, Zhichao Wang, Shan Yang, Xinyong Zhou, Hongqiang Du, Yi Zhou, Mingyang Zhang, Kun Zhou, Berrak Sisman, Lei Xie, Haizhou Li:
The NUS & NWPU system for Voice Conversion Challenge 2020. Blizzard Challenge / Voice Conversion Challenge 2020
[c12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CongY0YW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CongY0YW20
Jian Cong, Shan Yang, Lei Xie, Guoqiao Yu, Guanglu Wan:
Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training. INTERSPEECH 2020: 811-815
[c11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangYWWX20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangYWWX20
Fengyu Yang, Shan Yang, Qinghua Wu, Yujun Wang, Lei Xie:
Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis. INTERSPEECH 2020: 3436-3440
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2004-13595
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2004-13595
Shan Yang, Yuxuan Wang, Lei Xie:
Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise. CoRR abs/2004.13595 (2020)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-05106
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-05106
Geng Yang, Shan Yang, Kai Liu, Peng Fang, Wei Chen, Lei Xie:
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech. CoRR abs/2005.05106 (2020)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2008-00613
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2008-00613
Fengyu Yang, Shan Yang, Qinghua Wu, Yujun Wang, Lei Xie:
Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis. CoRR abs/2008.00613 (2020)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2008-04265
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2008-04265
Jian Cong, Shan Yang, Lei Xie, Guoqiao Yu, Guanglu Wan:
Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training. CoRR abs/2008.04265 (2020)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-08467
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-08467
Heyang Xue, Shan Yang, Yi Lei, Lei Xie, Xiulin Li:
Learn2Sing: Target Speaker Singing Voice Synthesis by learning from a Singing Teacher. CoRR abs/2011.08467 (2020)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-08477
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-08477
Yi Lei, Shan Yang, Lei Xie:
Fine-grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis. CoRR abs/2011.08477 (2020)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-08609
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-08609
Zhichao Wang, Wenshuo Ge, Xiong Wang, Shan Yang, Wendong Gan, Haitao Chen, Hai Li, Lei Xie, Xiulin Li:
Accent and Speaker Disentanglement in Many-to-many Voice Conversion. CoRR abs/2011.08609 (2020)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-08679
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-08679
Tao Li, Shan Yang, Liumeng Xue, Lei Xie:
Controllable Emotion Transfer For End-to-End Speech Synthesis. CoRR abs/2011.08679 (2020)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2012-01837
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2012-01837
Haohan Guo, Heng Lu, Na Hu, Chunlei Zhang, Shan Yang, Lei Xie, Dan Su, Dong Yu:
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training. CoRR abs/2012.01837 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[j2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/access/ZhuZYXX19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/access/ZhuZYXX19
Xiaolian Zhu, Yuchao Zhang, Shan Yang, Liumeng Xue, Lei Xie:
Pre-Alignment Guided Attention for Improving Training Efficiency and Model Stability in End-to-End Speech Synthesis. IEEE Access 7: 65955-65964 (2019)
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/AnWYMX19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/AnWYMX19
Xiaochun An, Yuxuan Wang, Shan Yang, Zejun Ma, Lei Xie:
Learning Hierarchical Representations for Expressive Speaking Style in End-to-End Speech Synthesis. ASRU 2019: 184-191
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ZhuYYX19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ZhuYYX19
Xiaolian Zhu, Shan Yang, Geng Yang, Lei Xie:
Controlling Emotion Strength with Relative Attribute for End-to-End Speech Synthesis. ASRU 2019: 192-199
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/YangYZYX19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/YangYZYX19
Fengyu Yang, Shan Yang, Pengcheng Zhu, Pengju Yan, Lei Xie:
Improving Mandarin End-to-End Speech Synthesis by Self-Attention and Learnable Gaussian Bias. ASRU 2019: 208-213
[c7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/blizzard/YangGYZML019
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/blizzard/YangGYZML019
Shan Yang, Wenshuo Ge, Fengyu Yang, Xinyong Zhou, Fanbo Meng, Kai Liu, Lei Xie:
SZ-NPU Team's Entry to Blizzard Challenge 2019. Blizzard Challenge 2019
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangLKXY19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangLKXY19
Shan Yang, Heng Lu, Shiying Kang, Lei Xie, Dong Yu:
Enhancing Hybrid Self-attention Structure with Relative-position-aware Bias for Speech Synthesis. ICASSP 2019: 6910-6914
2018
[c5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/blizzard/XiaoY0SH0D018
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/blizzard/XiaoY0SH0D018
Jinba Xiao, Shan Yang, Mingyang Zhang, Berrak Sisman, Dongyan Huang, Lei Xie, Minghui Dong, Haizhou Li:
The I2R-NWPU-NUS Text-to-Speech System for Blizzard Challenge 2018. Blizzard Challenge 2018
2017
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/YangXCLZHL17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/YangXCLZHL17
Shan Yang, Lei Xie, Xiao Chen, Xiaoyan Lou, Xuan Zhu, Dongyan Huang, Haizhou Li:
Statistical parametric speech synthesis using generative adversarial networks under a multi-task learning framework. ASRU 2017: 685-691
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/blizzard/LuZYMZZYH0D17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/blizzard/LuZYMZZYH0D17
Yanfeng Lu, Zhengchen Zhang, Chenyu Yang, Huaiping Ming, Xiaolian Zhu, Yuchao Zhang, Shan Yang, Dongyan Huang, Lei Xie, Minghui Dong:
The I2R-NWPU Text-to-Speech System for Blizzard Challenge 2017. Blizzard Challenge 2017
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/YangXCLZHL17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/YangXCLZHL17
Shan Yang, Lei Xie, Xiao Chen, Xiaoyan Lou, Xuan Zhu, Dongyan Huang, Haizhou Li:
Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under A Multi-task Learning Framework. CoRR abs/1707.01670 (2017)
2016
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/mta/FanXYWS16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/mta/FanXYWS16
Bo Fan, Lei Xie, Shan Yang, Lijuan Wang, Frank K. Soong:
A deep bidirectional LSTM approach for video-realistic talking head. Multim. Tools Appl. 75(9): 5287-5309 (2016)
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/apsipa/YangWX16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/apsipa/YangWX16
Shan Yang, Zhizheng Wu, Lei Xie:
On the training of DNN-based average voice model for speech synthesis. APSIPA 2016: 1-6
[c1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/blizzard/ZhangLZZLYLP0D16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/blizzard/ZhangLZZLYLP0D16
Zhengchen Zhang, Mei Li, Yuchao Zhang, Weini Zhang, Yang Liu, Shan Yang, Yanfeng Lu, Van Tung Pham, Lei Xie, Minghui Dong:
The I2R-NWPU-NTU Text-to-Speech System at Blizzard Challenge 2016. Blizzard Challenge 2016

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.