default search action
Lei Xie 0001
Person information
- affiliation: Northwestern Polytechnical University, School of Computer Science, Xi'an, China
- affiliation (2006 - 2007): The Chinese University of Hong Kong, Department of Systems Engineering and Engineering Management, Hong Kong
- affiliation (2004 - 2006): City University of Hong Kong, School of Creative Media, Hong Kong
- affiliation (PhD 2004): Northwestern Polytechnical University, Xi'an, China
- affiliation (2001 - 2002): Vrije Universiteit Brussel, Department of Electronics and Information Processing, Belgium
Other persons with the same name
- Lei Xie — disambiguation page
- Lei Xie 0002 — Xi'an Jiaotong University, China
- Lei Xie 0003 — Zhejiang University, College of Information Science and Electronic Engineering, Hangzhou, China
- Lei Xie 0004 — Nanjing University, State Key Laboratory for Novel Software Technology, China
- Lei Xie 0005 — Delft University of Technology, Laboratory of Computer Engineering, The Netherlands
- Lei Xie 0006 — City University of New York, Department of Computer Science, Hunter College, NY, USA (and 1 more)
- Lei Xie 0007 — Zhejiang University, State Key Laboratory of Industrial Control Technology, Hangzhou, China (and 2 more)
- Lei Xie 0008 — Air Force Engineering University, Institute of Aeronautics Engineering, Department of weapon science and technology, China
- Lei Xie 0009 — Hong Kong University of Science and Technology, Department of Electronic and Computer Science, Hong Kong (and 1 more)
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j74]Li Zhang, Ning Jiang, Qing Wang, Yue Li, Quan Lu, Lei Xie:
Whisper-SV: Adapting Whisper for low-data-resource speaker verification. Speech Commun. 163: 103103 (2024) - [j73]Bingshen Mu, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie:
MMGER: Multi-Modal and Multi-Granularity Generative Error Correction With LLM for Joint Accent and Speech Recognition. IEEE Signal Process. Lett. 31: 1940-1944 (2024) - [j72]Runduo Han, Weiming Xu, Zihan Zhang, Mingshuai Liu, Lei Xie:
Distil-DCCRN: A Small-Footprint DCCRN Leveraging Feature-Based Knowledge Distillation in Speech Enhancement. IEEE Signal Process. Lett. 31: 2075-2079 (2024) - [j71]Zhichao Wang, Yuanzhe Chen, Xinsheng Wang, Lei Xie, Yuping Wang:
StreamVoice+: Evolving Into End-to-End Streaming Zero-Shot Voice Conversion. IEEE Signal Process. Lett. 31: 3000-3004 (2024) - [j70]Qijie Shao, Pengcheng Guo, Jinghao Yan, Pengfei Hu, Lei Xie:
Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 32: 459-470 (2024) - [j69]Xinfa Zhu, Yi Lei, Tao Li, Yongmao Zhang, Hongbin Zhou, Heng Lu, Lei Xie:
METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1506-1518 (2024) - [j68]Kun Wei, Bei Li, Hang Lv, Quan Lu, Ning Jiang, Lei Xie:
Conversational Speech Recognition by Learning Audio-Textual Cross-Modal Contextual Representation. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2432-2444 (2024) - [j67]Zhichao Wang, Liumeng Xue, Qiuqiang Kong, Lei Xie, Yuanzhe Chen, Qiao Tian, Yuping Wang:
Multi-Level Temporal-Channel Speaker Retrieval for Zero-Shot Voice Conversion. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2926-2937 (2024) - [j66]Jixun Yao, Qing Wang, Pengcheng Guo, Ziqian Ning, Lei Xie:
Distinctive and Natural Speaker Anonymization via Singular Value Transformation-Assisted Matrix. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2944-2956 (2024) - [j65]Tao Li, Zhichao Wang, Xinfa Zhu, Jian Cong, Qiao Tian, Yuping Wang, Lei Xie:
U-Style: Cascading U-Nets With Multi-Level Speaker and Style Modeling for Zero-Shot Voice Cloning. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4026-4035 (2024) - [c277]Zhichao Wang, Yuanzhe Chen, Xinsheng Wang, Lei Xie, Yuping Wang:
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion. ACL (1) 2024: 7328-7338 - [c276]Zihan Zhang, Jiayao Sun, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie:
Bs-Plcnet: Band-Split Packet Loss Concealment Network with Multi-Task Learning Framework and Multi-Discriminators. ICASSP Workshops 2024: 23-24 - [c275]Runduo Han, Xiaopeng Yan, Weiming Xu, Pengcheng Guo, Jiayao Sun, He Wang, Quan Lu, Ning Jiang, Lei Xie:
An Audio-Quality-Based Multi-Strategy Approach For Target Speaker Extraction in the Misp 2023 Challenge. ICASSP Workshops 2024: 27-28 - [c274]Mingshuai Liu, Zhuangqi Chen, Xiaopeng Yan, Yuanjun Lv, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie:
Rad-Net: A Repairing and Denoising Network for Speech Signal Improvement. ICASSP Workshops 2024: 49-50 - [c273]He Wang, Pengcheng Guo, Yue Li, Ao Zhang, Jiayao Sun, Lei Xie, Wei Chen, Pan Zhou, Hui Bu, Xin Xu, Binbin Zhang, Zhuo Chen, Jian Wu, Longbiao Wang, Eng Siong Chng, Sun Li:
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge. ICASSP Workshops 2024: 63-64 - [c272]He Wang, Pengcheng Guo, Pan Zhou, Lei Xie:
MLCA-AVSR: Multi-Layer Cross Attention Fusion Based Audio-Visual Speech Recognition. ICASSP 2024: 8150-8154 - [c271]Jixun Yao, Yuguang Yang, Yi Lei, Ziqian Ning, Yanni Hu, Yu Pan, Jingjing Yin, Hongbin Zhou, Heng Lu, Lei Xie:
Promptvc: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts. ICASSP 2024: 10571-10575 - [c270]Ziqian Ning, Yuepeng Jiang, Pengcheng Zhu, Shuai Wang, Jixun Yao, Lei Xie, Mengxiao Bi:
Dualvc 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion. ICASSP 2024: 11106-11110 - [c269]Bingshen Mu, Pengcheng Guo, Dake Guo, Pan Zhou, Wei Chen, Lei Xie:
Automatic Channel Selection and Spatial Feature Integration for Multi-Channel Speech Recognition Across Various Array Topologies. ICASSP 2024: 11396-11400 - [c268]Ziqian Wang, Xinfa Zhu, Zihan Zhang, Yuanjun Lv, Ning Jiang, Guoqing Zhao, Lei Xie:
SELM: Speech Enhancement using Discrete Tokens and Language Models. ICASSP 2024: 11561-11565 - [c267]He Wang, Pengcheng Guo, Xucheng Wan, Huan Zhou, Lei Xie:
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder. ICME Workshops 2024: 1-6 - [c266]Hongfei Xue, Qijie Shao, Kaixun Huang, Peikun Chen, Jie Liu, Lei Xie:
SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition. ICME 2024: 1-6 - [c265]Xinfa Zhu, Yuke Li, Yi Lei, Ning Jiang, Guoqing Zhao, Lei Xie:
Boosting Multi-Speaker Expressive Speech Synthesis with Semi-Supervised Contrastive Learning. ICME 2024: 1-6 - [c264]Xinfa Zhu, Wenjie Tian, Xinsheng Wang, Lei He, Yujia Xiao, Xi Wang, Xu Tan, Sheng Zhao, Lei Xie:
UniStyle: Unified Style Modeling for Speaking Style Captioning and Stylistic Speech Synthesis. ACM Multimedia 2024: 7513-7522 - [c263]Zhixian Zhao, Haifeng Chen, Xi Li, Dongmei Jiang, Lei Xie:
Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment. MRAC@MM 2024: 67-71 - [i195]Hongfei Xue, Yuhao Liang, Bingshen Mu, Shiliang Zhang, Mengzhe Chen, Qian Chen, Lei Xie:
E-chat: Emotion-sensitive Spoken Dialogue System with Large Language Models. CoRR abs/2401.00475 (2024) - [i194]He Wang, Pengcheng Guo, Pan Zhou, Lei Xie:
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition. CoRR abs/2401.03424 (2024) - [i193]He Wang, Pengcheng Guo, Yue Li, Ao Zhang, Jiayao Sun, Lei Xie, Wei Chen, Pan Zhou, Hui Bu, Xin Xu, Binbin Zhang, Zhuo Chen, Jian Wu, Longbiao Wang, Eng Siong Chng, Sun Li:
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge. CoRR abs/2401.03473 (2024) - [i192]Zihan Zhang, Jiayao Sun, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie:
BS-PLCNet: Band-split Packet Loss Concealment Network with Multi-task Learning Framework and Multi-discriminators. CoRR abs/2401.03687 (2024) - [i191]Runduo Han, Xiaopeng Yan, Weiming Xu, Pengcheng Guo, Jiayao Sun, He Wang, Quan Lu, Ning Jiang, Lei Xie:
An audio-quality-based multi-strategy approach for target speaker extraction in the MISP 2023 Challenge. CoRR abs/2401.03697 (2024) - [i190]Mingshuai Liu, Zhuangqi Chen, Xiaopeng Yan, Yuanjun Lv, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie:
RaD-Net: A Repairing and Denoising Network for Speech Signal Improvement. CoRR abs/2401.04389 (2024) - [i189]He Wang, Pengcheng Guo, Wei Chen, Pan Zhou, Lei Xie:
The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023. CoRR abs/2401.06788 (2024) - [i188]He Wang, Pengcheng Guo, Xucheng Wan, Huan Zhou, Lei Xie:
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder. CoRR abs/2404.05466 (2024) - [i187]Xuelong Geng, Tianyi Xu, Kun Wei, Bingshen Mu, Hongfei Xue, He Wang, Yangze Li, Pengcheng Guo, Yuhang Dai, Longhao Li, Mingchen Shao, Lei Xie:
Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets. CoRR abs/2405.02132 (2024) - [i186]Bingshen Mu, Yangze Li, Qijie Shao, Kun Wei, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie:
MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition. CoRR abs/2405.03152 (2024) - [i185]Yuepeng Jiang, Tao Li, Fengyu Yang, Lei Xie, Meng Meng, Yujun Wang:
Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling. CoRR abs/2406.05681 (2024) - [i184]Rong Gong, Hongfei Xue, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui Bu, Shaomei Wu, Jiaming Zhou, Yong Qin, Binbin Zhang, Jun Du, Jia Bin, Ming Li:
AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection. CoRR abs/2406.07256 (2024) - [i183]Mingshuai Liu, Zhuangqi Chen, Xiaopeng Yan, Yuanjun Lv, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie:
RaD-Net 2: A causal two-stage repairing and denoising speech enhancement network with knowledge distillation and complex axial self-attention. CoRR abs/2406.07498 (2024) - [i182]Yuanjun Lv, Hai Li, Ying Yan, Junhui Liu, Danming Xie, Lei Xie:
FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter. CoRR abs/2406.08196 (2024) - [i181]Linhan Ma, Xinfa Zhu, Yuanjun Lv, Zhichao Wang, Ziqian Wang, Wendi He, Hongbin Zhou, Lei Xie:
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy. CoRR abs/2406.09844 (2024) - [i180]Peikun Chen, Sining Sun, Changhao Shan, Qing Yang, Lei Xie:
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study. CoRR abs/2406.18862 (2024) - [i179]Li Zhang, Ning Jiang, Qing Wang, Yue Li, Quan Lu, Lei Xie:
Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification. CoRR abs/2407.10048 (2024) - [i178]He Wang, Lei Xie:
The NPU-ASLP System Description for Visual Speech Recognition in CNVSRC 2024. CoRR abs/2408.02369 (2024) - [i177]Runduo Han, Weiming Xu, Zihan Zhang, Mingshuai Liu, Lei Xie:
Distil-DCCRN: A Small-footprint DCCRN Leveraging Feature-based Knowledge Distillation in Speech Enhancement. CoRR abs/2408.04267 (2024) - [i176]Yangze Li, Xiong Wang, Songjun Cao, Yike Zhang, Long Ma, Lei Xie:
A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition. CoRR abs/2408.09491 (2024) - [i175]Tianyi Xu, Kaixun Huang, Pengcheng Guo, Yu Zhou, Longtao Huang, Hui Xue, Lei Xie:
Towards Rehearsal-Free Multilingual ASR: A LoRA-based Case Study on Whisper. CoRR abs/2408.10680 (2024) - [i174]Ziqian Ning, Shuai Wang, Yuepeng Jiang, Jixun Yao, Lei He, Shifeng Pan, Jie Ding, Lei Xie:
Drop the beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation. CoRR abs/2408.15474 (2024) - [i173]Hongfei Xue, Rong Gong, Mingchen Shao, Xin Xu, Lezhi Wang, Lei Xie, Hui Bu, Jiaming Zhou, Yong Qin, Jun Du, Ming Li, Binbin Zhang, Bin Jia:
Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge. CoRR abs/2409.05430 (2024) - [i172]Shuiyun Liu, Yuxiang Kong, Pengcheng Guo, Weiji Zhuang, Peng Gao, Yujun Wang, Lei Xie:
Optimizing Dysarthria Wake-Up Word Spotting: An End-to-End Approach for SLT 2024 LRDWWS Challenge. CoRR abs/2409.10076 (2024) - [i171]Hongfei Xue, Wei Ren, Xuelong Geng, Kun Wei, Longhao Li, Qijie Shao, Linju Yang, Kai Diao, Lei Xie:
Ideal-LLM: Integrating Dual Encoders and Language-Adapted LLM for Multilingual Speech-to-Text. CoRR abs/2409.11214 (2024) - [i170]Yuguang Yang, Yu Pan, Jixun Yao, Xiang Zhang, Jianhao Ye, Hongbin Zhou, Lei Xie, Lei Ma, Jianjun Zhao:
Takin-VC: Zero-shot Voice Conversion via Jointly Hybrid Content and Memory-Augmented Context-Aware Timbre Modeling. CoRR abs/2410.01350 (2024) - [i169]Nikita Kuzmin, Hieu-Thi Luong, Jixun Yao, Lei Xie, Kong Aik Lee, Eng Siong Chng:
NTU-NPU System for Voice Privacy 2024 Challenge. CoRR abs/2410.02371 (2024) - [i168]Dake Guo, Jixun Yao, Xinfa Zhu, Kangxiang Xia, Zhao Guo, Ziyu Zhang, Yao Wang, Jie Liu, Lei Xie:
The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge. CoRR abs/2410.23815 (2024) - [i167]Kangxiang Xia, Dake Guo, Jixun Yao, Liumeng Xue, Hanzhao Li, Shuai Wang, Zhao Guo, Lei Xie, Qingqing Zhang, Lei Luo, Minghui Dong, Peng Sun:
The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings. CoRR abs/2411.00064 (2024) - [i166]Xiong Wang, Yangze Li, Chaoyou Fu, Yunhang Shen, Lei Xie, Ke Li, Xing Sun, Long Ma:
Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM. CoRR abs/2411.00774 (2024) - 2023
- [j64]Xiang Hao, Chenglin Xu, Lei Xie:
Neural speech enhancement with unsupervised pre-training and mixture training. Neural Networks 158: 216-227 (2023) - [j63]Zhenglei Wei, Huan Zhou, Fei Cen, Lei Xie, Wenqiang Zhu, Peng Zhang, Qinzhi Hao:
A novel evolutionary algorithm inspired from triangle search and its applications on parameters identification of photovoltaic models. Soft Comput. 27(20): 14835-14860 (2023) - [j62]Zhichao Wang, Yuanzhe Chen, Lei Xie, Qiao Tian, Yuping Wang:
LM-VC: Zero-Shot Voice Conversion via Speech Generation Based on Language Models. IEEE Signal Process. Lett. 30: 1157-1161 (2023) - [j61]Tao Li, Chenxu Hu, Jian Cong, Xinfa Zhu, Jingbei Li, Qiao Tian, Yuping Wang, Lei Xie:
DiCLET-TTS: Diffusion Model Based Cross-Lingual Emotion Transfer for Text-to-Speech - A Study Between English and Mandarin. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3418-3430 (2023) - [j60]Qing Wang, Jixun Yao, Li Zhang, Pengcheng Guo, Lei Xie:
Timbre-Reserved Adversarial Attack in Speaker Identification. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3848-3858 (2023) - [j59]Zhichao Wang, Xinsheng Wang, Qicong Xie, Tao Li, Lei Xie, Qiao Tian, Yuping Wang:
MSM-VC: High-Fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-Scale Style Modeling. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3883-3895 (2023) - [j58]Junwen Xiong, Yu Zhou, Peng Zhang, Lei Xie, Wei Huang, Yufei Zha:
Look&listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement. IEEE Trans. Multim. 25: 5800-5812 (2023) - [j57]Xinsheng Wang, Qicong Xie, Jihua Zhu, Lei Xie, Odette Scharenborg:
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Persons. IEEE Trans. Multim. 25: 6717-6728 (2023) - [c262]Yi Lei, Shan Yang, Xinsheng Wang, Qicong Xie, Jixun Yao, Lei Xie, Dan Su:
UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis. AAAI 2023: 13025-13033 - [c261]Wenjiang Chi, Xiaoqin Feng, Liumeng Xue, Yunlin Chen, Lei Xie, Zhifei Li:
Multi-granularity Semantic and Acoustic Stress Prediction for Expressive TTS. APSIPA ASC 2023: 2409-2415 - [c260]Peikun Chen, Fan Yu, Yuhao Liang, Hongfei Xue, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie:
BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition. ASRU 2023: 1-7 - [c259]Dake Guo, Xinfa Zhu, Liumeng Xue, Tao Li, Yuanjun Lv, Yuepeng Jiang, Lei Xie:
HIGNN-TTS: Hierarchical Prosody Modeling With Graph Neural Networks for Expressive Long-Form TTS. ASRU 2023: 1-7 - [c258]Kaixun Huang, Ao Zhang, Binbin Zhang, Tianyi Xu, Xingchen Song, Lei Xie:
Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition. ASRU 2023: 1-8 - [c257]Yangze Li, Fan Yu, Yuhao Liang, Pengcheng Guo, Mohan Shi, Zhihao Du, Shiliang Zhang, Lei Xie:
Sa-Paraformer: Non-Autoregressive End-To-End Speaker-Attributed ASR. ASRU 2023: 1-7 - [c256]Yuke Li, Xinfa Zhu, Yi Lei, Hai Li, Junhui Liu, Danming Xie, Lei Xie:
Zero-Shot Emotion Transfer for Cross-Lingual Speech Synthesis. ASRU 2023: 1-8 - [c255]Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu:
The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR. ASRU 2023: 1-8 - [c254]Yuanjun Lv, Jixun Yao, Peikun Chen, Hongbin Zhou, Heng Lu, Lei Xie:
Salt: Distinguishable Speaker Anonymization Through Latent Space Transformation. ASRU 2023: 1-8 - [c253]Ziqian Ning, Yuepeng Jiang, Zhichao Wang, Bin Zhang, Lei Xie:
Vits-Based Singing Voice Conversion Leveraging Whisper and Multi-Scale F0 Modeling. ASRU 2023: 1-8 - [c252]Weiming Xu, Zhouxuan Chen, Zhili Tan, Shubo Lv, Runduo Han, Wenjiang Zhou, Weifeng Zhao, Lei Xie:
MBTFNET: Multi-Band Temporal-Frequency Neural Network for Singing Voice Enhancement. ASRU 2023: 1-8 - [c251]Yongmao Zhang, Guanghou Liu, Yi Lei, Yunlin Chen, Hao Yin, Lei Xie, Zhifei Li:
Promptspeaker: Speaker Generation Based on Text Descriptions. ASRU 2023: 1-7 - [c250]Zihan Zhang, Jiayao Sun, Xianjun Xia, Ziqian Wang, Xiaopeng Yan, Yijian Xiao, Lei Xie:
An Exploration of Task-Decoupling on Two-Stage Neural Post Filter for Real-Time Personalized Acoustic Echo Cancellation. ASRU 2023: 1-7 - [c249]Ao Zhang, Pan Zhou, Kaixun Huang, Yong Zou, Ming Liu, Lei Xie:
U2-KWS: Unified Two-Pass Open-Vocabulary Keyword Spotting with Keyword Bias. ASRU 2023: 1-8 - [c248]Yuepeng Jiang, Kun Song, Fengyu Yang, Lei Xie, Meng Meng, Yu Ji, Yujun Wang:
The Xiaomi-ASLP Text-to-speech System for Blizzard Challenge 2023. Blizzard Challenge 2023 - [c247]Ziqian Wang, Qing Wang, Jixun Yao, Lei Xie:
The NPU-ASLP System for Deepfake Algorithm Recognition in ADD 2023 Challenge. DADA@IJCAI 2023: 64-69 - [c246]Mingshuai Liu, Shubo Lv, Zihan Zhang, Runduo Han, Xiang Hao, Xianjun Xia, Li Chen, Yijian Xiao, Lei Xie:
Two-Stage Neural Network for ICASSP 2023 Speech Signal Improvement Challenge. ICASSP 2023: 1-2 - [c245]Ziqian Ning, Qicong Xie, Pengcheng Zhu, Zhichao Wang, Liumeng Xue, Jixun Yao, Lei Xie, Mengxiao Bi:
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features. ICASSP 2023: 1-5 - [c244]Kun Song, Yongmao Zhang, Yi Lei, Jian Cong, Hanzhao Li, Lei Xie, Gang He, Jinfeng Bai:
DSPGAN: A Gan-Based Universal Vocoder for High-Fidelity TTS by Time-Frequency Domain Supervision from DSP. ICASSP 2023: 1-5 - [c243]Zhichao Wang, Xinsheng Wang, Lei Xie, Yuanzhe Chen, Qiao Tian, Yuping Wang:
Delivering Speaking Style in Low-Resource Voice Conversion with Multi-Factor Constraints. ICASSP 2023: 1-5 - [c242]Jie Wang, Menglong Xu, Jingyong Hou, Binbin Zhang, Xiao-Lei Zhang, Lei Xie, Fuping Pan:
Wekws: A Production First Small-Footprint End-to-End Keyword Spotting Toolkit. ICASSP 2023: 1-5 - [c241]Xiaopeng Yan, Yindi Yang, Zhihao Guo, Liangliang Peng, Lei Xie:
The NPU-Elevoc Personalized Speech Enhancement System for Icassp2023 DNS Challenge. ICASSP 2023: 1-2 - [c240]Jixun Yao, Yi Lei, Qing Wang, Pengcheng Guo, Ziqian Ning, Lei Xie, Hai Li, Junhui Liu, Danming Xie:
Preserving Background Sound in Noise-Robust Voice Conversion Via Multi-Task Learning. ICASSP 2023: 1-5 - [c239]Jixun Yao, Qing Wang, Yi Lei, Pengcheng Guo, Lei Xie, Namin Wang, Jie Liu:
Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling. ICASSP 2023: 1-5 - [c238]Ao Zhang, He Wang, Pengcheng Guo, Yihui Fu, Lei Xie, Yingying Gao, Shilei Zhang, Junlan Feng:
VE-KWS: Visual Modality Enhanced End-to-End Keyword Spotting. ICASSP 2023: 1-5 - [c237]Li Zhang, Qing Wang, Hongji Wang, Yue Li, Wei Rao, Yannan Wang, Lei Xie:
Distance-Based Weight Transfer for Fine-Tuning From Near-Field to Far-Field Speaker Verification. ICASSP 2023: 1-5 - [c236]Zihan Zhang, Shimin Zhang, Mingshuai Liu, Yanhong Leng, Zhe Han, Li Chen, Lei Xie:
Two-Step Band-Split Neural Network Approach For Full-Band Residual Echo Suppression. ICASSP 2023: 1-2 - [c235]Xinfa Zhu, Yi Lei, Kun Song, Yongmao Zhang, Tao Li, Lei Xie:
Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling. ICASSP 2023: 1-5 - [c234]Kun Song, Yi Ren, Yi Lei, Chunfeng Wang, Kun Wei, Lei Xie, Xiang Yin, Zejun Ma:
StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation. INTERSPEECH 2023: 42-46 - [c233]Hongfei Xue, Qijie Shao, Peikun Chen, Pengcheng Guo, Lei Xie, Jie Liu:
TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition. INTERSPEECH 2023: 216-220 - [c232]Shubo Lv, Xiong Wang, Sining Sun, Long Ma, Lei Xie:
DCCRN-KWS: An Audio Bias Based Model for Noise Robust Small-Footprint Keyword Spotting. INTERSPEECH 2023: 929-933 - [c231]Tianyi Xu, Zhanheng Yang, Kaixun Huang, Pengcheng Guo, Ao Zhang, Biao Li, Changru Chen, Chao Li, Lei Xie:
Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition. INTERSPEECH 2023: 1668-1672 - [c230]Ziqian Ning, Yuepeng Jiang, Pengcheng Zhu, Jixun Yao, Shuai Wang, Lei Xie, Mengxiao Bi:
DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding. INTERSPEECH 2023: 2063-2067 - [c229]Zhanheng Yang, Sining Sun, Xiong Wang, Yike Zhang, Long Ma, Lei Xie:
Two Stage Contextual Word Filtering for Context Bias in Unified Streaming and Non-streaming Transducer. INTERSPEECH 2023: 3257-3261 - [c228]Yuhao Liang, Fan Yu, Yangze Li, Pengcheng Guo, Shiliang Zhang, Qian Chen, Lei Xie:
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR. INTERSPEECH 2023: 3487-3491 - [c227]Qing Wang, Jixun Yao, Ziqian Wang, Pengcheng Guo, Lei Xie:
Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification. INTERSPEECH 2023: 3994-3998 - [c226]Yongmao Zhang, Heyang Xue, Hanzhao Li, Lei Xie, Tingwei Guo, Ruixiong Zhang, Caixia Gong:
VISinger2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer. INTERSPEECH 2023: 4444-4448 - [c225]Guanghou Liu, Yongmao Zhang, Yi Lei, Yunlin Chen, Rui Wang, Lei Xie, Zhifei Li:
PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions. INTERSPEECH 2023: 4888-4892 - [c224]Kaixun Huang, Ao Zhang, Zhanheng Yang, Pengcheng Guo, Bingshen Mu, Tianyi Xu, Lei Xie:
Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network. INTERSPEECH 2023: 4933-4937 - [c223]Kun Song, Yi Lei, Peikun Chen, Yiqing Cao, Kun Wei, Yongmao Zhang, Lei Xie, Ning Jiang, Guoqing Zhao:
The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task. IWSLT@ACL 2023: 311-320 - [i165]Zhanheng Yang, Sining Sun, Xiong Wang, Yike Zhang, Long Ma, Lei Xie:
Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming Transducer. CoRR abs/2301.06735 (2023) - [i164]Ao Zhang, He Wang, Pengcheng Guo, Yihui Fu, Lei Xie, Yingying Gao, Shilei Zhang, Junlan Feng:
VE-KWS: Visual Modality Enhanced End-to-End Keyword Spotting. CoRR abs/2302.13523 (2023) - [i163]Li Zhang, Qing Wang, Hongji Wang, Yue Li, Wei Rao, Yannan Wang, Lei Xie:
Distance-based Weight Transfer from Near-field to Far-field Speaker Verification. CoRR abs/2303.00264 (2023) - [i162]Mingshuai Liu, Shubo Lv, Zihan Zhang, Runduo Han, Xiang Hao, Xianjun Xia, Li Chen, Yijian Xiao, Lei Xie:
Two-stage Neural Network for ICASSP 2023 Speech Signal Improvement Challenge. CoRR abs/2303.07621 (2023) - [i161]Zhichao Wang, Liumeng Xue, Qiuqiang Kong, Lei Xie, Yuanzhe Chen, Qiao Tian, Yuping Wang:
Multi-level Temporal-channel Speaker Retrieval for Robust Zero-shot Voice Conversion. CoRR abs/2305.07204 (2023) - [i160]