default search action
Pengyuan Zhang
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [j41]Shengchang Xiao, Xueshuai Zhang, Pengyuan Zhang, Yonghong Yan:
Semi-supervised sound event detection with dynamic convolution and confidence-aware mean teacher. Digit. Signal Process. 156: 104794 (2025) - 2024
- [j40]Jiakun Shen, Xueshuai Zhang, Yu Lu, Pengfei Ye, Pengyuan Zhang, Yonghong Yan:
Novel audio characteristic-dependent feature extraction and data augmentation methods for cough-based respiratory disease classification. Comput. Biol. Medicine 179: 108843 (2024) - [j39]Jiahao Yang, Shuo Feng, Wenkai Zhang, Ming Zhang, Jun Zhou, Pengyuan Zhang:
An efficient loss function and deep learning approach for ranking stock returns in the absence of prior knowledge. Inf. Process. Manag. 61(1): 103579 (2024) - [j38]Zhenduo Zhao, Zhuo Li, Xueshuai Zhang, Wenchao Wang, Pengyuan Zhang:
Prototype Division for Self-Supervised Speaker Verification. IEEE Signal Process. Lett. 31: 880-884 (2024) - [j37]Yuxiang Zhang, Zhuo Li, Jingze Lu, Wenchao Wang, Pengyuan Zhang:
Synthetic Speech Detection Based on the Temporal Consistency of Speaker Features. IEEE Signal Process. Lett. 31: 944-948 (2024) - [j36]Han Zhu, Gaofeng Cheng, Jindong Wang, Wenxin Hou, Pengyuan Zhang, Yonghong Yan:
Boosting Cross-Domain Speech Recognition With Self-Supervision. IEEE ACM Trans. Audio Speech Lang. Process. 32: 471-485 (2024) - [j35]Yifan Chen, Gaofeng Cheng, Runyan Yang, Pengyuan Zhang, Yonghong Yan:
Interrelate Training and Clustering for Online Speaker Diarization. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1352-1364 (2024) - [j34]Chengxin Chen, Pengyuan Zhang:
Modality-collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion Recognition. ACM Trans. Multim. Comput. Commun. Appl. 20(5): 146:1-146:23 (2024) - [c94]Pengyuan Zhang, Baojiang Cui:
Network Scanning Detection Based on Spatiotemporal Behavior. EIDWT 2024: 110-117 - [c93]Xing Bai, Jun Zhou, Pengyuan Zhang, Ruipeng Hao:
Make Audio Solely Drive Lip in Talking Face Video Synthesis. ICANN (3) 2024: 349-360 - [c92]Aolin Hu, Xueshuai Zhang, Shaoxing Zhang, Pengyuan Zhang, Yu Lu, Pengfei Ye, Qingwei Zhao, Yonghong Yan:
Snore Sound Features Based on Percussive Enhancing and Positional Encoding Combined with Multi-Task Learning for Osahs Detection. ICASSP 2024: 901-905 - [c91]Jiakun Shen, Xueshuai Zhang, Pengyuan Zhang, Yonghong Yan, Qingwei Zhao, Ta Li, Yanfen Tang, Shaoxing Zhang:
One-Epoch Training with Single Test Sample in Test Time for Better Generalization of Cough-Based Covid-19 Detection Model. ICASSP 2024: 931-935 - [c90]Jingze Lu, Yuxiang Zhang, Wenchao Wang, Zengqiang Shang, Pengyuan Zhang:
One-Class Knowledge Distillation for Spoofing Speech Detection. ICASSP 2024: 11251-11255 - [c89]Yuxiang Zhang, Jingze Lu, Zengqiang Shang, Wenchao Wang, Pengyuan Zhang:
Improving Short Utterance Anti-Spoofing with Aasist2. ICASSP 2024: 11636-11640 - [i46]Chengxin Chen, Pengyuan Zhang:
TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition. CoRR abs/2404.12979 (2024) - [i45]Haorui He, Zengqiang Shang, Chaoren Wang, Xuyuan Li, Yicheng Gu, Hua Hua, Liwei Liu, Chen Yang, Jiaqi Li, Peiyang Shi, Yuancheng Wang, Kai Chen, Pengyuan Zhang, Zhizheng Wu:
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation. CoRR abs/2407.05361 (2024) - 2023
- [j33]Yukun Liu, Ta Li, Pengyuan Zhang, Yonghong Yan:
SFA: Searching faster architectures for end-to-end automatic speech recognition models. Comput. Speech Lang. 81: 101500 (2023) - [j32]Zhuo Li, Runqiu Xiao, Hangting Chen, Zhenduo Zhao, Wenchao Wang, Pengyuan Zhang:
How to make embeddings suitable for PLDA. Comput. Speech Lang. 81: 101523 (2023) - [j31]Jiahao Yang, Wenkai Zhang, Xuejun Zhang, Jun Zhou, Pengyuan Zhang:
Enhancing stock movement prediction with market index and curriculum learning. Expert Syst. Appl. 213(Part): 118800 (2023) - [j30]Feng Dang, Hangting Chen, Qi Hu, Pengyuan Zhang, Yonghong Yan:
First coarse, fine afterward: A lightweight two-stage complex approach for monaural speech enhancement. Speech Commun. 146: 32-44 (2023) - [j29]Yi Yang, Qi Hu, Qingwei Zhao, Pengyuan Zhang:
So-DAS: A Two-Step Soft-Direction-Aware Speech Separation Framework. IEEE Signal Process. Lett. 30: 344-348 (2023) - [j28]Han Zhu, Dongji Gao, Gaofeng Cheng, Daniel Povey, Pengyuan Zhang, Yonghong Yan:
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3320-3330 (2023) - [j27]Yuxiang Zhang, Zhuo Li, Jingze Lu, Hua Hua, Wenchao Wang, Pengyuan Zhang:
The Impact of Silence on Speech Anti-Spoofing. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3374-3389 (2023) - [c88]Zengqiang Shang, Xuyuan Li, Peiyang Shi, Hua Hua, Pengyuan Zhang:
The IOA-ThinkIT system for Blizzard Challenge 2023. Blizzard Challenge 2023 - [c87]Hua Hua, Jingze Lu, Peiyang Shi, Zengqiang Shang, Yuxiang Zhang, Xuyuan Li, Pengyuan Zhang:
Description of a Multi-Stage Audio Spoofing System in ADD Challenge 2023. DADA@IJCAI 2023: 49-57 - [c86]Yuxiang Zhang, Jingze Lu, Zhuo Li, Zengqiang Shang, Wenchao Wang, Pengyuan Zhang:
Improving the Robustness of Deepfake Audio Detection through Confidence Calibration. DADA@IJCAI 2023: 70-75 - [c85]Jingze Lu, Yuxiang Zhang, Zhuo Li, Zengqiang Shang, Wenchao Wang, Pengyuan Zhang:
Detecting Unknown Speech Spoofing Algorithms with Nearest Neighbors. DADA@IJCAI 2023: 89-94 - [c84]Jiakun Shen, Xueshuai Zhang, Pengyuan Zhang, Yonghong Yan, Shaoxing Zhang, Zhihua Huang, Yanfen Tang, Yu Wang, Fujie Zhang, Aijun Sun:
Piecewise Position Encoding in Convolutional Neural Network for Cough-Based Covid-19 Detection. ICASSP 2023: 1-5 - [c83]Shengchang Xiao, Xueshuai Zhang, Pengyuan Zhang:
Multi-Dimensional Frequency Dynamic Convolution with Confident Mean Teacher for Sound Event Detection. ICASSP 2023: 1-5 - [c82]Zhenduo Zhao, Zhuo Li, Wenchao Wang, Pengyuan Zhang:
PCF: ECAPA-TDNN with Progressive Channel Fusion for Speaker Verification. ICASSP 2023: 1-5 - [i44]Feng Dang, Qi Hu, Pengyuan Zhang:
THLNet: two-stage heterogeneous lightweight network for monaural speech enhancement. CoRR abs/2301.07939 (2023) - [i43]Shengchang Xiao, Xueshuai Zhang, Pengyuan Zhang:
Multi-dimensional frequency dynamic convolution with confident mean teacher for sound event detection. CoRR abs/2302.09256 (2023) - [i42]Changfeng Gao, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Speech Corpora Divergence Based Unsupervised Data Selection for ASR. CoRR abs/2302.13222 (2023) - [i41]Zhenduo Zhao, Zhuo Li, Wenchao Wang, Pengyuan Zhang:
PCF: ECAPA-TDNN with Progressive Channel Fusion for Speaker Verification. CoRR abs/2303.00204 (2023) - [i40]Feng Dang, Qi Hu, Pengyuan Zhang, Yonghong Yan:
ForkNet: Simultaneous Time and Time-Frequency Domain Modeling for Speech Enhancement. CoRR abs/2305.08292 (2023) - [i39]Zhenduo Zhao, Zhuo Li, Wenchao Wang, Pengyuan Zhang:
The HCCL system for VoxCeleb Speaker Recognition Challenge 2022. CoRR abs/2305.12642 (2023) - [i38]Zhuo Li, Jingze Lu, Zhenduo Zhao, Wenchao Wang, Pengyuan Zhang:
Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain Adaptation Speaker Verification. CoRR abs/2305.12703 (2023) - [i37]Han Zhu, Dongji Gao, Gaofeng Cheng, Daniel Povey, Pengyuan Zhang, Yonghong Yan:
Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition. CoRR abs/2308.06547 (2023) - [i36]Xuyuan Li, Zengqiang Shang, Jian Liu, Hua Hua, Peiyang Shi, Pengyuan Zhang:
Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder. CoRR abs/2308.13365 (2023) - [i35]Yuxiang Zhang, Jingze Lu, Zengqiang Shang, Wenchao Wang, Pengyuan Zhang:
Improving Short Utterance Anti-Spoofing with AASIST2. CoRR abs/2309.08279 (2023) - [i34]Jingze Lu, Yuxiang Zhang, Wenchao Wang, Zengqiang Shang, Pengyuan Zhang:
One-Class Knowledge Distillation for Spoofing Speech Detection. CoRR abs/2309.08285 (2023) - [i33]Yuxiang Zhang, Zhuo Li, Jingze Lu, Hua Hua, Wenchao Wang, Pengyuan Zhang:
The Impact of Silence on Speech Anti-Spoofing. CoRR abs/2309.11827 (2023) - [i32]Yuxiang Zhang, Zhuo Li, Jingze Lu, Wenchao Wang, Pengyuan Zhang:
Synthetic Speech Detection Based on Temporal Consistency and Distribution of Speaker Features. CoRR abs/2309.16954 (2023) - [i31]Chengxin Chen, Pengyuan Zhang:
DSNet: Disentangled Siamese Network with Neutral Calibration for Speech Emotion Recognition. CoRR abs/2312.15593 (2023) - [i30]Chengxin Chen, Pengyuan Zhang:
Modality-Collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion Recognition. CoRR abs/2312.15848 (2023) - 2022
- [j26]Yuzhuo Liu, Hangting Chen, Qingwei Zhao, Pengyuan Zhang:
Master-Teacher-Student: A Weakly Labelled Semi-Supervised Framework for Audio Tagging and Sound Event Detection. IEICE Trans. Inf. Syst. 105-D(4): 828-831 (2022) - [j25]Zhaoqi Li, Ta Li, Qingwei Zhao, Pengyuan Zhang:
Label-Adversarial Jointly Trained Acoustic Word Embedding. IEICE Trans. Inf. Syst. 105-D(8): 1501-1505 (2022) - [j24]Zheying Huang, Ji Xu, Qingwei Zhao, Pengyuan Zhang:
A Two-Fold Cross-Validation Training Framework Combined with Meta-Learning for Code-Switching Speech Recognition. IEICE Trans. Inf. Syst. 105-D(9): 1639-1642 (2022) - [j23]Mengxi Liu, Pengyuan Zhang, Qian Shi, Mengwei Liu:
An Adversarial Domain Adaptation Framework With KL-Constraint for Remote Sensing Land Cover Classification. IEEE Geosci. Remote. Sens. Lett. 19: 1-5 (2022) - [j22]Runyan Yang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
An E2E-ASR-Based Iteratively-Trained Timestamp Estimator. IEEE Signal Process. Lett. 29: 1654-1658 (2022) - [j21]Changfeng Gao, Gaofeng Cheng, Ta Li, Pengyuan Zhang, Yonghong Yan:
Self-Supervised Pre-Training for Attention-Based Encoder-Decoder ASR Model. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1763-1774 (2022) - [c81]Feng Dang, Hangting Chen, Pengyuan Zhang:
DPT-FSNet: Dual-Path Transformer Based Full-Band and Sub-Band Fusion Network for Speech Enhancement. ICASSP 2022: 6857-6861 - [c80]Keqi Deng, Songjun Cao, Yike Zhang, Long Ma, Gaofeng Cheng, Ji Xu, Pengyuan Zhang:
Improving CTC-Based Speech Recognition Via Knowledge Transferring from Pre-Trained Language Models. ICASSP 2022: 8517-8521 - [c79]Keqi Deng, Zehui Yang, Shinji Watanabe, Yosuke Higuchi, Gaofeng Cheng, Pengyuan Zhang:
Improving Non-Autoregressive End-to-End Speech Recognition with Pre-Trained Acoustic and Language Models. ICASSP 2022: 8522-8526 - [c78]Hangting Chen, Yi Yang, Feng Dang, Pengyuan Zhang:
Beam-Guided TasNet: An Iterative Speech Separation Framework with Multi-Channel Output. INTERSPEECH 2022: 866-870 - [c77]Yukun Liu, Ta Li, Pengyuan Zhang, Yonghong Yan:
NAS-SCAE: Searching Compact Attention-based Encoders For End-to-end Automatic Speech Recognition. INTERSPEECH 2022: 1011-1015 - [c76]Yifan Chen, Yifan Guo, Qingxuan Li, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Interrelate Training and Searching: A Unified Online Clustering Framework for Speaker Diarization. INTERSPEECH 2022: 1456-1460 - [c75]Zehui Yang, Yifan Chen, Lei Luo, Runyan Yang, Lingxuan Ye, Gaofeng Cheng, Ji Xu, Yaohui Jin, Qingqing Zhang, Pengyuan Zhang, Lei Xie, Yonghong Yan:
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset. INTERSPEECH 2022: 1736-1740 - [c74]Xueshuai Zhang, Jiakun Shen, Jun Zhou, Pengyuan Zhang, Yonghong Yan, Zhihua Huang, Yanfen Tang, Yu Wang, Fujie Zhang, Shaoxing Zhang, Aijun Sun:
Robust Cough Feature Extraction and Classification Method for COVID-19 Cough Detection Based on Vocalization Characteristics. INTERSPEECH 2022: 2168-2172 - [c73]Han Zhu, Jindong Wang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Decoupled Federated Learning for ASR with Non-IID Data. INTERSPEECH 2022: 2628-2632 - [c72]Lingxuan Ye, Gaofeng Cheng, Runyan Yang, Zehui Yang, Sanli Tian, Pengyuan Zhang, Yonghong Yan:
Improving Recognition of Out-of-vocabulary Words in E2E Code-switching ASR by Fusing Speech Generation Methods. INTERSPEECH 2022: 3163-3167 - [c71]Yuxiang Zhang, Zhuo Li, Wenchao Wang, Pengyuan Zhang:
SASV Based on Pre-trained ASV System and Integrated Scoring Module. INTERSPEECH 2022: 4376-4380 - [c70]Chengxin Chen, Pengyuan Zhang:
CTA-RNN: Channel and Temporal-wise Attention RNN leveraging Pre-trained ASR Embeddings for Speech Emotion Recognition. INTERSPEECH 2022: 4730-4734 - [c69]Han Zhu, Li Wang, Gaofeng Cheng, Jindong Wang, Pengyuan Zhang, Yonghong Yan:
Wav2vec-S: Semi-Supervised Pre-Training for Low-Resource ASR. INTERSPEECH 2022: 4870-4874 - [c68]Qingxuan Li, Han Zhu, Liuping Luo, Gaofeng Cheng, Pengyuan Zhang, Jiasong Sun, Yonghong Yan:
Sequence Distribution Matching for Unsupervised Domain Adaptation in ASR. ISCSLP 2022: 21-25 - [c67]Peiyang Shi, Zengqiang Shang, Pengyuan Zhang:
A Mandarin Prosodic Boundary Prediction Model Based on Multi-Source Semi-Supervision. ISCSLP 2022: 285-289 - [c66]Gaofeng Cheng, Yifan Chen, Runyan Yang, Qingxuan Li, Zehui Yang, Lingxuan Ye, Pengyuan Zhang, Qingqing Zhang, Lei Xie, Yanmin Qian, Kong Aik Lee, Yonghong Yan:
The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines. ISCSLP 2022: 488-492 - [c65]Shuhao Deng, Chengfei Li, Jinfeng Bai, Qingqing Zhang, Wei-Qiang Zhang, Runyan Yang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Summary On The ISCSLP 2022 Chinese-English Code-Switching ASR Challenge. ISCSLP 2022: 527-531 - [c64]Chengxin Chen, Pengyuan Zhang:
Integrating Cross-modal Interactions via Latent Representation Shift for Multi-modal Humor Detection. MuSe @ ACM Multimedia 2022: 23-28 - [c63]Yuxiang Zhang, Jingze Lu, Xingming Wang, Zhuo Li, Runqiu Xiao, Wenchao Wang, Ming Li, Pengyuan Zhang:
Deepfake Detection System for the ADD Challenge Track 3.2 Based on Score Fusion. DDAM@MM 2022: 43-52 - [c62]Jingze Lu, Zhuo Li, Yuxiang Zhang, Wenchao Wang, Pengyuan Zhang:
Acoustic or Pattern? Speech Spoofing Countermeasure based on Image Pre-training Models. DDAM@MM 2022: 77-84 - [c61]Hua Hua, Ziyi Chen, Yuxiang Zhang, Ming Li, Pengyuan Zhang:
Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion. DDAM@MM 2022: 93-100 - [c60]Jianhua Tao, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Liang, Pengyuan Zhang, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi:
DDAM '22: 1st International Workshop on Deepfake Detection for Audio Multimedia. ACM Multimedia 2022: 7405-7406 - [c59]Pengjian Yang, Jun Wang, Guangyu Zhong, Pengyuan Zhang, Lai Zhang, Fan Liang, Jianxin Yang:
An IBC Reference Block Enhancement Model Based on GAN for Screen Content Video Coding. MMM (2) 2022: 15-26 - [c58]Jingze Lu, Yuxiang Zhang, Wenchao Wang, Pengyuan Zhang:
Robust Cross-SubBand Countermeasure Against Replay Attacks. Odyssey 2022: 126-132 - [e1]Jianhua Tao, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Lian, Pengyuan Zhang:
DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, Lisboa, Portugal, 14 October 2022. ACM 2022, ISBN 978-1-4503-9496-3 [contents] - [i29]Keqi Deng, Zehui Yang, Shinji Watanabe, Yosuke Higuchi, Gaofeng Cheng, Pengyuan Zhang:
Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models. CoRR abs/2201.10103 (2022) - [i28]Ziyi Chen, Hua Hua, Yuxiang Zhang, Ming Li, Pengyuan Zhang:
The HCCL-DKU system for fake audio generation task of the 2022 ICASSP ADD Challenge. CoRR abs/2201.12567 (2022) - [i27]Keqi Deng, Songjun Cao, Yike Zhang, Long Ma, Gaofeng Cheng, Ji Xu, Pengyuan Zhang:
Improving CTC-based speech recognition via knowledge transferring from pre-trained language models. CoRR abs/2203.03582 (2022) - [i26]Zehui Yang, Yifan Chen, Lei Luo, Runyan Yang, Lingxuan Ye, Gaofeng Cheng, Ji Xu, Yaohui Jin, Qingqing Zhang, Pengyuan Zhang, Lei Xie, Yonghong Yan:
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset. CoRR abs/2203.16844 (2022) - [i25]Chengxin Chen, Pengyuan Zhang:
CTA-RNN: Channel and Temporal-wise Attention RNN Leveraging Pre-trained ASR Embeddings for Speech Emotion Recognition. CoRR abs/2203.17023 (2022) - [i24]Zhuo Li, Runqiu Xiao, Zihan Zhang, Zhenduo Zhao, Wenchao Wang, Pengyuan Zhang:
Back-ends Selection for Deep Speaker Embeddings. CoRR abs/2204.11403 (2022) - [i23]Chengxin Chen, Meng Wang, Pengyuan Zhang:
Audio-Visual Scene Classification Using A Transfer Learning Based Joint Optimization Strategy. CoRR abs/2204.11420 (2022) - [i22]Ziyi Chen, Haoran Miao, Pengyuan Zhang:
Streaming non-autoregressive model for any-to-many voice conversion. CoRR abs/2206.07288 (2022) - [i21]Han Zhu, Jindong Wang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Decoupled Federated Learning for ASR with Non-IID Data. CoRR abs/2206.09102 (2022) - [i20]Han Zhu, Gaofeng Cheng, Jindong Wang, Wenxin Hou, Pengyuan Zhang, Yonghong Yan:
Boosting Cross-Domain Speech Recognition with Self-Supervision. CoRR abs/2206.09783 (2022) - [i19]Yifan Chen, Yifan Guo, Qingxuan Li, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Interrelate Training and Searching: A Unified Online Clustering Framework for Speaker Diarization. CoRR abs/2206.13760 (2022) - [i18]Gaofeng Cheng, Yifan Chen, Runyan Yang, Qingxuan Li, Zehui Yang, Lingxuan Ye, Pengyuan Zhang, Qingqing Zhang, Lei Xie, Yanmin Qian, Kong Aik Lee, Yonghong Yan:
The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines. CoRR abs/2208.08042 (2022) - [i17]Shuhao Deng, Chengfei Li, Jinfeng Bai, Qingqing Zhang, Wei-Qiang Zhang, Runyan Yang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Summary on the ISCSLP 2022 Chinese-English Code-Switching ASR Challenge. CoRR abs/2210.06091 (2022) - [i16]Yuxiang Zhang, Jingze Lu, Xingming Wang, Zhuo Li, Runqiu Xiao, Wenchao Wang, Ming Li, Pengyuan Zhang:
Deepfake Detection System for the ADD Challenge Track 3.2 Based on Score Fusion. CoRR abs/2210.06818 (2022) - 2021
- [j20]Dongni Hu, Chengxin Chen, Pengyuan Zhang, Junfeng Li, Yonghong Yan, Qingwei Zhao:
A Two-Stage Attention Based Modality Fusion Framework for Multi-Modal Speech Emotion Recognition. IEICE Trans. Inf. Syst. 104-D(8): 1391-1394 (2021) - [j19]Xiaoxiao Miao, Ian McLoughlin, Wenchao Wang, Pengyuan Zhang:
D-MONA: A dilated mixed-order non-local attention network for speaker and language recognition. Neural Networks 139: 201-211 (2021) - [j18]Hangting Chen, Pengyuan Zhang:
A dual-stream deep attractor network with multi-domain learning for speech dereverberation and separation. Neural Networks 141: 238-248 (2021) - [j17]Danyang Liu, Ji Xu, Pengyuan Zhang, Yonghong Yan:
A unified system for multilingual speech recognition and language identification. Speech Commun. 127: 17-28 (2021) - [j16]Runyan Yang, Gaofeng Cheng, Haoran Miao, Ta Li, Pengyuan Zhang, Yonghong Yan:
Keyword Search Using Attention-Based End-to-End ASR and Frame-Synchronous Phoneme Alignments. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3202-3215 (2021) - [c57]Yifan Guo, Yifan Chen, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Far-Field Speech Recognition Based on Complex-Valued Neural Networks and Inter-Frame Similarity Difference Method. ASRU 2021: 1003-1010 - [c56]Zengqiang Shang, Ziyi Chen, Haozhe Zhang, Pengyuan Zhang:
The IOA-ThinkIT system for Blizzard Challenge 2021. Blizzard Challenge 2021 - [c55]Zuozhen Liu, Ta Li, Pengyuan Zhang:
RNN-T Based Open-Vocabulary Keyword Spotting in Mandarin with Multi-Level Detection. ICASSP 2021: 5649-5653 - [c54]Keqi Deng, Gaofeng Cheng, Haoran Miao, Pengyuan Zhang, Yonghong Yan:
History Utterance Embedding Transformer LM for Speech Recognition. ICASSP 2021: 5914-5918 - [c53]Changfeng Gao, Gaofeng Cheng, Runyan Yang, Han Zhu, Pengyuan Zhang, Yonghong Yan:
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Text Data. ICASSP 2021: 6543-6547 - [c52]Zengqiang Shang, Haozhe Zhang, Ziyi Chen, Bolin Zhou, Pengyuan Zhang:
The Thinkit System for Icassp2021 M2voc Challenge. ICASSP 2021: 8593-8597 - [c51]Yuzhuo Liu, Hangting Chen, Yun Wang, Pengyuan Zhang:
Power Pooling: An Adaptive Pooling Function for Weakly Labelled Sound Event Detection. IJCNN 2021: 1-7 - [c50]Ziyi Chen, Pengyuan Zhang:
TVQVC: Transformer Based Vector Quantized Variational Autoencoder with CTC Loss for Voice Conversion. Interspeech 2021: 826-830 - [c49]Zengqiang Shang, Zhihua Huang, Haozhe Zhang, Pengyuan Zhang, Yonghong Yan:
Incorporating Cross-Speaker Style Transfer for Multi-Language Text-to-Speech. Interspeech 2021: 1619-1623 - [c48]Feng Dang, Pengyuan Zhang, Hangting Chen:
Improved Speech Enhancement Using a Complex-Domain GAN with Fused Time-Domain and Time-Frequency Domain Constraints. Interspeech 2021: 2721-2725 - [c47]Haozhe Zhang, Zhihua Huang, Zengqiang Shang, Pengyuan Zhang, Yonghong Yan:
LinearSpeech: Parallel Text-to-Speech with Linear Complexity. Interspeech 2021: 4129-4133 - [c46]Yuxiang Zhang, Wenchao Wang, Pengyuan Zhang:
The Effect of Silence and Dual-Band Fusion in Anti-Spoofing System. Interspeech 2021: 4279-4283 - [c45]Runqiu Xiao, Xiaoxiao Miao, Wenchao Wang, Pengyuan Zhang, Bin Cai, Liuping Luo:
Adaptive Margin Circle Loss for Speaker Verification. Interspeech 2021: 4618-4622 - [c44]Jiakun Shen, Xueshuai Zhang, Wenchao Wang, Zhihua Huang, Pengyuan Zhang, Yonghong Yan:
Cough-based COVID-19 Detection with Multi-band Long-Short Term Memory and Convolutional Neural Networks. ISAIMS 2021: 209-215 - [c43]Changfeng Gao, Gaofeng Cheng, Jun Zhou, Pengyuan Zhang, Yonghong Yan:
Non-autoregressive Deliberation-Attention based End-to-End ASR. ISCSLP 2021: 1-5 - [c42]Zheying Huang, Peng Li, Ji Xu, Pengyuan Zhang, Yonghong Yan:
Context-dependent Label Smoothing Regularization for Attention-based End-to-End Code-Switching Speech Recognition. ISCSLP 2021: 1-5 - [i15]Yukun Liu, Ta Li, Pengyuan Zhang, Yonghong Yan:
Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search. CoRR abs/2104.05390 (2021) - [i14]Feng Dang, Hangting Chen, Pengyuan Zhang:
DPT-FSNet: Dual-path Transformer Based Full-band and Sub-band Fusion Network for Speech Enhancement. CoRR abs/2104.13002 (2021) - [i13]Han Zhu, Li Wang, Ying Hou, Jindong Wang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Wav2vec-S: Semi-Supervised Pre-Training for Speech Recognition. CoRR abs/2110.04484 (2021) - [i12]Changfeng Gao, Gaofeng Cheng, Yifan Guo, Qingwei Zhao, Pengyuan Zhang:
Data Augmentation based Consistency Contrastive Pre-training for Automatic Speech Recognition. CoRR abs/2112.12522 (2021) - 2020
- [j15]Danyang Liu, Ji Xu, Pengyuan Zhang:
End-to-End Multilingual Speech Recognition System with Language Supervision Training. IEICE Trans. Inf. Syst. 103-D(6): 1427-1430 (2020) - [j14]Qian Shi, Mengxi Liu, Xiaoping Liu, Penghua Liu, Pengyuan Zhang, Jinxing Yang, Xia Li:
Domain Adaption for Fine-Grained Urban Village Extraction From Satellite Images. IEEE Geosci. Remote. Sens. Lett. 17(8): 1430-1434 (2020) - [j13]Haoran Miao, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1452-1465 (2020) - [c41]Haoran Miao, Gaofeng Cheng, Changfeng Gao, Pengyuan Zhang, Yonghong Yan:
Transformer-Based Online CTC/Attention End-To-End Speech Recognition Architecture. ICASSP 2020: 6084-6088 - [c40]Yue Fan, Jiawen Kang, Lantian Li, Kaicheng Li, Haolin Chen, Sitong Cheng, Pengyuan Zhang, Ziya Zhou, Yunqi Cai, Dong Wang:
CN-Celeb: A Challenging Chinese Speaker Recognition Dataset. ICASSP 2020: 7604-7608 - [c39]Xuejun Zhang, Yujiang Li, Pengyuan Zhang, Yonghong Yan:
Lingual-Agnostic Meta-Learning for Low-Resource Part-of-Speech Tagging. ICIT 2020: 35-39 - [c38]Hangting Chen, Pengyuan Zhang, Qian Shi, Zuozhen Liu:
Improved Guided Source Separation Integrated with a Strong Back-End for the CHiME-6 Dinner Party Scenario. INTERSPEECH 2020: 334-338 - [c37]Xueshuai Zhang, Wenchao Wang, Pengyuan Zhang:
Speaker Diarization System Based on DPCA Algorithm for Fearless Steps Challenge Phase-2. INTERSPEECH 2020: 2602-2606 - [c36]Han Zhu, Jiangjiang Zhao, Yuling Ren, Li Wang, Pengyuan Zhang:
Domain Adaptation Using Class Similarity for Robust Speech Recognition. INTERSPEECH 2020: 4367-4371 - [i11]Haoran Miao, Gaofeng Cheng, Changfeng Gao, Pengyuan Zhang, Yonghong Yan:
Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture. CoRR abs/2001.08290 (2020) - [i10]Yuzhuo Liu, Hangting Chen, Pengyuan Zhang:
Power Pooling Operators and Confidence Learning for Semi-Supervised Sound Event Detection. CoRR abs/2005.11459 (2020) - [i9]Hangting Chen, Zuozhen Liu, Zongming Liu, Pengyuan Zhang:
ACGAN-based Data Augmentation Integrated with Long-term Scalogram for Acoustic Scene Classification. CoRR abs/2005.13146 (2020) - [i8]Hangting Chen, Pengyuan Zhang:
Exploring the time-domain deep attractor network with two-stream architectures in a reverberant environment. CoRR abs/2007.00272 (2020) - [i7]Yuzhuo Liu, Hangting Chen, YunWang, Pengyuan Zhang:
Power pooling: An adaptive pooling function for weakly labelled sound event detection. CoRR abs/2010.09985 (2020) - [i6]Han Zhu, Li Wang, Pengyuan Zhang, Yonghong Yan:
Multi-Accent Adaptation based on Gate Mechanism. CoRR abs/2011.02774 (2020) - [i5]Han Zhu, Jiangjiang Zhao, Yuling Ren, Li Wang, Pengyuan Zhang:
Domain Adaptation Using Class Similarity for Robust Speech Recognition. CoRR abs/2011.02782 (2020)
2010 – 2019
- 2019
- [j12]Yongping Zhang, Pengyuan Zhang, Fei Tao, Yang Liu, Ying Zuo:
Consensus aware manufacturing service collaboration optimization under blockchain based Industrial Internet platform. Comput. Ind. Eng. 135: 1025-1035 (2019) - [j11]Danyang Liu, Ji Xu, Pengyuan Zhang, Yonghong Yan:
Investigation of knowledge transfer approaches to improve the acoustic modeling of Vietnamese ASR system. IEEE CAA J. Autom. Sinica 6(5): 1187-1195 (2019) - [j10]Shengyu Yao, Ruohua Zhou, Pengyuan Zhang:
Speaker-Phonetic I-Vector Modeling for Text-Dependent Speaker Verification with Random Digit Strings. IEICE Trans. Inf. Syst. 102-D(2): 346-354 (2019) - [j9]Gaofeng Cheng, Pengyuan Zhang, Ji Xu:
Automatic Speech Recognition System with Output-Gate Projected Gated Recurrent Unit. IEICE Trans. Inf. Syst. 102-D(2): 355-363 (2019) - [j8]Shiyue Zhang, Dali Chen, Shixin Liu, Pengyuan Zhang, Wei Zhao:
Aluminum alloy microstructural segmentation method based on simple noniterative clustering and adaptive density-based spatial clustering of applications with noise. J. Electronic Imaging 28(3): 033035 (2019) - [j7]Dali Chen, Pengyuan Zhang, Shixin Liu, Yangquan Chen, Wei Zhao:
Aluminum alloy microstructural segmentation in micrograph with hierarchical parameter transfer learning method. J. Electronic Imaging 28(5): 053018 (2019) - [j6]Yike Zhang, Pengyuan Zhang, Yonghong Yan:
Tailoring an Interpretable Neural Language Model. IEEE ACM Trans. Audio Speech Lang. Process. 27(7): 1164-1178 (2019) - [j5]Yongping Zhang, Fei Tao, Yang Liu, Pengyuan Zhang, Ying Cheng, Ying Zuo:
Long/Short-Term Utility Aware Optimal Selection of Manufacturing Service Composition Toward Industrial Internet Platforms. IEEE Trans. Ind. Informatics 15(6): 3712-3722 (2019) - [c35]Wenjing Wei, Ge Zhan, Xun Wang, Pengyuan Zhang, Yonghong Yan:
A Novel Method for Automatic Heart Murmur Diagnosis Using Phonocardiogram. AIAM (ACM) 2019: 37:1-37:6 - [c34]Lu Huang, Gaofeng Cheng, Pengyuan Zhang, Yi Yang, Shumin Xu, Jiasong Sun:
Utterance-level Permutation Invariant Training with Latency-controlled BLSTM for Single-channel Multi-talker Speech Separation. APSIPA 2019: 1256-1261 - [c33]Ruimin Wang, Chunhui Lu, Xiaoyang Hao, Bolin Zhou, Zengqiang Shang, Pengyuan Zhang:
The IOA-ThinkIT system for Blizzard Challenge 2019. Blizzard Challenge 2019 - [c32]Hangting Chen, Pengyuan Zhang, Yonghong Yan:
An Audio Scene Classification Framework with Embedded Filters and a DCT-based Temporal Module. ICASSP 2019: 835-839 - [c31]Chunhui Lu, Pengyuan Zhang, Yonghong Yan:
Self-attention Based Prosodic Boundary Prediction for Chinese Speech Synthesis. ICASSP 2019: 7035-7039 - [c30]Long Wu, Hangting Chen, Li Wang, Pengyuan Zhang, Yonghong Yan:
Speaker-Invariant Feature-Mapping for Distant Speech Recognition via Adversarial Teacher-Student Learning. INTERSPEECH 2019: 431-435 - [c29]Han Zhu, Li Wang, Pengyuan Zhang, Yonghong Yan:
Multi-Accent Adaptation Based on Gate Mechanism. INTERSPEECH 2019: 744-748 - [c28]Haoran Miao, Gaofeng Cheng, Pengyuan Zhang, Ta Li, Yonghong Yan:
Online Hybrid CTC/Attention Architecture for End-to-End Speech Recognition. INTERSPEECH 2019: 2623-2627 - [c27]Wenjie Li, Pengyuan Zhang, Yonghong Yan:
Target Speaker Recovery and Recognition Network with Average x-Vector and Global Training. INTERSPEECH 2019: 3233-3237 - [c26]Chang Liu, Zhen Zhang, Pengyuan Zhang, Yonghong Yan:
Character-Aware Sub-Word Level Language Modeling for Uyghur and Turkish ASR. INTERSPEECH 2019: 3495-3499 - [c25]Sifan Wu, Fei Li, Pengyuan Zhang:
Weighted Feature Fusion Based Emotional Recognition for Variable-length Speech using DNN. IWCMC 2019: 674-679 - [i4]Hangting Chen, Zuozhen Liu, Zongming Liu, Pengyuan Zhang, Yonghong Yan:
Integrating the Data Augmentation Scheme with Various Classifiers for Acoustic Scene Modeling. CoRR abs/1907.06639 (2019) - [i3]Yue Fan, Jiawen Kang, Lantian Li, Kaicheng Li, Haolin Chen, Sitong Cheng, Pengyuan Zhang, Ziya Zhou, Yunqi Cai, Dong Wang:
CN-CELEB: a challenging Chinese speaker recognition dataset. CoRR abs/1911.01799 (2019) - [i2]Lu Huang, Gaofeng Cheng, Pengyuan Zhang, Yi Yang, Shumin Xu, Jiasong Sun:
Utterance-level Permutation Invariant Training with Latency-controlled BLSTM for Single-channel Multi-talker Speech Separation. CoRR abs/1912.11613 (2019) - 2018
- [j4]Yu Zhang, Pengyuan Zhang, Qingwei Zhao:
Improve Multichannel Speech Recognition with Temporal and Spatial Information. IEICE Trans. Inf. Syst. 101-D(7): 1963-1967 (2018) - [c24]Yu Zhang, Wenjie Li, Pengyuan Zhang, Yonghong Yan:
Improving Multichannel Speech Recognition with Generalized Cross Correlation Inputs and Multitask Learning. ICASSP 2018: 5704-5708 - [c23]Wenjie Li, Gaofeng Cheng, Fengpei Ge, Pengyuan Zhang, Yonghong Yan:
Investigation on the Combination of Batch Normalization and Dropout in BLSTM-based Acoustic Modeling for ASR. INTERSPEECH 2018: 2888-2892 - [c22]Hangting Chen, Pengyuan Zhang, Haichuan Bai, Qingsheng Yuan, Xiuguo Bao, Yonghong Yan:
Deep Convolutional Neural Network with Scalogram for Audio Scene Modeling. INTERSPEECH 2018: 3304-3308 - [c21]Yike Zhang, Pengyuan Zhang, Yonghong Yan:
Improving Language Modeling with an Adversarial Critic for Automatic Speech Recognition. INTERSPEECH 2018: 3348-3352 - [c20]Danyang Liu, Xinxin Wan, Ji Xu, Pengyuan Zhang:
Multilingual Speech Recognition Training and Adaptation with Language-Specific Gate Units. ISCSLP 2018: 86-90 - [c19]Long Wu, Li Wang, Pengyuan Zhang, Ta Li, Yonghong Yan:
Space-Time Residual LSTM Architechture for Distant Speech Recognition. ISCSLP 2018: 379-383 - [c18]Chang Liu, Yike Zhang, Pengyuan Zhang, Yaofeng Wang:
Evaluating Modeling Units and Sub-word Features in Language Models for Turkish ASR. ISCSLP 2018: 414-418 - [c17]Wenjie Li, Yu Zhang, Pengyuan Zhang, Fengpei Ge:
Multichannel ASR with Knowledge Distillation and Generalized Cross Correlation Feature. SLT 2018: 463-469 - 2017
- [c16]Yike Zhang, Pengyuan Zhang, Qingwei Zhao, Yonghong Yan, Zhenjiang Dong, Xia Jia:
An improved lexicon generation method for mandarin speech recognition. ICNC-FSKD 2017: 661-665 - [c15]Ge Zhang, Pengyuan Zhang, Jielin Pan, Yonghong Yan:
Fast variable-frame-rate decoding of speech recognition based on deep neural networks. ICNC-FSKD 2017: 821-825 - [c14]Yu Zhang, Pengyuan Zhang, Yonghong Yan:
Attention-Based LSTM with Multi-Task Learning for Distant Speech Recognition. INTERSPEECH 2017: 3857-3861 - 2016
- [j3]Xuyang Wang, Pengyuan Zhang, Qingwei Zhao, Jielin Pan, Yonghong Yan:
Improved End-to-End Speech Recognition Using Adaptive Per-Dimensional Learning Rate Methods. IEICE Trans. Inf. Syst. 99-D(10): 2550-2553 (2016) - [c13]Yike Zhang, Pengyuan Zhang, Ta Li, Yonghong Yan:
An unsupervised vocabulary selection technique for Chinese automatic speech recognition. SLT 2016: 420-425 - 2015
- [c12]Pengyuan Zhang, Jianping Li, Enming Dong, Qi Liu:
A Method of Link Prediction Based on Betweenness. CSoNet 2015: 228-235 - [c11]Qi Liu, Jianping Li, Zheng Xie, Pengyuan Zhang:
An improvement of link prediction by combining local information and betweenness. ICNC 2015: 456-461 - [c10]Pengyuan Zhang, Jianping Li, Qi Liu, Zheng Xie:
A bi-scale method of link prediction. ICNC 2015: 1040-1044 - [i1]Xiaofei Wang, Chao Wu, Pengyuan Zhang, Ziteng Wang, Yong Liu, Xu Li, Qiang Fu, Yonghong Yan:
Noise Robust IOA/CAS Speech Separation and Recognition System For The Third 'CHIME' Challenge. CoRR abs/1509.06103 (2015) - 2014
- [c9]Yulan Liu, Pengyuan Zhang, Thomas Hain:
Using neural network front-ends on far field multiple microphones based speech recognition. ICASSP 2014: 5542-5546 - [c8]Xuyang Wang, Ta Li, Pengyuan Zhang, Jielin Pan, Yonghong Yan:
Enhanced Out of Vocabulary Word Detection Using Local Acoustic Information. IIH-MSP 2014: 594-597 - [c7]Pengyuan Zhang, Yulan Liu, Thomas Hain:
Semi-supervised DNN training in meeting recognition. SLT 2014: 141-146 - 2012
- [j2]Chuanxu Wang, Pengyuan Zhang:
Optimization of Spoken Term Detection System. J. Appl. Math. 2012: 548341:1-548341:8 (2012) - 2010
- [j1]Yanqing Sun, Yu Zhou, Qingwei Zhao, Pengyuan Zhang, Fuping Pan, Yonghong Yan:
Enhancing the Robustness of the Posterior-Based Confidence Measures Using Entropy Information for Speech Recognition. IEICE Trans. Inf. Syst. 93-D(9): 2431-2439 (2010)
2000 – 2009
- 2007
- [c6]Pengyuan Zhang, Qingwei Zhao, Yonghong Yan:
A Spoken Dialogue System Based on Keyword Spotting Technology. HCI (3) 2007: 253-261 - [c5]Pengyuan Zhang, Jian Shao, Qingwei Zhao, Yonghong Yan:
Keyword Spotting Based on Syllable Confusion Network. ICNC (2) 2007: 656-659 - [c4]Zhaojie Liu, Jian Shao, Pengyuan Zhang, Qingwei Zhao, Yonghong Yan, Ji Feng:
Real Context Model for Tone Recognition in Mandarin Conversational Telephone Speech. ICNC (2) 2007: 696-699 - [c3]Jian Shao, Qingwei Zhao, Pengyuan Zhang, Zhaojie Liu, Yonghong Yan:
A fast fuzzy keyword spotting algorithm based on syllable confusion network. INTERSPEECH 2007: 2405-2408 - 2006
- [c2]Jian Shao, Pengyuan Zhang, Jiang Han, Jun Yang, Yonghong Yan:
Syllable Based Audio Search Using Confusion Network Arc as Indexing Unit. ISCSLP 2006 - [c1]Pengyuan Zhang, Jian Shao, Jiang Han, Zhaojie Liu, Yonghong Yan:
Keyword Spotting Based on Phoneme Confusion Matrix. ISCSLP 2006
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-04 20:44 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint