


default search action
Jianhua Tao 0001
Person information
- unicode name: 陶建华
- affiliation: Tsinghua University, Department of Automation, Beijing, China
- affiliation: University of Chinese Academy of Sciences, School of Artificial Intelligence, Beijing, China
- affiliation (PhD 2001): Tsinghua University, Beijing, China
Other persons with the same name
- Jianhua Tao 0002 — Guangzhou University, School of Mechanical and Electrical Engineering, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [j98]Xinxin Zheng
, Feihu Che, Jianhua Tao:
A Comprehensive Survey of Few-shot Information Networks. Mach. Intell. Res. 22(1): 60-78 (2025) - [j97]Sicheng Zhao, Jing Jiang, Wenbo Tang, Jiankun Zhu, Hui Chen, Pengfei Xu, Björn W. Schuller
, Jianhua Tao, Hongxun Yao, Guiguang Ding:
Multi-source multi-modal domain adaptation. Inf. Fusion 117: 102862 (2025) - [j96]Cunhang Fan
, Hongyu Zhang, Qinke Ni
, Jingjing Zhang, Jianhua Tao, Jian Zhou, Jiangyan Yi, Zhao Lv, Xiaopei Wu:
Seeing helps hearing: A multi-modal dataset and a mamba-based dual branch parallel network for auditory attention decoding. Inf. Fusion 118: 102946 (2025) - [j95]Cunhang Fan
, Kang Zhu
, Jianhua Tao
, Guofeng Yi
, Jun Xue
, Zhao Lv
:
Multi-Level Contrastive Learning: Hierarchical Alleviation of Heterogeneity in Multimodal Sentiment Analysis. IEEE Trans. Affect. Comput. 16(1): 207-222 (2025) - [j94]Licai Sun
, Zheng Lian
, Kexin Wang, Yu He, Mingyu Xu, Haiyang Sun, Bin Liu
, Jianhua Tao
:
SVFAP: Self-Supervised Video Facial Affect Perceiver. IEEE Trans. Affect. Comput. 16(1): 405-422 (2025) - [c297]Yujie Chen, Jiangyan Yi, Cunhang Fan, Jianhua Tao, Yong Ren, Siding Zeng, Chu Yuan Zhang, Xinrui Yan, Hao Gu, Jun Xue, Chenglong Wang, Zhao Lv, Xiaohui Zhang:
Region-Based Optimization in Continual Learning for Audio Deepfake Detection. AAAI 2025: 23651-23659 - [c296]Cunhang Fan, Enrui Liu, Andong Li, Jianhua Tao, Jian Zhou, Jiahao Li, Chengshi Zheng, Zhao Lv:
BSDB-Net: Band-Split Dual-Branch Network with Selective State Spaces Mechanism for Monaural Speech Enhancement. AAAI 2025: 23850-23858 - [c295]Shuai Zhang, Jiangyan Yi, Zhengqi Wen, Jianhua Tao, Feihu Che, Jinyang Wu, Ruibo Fu:
Code-switching Mediated Sentence-level Semantic Learning. AAAI 2025: 25913-25921 - [i129]Kaiying Yan, Moyang Liu, Yukun Liu, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Xuefei Liu, Guanjun Li:
MTPareto: A MultiModal Targeted Pareto Framework for Fake News Detection. CoRR abs/2501.06764 (2025) - [i128]Zheng Lian, Haoyu Chen, Lan Chen, Haiyang Sun, Licai Sun, Yong Ren, Zebang Cheng, Bin Liu, Rui Liu, Xiaojiang Peng, Jiangyan Yi, Jianhua Tao:
AffectGPT: A New Dataset, Model, and Benchmark for Emotion Understanding with Multimodal Large Language Models. CoRR abs/2501.16566 (2025) - [i127]Mingkuan Feng, Jinyang Wu, Shuai Zhang, Pengpeng Shao, Ruihan Jin, Zhengqi Wen, Jianhua Tao, Feihu Che:
DReSS: Data-driven Regularized Structured Streamlining for Large Language Models. CoRR abs/2501.17905 (2025) - [i126]Jinyang Wu, Mingkuan Feng, Shuai Zhang, Ruihan Jin, Feihu Che, Zengqi Wen, Jianhua Tao:
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking. CoRR abs/2502.02339 (2025) - [i125]Zhengxian Yang, Shi Pan, Shengqi Wang, Haoxiang Wang, Li Lin, Guanjun Li, Zhengqi Wen, Borong Lin, Jianhua Tao, Tao Yu:
ImViD: Immersive Volumetric Videos for Enhanced VR Engagement. CoRR abs/2503.14359 (2025) - 2024
- [j93]Tao Wang
, Jiangyan Yi, Ruibo Fu, Jianhua Tao, Zhengqi Wen, Chu Yuan Zhang:
Emotion selectable end-to-end text-based speech editing. Artif. Intell. 329: 104076 (2024) - [j92]Cunhang Fan, Heng Xie, Jianhua Tao, Yongwei Li, Guanxiong Pei
, Taihao Li, Zhao Lv:
ICaps-ResLSTM: Improved capsule network and residual LSTM for EEG emotion recognition. Biomed. Signal Process. Control. 87(Part B): 105422 (2024) - [j91]Pengpeng Shao
, Jianhua Tao:
Multi-level graph contrastive learning. Neurocomputing 570: 127101 (2024) - [j90]Zheng Lian
, Licai Sun
, Haiyang Sun, Kang Chen, Zhuofan Wen, Hao Gu, Bin Liu, Jianhua Tao:
GPT-4V with emotion: A zero-shot benchmark for Generalized Emotion Recognition. Inf. Fusion 108: 102367 (2024) - [j89]Licai Sun
, Zheng Lian, Bin Liu, Jianhua Tao:
HiCMAE: Hierarchical Contrastive Masked Autoencoder for self-supervised Audio-Visual Emotion Recognition. Inf. Fusion 108: 102382 (2024) - [j88]Guofeng Yi, Cunhang Fan
, Kang Zhu, Zhao Lv, Shan Liang, Zhengqi Wen, Guanxiong Pei
, Taihao Li, Jianhua Tao:
VLP2MSA: Expanding vision-language pre-training to multimodal sentiment analysis. Knowl. Based Syst. 283: 111136 (2024) - [j87]Pengpeng Shao, Yang Wen, Jianhua Tao:
Bayesian hypernetwork collaborates with time-difference evolutional network for temporal knowledge prediction. Neural Networks 175: 106146 (2024) - [j86]Cunhang Fan
, Jun Xue
, Jianhua Tao, Jiangyan Yi, Chenglong Wang, Chengshi Zheng
, Zhao Lv:
Spatial reconstructed local attention Res2Net with F0 subband for fake speech detection. Neural Networks 175: 106320 (2024) - [j85]Feihu Che, Jianhua Tao:
M2ixKG: Mixing for harder negative samples in knowledge graph. Neural Networks 177: 106358 (2024) - [j84]Cunhang Fan
, Hongyu Zhang
, Wei Huang, Jun Xue
, Jianhua Tao, Jiangyan Yi, Zhao Lv, Xiaopei Wu:
DGSD: Dynamical graph self-distillation for EEG-based auditory spatial attention detection. Neural Networks 179: 106580 (2024) - [j83]Nayu Liu
, Kaiwen Wei
, Yong Yang
, Jianhua Tao
, Xian Sun
, Fanglong Yao
, Hongfeng Yu
, Li Jin
, Zhao Lv
, Cunhang Fan
:
Multimodal Cross-Lingual Summarization for Videos: A Revisit in Knowledge Distillation Induced Triple-Stage Training Method. IEEE Trans. Pattern Anal. Mach. Intell. 46(12): 10697-10714 (2024) - [j82]Jiangyan Yi
, Chenglong Wang
, Jianhua Tao, Chuyuan Zhang, Cunhang Fan, Zhengkun Tian, Haoxin Ma, Ruibo Fu:
SceneFake: An initial dataset and benchmarks for scene fake audio detection. Pattern Recognit. 152: 110468 (2024) - [j81]Haoxin Ma, Jiangyan Yi, Chenglong Wang, Xinrui Yan, Jianhua Tao, Tao Wang, Shiming Wang, Ruibo Fu:
CFAD: A Chinese dataset for fake audio detection. Speech Commun. 164: 103122 (2024) - [j80]Na Guo
, Jianguo Wei, Yongwei Li
, Wenhuan Lu, Jianhua Tao:
Zero-shot voice conversion based on feature disentanglement. Speech Commun. 165: 103143 (2024) - [j79]Mingyue Niu
, Jianhua Tao
, Yongwei Li
, Yong Qin, Ya Li
:
WavDepressionNet: Automatic Depression Level Prediction via Raw Speech Signals. IEEE Trans. Affect. Comput. 15(1): 285-296 (2024) - [j78]Licai Sun
, Zheng Lian
, Bin Liu
, Jianhua Tao
:
Efficient Multimodal Transformer With Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis. IEEE Trans. Affect. Comput. 15(1): 309-325 (2024) - [j77]Cunhang Fan
, Mingming Ding
, Jianhua Tao
, Ruibo Fu
, Jiangyan Yi
, Zhengqi Wen, Zhao Lv
:
Dual-Branch Knowledge Distillation for Noise-Robust Synthetic Speech Detection. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2453-2466 (2024) - [j76]Mingyue Niu
, Ya Li, Jianhua Tao, Xiuzhuang Zhou, Björn W. Schuller:
DepressionMLP: A Multi-Layer Perceptron Architecture for Automatic Depression Level Prediction via Facial Keypoints and Action Units. IEEE Trans. Circuits Syst. Video Technol. 34(9): 8924-8938 (2024) - [j75]Zheng Lian
, Bin Liu
, Jianhua Tao
:
PIRNet: Personality-Enhanced Iterative Refinement Network for Emotion Recognition in Conversation. IEEE Trans. Neural Networks Learn. Syst. 35(2): 2863-2874 (2024) - [c294]Cunhang Fan, Yujie Chen, Jun Xue, Yonghui Kong, Jianhua Tao, Zhao Lv:
Progressive Distillation Based on Masked Generation Feature Method for Knowledge Graph Completion. AAAI 2024: 8380-8388 - [c293]Xiaohui Zhang, Jiangyan Yi, Chenglong Wang, Chu Yuan Zhang, Siding Zeng, Jianhua Tao:
What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection. AAAI 2024: 19569-19577 - [c292]Sicheng Zhao
, Jianhua Tao
, Guiguang Ding
:
Open-world Domain Adaptation and Generalization. ACM TUR-C 2024 - [c291]Chu Yuan Zhang
, Jiangyan Yi
, Jianhua Tao
, Chenglong Wang
, Xinrui Yan
:
Distinguishing Neural Speech Synthesis Models Through Fingerprints in Speech Waveforms. CCL 2024: 259-273 - [c290]Yan Zhao, Jiangyan Yi, Jianhua Tao, Chenglong Wang, Yongfeng Dong:
EmoFake: An Initial Dataset for Emotion Fake Audio Detection. CCL 2024: 419-433 - [c289]Hao Gu, Jiangyan Yi, Zheng Lian, Jianhua Tao, Xinrui Yan:
NLoPT: N-gram Enhanced Low-Rank Task Adaptive Pre-training for Efficient Language Model Adaption. LREC/COLING 2024: 12259-12270 - [c288]Shuaihu Han, Guohua Yang, Dawei Zhang, Jianhua Tao, Feihu Che:
Multi-stage Vs Single-Stage: A Local Information Focused Approach for Overlapping Event Extraction. ICANN (7) 2024: 277-291 - [c287]Chenglong Wang, Jiayi He
, Jiangyan Yi, Jianhua Tao, Chu Yuan Zhang, Xiaohui Zhang:
Multi-Scale Permutation Entropy for Audio Deepfake Detection. ICASSP 2024: 1406-1410 - [c286]Mingyu Xu, Zheng Lian, Bin Liu, Zerui Chen, Jianhua Tao:
Pseudo Labels Regularization for Imbalanced Partial-Label Learning. ICASSP 2024: 6305-6309 - [c285]Yong Ren
, Tao Wang, Jiangyan Yi, Le Xu, Jianhua Tao, Chu Yuan Zhang, Junzuo Zhou:
Fewer-Token Neural Speech Codec with Time-Invariant Codes. ICASSP 2024: 12737-12741 - [c284]Kang Zhu, Cunhang Fan, Jianhua Tao, Jun Xue, Heng Xie, Xuefei Liu, Yongwei Li, Zhengqi Wen, Zhao Lv:
Dual-View Multimodal Interaction in Multimodal Sentiment Analysis. ICME 2024: 1-6 - [c283]Shuaihu Han, Guohua Yang, Dawei Zhang, Jianhua Tao:
What Comes Next and Why? A Staged Encoder-Decoder Architecture for Script Event Prediction. IJCNN 2024: 1-9 - [c282]Xiaoyang Li, Guohua Yang, Dawei Zhang, Jianhua Tao:
APC: Predict Global Representation From Local Observation In Multi-Agent Reinforcement Learning. IJCNN 2024: 1-8 - [c281]Zhiyong Wang
, Xiaopeng Wang, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Yukun Liu, Guanjun Li, Xin Qi, Yi Lu, Xuefei Liu, Yongwei Li:
A Noval Feature via Color Quantisation for Fake Audio Detection. ISCSLP 2024: 1-5 - [c280]Moyang Liu, Yukun Liu, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Xuefei Liu, Guanjun Li:
Exploring the Role of Audio in Multimodal Misinformation Detection. ISCSLP 2024: 204-208 - [c279]Xin Qi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Shuchen Shi, Yi Lu, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Guanjun Li, Xuefei Liu, Yongwei Li:
EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech. ISCSLP 2024: 294-298 - [c278]Kang Zhu, Xuefei Liu, Heng Xie, Cong Cai, Ruibo Fu, Guanjun Li, Zhengqi Wen, Jianhua Tao, Cunhang Fan, Zhao Lv, Le Wang, Hao Lin:
Transferring Personality Knowledge to Multimodal Sentiment Analysis. ISCSLP 2024: 431-435 - [c277]Hanzhe Xu, Xuefei Liu, Cong Cai, Kang Zhu, Jizhou Cui, Ruibo Fu, Heng Xie, Jianhua Tao, Zhengqi Wen, Ziping Zhao, Guanjun Li, Le Wang, Hao Lin:
Temporal Shift for Personality Recognition with Pre-Trained Representations. ISCSLP 2024: 446-450 - [c276]Xiaoke Qi, Hao Gu, Jiangyan Yi, Jianhua Tao, Yong Ren, Jiayi He, Siding Zeng:
MADD: A Multi-Lingual Multi-Speaker Audio Deepfake Detection Dataset. ISCSLP 2024: 466-470 - [c275]Xinrui Yan, Jiangyan Yi, Jianhua Tao, Yujie Chen, Hao Gu, Guanjun Li, Junzuo Zhou, Yong Ren, Tao Xu:
Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio. ISCSLP 2024: 476-480 - [c274]Yuankun Xie, Chenxu Xiong, Xiaopeng Wang, Zhiyong Wang, Yi Lu, Xin Qi, Ruibo Fu, Yukun Liu, Zhengqi Wen, Jianhua Tao, Guanjun Li, Long Ye:
Does Current Deepfake Audio Detection Model Effectively Detect ALM-Based Deepfake Audio? ISCSLP 2024: 481-485 - [c273]Jizhou Cui, Xuefei Liu, Yongwei Li, Xiaoying Xu, Ruibo Fu, Jianhua Tao, Zhengqi Wen, Yukun Liu, Guanjun Li, Le Wang, Hao Lin:
Unlocking the Power of Emotions: Enhancing Personality Trait Recognition Through Utilization of Emotional Cues. ISCSLP 2024: 566-570 - [c272]Cunhang Fan
, Jingjing Zhang
, Hongyu Zhang
, Wang Xiang
, Jianhua Tao
, Xinhui Li
, Jiangyan Yi
, Dianbo Sui
, Zhao Lv
:
MSFNet: Multi-Scale Fusion Network for Brain-Controlled Speaker Extraction. ACM Multimedia 2024: 1652-1661 - [c271]Hao Gu
, Jiangyan Yi
, Chenglong Wang
, Yong Ren
, Jianhua Tao
, Xinrui Yan
, Yujie Chen
, Xiaohui Zhang
:
Utilizing Speaker Profiles for Impersonation Audio Detection. ACM Multimedia 2024: 1961-1970 - [c270]Sicheng Zhao
, Guoli Jia
, Xiaopeng Hong
, Yanyan Zhao
, Jianhua Tao
:
Label-Efficient Emotion and Sentiment Analysis. ACM Multimedia 2024: 11300-11301 - [c269]Zheng Lian
, Bin Liu
, Rui Liu
, Kele Xu
, Erik Cambria
, Guoying Zhao
, Björn W. Schuller
, Jianhua Tao
:
MRAC'24 Track 2: 2nd International Workshop on Multimodal and Responsible Affective Computing. MRAC@MM 2024: 39-40 - [c268]Zheng Lian
, Haiyang Sun
, Licai Sun
, Zhuofan Wen
, Siyuan Zhang
, Shun Chen
, Hao Gu
, Jinming Zhao
, Ziyang Ma
, Xie Chen
, Jiangyan Yi
, Rui Liu
, Kele Xu
, Bin Liu
, Erik Cambria
, Guoying Zhao
, Björn W. Schuller
, Jianhua Tao
:
MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition. MRAC@MM 2024: 41-48 - [c267]Zhuofan Wen
, Hailiang Yao
, Shun Chen
, Haiyang Sun
, Mingyu Xu
, Licai Sun
, Zheng Lian
, Bin Liu
, Fengyu Zhang
, Siyuan Zhang
, Jianhua Tao
:
Social Perception Prediction for MuSe 2024: Joint Learning of Multiple Perceptions. MuSe@ACM Multimedia 2024: 52-59 - [c266]Shun Chen
, Hailiang Yao
, Mingyu Xu
, Zhuofan Wen
, Haiyang Sun
, Licai Sun
, Zheng Lian
, Bin Liu
, Fengyu Zhang
, Siyuan Zhang
, Jianhua Tao
:
DPP: A Dual-Phase Processing Method for Cross-Cultural Humor Detection. MuSe@ACM Multimedia 2024: 70-78 - [c265]Yonghui Kong, Cunhang Fan, Yujie Chen, Shuai Zhang, Zhao Lv, Jianhua Tao:
Bilateral Masking with prompt for Knowledge Graph Completion. NAACL-HLT (Findings) 2024: 240-249 - [c264]Sheng Yan, Cunhang Fan, Hongyu Zhang, Xiaoke Yang, Jianhua Tao, Zhao Lv:
DARNet: Dual Attention Refinement Network with Spatiotemporal Construction for Auditory Attention Detection. NeurIPS 2024 - [e8]Yanmin Qian, Qin Jin, Zhijian Ou, Zhenhua Ling, Zhiyong Wu, Ya Li, Lei Xie, Jianhua Tao:
14th IEEE International Symposium on Chinese Spoken Language Processing, ISCSLP 2024, Beijing, China, November 7-10, 2024. IEEE 2024, ISBN 979-8-3315-1682-6 [contents] - [e7]Jianhua Tao, Shreya Ghosh, Zheng Lian, Zhixi Cai, Björn W. Schuller, Abhinav Dhall, Guoying Zhao, Dimitrios Kollias, Erik Cambria, Roland Goecke, Tom Gedeon:
Proceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing, MRAC 2024, Melbourne VIC, Australia, 28 October 2024- 1 November 2024. ACM 2024, ISBN 979-8-4007-1203-6 [contents] - [i124]Licai Sun, Zheng Lian, Kexin Wang, Yu He, Mingyu Xu, Haiyang Sun, Bin Liu, Jianhua Tao:
SVFAP: Self-supervised Video Facial Affect Perceiver. CoRR abs/2401.00416 (2024) - [i123]Zheng Lian, Licai Sun, Yong Ren, Hao Gu, Haiyang Sun, Lan Chen, Bin Liu, Jianhua Tao:
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion Recognition. CoRR abs/2401.03429 (2024) - [i122]Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao:
HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition. CoRR abs/2401.05698 (2024) - [i121]Cunhang Fan, Yujie Chen, Jun Xue, Yonghui Kong, Jianhua Tao, Zhao Lv:
Progressive Distillation Based on Masked Generation Feature Method for Knowledge Graph Completion. CoRR abs/2401.12997 (2024) - [i120]Kang Chen, Zheng Lian, Haiyang Sun, Bin Liu, Jianhua Tao:
Can Deception Detection Go Deeper? Dataset, Evaluation, and Benchmark for Deception Reasoning. CoRR abs/2402.11432 (2024) - [i119]Zhuofan Wen, Fengyu Zhang, Siyuan Zhang, Haiyang Sun, Mingyu Xu, Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao:
Multimodal Fusion with Pre-Trained Model Features in Affective Behaviour Analysis In-the-wild. CoRR abs/2403.15044 (2024) - [i118]Xinxin Zheng, Feihu Che, Jinyang Wu, Shuai Zhang, Shuai Nie, Kang Liu, Jianhua Tao:
KS-LLM: Knowledge Selection of Large Language Models with Evidence Document for Question Answering. CoRR abs/2404.15660 (2024) - [i117]Zheng Lian, Haiyang Sun, Licai Sun, Zhuofan Wen, Siyuan Zhang, Shun Chen, Hao Gu, Jinming Zhao, Ziyang Ma, Xie Chen, Jiangyan Yi, Rui Liu, Kele Xu
, Bin Liu, Erik Cambria, Guoying Zhao, Björn W. Schuller, Jianhua Tao:
MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition. CoRR abs/2404.17113 (2024) - [i116]Yuankun Xie, Yi Lu, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Jianhua Tao, Xin Qi, Xiaopeng Wang, Yukun Liu, Haonan Cheng, Long Ye, Yi Sun:
The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio. CoRR abs/2405.04880 (2024) - [i115]Jinyang Wu, Feihu Che, Xinxin Zheng, Shuai Zhang, Ruihan Jin, Shuai Nie, Pengpeng Shao, Jianhua Tao:
Can large language models understand uncommon meanings of common words? CoRR abs/2405.05741 (2024) - [i114]Xiaohui Zhang, Jiangyan Yi, Jianhua Tao:
EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark. CoRR abs/2405.08596 (2024) - [i113]Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Yuankun Xie, Yukun Liu, Xiaopeng Wang, Xuefei Liu, Yongwei Li, Jianhua Tao, Yi Lu, Xin Qi, Shuchen Shi:
Generalized Fake Audio Detection via Deep Stable Learning. CoRR abs/2406.03237 (2024) - [i112]Yuankun Xie, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Xiaopeng Wang, Haonan Cheng, Long Ye, Jianhua Tao:
Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy. CoRR abs/2406.03240 (2024) - [i111]Xiaopeng Wang, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Yuankun Xie, Yukun Liu, Jianhua Tao, Xuefei Liu, Yongwei Li, Xin Qi, Yi Lu, Shuchen Shi:
Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection. CoRR abs/2406.03247 (2024) - [i110]Shuchen Shi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Tao Wang, Chunyu Qiang, Yi Lu, Xin Qi, Xuefei Liu, Yukun Liu, Yongwei Li, Zhiyong Wang, Xiaopeng Wang:
PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation. CoRR abs/2406.04683 (2024) - [i109]Junzuo Zhou, Jiangyan Yi, Tao Wang, Jianhua Tao, Ye Bai, Chu Yuan Zhang, Yong Ren, Zhengqi Wen:
TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking. CoRR abs/2406.04840 (2024) - [i108]Yujie Chen, Jiangyan Yi, Jun Xue, Chenglong Wang, Xiaohui Zhang, Shunbo Dong, Siding Zeng, Jianhua Tao, Zhao Lv, Cunhang Fan:
RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection. CoRR abs/2406.06086 (2024) - [i107]Yi Lu, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Zhiyong Wang, Xin Qi, Xuefei Liu, Yongwei Li, Yukun Liu, Xiaopeng Wang, Shuchen Shi:
Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio. CoRR abs/2406.08112 (2024) - [i106]Ruibo Fu, Shuchen Shi, Hongming Guo, Tao Wang, Chunyu Qiang, Zhengqi Wen, Jianhua Tao, Xin Qi, Yi Lu, Xiaopeng Wang, Zhiyong Wang, Yukun Liu, Xuefei Liu, Shuai Zhang, Guanjun Li:
MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation. CoRR abs/2406.10591 (2024) - [i105]Ruihan Jin, Ruibo Fu, Zhengqi Wen, Shuai Zhang, Yukun Liu, Jianhua Tao:
Fake News Detection and Manipulation Reasoning via Large Vision-Language Models. CoRR abs/2407.02042 (2024) - [i104]Ruibo Fu, Xin Qi, Zhengqi Wen, Jianhua Tao, Tao Wang, Chunyu Qiang, Zhiyong Wang, Yi Lu, Xiaopeng Wang, Shuchen Shi, Yukun Liu, Xuefei Liu, Shuai Zhang:
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation. CoRR abs/2407.05421 (2024) - [i103]Zheng Lian, Haiyang Sun, Licai Sun, Jiangyan Yi, Bin Liu, Jianhua Tao:
AffectGPT: Dataset and Framework for Explainable Multimodal Emotion Recognition. CoRR abs/2407.07653 (2024) - [i102]Siding Zeng, Jiangyan Yi, Jianhua Tao, Yujie Chen, Shan Liang, Yong Ren, Xiaohui Zhang:
An Unsupervised Domain Adaptation Method for Locating Manipulated Region in partially fake Audio. CoRR abs/2407.08239 (2024) - [i101]Cong Cai, Shan Liang, Xuefei Liu, Kang Zhu, Zhengqi Wen, Jianhua Tao, Heng Xie, Jizhou Cui, Yiming Ma, Zhenhua Cheng, Hanzhe Xu, Ruibo Fu, Bin Liu, Yongwei Li:
MDPE: A Multimodal Deception Dataset with Personality and Emotional Characteristics. CoRR abs/2407.12274 (2024) - [i100]Jiangyan Yi, Chu Yuan Zhang, Jianhua Tao, Chenglong Wang, Xinrui Yan, Yong Ren, Hao Gu, Junzuo Zhou:
ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild. CoRR abs/2408.04967 (2024) - [i99]Chunyu Qiang, Wang Geng, Yi Zhao, Ruibo Fu, Tao Wang, Cheng Gong, Tianrui Wang, Qiuyu Liu, Jiangyan Yi, Zhengqi Wen, Chen Zhang, Hao Che, Longbiao Wang, Jiangwu Dang, Jianhua Tao:
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing. CoRR abs/2408.05758 (2024) - [i98]Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Yukun Liu, Guanjun Li, Xin Qi, Yi Lu, Xuefei Liu, Yongwei Li:
A Noval Feature via Color Quantisation for Fake Audio Detection. CoRR abs/2408.10849 (2024) - [i97]Xin Qi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Shuchen Shi, Yi Lu, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Guanjun Li, Xuefei Liu, Yongwei Li:
EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech. CoRR abs/2408.10852 (2024) - [i96]Yuankun Xie, Chenxu Xiong, Xiaopeng Wang, Zhiyong Wang, Yi Lu, Xin Qi, Ruibo Fu, Yukun Liu, Zhengqi Wen, Jianhua Tao, Guanjun Li, Long Ye:
Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio? CoRR abs/2408.10853 (2024) - [i95]Moyang Liu, Yukun Liu, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Xuefei Liu, Guanjun Li:
Exploring the Role of Audio in Multimodal Misinformation Detection. CoRR abs/2408.12558 (2024) - [i94]Jinyang Wu, Feihu Che, Chuyuan Zhang, Jianhua Tao, Shuai Zhang, Pengpeng Shao:
Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models. CoRR abs/2408.13533 (2024) - [i93]Hao Gu, Jiangyan Yi, Chenglong Wang, Yong Ren, Jianhua Tao, Xinrui Yan, Yujie Chen, Xiaohui Zhang:
Utilizing Speaker Profiles for Impersonation Audio Detection. CoRR abs/2408.17009 (2024) - [i92]Chenxu Xiong, Ruibo Fu, Shuchen Shi, Zhengqi Wen, Jianhua Tao, Tao Wang, Chenxing Li, Chunyu Qiang, Yuankun Xie, Xin Qi, Guanjun Li, Zizheng Yang:
Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation. CoRR abs/2409.09381 (2024) - [i91]Xin Qi, Ruibo Fu, Zhengqi Wen, Tao Wang, Chunyu Qiang, Jianhua Tao, Chenxing Li, Yi Lu, Shuchen Shi, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Xuefei Liu, Guanjun Li:
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech. CoRR abs/2409.11835 (2024) - [i90]Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Xiaopeng Wang, Yuankun Xie, Xin Qi, Shuchen Shi, Yi Lu, Yukun Liu, Chenxing Li, Xuefei Liu, Guanjun Li:
Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0. CoRR abs/2409.11909 (2024) - [i89]Junzuo Zhou, Jiangyan Yi, Yong Ren, Jianhua Tao, Tao Wang, Chu Yuan Zhang:
WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification. CoRR abs/2409.12121 (2024) - [i88]Zheng Lian, Haiyang Sun, Licai Sun, Lan Chen, Haoyu Chen, Hao Gu, Zhuofan Wen, Shun Chen, Siyuan Zhang, Hailiang Yao, Mingyu Xu, Kang Chen, Bin Liu, Rui Liu, Shan Liang, Ya Li, Jiangyan Yi, Jianhua Tao:
Open-vocabulary Multimodal Emotion Recognition: Dataset, Metric, and Benchmark. CoRR abs/2410.01495 (2024) - [i87]Sheng Yan, Cunhang fan, Hongyu Zhang, Xiaoke Yang, Jianhua Tao, Zhao Lv:
DARNet: Dual Attention Refinement Network with Spatiotemporal Construction for Auditory Attention Detection. CoRR abs/2410.11181 (2024) - [i86]Haojie Zhang, Zhihao Liang, Ruibo Fu, Zhengqi Wen, Xuefei Liu, Chenxing Li, Jianhua Tao, Yaling Liang:
LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis. CoRR abs/2411.16748 (2024) - [i85]Jinyang Wu, Mingkuan Feng, Shuai Zhang, Feihu Che, Zengqi Wen, Jianhua Tao:
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS. CoRR abs/2411.18478 (2024) - [i84]Xinrui Yan, Jiangyan Yi, Jianhua Tao, Yujie Chen, Hao Gu, Guanjun Li, Junzuo Zhou, Yong Ren, Tao Xu:
Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio. CoRR abs/2412.01425 (2024) - [i83]Yujie Chen, Jiangyan Yi, Cunhang Fan, Jianhua Tao, Yong Ren, Siding Zeng, Chu Yuan Zhang, Xinrui Yan, Hao Gu, Jun Xue, Chenglong Wang, Zhao Lv, Xiaohui Zhang:
Region-Based Optimization in Continual Learning for Audio Deepfake Detection. CoRR abs/2412.11551 (2024) - [i82]Cunhang Fan, Enrui Liu, Andong Li, Jianhua Tao, Jian Zhou, Jiahao Li, Chengshi Zheng, Zhao Lv:
BSDB-Net: Band-Split Dual-Branch Network with Selective State Spaces Mechanism for Monaural Speech Enhancement. CoRR abs/2412.19099 (2024) - 2023
- [j74]Pengpeng Shao, Jiayi He
, Guanjun Li, Dawei Zhang, Jianhua Tao:
Hierarchical graph attention network for temporal knowledge graph reasoning. Neurocomputing 550: 126390 (2023) - [j73]Zepeng Huai, Dawei Zhang, Guohua Yang, Jianhua Tao:
Spatial-temporal knowledge graph network for event prediction. Neurocomputing 553: 126557 (2023) - [j72]Pengpeng Shao, Tong Liu, Feihu Che, Dawei Zhang, Jianhua Tao:
Adaptive pseudo-Siamese policy network for temporal knowledge prediction. Neural Networks 160: 192-201 (2023) - [j71]Zheng Lian
, Lan Chen, Licai Sun
, Bin Liu
, Jianhua Tao
:
GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation. IEEE Trans. Pattern Anal. Mach. Intell. 45(7): 8419-8432 (2023) - [j70]Andreas Triantafyllopoulos
, Björn W. Schuller
, Gökçe Iymen
, Tevfik Metin Sezgin, Xiangheng He, Zijiang Yang
, Panagiotis Tzirakis
, Shuo Liu
, Silvan Mertes
, Elisabeth André
, Ruibo Fu
, Jianhua Tao
:
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era. Proc. IEEE 111(10): 1355-1381 (2023) - [j69]Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengkun Tian, Cunhang Fan:
Transfer knowledge for punctuation prediction via adversarial training. Speech Commun. 149: 1-10 (2023) - [j68]Mingyue Niu
, Jianhua Tao
, Bin Liu
, Jian Huang
, Zheng Lian
:
Multimodal Spatiotemporal Representation for Automatic Depression Level Detection. IEEE Trans. Affect. Comput. 14(1): 294-307 (2023) - [j67]Mingyue Niu
, Ziping Zhao
, Jianhua Tao
, Ya Li, Björn W. Schuller
:
Dual Attention and Element Recalibration Networks for Automatic Depression Level Prediction. IEEE Trans. Affect. Comput. 14(3): 1954-1965 (2023) - [j66]Zheng Lian
, Bin Liu
, Jianhua Tao
:
SMIN: Semi-Supervised Multi-Modal Interaction Network for Conversational Emotion Recognition. IEEE Trans. Affect. Comput. 14(3): 2415-2429 (2023) - [j65]Jiangyan Yi
, Jianhua Tao
, Ruibo Fu
, Tao Wang
, Chu Yuan Zhang
, Chenglong Wang
:
Adversarial Multi-Task Learning for Mandarin Prosodic Boundary Prediction With Multi-Modal Embeddings. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2963-2973 (2023) - [c263]Junjie Chen, Yongwei Li, Ziping Zhao, Xuefei Liu, Zhengqi Wen, Jianhua Tao:
Hybrid Multi-Task Learning for End-To-End Multimodal Emotion Recognition. APSIPA ASC 2023: 1966-1971 - [c262]Yi Lu, Ruibo Fu, Xin Qi, Zhengqi Wen, Jianhua Tao, Jiangyan Yi, Tao Wang, Yong Ren, Chuyuan Zhang, Chenyu Yang, Wenling Shi:
The VIBVG Speech Synthesis System for Blizzard Challenge 2023. Blizzard Challenge 2023 - [c261]Xiaohui Zhang, Mangui Liang, Zhengkun Tian, Jiangyan Yi, Jianhua Tao:
TST: Time-Sparse Transducer for Automatic Speech Recognition. CICAI (2) 2023: 68-80 - [c260]Xiaohui Zhang, Jiangyan Yi, Jianhua Tao, Chenglong Wang, Le Xu, Ruibo Fu:
Adaptive Fake Audio Detection with Low-Rank Model Squeezing. DADA@IJCAI 2023: 95-100 - [c259]Chenglong Wang, Jiangyan Yi, Xiaohui Zhang, Jianhua Tao, Xinrui Yan, Le Xu, Ruibo Fu:
Low-rank Adaptation Method for Wav2vec2-based Fake Audio Detection. DADA@IJCAI 2023: 101-106 - [c258]Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li:
ADD 2023: the Second Audio Deepfake Detection Challenge. DADA@IJCAI 2023: 125-130 - [c257]Guanjun Li, Wei Xue
, Wenju Liu, Jiangyan Yi, Jianhua Tao:
GCC-Speaker: Target Speaker Localization with Optimal Speaker-Dependent Weighting in Multi-Speaker Scenarios. ICASSP 2023: 1-5 - [c256]Jinlong Xue
, Yayue Deng, Fengping Wang, Ya Li
, Yingming Gao, Jianhua Tao, Jianqing Sun, Jiaen Liang:
M2-CTTS: End-to-End Multi-Scale Multi-Modal Conversational Text-to-Speech Synthesis. ICASSP 2023: 1-5 - [c255]Xiaohui Zhang, Jiangyan Yi, Jianhua Tao, Chenglong Wang, Chu Yuan Zhang:
Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection. ICML 2023: 41819-41831 - [c254]Zepeng Huai, Guohua Yang, Jianhua Tao, Dawei Zhang:
Learning Item Attributes and User Interests for Knowledge Graph Enhanced Recommendation. ICONIP (4) 2023: 284-297 - [c253]Ruiteng Zhang, Jianguo Wei, Xugang Lu, Yongwei Li, Junhai Xu, Di Jin, Jianhua Tao:
SOT: Self-supervised Learning-Assisted Optimal Transport for Unsupervised Adaptive Speech Emotion Recognition. INTERSPEECH 2023: 1858-1862 - [c252]Chenglong Wang, Jiangyan Yi, Jianhua Tao, Chu Yuan Zhang, Shuai Zhang, Ruibo Fu, Xun Chen:
TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection. INTERSPEECH 2023: 3137-3141 - [c251]Haiyang Sun, Zheng Lian, Bin Liu, Ying Li, Jianhua Tao, Licai Sun, Cong Cai, Meng Wang, Yuan Cheng:
EmotionNAS: Two-stream Neural Architecture Search for Speech Emotion Recognition. INTERSPEECH 2023: 3597-3601 - [c250]Chenglong Wang, Jiangyan Yi, Jianhua Tao, Chu Yuan Zhang, Shuai Zhang, Xun Chen:
Detection of Cross-Dataset Fake Audio Based on Prosodic and Pronunciation Features. INTERSPEECH 2023: 3844-3848 - [c249]Licai Sun
, Zheng Lian
, Bin Liu
, Jianhua Tao
:
MAE-DFER: Efficient Masked Autoencoder for Self-supervised Dynamic Facial Expression Recognition. ACM Multimedia 2023: 6110-6121 - [c248]Ke Xu
, Kang Chen
, Licai Sun
, Zheng Lian
, Bin Liu
, Gong Chen
, Haiyang Sun
, Mingyu Xu
, Jianhua Tao
:
Integrating VideoMAE based model and Optical Flow for Micro- and Macro-expression Spotting. ACM Multimedia 2023: 9576-9580 - [c247]Zheng Lian
, Haiyang Sun, Licai Sun
, Kang Chen, Mingyu Xu, Kexin Wang, Ke Xu, Yu He, Ying Li, Jinming Zhao, Ye Liu, Bin Liu, Jiangyan Yi, Meng Wang, Erik Cambria
, Guoying Zhao
, Björn W. Schuller
, Jianhua Tao
:
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning. ACM Multimedia 2023: 9610-9614 - [c246]Zheng Lian
, Erik Cambria
, Guoying Zhao
, Björn W. Schuller
, Jianhua Tao
:
MRAC'23: 1st International Workshop on Multimodal and Responsible Affective Computing. ACM Multimedia 2023: 9713-9714 - [c245]Guofeng Yi
, Yuguang Yang
, Yu Pan
, Yuhang Cao
, Jixun Yao
, Xiang Lv
, Cunhang Fan
, Zhao Lv, Jianhua Tao, Shan Liang
, Heng Lu
:
Exploring the Power of Cross-Contextual Large Language Model in Mimic Emotion Prediction. MuSe@ACM Multimedia 2023: 19-26 - [c244]Heng Xie
, Jizhou Cui
, Yuhang Cao
, Junjie Chen
, Jianhua Tao
, Cunhang Fan
, Xuefei Liu
, Zhengqi Wen
, Heng Lu
, Yuguang Yang
, Zhao Lv
, Yongwei Li
:
Multimodal Cross-Lingual Features and Weight Fusion for Cross-Cultural Humor Detection. MuSe@ACM Multimedia 2023: 51-57 - [c243]Haiyang Sun
, Zhuofan Wen
, Mingyu Xu
, Zheng Lian
, Licai Sun
, Bin Liu
, Jianhua Tao
:
Exclusive Modeling for MuSe-Personalisation Challenge. MuSe@ACM Multimedia 2023: 73-80 - [c242]Mingyu Xu, Zheng Lian, Lei Feng, Bin Liu, Jianhua Tao:
ALIM: Adjusting Label Importance Mechanism for Noisy Partial Label Learning. NeurIPS 2023 - [c241]Mingyu Xu, Zheng Lian, Bin Liu, Jianhua Tao:
VRA: Variational Rectified Activation for Out-of-distribution Detection. NeurIPS 2023 - [e6]Jianhua Tao, Haizhou Li, Jiangyan Yi, Cunhang Fan:
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), Macao, China, August 19, 2023. CEUR Workshop Proceedings 3597, CEUR-WS.org 2023 [contents] - [i81]Haogeng Liu, Tao Wang, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Jianhua Tao:
UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion. CoRR abs/2301.03801 (2023) - [i80]Mingyu Xu, Zheng Lian, Lei Feng, Bin Liu, Jianhua Tao:
DALI: Dynamically Adjusted Label Importance for Noisy Partial Label Learning. CoRR abs/2301.12077 (2023) - [i79]Zheng Lian, Haiyang Sun, Licai Sun, Jinming Zhao, Ye Liu, Bin Liu, Jiangyan Yi, Meng Wang, Erik Cambria, Guoying Zhao, Björn W. Schuller, Jianhua Tao:
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning. CoRR abs/2304.08981 (2023) - [i78]Jinlong Xue, Yayue Deng, Fengping Wang, Ya Li, Yingming Gao, Jianhua Tao, Jianqing Sun, Jiaen Liang:
M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis. CoRR abs/2305.02269 (2023) - [i77]Chenglong Wang, Jiangyan Yi, Jianhua Tao, Chu Yuan Zhang, Shuai Zhang, Xun Chen:
Detection of Cross-Dataset Fake Audio Based on Prosodic and Pronunciation Features. CoRR abs/2305.13700 (2023) - [i76]Chenglong Wang, Jiangyan Yi, Jianhua Tao, Chuyuan Zhang, Shuai Zhang, Ruibo Fu, Xun Chen:
TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection. CoRR abs/2305.13701 (2023) - [i75]Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li:
ADD 2023: the Second Audio Deepfake Detection Challenge. CoRR abs/2305.13774 (2023) - [i74]Xiaohui Zhang, Jiangyan Yi, Jianhua Tao, Chenlong Wang, Le Xu, Ruibo Fu:
Adaptive Fake Audio Detection with Low-Rank Model Squeezing. CoRR abs/2306.04956 (2023) - [i73]Chenglong Wang, Jiangyan Yi, Xiaohui Zhang, Jianhua Tao, Le Xu, Ruibo Fu:
Low-rank Adaptation Method for Wav2vec2-based Fake Audio Detection. CoRR abs/2306.05617 (2023) - [i72]Haogeng Liu, Tao Wang, Jie Cao, Ran He, Jianhua Tao:
Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion. CoRR abs/2306.05708 (2023) - [i71]Zheng Lian, Licai Sun, Mingyu Xu, Haiyang Sun, Ke Xu, Zhuofan Wen, Shun Chen, Bin Liu, Jianhua Tao:
Explainable Multimodal Emotion Reasoning. CoRR abs/2306.15401 (2023) - [i70]Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao:
MAE-DFER: Efficient Masked Autoencoder for Self-supervised Dynamic Facial Expression Recognition. CoRR abs/2307.02227 (2023) - [i69]Xiaohui Zhang, Mangui Liang, Zhengkun Tian, Jiangyan Yi, Jianhua Tao:
TST: Time-Sparse Transducer for Automatic Speech Recognition. CoRR abs/2307.08323 (2023) - [i68]Xiaohui Zhang, Jiangyan Yi, Jianhua Tao, Chenglong Wang, Chuyuan Zhang:
Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio Detection. CoRR abs/2308.03300 (2023) - [i67]Cunhang Fan, Jun Xue, Jianhua Tao, Jiangyan Yi, Chenglong Wang, Chengshi Zheng, Zhao Lv:
Spatial Reconstructed Local Attention Res2Net with F0 Subband for Fake Speech Detection. CoRR abs/2308.09944 (2023) - [i66]Jiangyan Yi, Chenglong Wang, Jianhua Tao, Xiaohui Zhang, Chu Yuan Zhang, Yan Zhao:
Audio Deepfake Detection: A Survey. CoRR abs/2308.14970 (2023) - [i65]Chu Yuan Zhang, Jiangyan Yi, Jianhua Tao, Chenglong Wang, Xinrui Yan:
Distinguishing Neural Speech Synthesis Models Through Fingerprints in Speech Waveforms. CoRR abs/2309.06780 (2023) - [i64]Cunhang Fan, Hongyu Zhang, Wei Huang, Jun Xue, Jianhua Tao, Jiangyan Yi, Zhao Lv, Xiaopei Wu:
DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial Attention Detection. CoRR abs/2309.07147 (2023) - [i63]Le Xu, Jiangyan Yi, Jianhua Tao, Tao Wang, Yong Ren, Rongxiu Zhong:
Controllable Residual Speaker Representation for Voice Conversion. CoRR abs/2309.08166 (2023) - [i62]Yong Ren, Tao Wang, Jiangyan Yi, Le Xu, Jianhua Tao, Chuyuan Zhang, Junzuo Zhou:
Fewer-token Neural Speech Codec with Time-invariant Codes. CoRR abs/2310.00014 (2023) - [i61]Cunhang Fan, Mingming Ding, Jianhua Tao, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Zhao Lv:
Learning to Behave Like Clean Speech: Dual-Branch Knowledge Distillation for Noise-Robust Fake Audio Detection. CoRR abs/2310.08869 (2023) - [i60]Zheng Lian, Licai Sun, Haiyang Sun, Kang Chen, Zhuofan Wen, Hao Gu, Shun Chen, Bin Liu, Jianhua Tao:
GPT-4V with Emotion: A Zero-shot Benchmark for Multimodal Emotion Understanding. CoRR abs/2312.04293 (2023) - [i59]Xiaohui Zhang, Jiangyan Yi, Chenglong Wang, Chuyuan Zhang, Siding Zeng, Jianhua Tao:
What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection. CoRR abs/2312.09651 (2023) - [i58]Haiyang Sun, Zheng Lian, Licai Sun, Bin Liu, Jianhua Tao:
RMNAS: A Multimodal Neural Architecture Search Framework For Robust Multimodal Sentiment Analysis. CoRR abs/2312.15583 (2023) - 2022
- [j64]Ke Xu
, Bin Liu, Jianhua Tao, Zhao Lv
, Cunhang Fan, Leichao Song:
AHRNN: Attention-Based Hybrid Robust Neural Network for emotion recognition. Cogn. Comput. Syst. 4(1): 85-95 (2022) - [j63]Björn W. Schuller
, Yonina C. Eldar, Maja Pantic, Shrikanth Narayanan, Tuomas Virtanen
, Jianhua Tao:
Editorial: Intelligent Signal Analysis for Contagious Virus Diseases. IEEE J. Sel. Top. Signal Process. 16(2): 159-163 (2022) - [j62]Haishuai Wang
, Guangyu Tao
, Jiali Ma, Shangru Jia, Lianhua Chi
, Hong Yang, Ziping Zhao
, Jianhua Tao
:
Predicting the Epidemics Trend of COVID-19 Using Epidemiological-Based Generative Adversarial Networks. IEEE J. Sel. Top. Signal Process. 16(2): 276-288 (2022) - [j61]Pengpeng Shao, Dawei Zhang, Guohua Yang, Jianhua Tao, Feihu Che, Tong Liu:
Tucker decomposition-based temporal knowledge graph completion. Knowl. Based Syst. 238: 107841 (2022) - [j60]Wenhuan Lu, Xinyue Zhao, Na Guo, Yongwei Li, Jianguo Wei
, Jianhua Tao, Jianwu Dang:
One-shot emotional voice conversion based on feature separation. Speech Commun. 143: 1-9 (2022) - [j59]Zhengkun Tian
, Jiangyan Yi
, Jianhua Tao
, Shuai Zhang
, Zhengqi Wen:
Hybrid Autoregressive and Non-Autoregressive Transformer Models for Speech Recognition. IEEE Signal Process. Lett. 29: 762-766 (2022) - [j58]Xiao Sun
, Jingyuan Li
, Jianhua Tao
:
Emotional Conversation Generation Orientated Syntactically Constrained Bidirectional-Asynchronous Framework. IEEE Trans. Affect. Comput. 13(1): 187-198 (2022) - [j57]Tao Wang
, Ruibo Fu
, Jiangyan Yi
, Jianhua Tao
, Zhengqi Wen:
NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation. IEEE ACM Trans. Audio Speech Lang. Process. 30: 865-878 (2022) - [j56]Tao Wang
, Jiangyan Yi
, Ruibo Fu
, Jianhua Tao
, Zhengqi Wen:
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing. IEEE ACM Trans. Audio Speech Lang. Process. 30: 2241-2254 (2022) - [j55]Mingyue Niu
, Ziping Zhao, Jianhua Tao, Ya Li, Björn W. Schuller
:
Selective Element and Two Orders Vectorization Networks for Automatic Depression Severity Diagnosis via Facial Changes. IEEE Trans. Circuits Syst. Video Technol. 32(11): 8065-8077 (2022) - [c240]Chao Shen, Jianhua Tao, Peng Li, Zhao Lv
, Guohua Yang:
Joint Event Extraction Based on CNN-BiGRU and Attention Mechanism. CACML 2022: 492-497 - [c239]Haiyang Sun, Zheng Lian, Bin Liu, Jianhua Tao, Licai Sun, Cong Cai, Yu He:
Two-Aspect Information Interaction Model for ABAW4 Multi-task Challenge. ECCV Workshops (6) 2022: 173-180 - [c238]Tao Wang, Jiangyan Yi, Liqun Deng, Ruibo Fu, Jianhua Tao, Zhengqi Wen:
Context-Aware Mask Prediction Network for End-to-End Text-Based Speech Editing. ICASSP 2022: 6082-6086 - [c237]Ya Li, Mingyue Niu, Ziping Zhao, Jianhua Tao:
Automatic Depression Level Assessment from Speech By Long-Term Global Information Embedding. ICASSP 2022: 8507-8511 - [c236]Cong Cai, Bin Liu, Jianhua Tao, Zhengkun Tian, Jiahao Lu, Kexin Wang:
End-to-End Network Based on Transformer for Automatic Detection of Covid-19. ICASSP 2022: 9082-9086 - [c235]Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu
, Zhengqi Wen, Haizhou Li:
ADD 2022: the first Audio Deep Synthesis Detection Challenge. ICASSP 2022: 9216-9220 - [c234]Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Jianhua Tao, Yu Ting Yeung, Liqun Deng:
reducing multilingual context confusion for end-to-end code-switching automatic speech recognition. INTERSPEECH 2022: 3894-3898 - [c233]Jiahui Pan, Shuai Nie, Hui Zhang, Shulin He, Kanghao Zhang, Shan Liang, Xueliang Zhang, Jianhua Tao:
Speaker recognition-assisted robust audio deepfake detection. INTERSPEECH 2022: 4202-4206 - [c232]Shuai Nie, Shan Liang, Zhanlei Yang, Longshuai Xiao, Wenju Liu, Jianhua Tao:
Masking-based Neural Beamformer for Multichannel Speech Enhancement. ISCSLP 2022: 125-129 - [c231]Jiahao Lu, Bin Liu, Zheng Lian, Cong Cai, Jianhua Tao, Ziping Zhao:
Prediction of Depression Severity Based on Transformer Encoder and CNN Model. ISCSLP 2022: 339-343 - [c230]Jun Xue
, Cunhang Fan, Zhao Lv
, Jianhua Tao, Jiangyan Yi, Chengshi Zheng, Zhengqi Wen, Minmin Yuan, Shegang Shao:
Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features. DDAM@MM 2022: 19-26 - [c229]Chenglong Wang, Jiangyan Yi, Jianhua Tao, Haiyang Sun, Xun Chen, Zhengkun Tian, Haoxin Ma, Cunhang Fan, Ruibo Fu:
Fully Automated End-to-End Fake Audio Detection. DDAM@MM 2022: 27-33 - [c228]Tao Wang, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Jianhua Tao:
Singing-Tacotron: Global Duration Control Attention and Dynamic Filter for End-to-end Singing Voice Synthesis. DDAM@MM 2022: 53-59 - [c227]Yu He, Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao, Meng Wang, Yuan Cheng:
Multimodal Temporal Attention in Sentiment Analysis. MuSe @ ACM Multimedia 2022: 61-66 - [c226]Xinrui Yan, Jiangyan Yi, Jianhua Tao, Chenglong Wang, Haoxin Ma, Tao Wang, Shiming Wang, Ruibo Fu:
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio. DDAM@MM 2022: 61-68 - [c225]Kexin Wang, Zheng Lian, Licai Sun, Bin Liu, Jianhua Tao, Yin Fan:
Emotional Reaction Analysis based on Multi-Label Graph Convolutional Networks and Dynamic Facial Expression Recognition Transformer. MuSe @ ACM Multimedia 2022: 75-80 - [c224]Jianhua Tao, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Liang, Pengyuan Zhang, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi:
DDAM '22: 1st International Workshop on Deepfake Detection for Audio Multimedia. ACM Multimedia 2022: 7405-7406 - [e5]Jianhua Tao, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Lian, Pengyuan Zhang:
DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, Lisboa, Portugal, 14 October 2022. ACM 2022, ISBN 978-1-4503-9496-3 [contents] - [i57]Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Jianhua Tao, Yu Ting Yeung, Liqun Deng:
Reducing language context confusion for end-to-end code-switching automatic speech recognition. CoRR abs/2201.12155 (2022) - [i56]Tao Wang, Ruibo Fu, Jiangyan Yi, Jianhua Tao, Zhengqi Wen:
Singing-Tacotron: Global duration control attention and dynamic filter for End-to-end singing voice synthesis. CoRR abs/2202.07907 (2022) - [i55]Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li, Zheng Lian, Bin Liu:
ADD 2022: the First Audio Deep Synthesis Detection Challenge. CoRR abs/2202.08433 (2022) - [i54]Feihu Che, Guohua Yang, Pengpeng Shao, Dawei Zhang, Jianhua Tao:
MixKG: Mixing for harder negative samples in knowledge graph. CoRR abs/2202.09606 (2022) - [i53]Tao Wang, Jiangyan Yi, Ruibo Fu, Jianhua Tao, Zhengqi Wen:
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing. CoRR abs/2202.09950 (2022) - [i52]Zheng Lian, Lan Chen, Licai Sun, Bin Liu, Jianhua Tao:
GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation. CoRR abs/2203.02177 (2022) - [i51]Tao Wang, Ruibo Fu, Jiangyan Yi, Jianhua Tao, Zhengqi Wen:
NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation. CoRR abs/2203.02678 (2022) - [i50]Haiyang Sun, Zheng Lian, Bin Liu, Ying Li, Licai Sun, Cong Cai, Jianhua Tao, Meng Wang, Yuan Cheng:
EmotionNAS: Two-stream Architecture Search for Speech Emotion Recognition. CoRR abs/2203.13617 (2022) - [i49]Pengpeng Shao, Tong Liu, Feihu Che, Dawei Zhang, Jianhua Tao:
Adaptive Pseudo-Siamese Policy Network for Temporal Knowledge Prediction. CoRR abs/2204.12036 (2022) - [i48]Haiyang Sun, Zheng Lian, Bin Liu, Jianhua Tao, Licai Sun, Cong Cai:
Two-Aspect Information Fusion Model For ABAW4 Multi-task Challenge. CoRR abs/2207.11389 (2022) - [i47]Jun Xue, Cunhang Fan, Zhao Lv
, Jianhua Tao, Jiangyan Yi, Chengshi Zheng, Zhengqi Wen, Minmin Yuan, Shegang Shao:
Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features. CoRR abs/2208.01214 (2022) - [i46]Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao:
Efficient Multimodal Transformer with Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis. CoRR abs/2208.07589 (2022) - [i45]Chenglong Wang, Jiangyan Yi, Jianhua Tao, Haiyang Sun, Xun Chen, Zhengkun Tian, Haoxin Ma, Cunhang Fan, Ruibo Fu:
Fully Automated End-to-End Fake Audio Detection. CoRR abs/2208.09618 (2022) - [i44]Xinrui Yan, Jiangyan Yi, Jianhua Tao, Chenglong Wang, Haoxin Ma, Tao Wang, Shiming Wang, Ruibo Fu:
An Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio. CoRR abs/2208.09646 (2022) - [i43]Xinrui Yan, Jiangyan Yi, Jianhua Tao, Chenglong Wang, Haoxin Ma, Zhengkun Tian, Ruibo Fu:
System Fingerprints Detection for DeepFake Audio: An Initial Dataset and Investigation. CoRR abs/2208.10489 (2022) - [i42]Andreas Triantafyllopoulos, Björn W. Schuller, Gökçe Iymen, Tevfik Metin Sezgin, Xiangheng He, Zijiang Yang, Panagiotis Tzirakis, Shuo Liu, Silvan Mertes, Elisabeth André
, Ruibo Fu, Jianhua Tao:
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era. CoRR abs/2210.03538 (2022) - [i41]Chunyu Qiang
, Jianhua Tao, Ruibo Fu, Zhengqi Wen, Jiangyan Yi, Tao Wang, Shiming Wang:
Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS. CoRR abs/2210.11429 (2022) - [i40]Zheng Lian, Mingyu Xu, Lan Chen, Licai Sun, Bin Liu, Jianhua Tao:
ARNet: Automatic Refinement Network for Noisy Partial Label Learning. CoRR abs/2211.04774 (2022) - [i39]Yan Zhao, Jiangyan Yi, Jianhua Tao, Chenglong Wang, Chu Yuan Zhang, Tao Wang, Yongfeng Dong:
EmoFake: An Initial Dataset for Emotion Fake Audio Detection. CoRR abs/2211.05363 (2022) - [i38]Jiangyan Yi, Chenglong Wang, Jianhua Tao, Zhengkun Tian, Cunhang Fan, Haoxin Ma, Ruibo Fu:
SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection. CoRR abs/2211.06073 (2022) - [i37]Tao Wang, Jiangyan Yi, Ruibo Fu, Jianhua Tao, Zhengqi Wen, Chu Yuan Zhang:
Emotion Selectable End-to-End Text-based Speech Editing. CoRR abs/2212.10191 (2022) - 2021
- [j54]Jianhua Tao, Jian Huang, Ya Li, Zheng Lian
, Mingyue Niu:
Correction to: Semi-supervised Ladder Networks for Speech Emotion Recognition. Int. J. Autom. Comput. 18(4): 680 (2021) - [j53]Mingyue Niu, Bin Liu, Jianhua Tao, Qifei Li:
A time-frequency channel attention and vectorization network for automatic depression level prediction. Neurocomputing 450: 208-218 (2021) - [j52]Zheng Lian
, Bin Liu, Jianhua Tao:
DECN: Dialogical emotion correction network for conversational emotion recognition. Neurocomputing 454: 483-495 (2021) - [j51]Feihu Che, Guohua Yang, Dawei Zhang, Jianhua Tao, Tong Liu:
Self-supervised graph representation learning via bootstrapping. Neurocomputing 456: 88-96 (2021) - [j50]Feihu Che, Jianhua Tao, Guohua Yang, Tong Liu, Dawei Zhang:
Multi-aspect self-supervised learning for heterogeneous information network. Knowl. Based Syst. 233: 107474 (2021) - [j49]Ziping Zhao
, Qifei Li
, Zixing Zhang, Nicholas Cummins
, Haishuai Wang
, Jianhua Tao, Björn W. Schuller:
Combining a parallel 2D CNN with a self-attention Dilated Residual Network for CTC-based discrete speech emotion recognition. Neural Networks 141: 52-60 (2021) - [j48]Shan Liang, Guanjun Li
, Shuai Nie, Zhanlei Yang, Wenju Liu, Jianhua Tao:
Exploiting the directional coherence function for multichannel source extraction. Speech Commun. 128: 1-14 (2021) - [j47]Björn W. Schuller
, Rosalind W. Picard, Elisabeth André
, Jonathan Gratch, Jianhua Tao:
Intelligent Signal Processing for Affective Computing [From the Guest Editors]. IEEE Signal Process. Mag. 38(6): 9-11 (2021) - [j46]Jing Han, Zixing Zhang, Cecilia Mascolo, Elisabeth André
, Jianhua Tao, Ziping Zhao, Björn W. Schuller
:
Deep Learning for Mobile Mental Health: Challenges and recent advances. IEEE Signal Process. Mag. 38(6): 96-105 (2021) - [j45]Cunhang Fan
, Jiangyan Yi
, Jianhua Tao
, Zhengkun Tian, Bin Liu, Zhengqi Wen:
Gated Recurrent Fusion With Joint Training Framework for Robust End-to-End Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 29: 198-209 (2021) - [j44]Zheng Lian
, Bin Liu
, Jianhua Tao
:
CTNet: Conversational Transformer Network for Emotion Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 29: 985-1000 (2021) - [j43]Ye Bai
, Jiangyan Yi
, Jianhua Tao
, Zhengqi Wen, Zhengkun Tian, Shuai Zhang:
Integrating Knowledge Into End-to-End Speech Recognition From External Text-Only Data. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1340-1351 (2021) - [j42]Ye Bai
, Jiangyan Yi
, Jianhua Tao
, Zhengkun Tian, Zhengqi Wen, Shuai Zhang:
Fast End-to-End Speech Recognition Via Non-Autoregressive Models and Cross-Modal Knowledge Transferring From BERT. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1897-1911 (2021) - [j41]Yongwei Li
, Jianhua Tao
, Donna Erickson, Bin Liu
, Masato Akagi
:
$F_0$-Noise-Robust Glottal Source and Vocal Tract Analysis Based on ARX-LF Model. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3375-3383 (2021) - [j40]Xiao Sun
, Zhengmeng Pei, Chen Zhang, Guoqiang Li
, Jianhua Tao:
Design and Analysis of a Human-Machine Interaction System for Researching Human's Dynamic Emotion. IEEE Trans. Syst. Man Cybern. Syst. 51(10): 6111-6121 (2021) - [j39]Hang Pan, Lun Xie, Zhiliang Wang, Bin Liu, Minghao Yang, Jianhua Tao:
Review of micro-expression spotting and recognition in video sequences. Virtual Real. Intell. Hardw. 3(1): 1-17 (2021) - [j38]Ziping Zhao, Zhongtian Bao, Zixing Zhang, Nicholas Cummins
, Shihuang Sun, Haishuai Wang, Jianhua Tao, Björn W. Schuller
:
Self-attention transfer networks for speech emotion recognition. Virtual Real. Intell. Hardw. 3(1): 43-54 (2021) - [j37]Jian Huang, Bin Liu, Jianhua Tao:
Learning long-term temporal contexts using skip RNN for continuous emotion recognition. Virtual Real. Intell. Hardw. 3(1): 55-64 (2021) - [j36]Jianhua Tao:
Emotion recognition for human-computer interaction. Virtual Real. Intell. Hardw. 3(1): iii-iv (2021) - [c223]Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
One In A Hundred: Selecting the Best Predicted Sequence from Numerous Candidates for Speech Recognition. APSIPA ASC 2021: 454-459 - [c222]Mingyue Niu, Jianhua Tao, Bin Liu:
Multi-Scale and Multi-Region Facial Discriminative Representation for Automatic Depression Level Prediction. ICASSP 2021: 1325-1329 - [c221]Licai Sun
, Bin Liu, Jianhua Tao, Zheng Lian
:
Multimodal Cross- and Self-Attention Network for Speech Emotion Recognition. ICASSP 2021: 4275-4279 - [c220]Shiming Wang, Zhenhua Ling, Ruibo Fu, Jiangyan Yi, Jianhua Tao:
Patnet : A Phoneme-Level Autoregressive Transformer Network for Speech Synthesis. ICASSP 2021: 5684-5688 - [c219]Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Zhengqi Wen:
Decoupling Pronunciation and Language for End-to-End Code-Switching Automatic Speech Recognition. ICASSP 2021: 6249-6253 - [c218]Ruibo Fu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Tao Wang, Chunyu Qiang
:
Bi-Level Style and Prosody Decoupling Modeling for Personalized End-to-End Speech Synthesis. ICASSP 2021: 6568-6572 - [c217]Tao Wang, Ruibo Fu, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Chunyu Qiang
, Shiming Wang:
Prosody and Voice Factorization for Few-Shot Speaker Adaptation in the Challenge M2voc 2021. ICASSP 2021: 8603-8607 - [c216]Hao Zhang, Bin Liu, Jianhua Tao, Zhao Lv
:
Facial Micro-Expression Recognition Based on Multi-Scale Temporal and Spatial Features. ICMI Companion 2021: 80-84 - [c215]Dongyan Huang, Björn W. Schuller, Jianhua Tao, Lei Xie, Jie Yang:
ASMMC21: The 6th International Workshop on Affective Social Multimedia Computing. ICMI 2021: 864-867 - [c214]Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Xuefei Liu, Zhengqi Wen:
End-to-End Spelling Correction Conditioned on Acoustic Feature for Code-Switching Speech Recognition. Interspeech 2021: 266-270 - [c213]Haoxin Ma, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengkun Tian, Chenglong Wang:
Continual Learning for Fake Audio Detection. Interspeech 2021: 886-890 - [c212]Jiangyan Yi, Ye Bai, Jianhua Tao, Haoxin Ma, Zhengkun Tian, Chenglong Wang, Tao Wang, Ruibo Fu:
Half-Truth: A Partially Fake Audio Detection Dataset. Interspeech 2021: 1654-1658 - [c211]Cong Cai, Mingyue Niu, Bin Liu, Jianhua Tao, Xuefei Liu:
TDCA-Net: Time-Domain Channel Attention Network for Depression Detection. Interspeech 2021: 2511-2515 - [c210]Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization. Interspeech 2021: 4034-4038 - [c209]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen, Leichao Song:
Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning. ISCSLP 2021: 1-5 - [c208]Zheng Lian
, Rongxiu Zhong, Zhengqi Wen, Bin Liu, Jianhua Tao:
Towards Fine-Grained Prosody Control for Voice Conversion. ISCSLP 2021: 1-5 - [c207]Chunyu Qiang
, Jianhua Tao, Ruibo Fu, Zhengqi Wen, Jiangyan Yi, Tao Wang, Shiming Wang:
Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS. ISCSLP 2021: 1-5 - [c206]Chenglong Wang, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengkun Tian:
Hierarchically Attending Time-Frequency and Channel Features for Improving Speaker Verification. ISCSLP 2021: 1-5 - [c205]Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Jianhua Tao, Ye Bai:
Rnn-transducer With Language Bias For End-to-end Mandarin-English Code-switching Speech Recognition. ISCSLP 2021: 1-5 - [c204]Licai Sun
, Mingyu Xu, Zheng Lian, Bin Liu, Jianhua Tao, Meng Wang, Yuan Cheng:
Multimodal Emotion Recognition and Sentiment Analysis via Attention Enhanced Recurrent Model. MuSe @ ACM Multimedia 2021: 15-20 - [c203]Cong Cai, Yu He, Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao, Mingyu Xu, Kexin Wang:
Multimodal Sentiment Analysis based on Recurrent Neural Network and Multimodal Attention. MuSe @ ACM Multimedia 2021: 61-67 - [c202]Xuefei Liu, Jianhua Tao, Yurong Han, Chenglong Wang, Xueying Zheng, Zhengqi Wen:
Which Phonemes Will Distinguish the Different Regions Within the Same Dialect? O-COCOSDA 2021: 152-157 - [i36]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang:
Fast End-to-End Speech Recognition via a Non-Autoregressive Model and Cross-Modal Knowledge Transferring from BERT. CoRR abs/2102.07594 (2021) - [i35]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Shuai Zhang, Zhengqi Wen, Xuefei Liu:
TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition. CoRR abs/2104.01522 (2021) - [i34]Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization. CoRR abs/2104.02882 (2021) - [i33]Jiangyan Yi, Ye Bai, Jianhua Tao, Zhengkun Tian, Chenglong Wang, Tao Wang, Ruibo Fu:
Half-Truth: A Partially Fake Audio Detection Dataset. CoRR abs/2104.03617 (2021) - [i32]Haoxin Ma, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengkun Tian, Chenglong Wang:
Continual Learning for Fake Audio Detection. CoRR abs/2104.07286 (2021) - [i31]Pengpeng Shao, Tong Liu, Dawei Zhang, Jianhua Tao, Feihu Che, Guohua Yang:
Multi-Level Graph Contrastive Learning. CoRR abs/2107.02639 (2021) - [i30]Zepeng Huai, Jianhua Tao, Feihu Che, Guohua Yang, Dawei Zhang:
Knowledge graph enhanced recommender system. CoRR abs/2112.09425 (2021) - 2020
- [j35]Zheng Lian
, Ya Li
, Jianhua Tao, Jian Huang, Mingyue Niu:
Expression Analysis Based on Face Regions in Real-world Conditions. Int. J. Autom. Comput. 17(1): 96-107 (2020) - [j34]Xiao Sun
, Jia Li, Xing Wei, Changliang Li, Jianhua Tao:
Emotional editing constraint conversation content generation based on reinforcement learning. Inf. Fusion 56: 70-80 (2020) - [j33]Ziping Zhao
, Zhongtian Bao
, Zixing Zhang
, Jun Deng
, Nicholas Cummins
, Haishuai Wang
, Jianhua Tao
, Björn W. Schuller
:
Automatic Assessment of Depression From Speech via a Hierarchical Attention Transfer Network and Attention Autoencoders. IEEE J. Sel. Top. Signal Process. 14(2): 423-434 (2020) - [j32]Bocheng Zhao
, Jianhua Tao, Minghao Yang, Zhengkun Tian, Cunhang Fan, Ye Bai:
Deep imitator: Handwriting calligraphy imitation via deep attention networks. Pattern Recognit. 104: 107080 (2020) - [j31]Cunhang Fan
, Jianhua Tao
, Bin Liu
, Jiangyan Yi
, Zhengqi Wen, Xuefei Liu:
End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1303-1314 (2020) - [j30]Xiao Sun
, Jia Li, Xing Wei, Changliang Li, Jianhua Tao:
Emotional Conversation Generation Based on a Bayesian Deep Neural Network. ACM Trans. Inf. Syst. 38(1): 8:1-8:24 (2020) - [j29]Ye Bai, Jiangyan Yi
, Jianhua Tao, Zhengqi Wen, Cunhang Fan:
A Public Chinese Dataset for Language Model Adaptation. J. Signal Process. Syst. 92(8): 839-851 (2020) - [c201]Feihu Che, Dawei Zhang, Jianhua Tao, Mingyue Niu, Bocheng Zhao:
ParamE: Regarding Neural Network Parameters as Relation Embeddings for Knowledge Graph Completion. AAAI 2020: 2774-2781 - [c200]Wenxiang She, Zhao Lv, Jianhua Tao, Bin Liu, Mingyue Niu:
Micro-Expression Recognition Based on Multiple Aggregation Networks. APSIPA 2020: 1043-1047 - [c199]Tao Wang, Jianhua Tao, Ruibo Fu, Zhengqi Wen, Chunyu Qiang:
The NLPR Speech Synthesis entry for Blizzard Challenge 2020. Blizzard Challenge / Voice Conversion Challenge 2020 - [c198]Zheng Lian, Jianhua Tao, Zhengqi Wen, Rongxiu Zhong:
CASIA Voice Conversion System for the Voice Conversion Challenge 2020. Blizzard Challenge / Voice Conversion Challenge 2020 - [c197]Jian Huang, Jianhua Tao, Bin Liu, Zheng Lian
, Mingyue Niu:
Multimodal Transformer Fusion for Continuous Emotion Recognition. ICASSP 2020: 3507-3511 - [c196]Ruibo Fu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Tao Wang:
Focusing on Attention: Prosody Transfer and Adaptative Optimization Strategy for Multi-Speaker End-to-End Speech Synthesis. ICASSP 2020: 6709-6713 - [c195]Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
Synchronous Transformers for end-to-end Speech Recognition. ICASSP 2020: 7884-7888 - [c194]Ke Xu, Bin Liu, Jianhua Tao, Zhao Lv
, Qifei Li, Cunhang Fan:
AMINN: Attention-Based Multi-Information Neural Network for Emotion Recognition. ICCPR 2020: 56-62 - [c193]Zheng Lian
, Jianhua Tao, Bin Liu, Jian Huang, Zhanlei Yang, Rongjun Li:
Context-Dependent Domain Adversarial Neural Network for Multimodal Emotion Recognition. INTERSPEECH 2020: 394-398 - [c192]Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Ye Bai, Cunhang Fan:
Focal Loss for Punctuation Prediction. INTERSPEECH 2020: 721-725 - [c191]Tao Wang, Jianhua Tao, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Rongxiu Zhong:
Spoken Content and Voice Factorization for Few-Shot Speaker Adaptation. INTERSPEECH 2020: 796-800 - [c190]Zheng Lian
, Jianhua Tao, Bin Liu, Jian Huang, Zhanlei Yang, Rongjun Li:
Conversational Emotion Recognition Using Self-Attention Mechanisms and Graph Neural Networks. INTERSPEECH 2020: 2347-2351 - [c189]Ruibo Fu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Chunyu Qiang, Tao Wang:
Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis. INTERSPEECH 2020: 2937-2941 - [c188]Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen:
Gated Recurrent Fusion of Spatial and Spectral Features for Multi-Channel Speech Separation with Deep Embedding Representations. INTERSPEECH 2020: 3321-3325 - [c187]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang:
Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition. INTERSPEECH 2020: 3381-3385 - [c186]Tao Wang, Xuefei Liu, Jianhua Tao, Jiangyan Yi, Ruibo Fu, Zhengqi Wen:
Non-Autoregressive End-to-End TTS with Coarse-to-Fine Decoding. INTERSPEECH 2020: 3984-3988 - [c185]Tao Wang, Jianhua Tao, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Chunyu Qiang:
Bi-Level Speaker Supervision for One-Shot Speech Synthesis. INTERSPEECH 2020: 3989-3993 - [c184]Jian Huang, Jianhua Tao, Bin Liu, Zheng Lian
:
Learning Utterance-Level Representations with Label Smoothing for Speech Emotion Recognition. INTERSPEECH 2020: 4079-4083 - [c183]Yongwei Li, Jianhua Tao, Bin Liu, Donna Erickson, Masato Akagi:
Comparison of Glottal Source Parameter Values in Emotional Vowels. INTERSPEECH 2020: 4103-4107 - [c182]Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen:
Joint Training for Simultaneous Speech Denoising and Dereverberation with Deep Embedding Representations. INTERSPEECH 2020: 4536-4540 - [c181]Ruibo Fu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Tao Wang, Chunyu Qiang:
Dynamic Speaker Representations Adjustment and Decoder Factorization for Speaker Adaptation in End-to-End Speech Synthesis. INTERSPEECH 2020: 4701-4705 - [c180]Zheng Lian
, Zhengqi Wen, Xinyong Zhou, Songbai Pu, Shengkai Zhang, Jianhua Tao:
ARVC: An Auto-Regressive Voice Conversion System Without Parallel Training Data. INTERSPEECH 2020: 4706-4710 - [c179]Ziping Zhao, Qifei Li, Nicholas Cummins
, Bin Liu, Haishuai Wang, Jianhua Tao, Björn W. Schuller:
Hybrid Network Feature Extraction for Depression Assessment from Speech. INTERSPEECH 2020: 4956-4960 - [c178]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Shuai Zhang, Zhengqi Wen:
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition. INTERSPEECH 2020: 5026-5030 - [c177]Licai Sun
, Zheng Lian, Jianhua Tao, Bin Liu, Mingyue Niu:
Multi-modal Continuous Dimensional Emotion Recognition Using Recurrent Neural Network and Self-Attention Mechanism. MuSe @ ACM Multimedia 2020: 27-34 - [i29]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen:
Spatial and spectral deep attention fusion for multi-channel speech separation using deep embedding features. CoRR abs/2002.01626 (2020) - [i28]Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Jianhua Tao, Ye Bai:
Rnn-transducer with language bias for end-to-end Mandarin-English code-switching speech recognition. CoRR abs/2002.08126 (2020) - [i27]Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen, Xuefei Liu:
Deep Attention Fusion Feature for Speech Separation with End-to-End Post-filter Method. CoRR abs/2003.07544 (2020) - [i26]Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengkun Tian, Cunhang Fan:
Adversarial Transfer Learning for Punctuation Restoration. CoRR abs/2004.00248 (2020) - [i25]Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen:
Simultaneous Denoising and Dereverberation Using Deep Embedding Features. CoRR abs/2004.02420 (2020) - [i24]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang:
Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition. CoRR abs/2005.04862 (2020) - [i23]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Shuai Zhang, Zhengqi Wen:
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition. CoRR abs/2005.07903 (2020) - [i22]Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Zhengqi Wen:
Decoupling Pronunciation and Language for End-to-end Code-switching Automatic Speech Recognition. CoRR abs/2010.14798 (2020) - [i21]Cunhang Fan, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Bin Liu, Zhengqi Wen:
Gated Recurrent Fusion with Joint Training Framework for Robust End-to-End Speech Recognition. CoRR abs/2011.04249 (2020) - [i20]Feihu Che, Guohua Yang, Dawei Zhang, Jianhua Tao, Pengpeng Shao, Tong Liu:
Self-supervised Graph Representation Learning via Bootstrapping. CoRR abs/2011.05126 (2020) - [i19]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen, Leichao Song:
Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning. CoRR abs/2011.05591 (2020) - [i18]Pengpeng Shao, Guohua Yang, Dawei Zhang, Jianhua Tao, Feihu Che, Tong Liu:
Tucker decomposition-based Temporal Knowledge Graph Completion. CoRR abs/2011.07751 (2020)
2010 – 2019
- 2019
- [j28]Jianhua Tao
, Jian Huang, Ya Li, Zheng Lian
, Mingyue Niu:
Semi-supervised Ladder Networks for Speech Emotion Recognition. Int. J. Autom. Comput. 16(4): 437-448 (2019) - [j27]Jiangyan Yi
, Jianhua Tao
, Zhengqi Wen, Ye Bai:
Language-Adversarial Transfer Learning for Low-Resource Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 27(3): 621-630 (2019) - [j26]Yibin Zheng
, Jianhua Tao
, Zhengqi Wen, Jiangyan Yi
:
Forward-Backward Decoding Sequence for Regularizing End-to-End TTS. IEEE ACM Trans. Audio Speech Lang. Process. 27(12): 2067-2079 (2019) - [j25]Minghao Yang, Jianhua Tao:
Data fusion methods in multimodal human computer dialog. Virtual Real. Intell. Hardw. 1(1): 21-38 (2019) - [c176]Mingyue Niu, Jianhua Tao, Bin Liu:
Local Second-Order Gradient Cross Pattern for Automatic Depression Detection. ACII Workshops 2019: 128-132 - [c175]Jian Huang, Jianhua Tao, Bin Liu, Zhen Lian, Mingyue Niu:
Efficient Modeling of Long Temporal Contexts for Continuous Emotion Recognition. ACII 2019: 185-191 - [c174]Jiangyan Yi, Jianhua Tao:
Distilling Knowledge for Distant Speech Recognition via Parallel Data. APSIPA 2019: 170-175 - [c173]Jiangyan Yi, Jianhua Tao:
Batch Normalization based Unsupervised Speaker Adaptation for Acoustic Models. APSIPA 2019: 176-180 - [c172]Qiuxian Zhang, Jiangyan Yi, Jianhua Tao, Mingliang Gu, Yong Ma:
Focal Loss for End-to-end Short Utterances Chinese Dialect Identification. APSIPA 2019: 397-401 - [c171]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen, Ye Bai:
Noise Prior Knowledge Learning for Speech Enhancement via Gated Convolutional Generative Adversarial Network. APSIPA 2019: 662-666 - [c170]Haoxin Ma, Ye Bai, Jiangyan Yi, Jianhua Tao:
Hypersphere Embedding and Additive Margin for Query-by-example Keyword Spotting. APSIPA 2019: 868-872 - [c169]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Bin Liu:
Voice Activity Detection Based on Time-Delay Neural Networks. APSIPA 2019: 1173-1178 - [c168]Jianhua Tao, Ruibo Fu, Zhengqi Wen:
The NLPR Speech Synthesis entry for Blizzard Challenge 2019. Blizzard Challenge 2019 - [c167]Mingyue Niu, Jianhua Tao, Ya Li, Jian Huang, Zheng Lian
:
Discriminative Video Representation with Temporal Order for Micro-expression Recognition. ICASSP 2019: 2112-2116 - [c166]Bocheng Zhao
, Minghao Yang, Jianhua Tao:
Drawing Order Recovery for Handwriting Chinese Characters. ICASSP 2019: 3227-3231 - [c165]Jiangyan Yi, Jianhua Tao, Ye Bai:
Language-invariant Bottleneck Features from Adversarial End-to-end Acoustic Models for Low Resource Speech Recognition. ICASSP 2019: 6071-6075 - [c164]Ruibo Fu, Jianhua Tao, Zhengqi Wen, Yibin Zheng:
Phoneme Dependent Speaker Embedding and Model Factorization for Multi-speaker Speech Synthesis and Adaptation. ICASSP 2019: 6930-6934 - [c163]Jiangyan Yi, Jianhua Tao:
Self-attention Based Model for Punctuation Prediction Using Word and Speech Embeddings. ICASSP 2019: 7270-7274 - [c162]Yibin Zheng, Xi Wang, Lei He, Shifeng Pan, Frank K. Soong, Zhengqi Wen, Jianhua Tao:
Forward-Backward Decoding for Regularizing End-to-End TTS. INTERSPEECH 2019: 1283-1287 - [c161]Zheng Lian
, Jianhua Tao, Bin Liu, Jian Huang:
Conversational Emotion Analysis via Attention Mechanisms. INTERSPEECH 2019: 1936-1940 - [c160]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Zhengkun Tian, Chenghao Zhao, Cunhang Fan:
A Time Delay Neural Network with Shared Weight Self-Attention for Small-Footprint Keyword Spotting. INTERSPEECH 2019: 2190-2194 - [c159]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen:
Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition. INTERSPEECH 2019: 3795-3799 - [c158]Zheng Lian
, Jianhua Tao, Bin Liu, Jian Huang:
Unsupervised Representation Learning with Future Observation Prediction for Speech Emotion Recognition. INTERSPEECH 2019: 3840-3844 - [c157]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengqi Wen:
Self-Attention Transducers for End-to-End Speech Recognition. INTERSPEECH 2019: 4395-4399 - [c156]Mingyue Niu, Jianhua Tao, Bin Liu, Cunhang Fan:
Automatic Depression Level Detection via ℓp-Norm Pooling. INTERSPEECH 2019: 4559-4563 - [c155]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen:
Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features. INTERSPEECH 2019: 4599-4603 - [i17]Jia Li, Xiao Sun, Xing Wei, Changliang Li, Jianhua Tao:
Reinforcement Learning Based Emotional Editing Constraint Conversation Generation. CoRR abs/1904.08061 (2019) - [i16]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen:
Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition. CoRR abs/1907.06017 (2019) - [i15]Yibin Zheng, Xi Wang, Lei He, Shifeng Pan, Frank K. Soong, Zhengqi Wen, Jianhua Tao:
Forward-Backward Decoding for Regularizing End-to-End TTS. CoRR abs/1907.09006 (2019) - [i14]Cunhang Fan, Bin Liu, Jianhua Tao, Jiangyan Yi, Zhengqi Wen:
Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features. CoRR abs/1907.09884 (2019) - [i13]Zhengkun Tian, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengqi Wen:
Self-Attention Transducers for End-to-End Speech Recognition. CoRR abs/1909.13037 (2019) - [i12]Zheng Lian, Ya Li, Jianhua Tao, Jian Huang:
Speech Emotion Recognition via Contrastive Loss under Siamese Networks. CoRR abs/1910.11174 (2019) - [i11]Zheng Lian, Jianhua Tao, Bin Liu, Jian Huang:
Conversational Emotion Analysis via Attention Mechanisms. CoRR abs/1910.11263 (2019) - [i10]Zheng Lian, Jianhua Tao, Zhengqi Wen, Bin Liu, Yibin Zheng, Rongxiu Zhong:
Towards Fine-Grained Prosody Control for Voice Conversion. CoRR abs/1910.11269 (2019) - [i9]Zheng Lian, Jianhua Tao, Bin Liu, Jian Huang:
Unsupervised Representation Learning with Future Observation Prediction for Speech Emotion Recognition. CoRR abs/1910.13806 (2019) - [i8]Zheng Lian, Jianhua Tao, Bin Liu, Jian Huang:
Domain adversarial learning for emotion recognition. CoRR abs/1910.13807 (2019) - [i7]Zheng Lian, Ya Li, Jianhua Tao, Jian Huang, Mingyue Niu:
Expression Analysis Based on Face Regions in Read-world Conditions. CoRR abs/1911.05188 (2019) - [i6]Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang:
Integrating Whole Context to Sequence-to-sequence Speech Recognition. CoRR abs/1912.01777 (2019) - [i5]Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
Synchronous Transformers for End-to-End Speech Recognition. CoRR abs/1912.02958 (2019) - 2018
- [j24]Shuai Nie
, Shan Liang
, Wenju Liu
, Xueliang Zhang
, Jianhua Tao:
Deep Learning Based Speech Separation via NMF-Style Reconstructions. IEEE ACM Trans. Audio Speech Lang. Process. 26(11): 2043-2055 (2018) - [j23]Jiangyan Yi, Zhengqi Wen, Jianhua Tao, Hao Ni, Bin Liu:
CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition. J. Signal Process. Syst. 90(7): 985-997 (2018) - [j22]Zhengqi Wen, Kehuang Li, Zhen Huang, Chin-Hui Lee, Jianhua Tao:
Improving Deep Neural Network Based Speech Synthesis through Contextual Feature Parametrization and Multi-Task Learning. J. Signal Process. Syst. 90(7): 1025-1037 (2018) - [j21]Yibin Zheng, Ya Li, Zhengqi Wen, Bin Liu, Jianhua Tao:
Investigating Deep Neural Network Adaptation for Generating Exclamatory and Interrogative Speech in Mandarin. J. Signal Process. Syst. 90(7): 1039-1052 (2018) - [c154]Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Ye Bai:
Adversarial Multilingual Training for Low-Resource Speech Recognition. ICASSP 2018: 4899-4903 - [c153]Jian Huang, Ya Li, Jianhua Tao, Zheng Lian, Jiangyan Yi:
End-to-End Continuous Emotion Recognition from Video Using 3D Convlstm Networks. ICASSP 2018: 6837-6841 - [c152]Hui Zhou, Minghao Yang, Hang Pan, Renjun Tang, Baohua Qiang, Jinlong Chen, Jianhua Tao:
Architecture and Parameter Analysis to Convolutional Neural Network for Hand Tracking. ICCCS (6) 2018: 429-439 - [c151]Bocheng Zhao
, Minghao Yang, Jianhua Tao:
Pen Tip Motion Prediction for Handwriting Drawing Order Recovery using Deep Neural Network. ICPR 2018: 704-709 - [c150]Minghao Yang, Na Sheng Ruo Yang, Ke Zhang, Jianhua Tao:
Self-Talk: Responses to Users' Opinions and Challenges in Human Computer Dialog. ICPR 2018: 2839-2844 - [c149]Minghao Yang, Dawei Zhang, Jianhua Tao:
Reducing Tongue Shape Dimensionality from Hundreds of Available Resources Using Autoencoder. ICPR 2018: 2875-2880 - [c148]Yibin Zheng, Jianhua Tao, Zhengqi Wen, Ya Li:
BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in a Text-to-Speech Front-End. INTERSPEECH 2018: 47-51 - [c147]Xiaoke Qi, Jianhua Tao:
Sparsity-Constrained Weight Mapping for Head-Related Transfer Functions Individualization from Anthropometric Features. INTERSPEECH 2018: 841-845 - [c146]Ruibo Fu, Jianhua Tao, Yibin Zheng, Zhengqi Wen:
Transfer Learning Based Progressive Neural Networks for Acoustic Modeling in Statistical Parametric Speech Synthesis. INTERSPEECH 2018: 907-911 - [c145]Yibin Zheng, Jianhua Tao, Zhengqi Wen, Ruibo Fu:
On the Application and Compression of Deep Time Delay Neural Network for Embedded Statistical Parametric Speech Synthesis. INTERSPEECH 2018: 922-926 - [c144]Ruibo Fu, Jianhua Tao, Yibin Zheng, Zhengqi Wen:
Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer. INTERSPEECH 2018: 2514-2518 - [c143]Shuai Nie, Shan Liang, Bin Liu, Yaping Zhang, Wenju Liu, Jianhua Tao:
Deep Noise Tracking Network: A Hybrid Signal Processing/Deep Learning Approach to Speech Enhancement. INTERSPEECH 2018: 3219-3223 - [c142]Jian Huang, Ya Li, Jianhua Tao, Zhen Lian
:
Speech Emotion Recognition from Variable-Length Inputs with Triplet Loss Function. INTERSPEECH 2018: 3673-3677 - [c141]Bin Liu, Jianhua Tao, Yibin Zheng:
A Novel Unified Framework for Speech Enhancement and Bandwidth Extension Based on Jointly Trained Neural Networks. ISCSLP 2018: 11-15 - [c140]Cunhang Fan, Bin Liu, Jianhua Tao, Zhengqi Wen, Jiangyan Yi, Ye Bai:
Utterance-level Permutation Invariant Training with Discriminative Learning for Single Channel Speech Separation. ISCSLP 2018: 26-30 - [c139]Ye Bai, Jianhua Tao, Jiangyan Yi, Zhengqi Wen, Cunhang Fan:
CLMAD: A Chinese Language Model Adaptation Dataset. ISCSLP 2018: 275-279 - [c138]Jian Huang, Ya Li, Jianhua Tao, Zheng Lian
, Mingyue Niu, Minghao Yang:
Multimodal Continuous Emotion Recognition with Data Augmentation Using Recurrent Neural Networks. AVEC@MM 2018: 57-64 - [c137]Jian Huang, Ya Li, Jianhua Tao, Zheng Lian
, Mingyue Niu, Minghao Yang:
Deep Learning for Continuous Multiple Time Series Annotations. AVEC@MM 2018: 91-98 - [c136]Dong-Yan Huang, Sicheng Zhao, Björn W. Schuller
, Hongxun Yao, Jianhua Tao, Min Xu
, Lei Xie, Qingming Huang, Jie Yang:
ASMMC-MMAC 2018: The Joint Workshop of 4th the Workshop on Affective Social Multimedia Computing and first Multi-Modal Affective Computing of Large-Scale Multimedia Data Workshop. ACM Multimedia 2018: 2120-2121 - [i4]Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Bin Liu:
Distilling Knowledge Using Parallel Data for Far-field Speech Recognition. CoRR abs/1802.06941 (2018) - [i3]Xiao Sun, Jingyuan Li, Jianhua Tao:
Emotional Conversation Generation Orientated Syntactically Constrained Bidirectional-Asynchronous Framework. CoRR abs/1806.07000 (2018) - [i2]Zheng Lian, Ya Li, Jianhua Tao, Jian Huang:
Investigation of Multimodal Features, Classifiers and Fusion Methods for Emotion Recognition. CoRR abs/1809.06225 (2018) - 2017
- [j20]Ya Li
, Jianhua Tao, Linlin Chao, Wei Bao, Yazhu Liu:
CHEAVD: a Chinese natural emotional audio-visual database. J. Ambient Intell. Humaniz. Comput. 8(6): 913-924 (2017) - [j19]Ya Li, Jianhua Tao, Wei Lai
, Xiaoying Xu:
Quantitative intonation modeling of interrogative sentences for Mandarin speech synthesis. Speech Commun. 89: 92-102 (2017) - [c135]Jianhua Tao, Ruibo Fu, Yibin Zheng, Zhengqi Wen, Ya Li, Biu Liu:
The NLPR Speech Synthesis entry for Blizzard Challenge 2017. Blizzard Challenge 2017 - [c134]Bin Liu, Jianhua Tao, Dawei Zhang, Yibin Zheng:
A novel pitch extraction based on jointly trained deep BLSTM Recurrent Neural Networks with bottleneck features. ICASSP 2017: 336-340 - [c133]Yibin Zheng, Jianhua Tao, Zhengqi Wen, Ya Li, Bin Liu:
Investigating Efficient Feature Representation Methods and Training Objective for BLSTM-Based Phone Duration Prediction. INTERSPEECH 2017: 784-788 - [c132]Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Ya Li:
Distilling Knowledge from an Ensemble of Models for Punctuation Prediction. INTERSPEECH 2017: 2779-2783 - [c131]Xiaoke Qi, Jianhua Tao:
A Domain Knowledge-Assisted Nonlinear Model for Head-Related Transfer Functions Based on Bottleneck Deep Neural Network. INTERSPEECH 2017: 3058-3062 - [c130]Jian Huang, Ya Li, Jianhua Tao, Zheng Lian, Zhengqi Wen, Minghao Yang, Jiangyan Yi:
Continuous Multimodal Emotion Prediction Based on Long Short Term Memory Recurrent Neural Network. AVEC@ACM Multimedia 2017: 11-18 - [c129]Jianhua Tao, Tingjun Wu, Dacheng Li, Yongyan Lu, Wenqiang Wu:
Research on modeling and machining algorithm of multi-shear and multi-punch CNC transverse shear line. CIS/RAM 2017: 161-166 - [c128]Bocheng Zhao
, Minghao Yang, Hang Pan, Qingjie Zhu, Jianhua Tao:
Nonrigid point matching of Chinese characters for robot writing. ROBIO 2017: 762-767 - 2016
- [j18]Minghao Yang, Jinlin Jiang, Jianhua Tao, Kaihui Mu, Hao Li:
Emotional head motion predicting from prosodic and linguistic features. Multim. Tools Appl. 75(9): 5125-5146 (2016) - [j17]Minghui Dong, Jianhua Tao, Man-Wai Mak:
Guest Editorial: Advances in Machine Learning for Speech Processing. J. Signal Process. Syst. 82(2): 137-140 (2016) - [j16]Bin Liu, Jianhua Tao, Zhengqi Wen, Fuyuan Mo:
Speech Enhancement Based on Analysis-Synthesis Framework with Improved Parameter Domain Enhancement. J. Signal Process. Syst. 82(2): 141-150 (2016) - [j15]Hao Che, Ya Li, Jianhua Tao, Zhengqi Wen:
Investigating Effect of Rich Syntactic Features on Mandarin Prosodic Boundaries Prediction. J. Signal Process. Syst. 82(2): 263-271 (2016) - [c127]Zhengqi Wen, Kehuang Li, Jianhua Tao, Chin-Hui Lee:
Deep neural network based voice conversion with a large synthesized parallel corpus. APSIPA 2016: 1-5 - [c126]Jiangyan Yi, Hao Ni, Zhengqi Wen, Jianhua Tao:
Improving BLSTM RNN based Mandarin speech recognition using accent dependent bottleneck features. APSIPA 2016: 1-5 - [c125]Jianhua Tao, Yibin Zheng, Zhengqi Wen, Ya Li, Biu Liu:
BLSTM Guided Unit Selection Synthesis System for Blizzard Challenge 2016. Blizzard Challenge 2016 - [c124]Hao Ni, Jiangyan Yi, Zhengqi Wen, Jianhua Tao:
Recurrent Neural Network Based Language Model Adaptation for Accent Mandarin Speech. CCPR (2) 2016: 607-617 - [c123]Ya Li, Jianhua Tao, Björn W. Schuller
, Shiguang Shan
, Dongmei Jiang, Jia Jia:
MEC 2016: The Multimodal Emotion Recognition Challenge of CCPR 2016. CCPR (2) 2016: 667-678 - [c122]Dawei Zhang, Minghao Yang, Jianhua Tao, Yang Wang, Bin Liu, Danish Bukhari:
Extraction of tongue contour in real-time magnetic resonance imaging sequences. ICASSP 2016: 937-941 - [c121]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li, Zhengqi Wen:
Long short term memory recurrent neural network based encoding method for emotion recognition in video. ICASSP 2016: 2752-2756 - [c120]Xiaoke Qi, Jianhua Tao:
A Sparse Spherical Harmonic-Based Model in Subbands for Head-Related Transfer Functions. INTERSPEECH 2016: 540-544 - [c119]Zhengqi Wen, Ya Li, Jianhua Tao:
The Parameterized Phoneme Identity Feature as a Continuous Real-Valued Vector for Neural Network Based Speech Synthesis. INTERSPEECH 2016: 2248-2252 - [c118]Yibin Zheng, Ya Li, Zhengqi Wen, Xingguang Ding, Jianhua Tao:
Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach. INTERSPEECH 2016: 3201-3205 - [c117]Bin Liu, Jianhua Tao:
A Novel Research to Artificial Bandwidth Extension Based on Deep BLSTM Recurrent Neural Networks and Exemplar-Based Sparse Representation. INTERSPEECH 2016: 3778-3782 - [c116]Ye Bai, Jiangyan Yi, Hao Ni, Zhengqi Wen, Bin Liu, Ya Li, Jianhua Tao:
End-to-end keywords spotting based on connectionist temporal classification for Mandarin. ISCSLP 2016: 1-5 - [c115]Hao Ni, Jiangyan Yi, Zhengqi Wen, Bin Liu, Jianhua Tao:
Improving accented Mandarin speech recognition by using recurrent neural network based language model adaptation. ISCSLP 2016: 1-5 - [c114]Zhengqi Wen, Kehuang Li, Zhen Huang, Jianhua Tao, Chin-Hui Lee:
Learning auxiliary categorical information for speech synthesis based on deep and recurrent neural networks. ISCSLP 2016: 1-5 - [c113]Jiangyan Yi, Hao Ni, Zhengqi Wen, Bin Liu, Jianhua Tao:
CTC regularized model adaptation for improving LSTM RNN based multi-accent Mandarin speech recognition. ISCSLP 2016: 1-5 - [c112]Yibin Zheng, Ya Li, Zhengqi Wen, Bin Liu, Jianhua Tao:
Text-based sentential stress prediction using continuous lexical embedding for Mandarin speech synthesis. ISCSLP 2016: 1-5 - [c111]Yibin Zheng, Ya Li, Zhengqi Wen, Bin Liu, Jianhua Tao:
Investigating deep neural network adaptation for generating exclamatory and interrogative speech in Mandarin. ISCSLP 2016: 1-5 - [c110]Renjun Tang, Ke Zhang, Shenruoyang Na, Minghao Yang, Hui Zhou, Qingjie Zhu, Yongsong Zhan, Jianhua Tao:
Football News Generation from Chinese Live Webcast Script. NLPCC/ICCPOL 2016: 778-789 - [i1]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li, Zhengqi Wen:
Audio Visual Emotion Recognition with Temporal Alignment and Perception Attention. CoRR abs/1603.08321 (2016) - 2015
- [j14]Minghao Yang, Jianhua Tao, Linlin Chao, Hao Li, Dawei Zhang, Hao Che, Tingli Gao, Bin Liu:
User behavior fusion in dialog management with multi-modal history cues. Multim. Tools Appl. 74(22): 10025-10051 (2015) - [j13]Ya Li, Jianhua Tao, Keikichi Hirose, Xiaoying Xu, Wei Lai:
Hierarchical stress modeling and generation in mandarin for expressive Text-to-Speech. Speech Commun. 72: 59-73 (2015) - [j12]Sujing Wang
, Wen-Jing Yan, Xiaobai Li, Guoying Zhao
, Chunguang Zhou, Xiaolan Fu, Minghao Yang, Jianhua Tao:
Micro-Expression Recognition Using Color Spaces. IEEE Trans. Image Process. 24(12): 6034-6047 (2015) - [c109]Ya Li, Linlin Chao, Yazhu Liu, Wei Bao, Jianhua Tao:
From simulated speech to natural speech, what are the robust features for emotion recognition? ACII 2015: 368-373 - [c108]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li:
Multi task sequence learning for depression scale prediction from video. ACII 2015: 526-531 - [c107]Ya Li, Nick Campbell, Jianhua Tao:
Voice quality: Not only about "you" but also about "your interlocutor". ICASSP 2015: 4739-4743 - [c106]Hao Li, Jianhua Tao, Minghao Yang, Bin Liu:
Estimate articulatory MRI series from acoustic signal using deep architecture. ICASSP 2015: 4854-4858 - [c105]Hao Li, Jianhua Tao, Yang Wang:
Evaluation of linear regression for speaker adaptation in HMM-based articulatory movements estimation. ICASSP 2015: 4944-4948 - [c104]Xiaoke Qi, Lu Wang
, Kaishun Wu
, Jianhua Tao:
Exploring smart pilot for partial packet recovery in super dense wireless networks. ICC Workshops 2015: 2145-2150 - [c103]Yang Wang, Minghao Yang, Zhengqi Wen, Jianhua Tao:
Combining extreme learning machine and decision tree for duration prediction in HMM based speech synthesis. INTERSPEECH 2015: 2197-2201 - [c102]Bin Liu, Jianhua Tao, Zhengqi Wen, Ya Li, Danish Bukhari:
A novel method of artificial bandwidth extension using deep architecture. INTERSPEECH 2015: 2598-2602 - [c101]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li, Zhengqi Wen:
Long Short Term Memory Recurrent Neural Network based Multimodal Dimensional Emotion Recognition. AVEC@ACM Multimedia 2015: 65-72 - 2014
- [j11]Jianhua Tao, Keikichi Hirose, Keiichi Tokuda, Alan W. Black, Simon King
:
Introduction to the Issue on Statistical Parametric Speech Synthesis. IEEE J. Sel. Top. Signal Process. 8(2): 170-172 (2014) - [j10]Shigeru Katagiri, Atsushi Nakamura, Tülay Adali, Jianhua Tao, Jan Larsen
, Tieniu Tan:
Guest Editorial: Machine Learning for Signal Processing. J. Signal Process. Syst. 74(3): 281-283 (2014) - [j9]Zhengqi Wen, Jianhua Tao, Shifeng Pan, Yang Wang:
Pitch-Scaled Spectrum Based Excitation Model for HMM-based Speech Synthesis. J. Signal Process. Syst. 74(3): 423-435 (2014) - [c100]Ran Zhang, Jianhua Tao, Ya Li, Zhengqi Wen:
A novel hybrid mandarin speech synthesis system using different base units for model training and concatenation. ICASSP 2014: 295-299 - [c99]Hao Li, Minghao Yang, Jianhua Tao:
Tongue shape conversion with non-parallel training data. ICASSP 2014: 2549-2553 - [c98]Hao Che, Jianhua Tao, Ya Li:
Improving Mandarin prosodic boundary prediction with rich syntactic features. INTERSPEECH 2014: 46-50 - [c97]Ran Zhang, Zhengqi Wen, Jianhua Tao, Ya Li, Bing Liu, Xiaoyan Lou:
A hierarchical viterbi algorithm for Mandarin hybrid speech synthesis system. INTERSPEECH 2014: 795-799 - [c96]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li:
Improving generation performance of speech emotion recognition by denoising autoencoders. ISCSLP 2014: 341-344 - [c95]Xin Xu, Ya Li, Xiaoying Xu, Zhengqi Wen, Hao Che, Shanfeng Liu, Jianhua Tao:
Survey on discriminative feature selection for speech emotion recognition. ISCSLP 2014: 345-349 - [c94]Xiaoying Xu, Huimin Wang, Ya Li, Wei Lai
, Jianhua Tao:
The expression of emotions by text and speech. ISCSLP 2014: 353 - [c93]Wei Bao, Ya Li, Mingliang Gu, Jianhua Tao, Linlin Chao, Shanfeng Liu:
Combining prosodic and spectral features for Mandarin intonation recognition. ISCSLP 2014: 497-500 - [c92]Hao Che, Zhengqi Wen, Ya Li, Jianhua Tao:
Investigating effect of rich syntactic features on Mandarin prosodic phrase boundaries prediction. ISCSLP 2014: 501-505 - [c91]Shanfeng Liu, Zhengqi Wen, Ya Li, Jianhua Tao, Bin Liu:
Context features based pre-selection and weight prediction in concatenation speech synthesis system. ISCSLP 2014: 506-510 - [c90]Yang Wang, Jianhua Tao:
Evaluation of parameter generation using high order dynamic features and long span windows for HMM based speech synthesis. ISCSLP 2014: 516-520 - [c89]Bin Liu, Jianhua Tao, Fuyuan Mo, Ya Li, Zhengqi Wen, Shanfeng Liu:
Efficient voice activity detection algorithm based on sub-band temporal envelope and sub-band long-term signal variability. ISCSLP 2014: 531-535 - [c88]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li, Zhengqi Wen:
Multi-scale Temporal Modeling for Dimensional Emotion Recognition in Video. AVEC@MM 2014: 11-18 - [c87]Wei Lai
, Xiaoying Xu, Ya Li, Hao Che, Shanfeng Liu, Jianhua Tao:
Phonological influences on the realization of final lowering evidence from dialogue Chinese Mandarin. O-COCOSDA 2014: 1-6 - [e4]Minghui Dong, Jianhua Tao, Haizhou Li, Thomas Fang Zheng, Yanfeng Lu:
The 9th International Symposium on Chinese Spoken Language Processing, Singapore, September 12-14, 2014. IEEE 2014, ISBN 978-1-4799-4220-6 [contents] - 2013
- [c86]Linlin Chao, Jianhua Tao, Minghao Yang, Ya Li:
Bayesian Inference Based Temporal Modeling for Naturalistic Affective Expression Classification. ACII 2013: 173-178 - [c85]Yang Wang, Jianhua Tao, Minghao Yang, Ya Li:
Extended Decision Tree with or Relationship for HMM-Based Speech Synthesis. ACPR 2013: 225-229 - [c84]Linlin Chao, Jianhua Tao, Minghao Yang:
Combining emotional history through multimodal fusion methods. APSIPA 2013: 1-4 - [c83]Xiaoying Xu, Jianhua Tao, Ya Li:
On Constructing a Chinese Task-Oriental Subjectivity Lexicon. CLSW 2013: 546-554 - [c82]Minghao Yang, Jianhua Tao, Dawei Zhang:
Extraction of tongue contour in X-ray videos. ICASSP 2013: 1094-1098 - [c81]Hao Li, Minghao Yang, Jianhua Tao:
Speaker-independent lips and tongue visualization of vowels. ICASSP 2013: 8106-8110 - [c80]Hao Che, Jianhua Tao:
Stress predicition for Mandarin text-to-speech system using discourse context feature. O-COCOSDA/CASLRE 2013: 1-5 - [c79]Ran Zhang, Jianhua Tao, Ya Li, Zhengqi Wen:
A novel unit selection method for concatenation speech system using similarity measure. O-COCOSDA/CASLRE 2013: 1-5 - 2012
- [j8]Julien Epps
, Roddy Cowie
, Shrikanth S. Narayanan, Björn W. Schuller
, Jianhua Tao:
Emotion and mental state recognition from speech. EURASIP J. Adv. Signal Process. 2012: 15 (2012) - [j7]Minghao Yang, Jianhua Tao, Kaihui Mu, Ya Li, Jianfeng Che:
A multimodal approach of generating 3D human-like talking agent. J. Multimodal User Interfaces 5(1-2): 61-68 (2012) - [c78]Minghao Yang, Jianhua Tao, Hao Li, Kaihui Mu:
Multimodal emotion estimation and emotional synthesize for interaction virtual agent. CCIS 2012: 191-196 - [c77]Zhengqi Wen, Hideki Kawahara, Jianhua Tao:
Pitch-Scaled Analysis based Residual Reconstruction for Speech Analysis and Synthesis. INTERSPEECH 2012: 374-377 - [c76]Zhengqi Wen, Jianhua Tao:
Amplitude Spectrum based Excitation Model for HMM-based Speech Synthesis. INTERSPEECH 2012: 1428-1431 - [c75]Zhengqi Wen, Jianhua Tao, Hao Che:
Statistical modification based post-filtering technique for HMM-based speech synthesis. ISCSLP 2012: 146-149 - 2011
- [j6]Jianhua Tao, Shifeng Pan, Minghao Yang, Ya Li, Kaihui Mu, Jianfeng Che:
Utterance independent bimodal emotion recognition in spontaneous communication. EURASIP J. Adv. Signal Process. 2011: 4 (2011) - [c74]Shifeng Pan, Jianhua Tao, Ya Li:
The CASIA Audio Emotion Recognition Method for Audio/Visual Emotion Challenge 2011. ACII (2) 2011: 388-395 - [c73]Shifeng Pan, Yoshihiko Nankaku, Keiichi Tokuda, Jianhua Tao:
Global variance modeling on frequency domain delta LSP for HMM-based speech synthesis. ICASSP 2011: 4716-4719 - [c72]Xiaoying Xu, Ya Li, Jianhua Tao, Yingchao Lu:
The Stability Analysis of Disyllabic Stress in Mandarin Speech. ICPhS 2011: 2181-2184 - [c71]Zhengqi Wen, Jianhua Tao:
Inverse Filtering Based Harmonic Plus Noise Excitation Model for HMM-Based Speech Synthesis. INTERSPEECH 2011: 1805-1808 - [c70]Ya Li, Jianhua Tao, Xiaoying Xu:
Hierarchical Stress Modeling in Mandarin Text-to-Speech. INTERSPEECH 2011: 2013-2016 - [c69]Qiong Hu, Jianhua Tao, Shifeng Pan, Chunyu Zhao:
HMM-based Tianjin Dialect speech synthesis using bilateral question Set. MLSP 2011: 1-4 - [c68]Kaihui Mu, Jianhua Tao, Minghao Yang:
Animating a Chinese interactive virtual character. MLSP 2011: 1-5 - [c67]Tieniu Tan, Shigeru Katagiri, Jianhua Tao, Atsushi Nakamura, Jan Larsen
:
Preface. MLSP 2011: 1 - [c66]Zhengqi Wen, Jianhua Tao:
An excitation model based on inverse filtering for speech analysis and synthesis. MLSP 2011: 1-5 - [c65]Minghao Yang, Jianhua Tao, Lihui Shi, Kaihui Mu, Jianfeng Che:
An outlier rejection scheme for optical flow tracking. MLSP 2011: 1-4 - 2010
- [j5]Jianhua Tao, Meng Zhang, Jani Nurminen, Jilei Tian, Xia Wang:
Supervisory Data Alignment for Text-Independent Voice Conversion. IEEE Trans. Speech Audio Process. 18(5): 932-943 (2010) - [c64]Peter Khooshabeh, Jonathan Gratch, Lixing Huang, Jianhua Tao:
Does culture affect the perception of emotion in virtual faces? APGV 2010: 165 - [c63]Jianhua Tao, Shifeng Pan, Ya Li, Zhengqi Wen, Yang Wang:
The WISTON Text to Speech System for Blizzard Challenge 2010. Blizzard Challenge 2010 - [c62]Kaihui Mu, Jianhua Tao, Jianfeng Che, Minghao Yang:
Mood avatar: automatic text-driven head motion synthesis. ICMI-MLMI 2010: 37:1-37:4 - [c61]Shifeng Pan, Meng Zhang, Jianhua Tao:
A novel hybrid approach for Mandarin speech synthesis. INTERSPEECH 2010: 182-185 - [c60]Ya Li, Jianhua Tao, Meng Zhang, Shifeng Pan, Xiaoying Xu:
Text-based unstressed syllable prediction in Mandarin. INTERSPEECH 2010: 1752-1755 - [c59]Xiaoying Xu, Jianhua Tao, Ling Zhang, Yingchao Lu:
The duration analysis of the checked tones in Cantonese speech. ISCSLP 2010: 440-445 - [c58]Jianhua Tao:
HMM based speech synthesis with Global Variance Training method. IUCS 2010: 47 - [c57]Kaihui Mu, Jianhua Tao, Jianfeng Che, Minghao Yang:
Real-time speech-driven lip synchronization. IUCS 2010: 378-382
2000 – 2009
- 2009
- [j4]Jianhua Tao, Le Xin, Panrong Yin:
Realistic Visual Speech Synthesis Based on Hybrid Concatenation Method. IEEE Trans. Speech Audio Process. 17(3): 469-477 (2009) - [c56]Jianhua Tao, Aijun Li, Shifeng Pan:
A multiple perception model on emotional speech. ACII 2009: 1-6 - [c55]Xiaoying Xu, Aijun Li, Liping Hu, Jianhua Tao:
Categorizing terms' subjectivity and polarity manually for opinion mining in Chinese. ACII 2009: 1-6 - [c54]Jianhua Tao, Ya Li, Shifeng Pan, Meng Zhang, Hongjun Sun, Zhengqi Wen:
The WISTON Text-to-Speech System for Blizzard Challenge 2009. Blizzard Challenge 2009 - [c53]Lishan Ma, Dekui Yuan, Jianhua Tao, Guoli Yang, Yong Sun:
Prediction of Ground Water Level Based on DE-BP Neutral Network. ESIAT (1) 2009: 258-261 - [c52]Huibin Jia, Jianhua Tao:
Prosody modeling for mandarin exclamatory speech. ICME 2009: 890-893 - [c51]Hongjun Sun, Jianhua Tao, Huibin Jia:
Dimension reducing of LSF parameters based on radial basis function neural network. INTERSPEECH 2009: 1103-1106 - [p2]Jianhua Tao, Aijun Li:
Emotional Speech Generation by Using Statistic Prosody Conversion Methods. Affective Information Processing 2009: 127-141 - [p1]Jianhua Tao, Panrong Yin, Le Xin:
Face Animation Based on Large Audiovisual Database. Affective Information Processing 2009: 181-200 - [e3]Jianhua Tao, Tieniu Tan:
Affective Information Processing. Springer 2009, ISBN 978-1-84800-305-7 [contents] - 2008
- [c50]Jianhua Tao, Jian Yu, Lixing Huang, Fangzhou Liu, Huibin Jia, Meng Zhang:
The WISTON Text to Speech System for Blizzard 2008. Blizzard Challenge 2008 - [c49]Meng Zhang, Jianhua Tao, Jilei Tian, Xia Wang:
Text-independent voice conversion based on state mapped codebook. ICASSP 2008: 4605-4608 - [c48]Fangzhou Liu, Qin Shi, Jianhua Tao:
Tree-guided transformation-based homograph disambiguation in Mandarin TTS system. ICASSP 2008: 4657-4660 - [c47]Mingyu You, Guo-Zheng Li, Luonan Chen, Jianhua Tao:
A Novel Classifier Based on Enhanced Lipschitz Embedding for Speech Emotion Recognition. ICIC (1) 2008: 482-490 - [c46]Meng Zhang, Jianhua Tao, Huibin Jia, Xia Wang:
Improving HMM Based Speech Synthesis by Reducing Over-Smoothing Problems. ISCSLP 2008: 17-20 - [c45]Yi Zhang, Jianhua Tao:
Prosody Modification on Mixed-Language Speech Synthesis. ISCSLP 2008: 253-256 - [c44]Fangzhou Liu, Huibin Jia, Jianhua Tao:
A Maximum Entropy Based Hierarchical Model for Automatic Prosodic Boundary Labeling in Mandarin. ISCSLP 2008: 257-260 - [e2]Helen M. Meng, Hui Jiang, Jianhua Tao, Ren-Hua Wang:
6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008, 16-19 December, 2008, Kunming, China. IEEE 2008, ISBN 978-1-4244-2942-4 [contents] - 2007
- [j3]Dekui Yuan, Binliang Lin, Roger A. Falconer
, Jianhua Tao:
Development of an integrated model for assessing the impact of diffuse and point source pollution on coastal waters. Environ. Model. Softw. 22(6): 871-879 (2007) - [j2]Mingyu You, Chun Chen, Jiajun Bu, Jia Liu, Jianhua Tao:
Manifolds Based Emotion Recognition in Speech. Int. J. Comput. Linguistics Chin. Lang. Process. 12(1) (2007) - [c43]Panrong Yin, Liyue Zhao, Lixing Huang, Jianhua Tao:
Expressive Face Animation Synthesis Based on Dynamic Mapping Method. ACII 2007: 1-11 - [c42]Marc Schröder, Laurence Devillers, Kostas Karpouzis, Jean-Claude Martin, Catherine Pelachaud, Christian Peter, Hannes Pirker, Björn W. Schuller, Jianhua Tao, Ian Wilson:
What Should a Generic Emotion Markup Language Be Able to Represent? ACII 2007: 440-451 - [c41]Lixing Huang, Le Xin, Liyue Zhao, Jianhua Tao:
Combining Audio and Video by Dominance in Bimodal Emotion Recognition. ACII 2007: 729-730 - [c40]Jian Yu, Meng Zhang, Jianhua Tao, Xia Wang:
A Novel HMM-Based TTS System using Both Continuous HMMS and Discrete HMMS. ICASSP (4) 2007: 709-712 - [c39]Jia Liu, Chun Chen, Jiajun Bu, Mingyu You, Jianhua Tao:
Speech Emotion Recognition Based on a Fusion of All-Class and Pairwise-Class Feature Selection. International Conference on Computational Science (1) 2007: 168-175 - [c38]Le Xin, Jianhua Tao, Tieniu Tan:
Dynamic Audio-Visual Mapping using Fused Hidden Markov Model Inversion Method. ICIP (3) 2007: 293-296 - [c37]Jia Liu, Chun Chen, Jiajun Bu, Mingyu You, Jianhua Tao:
Speech Emotion Recognition using an Enhanced Co-Training Algorithm. ICME 2007: 999-1002 - [c36]Jian Yu, Lixing Huang, Jianhua Tao, Xia Wang:
Modeling incompletion phenomenon in Mandarin dialog prosody. INTERSPEECH 2007: 462-465 - 2006
- [j1]Jianhua Tao, Yongguo Kang, Aijun Li:
Prosody conversion from neutral speech to emotional speech. IEEE Trans. Speech Audio Process. 14(4): 1145-1154 (2006) - [c35]Yongguo Kang, Jianhua Tao, Bo Xu:
Applying Pitch Target Model to Convert F0 Contour for Expressive Mandarin Speech Synthesis. ICASSP (1) 2006: 733-736 - [c34]Jian Yu, Wanzhi Zhang, Jianhua Tao:
A New Pitch Generation Model Based on Internal Dependence of Pitch Contour for Manadrin TTS System. ICASSP (1) 2006: 741-744 - [c33]Mingyu You, Chun Chen, Jiajun Bu, Jia Liu, Jianhua Tao:
Emotion Recognition from Noisy Speech. ICME 2006: 1653-1656 - [c32]Mingyu You, Chun Chen, Jiajun Bu, Jia Liu, Jianhua Tao:
Emotional Speech Analysis on Nonlinear Manifold. ICPR (3) 2006: 91-94 - [c31]Honghui Dong, Jianhua Tao, Bo Xu:
Prosodic Word Prediction Using a Maximum Entropy Approach. ISCSLP (Selected Papers) 2006: 169-178 - [c30]Jianhua Tao, Jian Yu, Yongguo Kang:
Nonlinear Emotional Prosody Generation and Annotation. ISCSLP (Selected Papers) 2006: 189-199 - [c29]Jian Yu, Jianhua Tao, Xia Wang:
Pitch Prediction for Mandarin TTS with Mutual Prosodic Constraint. ISCSLP 2006 - 2005
- [c28]Yongguo Kang, Zhiwei Shuang, Jianhua Tao, Wei Zhang, Bo Xu:
A Hybrid GMM and Codebook Mapping Method for Spectral Conversion. ACII 2005: 303-310 - [c27]Jianhua Tao, Yongguo Kang:
Features Importance Analysis for Emotional Speech Classification. ACII 2005: 449-457 - [c26]Panrong Yin, Jianhua Tao:
Dynamic Mapping Method Based Speech Driven Face Animation System. ACII 2005: 755-763 - [c25]Jianhua Tao, Tieniu Tan:
Affective Computing: A Review. ACII 2005: 981-995 - [c24]Yonglin Li, Jianhua Tao:
Personalized Facial Animation Based on 3D Model Fitting from Two Orthogonal Face Images. ACII 2005: 996-1003 - [c23]Le Xin, Qiang Wang
, Jianhua Tao, Xiaoou Tang, Tieniu Tan, Harry Shum:
Automatic 3D Face Modeling from Video. ICCV 2005: 1193-1199 - [c22]Honghui Dong, Jianhua Tao, Bo Xu:
Chinese prosodic phrasing with a constraint-based approach. INTERSPEECH 2005: 3241-3244 - [e1]Jianhua Tao, Tieniu Tan, Rosalind W. Picard:
Affective Computing and Intelligent Interaction, First International Conference, ACII 2005, Beijing, China, October 22-24, 2005, Proceedings. Lecture Notes in Computer Science 3784, Springer 2005, ISBN 3-540-29621-2 [contents] - 2004
- [c21]Jianhua Tao, Tieniu Tan:
Emotional Chinese talking head system. ICMI 2004: 273-280 - [c20]Bo Xu, Jianhua Tao, Yongguo Kang:
A new multicomponent AM-FM demodulation with predicting frequency boundaries and its application to formant estimation. INTERSPEECH 2004: 1105-1108 - [c19]Jianhua Tao:
Context based emotion detection from text input. INTERSPEECH 2004: 1337-1340 - [c18]Honghui Dong, Jianhua Tao, Bo Xu:
Grapheme-to-phoneme conversion in Chinese TTS system. ISCSLP 2004: 165-168 - [c17]Jianhua Tao:
Rhythm correlation of speech synthesis system. ISCSLP 2004: 221-224 - [c16]Jianhua Tao, Yongguo Kang:
Multi-source based acoustic model for speech synthesis. SSW 2004: 167-172 - [c15]Jianhua Tao:
Acoustic and Linguistic Information Based Chinese Prosodic Boundary Labelling. TSD 2004: 489-496 - [c14]Jianhua Tao:
F0 Prediction Model of Speech Synthesis Based on Template and Statistical Method. TSD 2004: 497-504 - 2003
- [c13]Sheng Zhao, Jianhua Tao, DanLing Jiang:
Chinese prosodic phrasing with extended features. ICASSP (1) 2003: 492-495 - [c12]Jianhua Tao, Xing Ni:
Auditive learning based Chinese F0 prediction. ICASSP (1) 2003: 500-503 - [c11]Jianhua Tao, Xing Ni:
Auditive learning based Chinese F0 prediction. ICME 2003: 213-216 - [c10]Jianhua Tao:
Emotion control of Chinese speech synthesis in natural environment. INTERSPEECH 2003: 2349-2352 - 2002
- [c9]Sheng Zhao, Jianhua Tao, Lianhong Cai:
Learning Rules for Chinese Prosodic Phrase Prediction. SIGHAN@COLING 2002 - [c8]Dan-Ning Jiang, Lie Lu
, Hong-Jiang Zhang, Jianhua Tao, Lian-Hong Cai:
Music type classification by spectral contrast feature. ICME (1) 2002: 113-116 - [c7]Jianhua Tao, Lianhong Cai:
Clustering and feature learning based F0 prediction for Chinese speech synthesis. INTERSPEECH 2002: 2097-2100 - [c6]Sheng Zhao, Jianhua Tao, Lianhong Cai:
Prosodic phrasing with inductive learning. INTERSPEECH 2002: 2417-2420 - [c5]Jianhua Tao, Sheng Zhao, Lian-Hong Cai:
Automatic stress prediction of Chinese speech synthesis. ISCSLP 2002 - [c4]Dan-Ning Jiang, Jianhua Tao, Lian-Hong Cai:
Voice quality analysis under the pitch effect. ISCSLP 2002 - 2000
- [c3]Achim F. Müller, Jianhua Tao, Rüdiger Hoffmann:
Data-driven importance analysis of linguistic and phonetic information. INTERSPEECH 2000: 66-69 - [c2]Achim F. Müller, Jianhua Tao, Rüdiger Hoffmann:
Data-driven importance analysis of linguistic and phonetic information. INTERSPEECH 2000: 75-78
1990 – 1999
- 1998
- [c1]Jianhua Tao, Lian-Hong Cai, Yu-Zuo Zhong:
The Statistical Model of Chinese Word Contours Based on Fuzzy. ISCSLP 1998
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-04-20 23:54 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint