default search action
Haizhou Li 0001
李海洲
Person information
- unicode name: 李海洲
- affiliation: Chinese University of Hong Kong (Shenzhen), China
- affiliation: National University of Singapore, Department of Electrical and Computer Engineering, Singapore
- affiliation (2006 - 2016): Nanyang Technological University, Singapore
- affiliation (2003 - 2016): Institute for Infocomm Research, A*STAR, Singapore
- affiliation (2011): University of New South Wales, Sydney, Australia
- affiliation (2009): University of Eastern Finland, Kuopio, Finland
- affiliation (PhD 1990): South China University of Technology, Guangzhou, China
Other persons with the same name
- Haizhou Li
- Haizhou Li 0002 — Blaise Pascal University, Clermont-Ferrand, France
- Haizhou Li 0003 — City University of Hong Kong, Department of Computer Science, Hong Kong
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j184]Qianhui Liu, Meng Ge, Haizhou Li:
Intelligent event-based lip reading word classification with spiking neural networks using spatio-temporal attention features and triplet loss. Inf. Sci. 675: 120660 (2024) - [j183]Jiaqi Yan, Qianhui Liu, Malu Zhang, Lang Feng, De Ma, Haizhou Li, Gang Pan:
Efficient spiking neural network design via neural architecture search. Neural Networks 173: 106172 (2024) - [j182]Xinyi Chen, Qu Yang, Jibin Wu, Haizhou Li, Kay Chen Tan:
A Hybrid Neural Coding Approach for Pattern Recognition With Spiking Neural Networks. IEEE Trans. Pattern Anal. Mach. Intell. 46(5): 3064-3078 (2024) - [j181]Shuai Wang, Zhengyang Chen, Bing Han, Hongji Wang, Chengdong Liang, Binbin Zhang, Xu Xiang, Wen Ding, Johan Rohdin, Anna Silnova, Yanmin Qian, Haizhou Li:
Advancing speaker embedding learning: Wespeaker toolkit for research and production. Speech Commun. 162: 103104 (2024) - [j180]Jingru Lin, Meng Ge, Wupeng Wang, Haizhou Li, Mengling Feng:
Selective HuBERT: Self-Supervised Pre-Training for Target Speaker in Clean and Mixture Speech. IEEE Signal Process. Lett. 31: 1014-1018 (2024) - [j179]Duo Ma, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li:
Text-Guided HuBERT: Self-Supervised Speech Pre-Training via Generative Adversarial Networks. IEEE Signal Process. Lett. 31: 2055-2059 (2024) - [j178]Xiaoxue Gao, Zexin Li, Yiming Chen, Cong Liu, Haizhou Li:
Transferable Adversarial Attacks Against ASR. IEEE Signal Process. Lett. 31: 2200-2204 (2024) - [j177]Qu Yang, Malu Zhang, Jibin Wu, Kay Chen Tan, Haizhou Li:
LC-TTFS: Toward Lossless Network Conversion for Spiking Neural Networks With TTFS Coding. IEEE Trans. Cogn. Dev. Syst. 16(5): 1626-1639 (2024) - [j176]Siqi Cai, Ran Zhang, Malu Zhang, Jibin Wu, Haizhou Li:
EEG-Based Auditory Attention Detection With Spiking Graph Convolutional Network. IEEE Trans. Cogn. Dev. Syst. 16(5): 1698-1706 (2024) - [j175]Koichiro Yoshino, Yun-Nung Chen, Paul A. Crook, Satwik Kottur, Jinchao Li, Behnam Hedayatnia, Seungwhan Moon, Zhengcong Fei, Zekang Li, Jinchao Zhang, Yang Feng, Jie Zhou, Seokhwan Kim, Yang Liu, Di Jin, Alexandros Papangelis, Karthik Gopalakrishnan, Dilek Hakkani-Tur, Babak Damavandi, Alborz Geramifard, Chiori Hori, Ankit Shah, Chen Zhang, Haizhou Li, João Sedoc, Luis F. D'Haro, Rafael E. Banchs, Alexander Rudnicky:
Overview of the Tenth Dialog System Technology Challenge: DSTC10. IEEE ACM Trans. Audio Speech Lang. Process. 32: 765-778 (2024) - [j174]Lei Liu, Li Liu, Haizhou Li:
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1559-1572 (2024) - [j173]Xuehao Zhou, Mingyang Zhang, Yi Zhou, Zhizheng Wu, Haizhou Li:
Accented Text-to-Speech Synthesis With Limited Data. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1699-1711 (2024) - [j172]Rui Liu, Berrak Sisman, Guanglai Gao, Haizhou Li:
Controllable Accented Text-to-Speech Synthesis With Fine and Coarse-Grained Intensity Rendering. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2188-2201 (2024) - [j171]Tianchi Liu, Kong Aik Lee, Qiongqiong Wang, Haizhou Li:
Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2324-2337 (2024) - [j170]Congcong Sun, Hui Tian, Peng Tian, Haizhou Li, Zhenxing Qian:
Multi-Agent Deep Learning for the Detection of Multiple Speech Steganography Methods. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2957-2972 (2024) - [j169]Mingyang Zhang, Yi Zhou, Yi Ren, Chen Zhang, Xiang Yin, Haizhou Li:
RefXVC: Cross-Lingual Voice Conversion With Enhanced Reference Leveraging. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4146-4156 (2024) - [j168]Wupeng Wang, Zexu Pan, Xinke Li, Shuai Wang, Haizhou Li:
Speech Separation With Pretrained Frontend to Minimize Domain Mismatch. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4184-4198 (2024) - [j167]Zexu Pan, Marvin Borsdorf, Siqi Cai, Tanja Schultz, Haizhou Li:
NeuroHeed: Neuro-Steered Speaker Extraction Using EEG Signals. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4456-4470 (2024) - [j166]Yicheng Gu, Xueyao Zhang, Liumeng Xue, Haizhou Li, Zhizheng Wu:
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoders. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4569-4579 (2024) - [j165]Siqi Cai, Tanja Schultz, Haizhou Li:
Brain Topology Modeling With EEG-Graphs for Auditory Spatial Attention Detection. IEEE Trans. Biomed. Eng. 71(1): 171-182 (2024) - [j164]Miao Liu, Jing Wang, Xinyuan Qian, Haizhou Li:
Audio-Visual Temporal Forgery Detection Using Embedding-Level Fusion and Multi-Dimensional Contrastive Loss. IEEE Trans. Circuits Syst. Video Technol. 34(8): 6937-6948 (2024) - [j163]Zhenyu Weng, Huiping Zhuang, Fulin Luo, Haizhou Li, Zhiping Lin:
Few-Shot Contrastive Transfer Learning With Pretrained Model for Masked Face Verification. IEEE Trans. Multim. 26: 3871-3883 (2024) - [j162]Xinyuan Qian, Wei Xue, Qiquan Zhang, Ruijie Tao, Haizhou Li:
Deep Cross-Modal Retrieval Between Spatial Image and Acoustic Speech. IEEE Trans. Multim. 26: 4480-4489 (2024) - [j161]Ruihang Ji, Shuzhi Sam Ge, Kai Zhao, Haizhou Li:
Event-Triggered Tracking Control for Nonlinear Systems With Prescribed Performance. IEEE Trans. Syst. Man Cybern. Syst. 54(6): 3547-3557 (2024) - [c722]Shimin Zhang, Qu Yang, Chenxiang Ma, Jibin Wu, Haizhou Li, Kay Chen Tan:
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling. AAAI 2024: 16838-16847 - [c721]Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li:
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling. AAAI 2024: 18698-18706 - [c720]Jiadong Wang, Zexu Pan, Malu Zhang, Robby T. Tan, Haizhou Li:
Restoring Speaking Lips from Occlusion for Audio-Visual Speech Recognition. AAAI 2024: 19144-19152 - [c719]Chen Zhang, Luis Fernando D'Haro, Yiming Chen, Malu Zhang, Haizhou Li:
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators. AAAI 2024: 19515-19524 - [c718]Yiming Chen, Chen Zhang, Danqing Luo, Luis Fernando D'Haro, Robby T. Tan, Haizhou Li:
Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models. ACL (Findings) 2024: 1359-1375 - [c717]Feng Jiang, Weihao Liu, Xiaomin Chu, Peifeng Li, Qiaoming Zhu, Haizhou Li:
Advancing Topic Segmentation and Outline Generation in Chinese Texts: The Paragraph-level Topic Representation, Corpus, and Benchmark. LREC/COLING 2024: 495-506 - [c716]Danqing Luo, Chen Zhang, Yan Zhang, Haizhou Li:
CrossTune: Black-Box Few-Shot Classification with Label Enhancement. LREC/COLING 2024: 4185-4197 - [c715]Yaxin Fan, Feng Jiang, Peifeng Li, Haizhou Li:
Uncovering the Potential of ChatGPT for Discourse Analysis in Dialogue: An Empirical Study. LREC/COLING 2024: 16998-17010 - [c714]Chen Zhang, Chengguang Tang, Dading Chong, Ke Shi, Guohua Tang, Feng Jiang, Haizhou Li:
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models. EMNLP (Findings) 2024: 8926-8946 - [c713]Yiming Chen, Xianghu Yue, Xiaoxue Gao, Chen Zhang, Luis Fernando D'Haro, Robby T. Tan, Haizhou Li:
Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models. EMNLP (Findings) 2024: 10917-10930 - [c712]Jiabao Pan, Yan Zhang, Chen Zhang, Zuozhu Liu, Hongwei Wang, Haizhou Li:
DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models. EMNLP 2024: 14686-14695 - [c711]Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li:
SVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks. ICASSP 2024: 221-225 - [c710]Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li:
Spiking-Leaf: A Learnable Auditory Front-End for Spiking Neural Networks. ICASSP 2024: 226-230 - [c709]Qiquan Zhang, Meng Ge, Hongxu Zhu, Eliathamby Ambikairajah, Qi Song, Zhaoheng Ni, Haizhou Li:
An Empirical Study on the Impact of Positional Encoding in Transformer-Based Monaural Speech Enhancement. ICASSP 2024: 1001-1005 - [c708]Siqi Cai, Ran Zhang, Haizhou Li:
Robust Decoding of the Auditory Attention from EEG Recordings Through Graph Convolutional Networks. ICASSP 2024: 2320-2324 - [c707]Yu Chen, Xinyuan Qian, Zexu Pan, Kainan Chen, Haizhou Li:
LOCSELECT: Target Speaker Localization with an Auditory Selective Hearing Mechanism. ICASSP 2024: 8696-8700 - [c706]Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis. ICASSP 2024: 10601-10605 - [c705]Junjie Li, Ruijie Tao, Zexu Pan, Meng Ge, Shuai Wang, Haizhou Li:
Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-Talker Speech. ICASSP 2024: 10666-10670 - [c704]Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li:
Leveraging in-the-wild Data for Effective Self-supervised Pretraining in Speaker Recognition. ICASSP 2024: 10901-10905 - [c703]Yidi Jiang, Zhengyang Chen, Ruijie Tao, Liqun Deng, Yanmin Qian, Haizhou Li:
Prompt-Driven Target Speech Diarization. ICASSP 2024: 11086-11090 - [c702]Yi Ma, Kong Aik Lee, Ville Hautamäki, Meng Ge, Haizhou Li:
Gradient Weighting for Speaker Verification in Extremely Low Signal-to-Noise Ratio. ICASSP 2024: 11311-11315 - [c701]Qianhui Liu, Jiaqi Yan, Malu Zhang, Gang Pan, Haizhou Li:
LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization. IJCAI 2024: 3097-3105 - [c700]Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, Xin Yang:
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition. IJCAI 2024: 3160-3168 - [c699]Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng:
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy. IJCNN 2024: 1-8 - [c698]Xianghu Yue, Xueyi Zhang, Yiming Chen, Chengwei Zhang, Mingrui Lao, Huiping Zhuang, Xinyuan Qian, Haizhou Li:
MMAL: Multi-Modal Analytic Learning for Exemplar-Free Audio-Visual Class Incremental Tasks. ACM Multimedia 2024: 2428-2437 - [c697]Weizhi Liu, Yue Li, Dongdong Lin, Hui Tian, Haizhou Li:
GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis. ACM Multimedia 2024: 3294-3302 - [c696]Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li:
Generative Expressive Conversational Speech Synthesis. ACM Multimedia 2024: 4187-4196 - [c695]Miao Liu, Jing Wang, Xinyuan Qian, Haizhou Li:
ListenFormer: Responsive Listening Head Generation with Non-autoregressive Transformers. ACM Multimedia 2024: 7094-7103 - [c694]Ruijie Tao, Zhan Shi, Yidi Jiang, Duc-Tuan Truong, Eng Siong Chng, Massimo Alioto, Haizhou Li:
Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization. ACM Multimedia 2024: 11342-11347 - [c693]Chuang Li, Yan Zhang, Min-Yen Kan, Haizhou Li:
UNO-DST: Leveraging Unlabelled Data in Zero-Shot Dialogue State Tracking. NAACL-HLT (Findings) 2024: 2972-2983 - [c692]Xidong Wang, Guiming Chen, Dingjie Song, Zhiyi Zhang, Zhihong Chen, Qingying Xiao, Junying Chen, Feng Jiang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li:
CMB: A Comprehensive Medical Benchmark in Chinese. NAACL-HLT 2024: 6184-6205 - [c691]Huang Huang, Fei Yu, Jianqing Zhu, Xuening Sun, Hao Cheng, Dingjie Song, Zhihong Chen, Mosen Alharthi, Bang An, Juncai He, Ziche Liu, Junying Chen, Jianquan Li, Benyou Wang, Lian Zhang, Ruoyu Sun, Xiang Wan, Haizhou Li, Jinchao Xu:
AceGPT, Localizing Large Language Models in Arabic. NAACL-HLT 2024: 8139-8163 - [c690]Kun Zhou, Berrak Sisman, Carlos Busso, Bin Ma, Haizhou Li:
Mixed-EVC: Mixed Emotion Synthesis and Control in Voice Conversion. Odyssey 2024: 180-186 - [c689]Hongli Yang, Xinyi Chen, Junjie Li, Hao Huang, Siqi Cai, Haizhou Li:
Listen to the Speaker in Your Gaze. CIS-RAM 2024: 380-385 - [c688]Ganjun Liu, Xiaohui Hou, Meng Ge, Tao Zhang, Haizhou Li:
A Non-Intrusive Approach to Assessing Dysarthria Severity: Advancing Clinical Diagnosis. WWW (Companion Volume) 2024: 1134-1137 - [i217]Yi Ma, Kong Aik Lee, Ville Hautamäki, Meng Ge, Haizhou Li:
Gradient weighting for speaker verification in extremely low Signal-to-Noise Ratio. CoRR abs/2401.02626 (2024) - [i216]Feng Jiang, Kuang Wang, Haizhou Li:
Bridging Research and Readers: A Multi-Modal Automated Academic Papers Interpretation System. CoRR abs/2401.09150 (2024) - [i215]Qiquan Zhang, Meng Ge, Hongxu Zhu, Eliathamby Ambikairajah, Qi Song, Zhaoheng Ni, Haizhou Li:
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement. CoRR abs/2401.09686 (2024) - [i214]Xianghu Yue, Xiaohai Tian, Malu Zhang, Zhizheng Wu, Haizhou Li:
CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing. CoRR abs/2401.12264 (2024) - [i213]Qianhui Liu, Jiaqi Yan, Malu Zhang, Gang Pan, Haizhou Li:
LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization. CoRR abs/2401.14652 (2024) - [i212]Lei Liu, Li Liu, Haizhou Li:
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition. CoRR abs/2401.17604 (2024) - [i211]Wenjie Wei, Malu Zhang, Jilin Zhang, Ammar Belatreche, Jibin Wu, Zijing Xu, Xuerui Qiu, Hong Chen, Yang Yang, Haizhou Li:
Event-Driven Learning for Spiking Neural Networks. CoRR abs/2403.00270 (2024) - [i210]Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Fine-Grained Quantitative Emotion Editing for Speech Generation. CoRR abs/2403.02002 (2024) - [i209]Xidong Wang, Nuo Chen, Junyin Chen, Yan Hu, Yidong Wang, Xiangbo Wu, Anningzhe Gao, Xiang Wan, Haizhou Li, Benyou Wang:
Apollo: An Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People. CoRR abs/2403.03640 (2024) - [i208]Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li:
sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks. CoRR abs/2403.05772 (2024) - [i207]Danqing Luo, Chen Zhang, Yan Zhang, Haizhou Li:
CrossTune: Black-Box Few-Shot Classification with Label Enhancement. CoRR abs/2403.12468 (2024) - [i206]Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng:
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy. CoRR abs/2403.16078 (2024) - [i205]Yicheng Gu, Xueyao Zhang, Liumeng Xue, Haizhou Li, Zhizheng Wu:
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder. CoRR abs/2404.17161 (2024) - [i204]Ruijie Tao, Xinyuan Qian, Yidi Jiang, Junjie Li, Jiadong Wang, Haizhou Li:
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention. CoRR abs/2404.18501 (2024) - [i203]Chuang Li, Yang Deng, Hengchang Hu, Min-Yen Kan, Haizhou Li:
Incorporating External Knowledge and Goal Guidance for LLM-based Conversational Recommender Systems. CoRR abs/2405.01868 (2024) - [i202]Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis. CoRR abs/2405.09171 (2024) - [i201]Xiangyu Zhang, Qiquan Zhang, Hexin Liu, Tianyi Xiao, Xinyuan Qian, Beena Ahmed, Eliathamby Ambikairajah, Haizhou Li, Julien Epps:
Mamba in Speech: Towards an Alternative to Self-Attention. CoRR abs/2405.12609 (2024) - [i200]Yiming Chen, Chen Zhang, Danqing Luo, Luis Fernando D'Haro, Robby T. Tan, Haizhou Li:
Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models. CoRR abs/2405.14646 (2024) - [i199]Jiahui Xu, Feng Jiang, Anningzhe Gao, Haizhou Li:
Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation. CoRR abs/2405.19799 (2024) - [i198]Chen Zhang, Chengguang Tang, Dading Chong, Ke Shi, Guohua Tang, Feng Jiang, Haizhou Li:
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models. CoRR abs/2405.20215 (2024) - [i197]Tianchi Liu, Lin Zhang, Rohan Kumar Das, Yi Ma, Ruijie Tao, Haizhou Li:
How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio? CoRR abs/2406.02483 (2024) - [i196]Zhijun Liu, Shuai Wang, Sho Inoue, Qibing Bai, Haizhou Li:
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis. CoRR abs/2406.05551 (2024) - [i195]Yidi Jiang, Ruijie Tao, Zhengyang Chen, Yanmin Qian, Haizhou Li:
Target Speech Diarization with Multimodal Prompts. CoRR abs/2406.07198 (2024) - [i194]Xuehao Zhou, Mingyang Zhang, Yi Zhou, Zhiwu Li, Haizhou Li:
Multi-Scale Accent Modeling with Disentangling for Multi-Speaker Multi-Accent TTS Synthesis. CoRR abs/2406.10844 (2024) - [i193]Zeyang Song, Qianhui Liu, Qu Yang, Yizhou Peng, Haizhou Li:
ED-sKWS: Early-Decision Spiking Neural Networks for Rapid,and Energy-Efficient Keyword Spotting. CoRR abs/2406.12726 (2024) - [i192]Junyi Ao, Yuancheng Wang, Xiaohai Tian, Dekun Chen, Jun Zhang, Lu Lu, Yuxuan Wang, Haizhou Li, Zhizheng Wu:
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words. CoRR abs/2406.13340 (2024) - [i191]Ziche Liu, Rui Ke, Feng Jiang, Haizhou Li:
Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models. CoRR abs/2406.14115 (2024) - [i190]Jiabao Pan, Yan Zhang, Chen Zhang, Zuozhu Liu, Hongwei Wang, Haizhou Li:
DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models. CoRR abs/2407.01009 (2024) - [i189]Rui Liu, Haolin Zuo, Zheng Lian, Xiaofen Xing, Björn W. Schuller, Haizhou Li:
Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset. CoRR abs/2407.02751 (2024) - [i188]Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, Xin Yang:
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition. CoRR abs/2407.09521 (2024) - [i187]Weizhi Liu, Yue Li, Dongdong Lin, Hui Tian, Haizhou Li:
GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis. CoRR abs/2407.10471 (2024) - [i186]Shuai Wang, Zhengyang Chen, Kong Aik Lee, Yanmin Qian, Haizhou Li:
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning. CoRR abs/2407.15188 (2024) - [i185]Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li:
Generative Expressive Conversational Speech Synthesis. CoRR abs/2407.21491 (2024) - [i184]Qianhui Liu, Jiadong Wang, Yang Wang, Xin Yang, Gang Pan, Haizhou Li:
Human-Inspired Audio-Visual Speech Recognition: Spike Activity, Cueing Interaction and Causal Processing. CoRR abs/2408.16564 (2024) - [i183]Dashanka De Silva, Siqi Cai, Saurav Pahuja, Tanja Schultz, Haizhou Li:
NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention. CoRR abs/2409.02489 (2024) - [i182]Xinyuan Qian, Xianghu Yue, Jiadong Wang, Huiping Zhuang, Haizhou Li:
Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection. CoRR abs/2409.07224 (2024) - [i181]Zhijun Liu, Shuai Wang, Pengcheng Zhu, Mengxiao Bi, Haizhou Li:
E1 TTS: Simple and Fast Non-Autoregressive TTS. CoRR abs/2409.09351 (2024) - [i180]Sho Inoue, Shuai Wang, Wanxing Wang, Pengcheng Zhu, Mengxiao Bi, Haizhou Li:
MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion. CoRR abs/2409.09352 (2024) - [i179]Junjie Li, Ke Zhang, Shuai Wang, Haizhou Li, Man-Wai Mak, Kong Aik Lee:
On the effectiveness of enrollment speech augmentation for Target Speaker Extraction. CoRR abs/2409.09589 (2024) - [i178]Chen Zhang, Dading Chong, Feng Jiang, Chengguang Tang, Anningzhe Gao, Guohua Tang, Haizhou Li:
Aligning Language Models Using Follow-up Likelihood as Reward Signal. CoRR abs/2409.13948 (2024) - [i177]Shuai Wang, Pengcheng Zhu, Haizhou Li:
M-Vec: Matryoshka Speaker Embeddings with Flexible Dimensions. CoRR abs/2409.15782 (2024) - [i176]Shuai Wang, Ke Zhang, Shaoxiong Lin, Junjie Li, Xuefei Wang, Meng Ge, Jianwei Yu, Yanmin Qian, Haizhou Li:
WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction. CoRR abs/2409.15799 (2024) - [i175]Yiming Chen, Xianghu Yue, Xiaoxue Gao, Chen Zhang, Luis Fernando D'Haro, Robby T. Tan, Haizhou Li:
Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models. CoRR abs/2409.18680 (2024) - [i174]Rui Liu, Jiatian Xi, Ziyue Jiang, Haizhou Li:
FluentEditor+: Text-based Speech Editing by Modeling Local Hierarchical Acoustic Smoothness and Global Prosody Consistency. CoRR abs/2410.03719 (2024) - [i173]Rui Liu, Zhenqi Jia, Jie Yang, Yifan Hu, Haizhou Li:
Emphasis Rendering for Conversational Text-to-Speech with Multi-modal Multi-scale Context Modeling. CoRR abs/2410.09524 (2024) - [i172]Fan Bu, Yuhao Zhang, Xidong Wang, Benyou Wang, Qun Liu, Haizhou Li:
Roadmap towards Superhuman Speech Understanding using Large Language Models. CoRR abs/2410.13268 (2024) - [i171]Zihao Cheng, Li Zhou, Feng Jiang, Benyou Wang, Haizhou Li:
Beyond Binary: Towards Fine-Grained LLM-Generated Text Detection via Role Recognition and Involvement Measurement. CoRR abs/2410.14259 (2024) - [i170]Ke Zhang, Junjie Li, Shuai Wang, Yangjie Wei, Yi Wang, Yannan Wang, Haizhou Li:
Multi-Level Speaker Representation for Target Speaker Extraction. CoRR abs/2410.16059 (2024) - [i169]Yiming Chen, Xianghu Yue, Chen Zhang, Xiaoxue Gao, Robby T. Tan, Haizhou Li:
VoiceBench: Benchmarking LLM-Based Voice Assistants. CoRR abs/2410.17196 (2024) - 2023
- [j160]Tao Luo, Weng-Fai Wong, Rick Siow Mong Goh, Anh Tuan Do, Zhixian Chen, Haizhou Li, Wenyu Jiang, Weiyun Yau:
Achieving Green AI with Energy-Efficient Deep Learning Using Neuromorphic Computing. Commun. ACM 66(7): 52-57 (2023) - [j159]Buddhi Wickramasinghe, Eliathamby Ambikairajah, Vidhyasaharan Sethu, Julien Epps, Haizhou Li, Ting Dang:
DNN controlled adaptive front-end for replay attack detection systems. Speech Commun. 154: 102973 (2023) - [j158]Tingting Wang, Zexu Pan, Meng Ge, Zhen Yang, Haizhou Li:
Time-Domain Speech Separation Networks With Graph Encoding Auxiliary. IEEE Signal Process. Lett. 30: 110-114 (2023) - [j157]Yi Zhou, Zhizheng Wu, Mingyang Zhang, Xiaohai Tian, Haizhou Li:
TTS-Guided Training for Accent Conversion Without Parallel Data. IEEE Signal Process. Lett. 30: 533-537 (2023) - [j156]