Остановите войну!
for scientists:
default search action
Guangzhi Sun
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [i28]Nineli Lashkarashvili, Wen Wu, Guangzhi Sun, Philip C. Woodland:
Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation. CoRR abs/2402.11747 (2024) - [i27]Xiaoliang Luo, Akilles Rechardt, Guangzhi Sun, Kevin K. Nejad, Felipe Yáñez, Bati Yilmaz, Kangjoo Lee, Alexandra O. Cohen, Valentina Borghesani, Anton Pashkov, Daniele Marinazzo, Jonathan Nicholas, Alessandro Salatiello, Ilia Sucholutsky, Pasquale Minervini, Sepehr Razavi, Roberta Rocca, Elkhan Yusifov, Tereza Okalova, Nianlong Gu, Martin Ferianc, Mikail Khona, Kaustubh R. Patil, Pui-Shee Lee, Rui Mata, Nicholas E. Myers, Jennifer K. Bizley, Sebastian Musslick, Isil Poyraz Bilgin, Guiomar Niso, Justin M. Ales, Michael Gaebler, N. Apurva Ratan Murty, Leyla Loued-Khenissi, Anna Behler, Chloe M. Hall, Jessica Dafflon, Sherry Dongqi Bao, Bradley C. Love:
Large language models surpass human experts in predicting neuroscience results. CoRR abs/2403.03230 (2024) - [i26]Zhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang:
M3AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset. CoRR abs/2403.14168 (2024) - 2023
- [j2]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Minimising Biasing Word Errors for Contextual ASR With the Tree-Constrained Pointer Generator. IEEE ACM Trans. Audio Speech Lang. Process. 31: 345-354 (2023) - [c11]Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao:
TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch. ASRU 2023: 1-9 - [c10]Evonne P. C. Lee, Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Spectral Clustering-Aware Learning of Embeddings for Speaker Diarisation. ICASSP 2023: 1-5 - [c9]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
End-to-End Spoken Language Understanding with Tree-Constrained Pointer Generator. ICASSP 2023: 1-5 - [i25]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator. CoRR abs/2305.18824 (2023) - [i24]Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland:
Can Contextual Biasing Remain Effective with Whisper and GPT-2? CoRR abs/2306.01942 (2023) - [i23]Guangzhi Sun, Chao Zhang, Ivan Vulic, Pawel Budzianowski, Philip C. Woodland:
Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data. CoRR abs/2307.01764 (2023) - [i22]Yang Li, Cheng Yu, Guangzhi Sun, Weiqin Zu, Zheng Tian, Ying Wen, Wei Pan, Chao Zhang, Jun Wang, Yang Yang, Fanglei Sun:
Cross-Utterance Conditioned VAE for Speech Generation. CoRR abs/2309.04156 (2023) - [i21]Qiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng:
Enhancing Quantised End-to-End ASR Models via Personalisation. CoRR abs/2309.09136 (2023) - [i20]Shutong Feng, Guangzhi Sun, Nurul Lubis, Chao Zhang, Milica Gasic:
Affect Recognition in Conversations Using Large Language Models. CoRR abs/2309.12881 (2023) - [i19]Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Connecting Speech Encoder and Large Language Model for ASR. CoRR abs/2309.13963 (2023) - [i18]Theodor Nguyen, Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland:
Conditional Diffusion Model for Target Speaker Extraction. CoRR abs/2310.04791 (2023) - [i17]Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models. CoRR abs/2310.05863 (2023) - [i16]Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
SALMONN: Towards Generic Hearing Abilities for Large Language Models. CoRR abs/2310.13289 (2023) - [i15]Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao, Robin Scheibler, Samuele Cornell, Sean Kim, Stavros Petridis:
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch. CoRR abs/2310.17864 (2023) - [i14]Guangzhi Sun, Shutong Feng, Dongcheng Jiang, Chao Zhang, Milica Gasic, Philip C. Woodland:
Speech-based Slot Filling using Large Language Models. CoRR abs/2311.07418 (2023) - 2022
- [c8]Yang Li, Cheng Yu, Guangzhi Sun, Hua Jiang, Fanglei Sun, Weiqin Zu, Ying Wen, Yang Yang, Jun Wang:
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech. ACL (1) 2022: 391-400 - [c7]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition. INTERSPEECH 2022: 2043-2047 - [i13]Yang Li, Cheng Yu, Guangzhi Sun, Hua Jiang, Fanglei Sun, Weiqin Zu, Ying Wen, Yang Yang, Jun Wang:
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech. CoRR abs/2205.04120 (2022) - [i12]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator. CoRR abs/2205.09058 (2022) - [i11]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition. CoRR abs/2207.00857 (2022) - [i10]Evonne P. C. Lee, Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation. CoRR abs/2210.13576 (2022) - [i9]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
End-to-end Spoken Language Understanding with Tree-constrained Pointer Generator. CoRR abs/2210.16554 (2022) - 2021
- [j1]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Combination of deep speaker embeddings for diarisation. Neural Networks 141: 372-384 (2021) - [c6]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Tree-Constrained Pointer Generator for End-to-End Contextual Speech Recognition. ASRU 2021: 780-787 - [c5]Guangzhi Sun, D. Liu, Chao Zhang, Philip C. Woodland:
Content-Aware Speaker Embeddings for Speaker Diarisation. ICASSP 2021: 7168-7172 - [c4]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Transformer Language Models with LSTM-Based Cross-Utterance Information Representation. ICASSP 2021: 7363-7367 - [i8]Guangzhi Sun, D. Liu, Chao Zhang, Philip C. Woodland:
Content-Aware Speaker Embeddings for Speaker Diarisation. CoRR abs/2102.06467 (2021) - [i7]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Transformer Language Models with LSTM-based Cross-utterance Information Representation. CoRR abs/2102.06474 (2021) - [i6]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Tree-constrained Pointer Generator for End-to-end Contextual Speech Recognition. CoRR abs/2109.00627 (2021) - 2020
- [c3]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu:
Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis. ICASSP 2020: 6264-6268 - [c2]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior. ICASSP 2020: 6699-6703 - [i5]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu:
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis. CoRR abs/2002.03785 (2020) - [i4]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior. CoRR abs/2002.03788 (2020) - [i3]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Cross-Utterance Language Models with Acoustic Error Sampling. CoRR abs/2009.01008 (2020) - [i2]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Combination of Deep Speaker Embeddings for Diarisation. CoRR abs/2010.12025 (2020)
2010 – 2019
- 2019
- [c1]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Speaker Diarisation Using 2D Self-attentive Combination of Embeddings. ICASSP 2019: 5801-5805 - [i1]Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Speaker diarisation using 2D self-attentive combination of embeddings. CoRR abs/1902.03190 (2019)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-04-11 20:12 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint