


default search action
Jialong Zuo
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
[j1]Jiahao Hong, Jialong Zuo, Chuchu Han, Ruochen Zheng, Ming Tian, Changxin Gao, Nong Sang:
Spatial cascaded clustering and weighted memory for unsupervised person re-identification. Image Vis. Comput. 156: 105478 (2025)
[c23]Jialong Zuo, Ying Nie, Tianyu Guo, Huaxin Zhang, Jiahao Hong, Nong Sang, Changxin Gao, Kai Han:
L-Man: A Large Multi-modal Model Unifying Human-centric Tasks. AAAI 2025: 11095-11103
[c22]Shengpeng Ji, Ziyue Jiang, Jialong Zuo, Minghui Fang, Yifu Chen, Tao Jin, Zhou Zhao:
Speech Watermarking with Discrete Intermediate Representations. AAAI 2025: 24239-24247
[c21]Shengpeng Ji, Qian Chen, Wen Wang, Jialong Zuo, Minghui Fang, Ziyue Jiang, Hai Huang, Zehan Wang, Xize Cheng, Siqi Zheng, Zhou Zhao:
ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control. ACL (1) 2025: 6966-6981
[c20]Shengpeng Ji, Minghui Fang, Jialong Zuo, Ziyue Jiang, Dingdong Wang, Hanting Wang, Hai Huang, Zhou Zhao:
Language-Codec: Bridging Discrete Codec Representations and Speech Language Models. ACL (1) 2025: 13332-13345
[c19]Minghui Fang, Shengpeng Ji, Jialong Zuo, Hai Huang, Yan Xia, Jieming Zhu, Xize Cheng, Xiaoda Yang, Wenrui Liu, Gang Wang, Zhenhua Dong, Zhou Zhao:
CART: A Generative Cross-Modal Retrieval Framework With Coarse-To-Fine Semantic Modeling. ACL (1) 2025: 15120-15133
[c18]Jialong Zuo, Shengpeng Ji, Minghui Fang, Mingze Li, Ziyue Jiang, Xize Cheng, Xiaoda Yang, Feiyang Chen, Xinyu Duan, Zhou Zhao:
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching. ACL (1) 2025: 16203-16217
[c17]Wenrui Liu, Jionghao Bai, Xize Cheng, Jialong Zuo, Ziyue Jiang, Shengpeng Ji, Minghui Fang, Xiaoda Yang, Qian Yang, Zhou Zhao:
VoxpopuliTTS: a large-scale multilingual TTS corpus for zero-shot speech generation. COLING 2025: 10293-10297
[c16]Huaxin Zhang, Xiaohao Xu, Xiang Wang, Jialong Zuo, Xiaonan Huang, Changxin Gao, Shanjun Zhang, Li Yu, Nong Sang:
Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity. CVPR 2025: 13843-13853
[c15]Jialong Zuo, Shengpeng Ji, Minghui Fang, Ziyue Jiang, Xize Cheng, Qian Yang, Wenrui Liu, Guangyan Zhang, Zehai Tu, Yiwen Guo, Zhou Zhao:
Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model. ICASSP 2025: 1-5
[c14]Xize Cheng, Siqi Zheng, Zehan Wang, Minghui Fang, Ziang Zhang, Rongjie Huang, Shengpeng Ji, Jialong Zuo, Tao Jin, Zhou Zhao:
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup. ICLR 2025
[c13]Shengpeng Ji, Ziyue Jiang, Wen Wang, Yifu Chen, Minghui Fang, Jialong Zuo, Qian Yang, Xize Cheng, Zehan Wang, Ruiqi Li, Ziang Zhang, Xiaoda Yang, Rongjie Huang, Yidi Jiang, Qian Chen, Siqi Zheng, Zhou Zhao:
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling. ICLR 2025
[c12]Minghui Fang, Shengpeng Ji, Jialong Zuo, Xize Cheng, Wenrui Liu, Xiaoda Yang, Ruofan Hu, Jieming Zhu, Zhou Zhao:
GTA: Towards Generative Text-To-Audio Retrieval via Multi-Scale Tokenizer. INTERSPEECH 2025
[c11]Kaixuan Luan, Xiaoda Yang, Shile Cai, Ruofan Hu, Minghui Fang, Wenrui Liu, Jialong Zuo, Jiaqi Duan, Yuhang Ma, Junyu Lu:
MelRe: Vision-Based Mel-Spectrogram Restoration. INTERSPEECH 2025
[c10]Wenrui Liu
, Qian Chen
, Wen Wang
, Guanrou Yang
, Weiqin Li
, Minghui Fang
, Jialong Zuo
, Xiaoda Yang
, Tao Jin
, Jin Xu
, Zemin Liu
, Yafeng Chen
, Jionghao Bai
, Zhifang Guo
:
Speech Token Prediction via Compressed-to-fine Language Modeling for Speech Generation. ACM Multimedia 2025: 10632-10641
[i28]Jialong Zuo, Shengpeng Ji, Minghui Fang, Ziyue Jiang, Xize Cheng, Qian Yang, Wenrui Liu, Guangyan Zhang, Zehai Tu, Yiwen Guo, Zhou Zhao:
Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model. CoRR abs/2502.05471 (2025)
[i27]Ziyue Jiang, Yi Ren, Ruiqi Li, Shengpeng Ji, Zhenhui Ye, Chen Zhang, Jionghao Bai, Xiaoda Yang, Jialong Zuo, Yu Zhang
, Rui Liu, Xiang Yin, Zhou Zhao:
Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis. CoRR abs/2502.18924 (2025)
[i26]Shengpeng Ji, Tianle Liang, Yangzhuo Li, Jialong Zuo, Minghui Fang, Jinzheng He, Yifu Chen, Zhengqing Liu, Ziyue Jiang, Xize Cheng, Siqi Zheng, Jin Xu, Junyang Lin, Zhou Zhao:
WavReward: Spoken Dialogue Models With Generalist Reward Evaluators. CoRR abs/2505.09558 (2025)
[i25]Jialong Zuo, Shengpeng Ji, Minghui Fang, Mingze Li, Ziyue Jiang, Xize Cheng, Xiaoda Yang, Feiyang Chen, Xinyu Duan, Zhou Zhao:
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching. CoRR abs/2506.01014 (2025)
[i24]Jialong Zuo, Yongtai Deng, Mengdan Tan, Rui Jin, Dongyue Wu, Nong Sang, Liang Pan, Changxin Gao:
ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model. CoRR abs/2506.09385 (2025)
[i23]Dongyue Wu, Zilin Guo, Jialong Zuo, Nong Sang, Changxin Gao:
Partial Forward Blocking: A Novel Data Pruning Paradigm for Lossless Training Acceleration. CoRR abs/2506.23674 (2025)
[i22]Jialong Zuo, Guangyan Zhang, Minghui Fang, Shengpeng Ji, Xiaoqi Jiao, Jingyu Li, Yiwen Guo, Zhou Zhao:
Entropy-based Coarse and Compressed Semantic Speech Representation Learning. CoRR abs/2509.00503 (2025)
[i21]Jialong Zuo, Yongtai Deng, Lingdong Kong, Jingkang Yang, Rui Jin, Yiwei Zhang, Nong Sang, Liang Pan, Ziwei Liu, Changxin Gao:
VideoLucy: Deep Memory Backtracking for Long Video Understanding. CoRR abs/2510.12422 (2025)
[i20]Wenti Yin, Huaxin Zhang, Xiang Wang, Yuqing Lu, Yicheng Zhang, Bingquan Gong, Jialong Zuo, Li Yu, Changxin Gao, Nong Sang:
Learning to Tell Apart: Weakly Supervised Video Anomaly Detection via Disentangled Semantic Alignment. CoRR abs/2511.10334 (2025)
[i19]Ao Liang, Lingdong Kong, Tianyi Yan, Hongsi Liu, Wesley Yang, Ziqi Huang, Wei Yin, Jialong Zuo, Yixuan Hu, Dekai Zhu, Dongyue Lu, Youquan Liu, Guangfeng Jiang, Linfeng Li, Xiangtai Li, Long Zhuo, Lai Xing Ng, Benoit R. Cottereau, Changxin Gao, Liang Pan, Wei Tsang Ooi, Ziwei Liu:
WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World. CoRR abs/2512.10958 (2025)
[i18]Jialong Zuo, Haoyou Deng, Hanyu Zhou, Jiaxin Zhu, Yicheng Zhang, Yiwei Zhang, Yongxin Yan, Kaixing Huang, Weisen Chen, Yongtai Deng, Rui Jin, Nong Sang, Changxin Gao:
Is Nano Banana Pro a Low-Level Vision All-Rounder? A Comprehensive Evaluation on 14 Tasks and 40 Datasets. CoRR abs/2512.15110 (2025)- 2024
[c9]Shengpeng Ji, Ziyue Jiang, Hanting Wang, Jialong Zuo, Zhou Zhao:
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech. ACL (1) 2024: 13588-13600
[c8]Jialong Zuo, Hanyu Zhou, Ying Nie, Feng Zhang, Tianyu Guo, Nong Sang, Yunhe Wang, Changxin Gao:
UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity. CVPR 2024: 22010-22019
[c7]Xiaoda Yang, Xize Cheng, Jiaqi Duan
, Hongshun Qiu, Minjie Hong, Minghui Fang, Shengpeng Ji, Jialong Zuo, Zhiqing Hong, Zhimeng Zhang, Tao Jin:
AudioVSR: Enhancing Video Speech Recognition with Audio Data. EMNLP 2024: 15352-15361
[c6]Shengpeng Ji, Jialong Zuo, Minghui Fang, Ziyue Jiang, Feiyang Chen
, Xinyu Duan, Baoxing Huai, Zhou Zhao:
TextrolSpeech: A Text Style Control Speech Corpus with Codec Language Text-to-Speech Models. ICASSP 2024: 10301-10305
[c5]Qian Yang, Jialong Zuo, Zhe Su, Ziyue Jiang, Mingze Li, Zhou Zhao, Feiyang Chen
, Zhefeng Wang, Baoxing Huai:
MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis. INTERSPEECH 2024
[c4]Xiaoda Yang
, Xize Cheng
, Dongjie Fu
, Minghui Fang
, Jialong Zuo
, Shengpeng Ji
, Zhou Zhao
, Tao Jin:
SyncTalklip: Highly Synchronized Lip-Readable Speaker Generation with Multi-Task Learning. ACM Multimedia 2024: 8149-8158
[c3]Jialong Zuo, Jiahao Hong, Feng Zhang, Changqian Yu, Hanyu Zhou, Changxin Gao, Nong Sang, Jingdong Wang:
PLIP: Language-Image Pre-training for Person Representation Learning. NeurIPS 2024
[c2]Jialong Zuo, Ying Nie, Hanyu Zhou, Huaxin Zhang, Haoyu Wang, Tianyu Guo, Nong Sang, Changxin Gao:
Cross-video Identity Correlating for Person Re-identification Pre-training. NeurIPS 2024
[i17]Shengpeng Ji, Ziyue Jiang, Hanting Wang, Jialong Zuo, Zhou Zhao:
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech. CoRR abs/2402.09378 (2024)
[i16]Shengpeng Ji, Minghui Fang, Ziyue Jiang, Rongjie Huang, Jialong Zuo, Shulei Wang, Zhou Zhao:
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models. CoRR abs/2402.12208 (2024)
[i15]Jiahao Hong, Jialong Zuo, Chuchu Han, Ruochen Zheng, Ming Tian, Changxin Gao, Nong Sang:
Spatial Cascaded Clustering and Weighted Memory for Unsupervised Person Re-identification. CoRR abs/2403.00261 (2024)
[i14]Shengpeng Ji, Jialong Zuo, Minghui Fang, Siqi Zheng, Qian Chen, Wen Wang, Ziyue Jiang, Hai Huang, Xize Cheng, Rongjie Huang, Zhou Zhao:
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec. CoRR abs/2406.01205 (2024)
[i13]Huaxin Zhang, Xiaohao Xu, Xiang Wang, Jialong Zuo, Chuchu Han, Xiaonan Huang, Changxin Gao, Yuehuan Wang, Nong Sang:
Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM. CoRR abs/2406.12235 (2024)
[i12]Minghui Fang, Shengpeng Ji, Jialong Zuo, Hai Huang, Yan Xia, Jieming Zhu, Xize Cheng, Xiaoda Yang, Wenrui Liu, Gang Wang, Zhenhua Dong, Zhou Zhao:
ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling. CoRR abs/2406.17507 (2024)
[i11]Qian Yang, Jialong Zuo, Zhe Su, Ziyue Jiang, Mingze Li, Zhou Zhao, Feiyang Chen, Zhefeng Wang, Baoxing Huai:
MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis. CoRR abs/2407.14006 (2024)
[i10]Shengpeng Ji, Ziyue Jiang, Xize Cheng, Yifu Chen, Minghui Fang, Jialong Zuo, Qian Yang, Ruiqi Li, Ziang Zhang, Xiaoda Yang, Rongjie Huang, Yidi Jiang, Qian Chen, Siqi Zheng, Wen Wang, Zhou Zhao:
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling. CoRR abs/2408.16532 (2024)
[i9]Jialong Zuo, Ying Nie, Hanyu Zhou, Huaxin Zhang, Haoyu Wang, Tianyu Guo, Nong Sang, Changxin Gao:
Cross-video Identity Correlating for Person Re-identification Pre-training. CoRR abs/2409.18569 (2024)
[i8]Xize Cheng, Siqi Zheng, Zehan Wang, Minghui Fang, Ziang Zhang, Rongjie Huang, Ziyang Ma, Shengpeng Ji, Jialong Zuo, Tao Jin, Zhou Zhao:
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup. CoRR abs/2410.21269 (2024)
[i7]Shengpeng Ji, Yifu Chen, Minghui Fang, Jialong Zuo, Jingyu Lu, Hanting Wang, Ziyue Jiang, Long Zhou, Shujie Liu, Xize Cheng, Xiaoda Yang, Zehan Wang, Qian Yang, Jian Li, Yidi Jiang, Jingzhen He, Yunfei Chu, Jin Xu, Zhou Zhao:
WavChat: A Survey of Spoken Dialogue Models. CoRR abs/2411.13577 (2024)
[i6]Huaxin Zhang, Xiaohao Xu, Xiang Wang, Jialong Zuo, Xiaonan Huang, Changxin Gao, Shanjun Zhang, Li Yu, Nong Sang:
Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity. CoRR abs/2412.06171 (2024)
[i5]Shengpeng Ji, Ziyue Jiang, Jialong Zuo, Minghui Fang, Yifu Chen, Tao Jin, Zhou Zhao:
Speech Watermarking with Discrete Intermediate Representations. CoRR abs/2412.13917 (2024)- 2023
[c1]Ziyue Jiang, Qian Yang, Jialong Zuo, Zhenhui Ye, Rongjie Huang, Yi Ren, Zhou Zhao:
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models. ACL (Findings) 2023: 11655-11671
[i4]Jialong Zuo, Changqian Yu, Nong Sang, Changxin Gao:
PLIP: Language-Image Pre-training for Person Representation Learning. CoRR abs/2305.08386 (2023)
[i3]Ziyue Jiang, Qian Yang, Jialong Zuo, Zhenhui Ye, Rongjie Huang, Yi Ren, Zhou Zhao:
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models. CoRR abs/2305.13612 (2023)
[i2]Shengpeng Ji, Jialong Zuo, Minghui Fang, Ziyue Jiang, Feiyang Chen, Xinyu Duan, Baoxing Huai, Zhou Zhao:
TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models. CoRR abs/2308.14430 (2023)
[i1]Jialong Zuo, Hanyu Zhou, Ying Nie, Feng Zhang, Tianyu Guo, Nong Sang, Yunhe Wang, Changxin Gao:
UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity. CoRR abs/2312.03441 (2023)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2026-02-13 00:46 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







