


default search action
Kaihang Pan
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
[j2]Dong Chen
, Kaihang Pan, Guangyu Dai
, Guoming Wang
, Yueting Zhuang
, Siliang Tang
, Mingliang Xu
:
Improving Vision Anomaly Detection With the Guidance of Language Modality. IEEE Trans. Multim. 27: 1410-1419 (2025)
[c11]Haiyi Qiu, Minghe Gao, Long Qian, Kaihang Pan, Qifan Yu, Juncheng Li, Wenjie Wang, Siliang Tang, Yueting Zhuang, Tat-Seng Chua:
STEP: Enhancing Video-LLMs' Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training. CVPR 2025: 3284-3294
[c10]Qifan Yu, Wei Chow, Zhongqi Yue, Kaihang Pan, Yang Wu, Xiaoyang Wan, Juncheng Li, Siliang Tang, Hanwang Zhang, Yueting Zhuang:
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea. CVPR 2025: 26125-26135
[c9]Kaihang Pan, Wang Lin, Zhongqi Yue, Tenglong Ao, Liyu Jia, Wei Zhao, Juncheng Li, Siliang Tang, Hanwang Zhang:
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens. CVPR 2025: 26136-26146
[c8]Hao Fei, Yuan Zhou, Juncheng Li, Xiangtai Li, Qingshan Xu, Bobo Li, Shengqiong Wu, Yaoting Wang, Junbao Zhou, Jiahao Meng, Qingyu Shi, Zhiyuan Zhou, Liangtao Shi, Minghe Gao, Daoan Zhang, Zhiqi Ge, Siliang Tang, Kaihang Pan, Yaobo Ye, Haobo Yuan, Tao Zhang, Weiming Wu, Tianjie Ju, Zixiang Meng, Shilin Xu, Liyu Jia, Wentao Hu, Meng Luo, Jiebo Luo, Tat-Seng Chua, Shuicheng Yan, Hanwang Zhang:
On Path to Multimodal Generalist: General-Level and General-Bench. ICML 2025
[c7]Wendong Bu, Yang Wu, Qifan Yu, Minghe Gao, Bingchen Miao, Zhenkui Zhang, Kaihang Pan, Liyunfei, Mengze Li, Wei Ji, Juncheng Li, Siliang Tang, Yueting Zhuang:
What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities. ICML 2025
[i20]Kaihang Pan, Wang Lin, Zhongqi Yue, Tenglong Ao, Liyu Jia, Wei Zhao, Juncheng Li, Siliang Tang
, Hanwang Zhang:
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens. CoRR abs/2504.14666 (2025)
[i19]Wang Lin, Liyu Jia, Wentao Hu, Kaihang Pan, Zhongqi Yue, Wei Zhao, Jingyuan Chen, Fei Wu, Hanwang Zhang:
Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning. CoRR abs/2504.15932 (2025)
[i18]Hao Fei, Yuan Zhou, Juncheng Li, Xiangtai Li, Qingshan Xu, Bobo Li, Shengqiong Wu, Yaoting Wang, Junbao Zhou, Jiahao Meng, Qingyu Shi, Zhiyuan Zhou, Liangtao Shi, Minghe Gao, Daoan Zhang, Zhiqi Ge, Weiming Wu, Siliang Tang, Kaihang Pan, Yaobo Ye, Haobo Yuan, Tao Zhang, Tianjie Ju, Zixiang Meng, Shilin Xu, Liyu Jia, Wentao Hu, Meng Luo
, Jiebo Luo, Tat-Seng Chua, Shuicheng Yan, Hanwang Zhang:
On Path to Multimodal Generalist: General-Level and General-Bench. CoRR abs/2505.04620 (2025)
[i17]Bohan Wang, Zhongqi Yue, Fengda Zhang, Shuo Chen, Li'an Bi, Junzhe Zhang, Xue Song, Kennard Yanting Chan, Jiachun Pan, Weijia Wu, Mingze Zhou, Wang Lin, Kaihang Pan, Saining Zhang, Liyu Jia, Wentao Hu, Wei Zhao, Hanwang Zhang:
Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning. CoRR abs/2505.07538 (2025)
[i16]Kaihang Pan, Yang Wu, Wendong Bu, Kai Shen, Juncheng Li, Yingting Wang, Yunfei Li, Siliang Tang, Jun Xiao, Fei Wu, Hang Zhao, Yueting Zhuang:
Unlocking Aha Moments via Reinforcement Learning: Advancing Collaborative Visual Comprehension and Generation. CoRR abs/2506.01480 (2025)
[i15]Kaihang Pan, Wendong Bu, Yuruo Wu, Yang Wu, Kai Shen, Yunfei Li, Hang Zhao, Juncheng Li, Siliang Tang, Yueting Zhuang:
FocusDiff: Advancing Fine-Grained Text-Image Alignment for Autoregressive Visual Generation through RL. CoRR abs/2506.05501 (2025)
[i14]Wendong Bu, Yang Wu, Qifan Yu, Minghe Gao, Bingchen Miao, Zhenkui Zhang, Kaihang Pan, Yunfei Li, Mengze Li, Wei Ji, Juncheng Li, Siliang Tang, Yueting Zhuang:
What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities. CoRR abs/2506.08933 (2025)
[i13]Zhaoyu Fan, Kaihang Pan, Mingze Zhou, Bosheng Qin, Juncheng Li, Shengyu Zhang, Wenqiao Zhang, Siliang Tang, Fei Wu, Yueting Zhuang:
Towards Meta-Cognitive Knowledge Editing for Multimodal LLMs. CoRR abs/2509.05714 (2025)
[i12]Kaihang Pan, Weile Chen, Haiyi Qiu, Qifan Yu, Wendong Bu, Zehan Wang, Yun Zhu, Juncheng Li, Siliang Tang:
WiseEdit: Benchmarking Cognition- and Creativity-Informed Image Editing. CoRR abs/2512.00387 (2025)
[i11]Wendong Bu, Kaihang Pan, Yuze Lin, Jiacheng Li, Kai Shen, Wenqiao Zhang, Juncheng Li, Jun Xiao, Siliang Tang:
OmniMoGen: Unifying Human Motion Generation via Learning from Interleaved Text-Motion Instructions. CoRR abs/2512.19159 (2025)- 2024
[j1]Jianhao Guo
, Siliang Tang
, Juncheng Li
, Kaihang Pan
, Lingfei Wu
:
RustGraph: Robust Anomaly Detection in Dynamic Graphs by Jointly Learning Structural-Temporal Dependency. IEEE Trans. Knowl. Data Eng. 36(7): 3472-3485 (2024)
[c6]Juncheng Li, Kaihang Pan, Zhiqi Ge, Minghe Gao, Wei Ji, Wenqiao Zhang, Tat-Seng Chua, Siliang Tang, Hanwang Zhang, Yueting Zhuang:
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions. ICLR 2024
[c5]Kaihang Pan, Siliang Tang, Juncheng Li, Zhaoyu Fan, Wei Chow, Shuicheng Yan, Tat-Seng Chua, Yueting Zhuang, Hanwang Zhang:
Auto-Encoding Morph-Tokens for Multimodal LLM. ICML 2024
[c4]Wei Chow, Juncheng Li, Qifan Yu, Kaihang Pan, Hao Fei, Zhiqi Ge, Shuai Yang, Siliang Tang, Hanwang Zhang, Qianru Sun:
Unified Generative and Discriminative Training for Multi-modal Large Language Models. NeurIPS 2024
[c3]Kaihang Pan, Zhaoyu Fan, Juncheng Li, Qifan Yu, Hao Fei, Siliang Tang, Richang Hong, Hanwang Zhang, Qianru Sun:
Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration. NeurIPS 2024
[c2]Kaihang Pan
, Juncheng Li
, Wenjie Wang
, Hao Fei
, Hongye Song
, Wei Ji
, Jun Lin
, Xiaozhong Liu
, Tat-Seng Chua
, Siliang Tang
:
I3: Intent-Introspective Retrieval Conditioned on Instructions. SIGIR 2024: 1839-1849
[i10]Kaihang Pan, Siliang Tang
, Juncheng Li, Zhaoyu Fan, Wei Chow, Shuicheng Yan, Tat-Seng Chua, Yueting Zhuang, Hanwang Zhang:
Auto-Encoding Morph-Tokens for Multimodal LLM. CoRR abs/2405.01926 (2024)
[i9]Kaihang Pan, Zhaoyu Fan, Juncheng Li, Qifan Yu, Hao Fei, Siliang Tang
, Richang Hong, Hanwang Zhang, Qianru Sun:
Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration. CoRR abs/2409.19872 (2024)
[i8]Wei Chow, Juncheng Li, Qifan Yu, Kaihang Pan, Hao Fei, Zhiqi Ge, Shuai Yang, Siliang Tang
, Hanwang Zhang, Qianru Sun:
Unified Generative and Discriminative Training for Multi-modal Large Language Models. CoRR abs/2411.00304 (2024)
[i7]Qifan Yu, Wei Chow, Zhongqi Yue, Kaihang Pan, Yang Wu, Xiaoyang Wan, Juncheng Li, Siliang Tang
, Hanwang Zhang, Yueting Zhuang:
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea. CoRR abs/2411.15738 (2024)
[i6]Haiyi Qiu, Minghe Gao, Long Qian, Kaihang Pan, Qifan Yu, Juncheng Li, Wenjie Wang, Siliang Tang
, Yueting Zhuang, Tat-Seng Chua:
STEP: Enhancing Video-LLMs' Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training. CoRR abs/2412.00161 (2024)
[i5]Zhiqi Ge, Juncheng Li, Xinglei Pang, Minghe Gao, Kaihang Pan, Wang Lin, Hao Fei, Wenqiao Zhang, Siliang Tang
, Yueting Zhuang:
Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining. CoRR abs/2412.10342 (2024)- 2023
[c1]Kaihang Pan, Juncheng Li, Hongye Song, Jun Lin, Xiaozhong Liu, Siliang Tang
:
Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization for Few-shot Generalization. EMNLP (Findings) 2023: 1059-1077
[i4]Kaihang Pan, Juncheng Li, Hongye Song, Jun Lin, Xiaozhong Liu, Siliang Tang
:
Meta-augmented Prompt Tuning for Better Few-shot Learning. CoRR abs/2303.12314 (2023)
[i3]Juncheng Li, Kaihang Pan, Zhiqi Ge, Minghe Gao, Hanwang Zhang, Wei Ji, Wenqiao Zhang, Tat-Seng Chua, Siliang Tang
, Yueting Zhuang:
Empowering Vision-Language Models to Follow Interleaved Vision-Language Instructions. CoRR abs/2308.04152 (2023)
[i2]Kaihang Pan, Juncheng Li, Hongye Song, Hao Fei, Wei Ji, Shuo Zhang, Jun Lin, Xiaozhong Liu, Siliang Tang
:
ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval. CoRR abs/2308.10025 (2023)
[i1]Dong Chen, Kaihang Pan, Guoming Wang, Yueting Zhuang, Siliang Tang
:
Improving Vision Anomaly Detection with the Guidance of Language Modality. CoRR abs/2310.02821 (2023)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2026-02-05 23:40 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







