default search action

combined dblp search
author search
venue search
publication search

ask others

Zhaokai Wang

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2025
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/pami/WangZYLLTDGLQD25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pami/WangZYLLTDGLQD25
Zhaokai Wang, Xizhou Zhu, Xue Yang, Gen Luo, Hao Li, Changyao Tian, Wenhan Dou, Junqi Ge, Lewei Lu, Yu Qiao, Jifeng Dai:
Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding. IEEE Trans. Pattern Anal. Mach. Intell. 47(11): 10142-10159 (2025)
[c14]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/acl/HuXYWX0YTZZLXWX25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HuXYWX0YTZZLXWX25
Xueyu Hu, Tao Xiong, Biao Yi, Zishu Wei, Ruixuan Xiao, Yurun Chen, Jiasheng Ye, Meiling Tao, Xiangxin Zhou, Ziyu Zhao, Yuhuai Li, Shengze Xu, Shenzhi Wang, Xinchen Xu, Shuofei Qiao, Zhaokai Wang, Kun Kuang, Tieyong Zeng, Liang Wang, Jiwei Li, Yuchen Eleanor Jiang, Wangchunshu Zhou, Guoyin Wang, Keting Yin, Zhou Zhao, Hongxia Yang, Fan Wu, Shengyu Zhang, Fei Wu:
OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use. ACL (1) 2025: 7436-7465
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/LuoYDWLD0Z25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/LuoYDWLD0Z25
Gen Luo, Xue Yang, Wenhan Dou, Zhaokai Wang, Jiawen Liu, Jifeng Dai, Yu Qiao, Xizhou Zhu:
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training. CVPR 2025: 24960-24971
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/LiTSZWZDWLLD25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/LiTSZWZDWLLD25
Hao Li, Changyao Tian, Jie Shao, Xizhou Zhu, Zhaokai Wang, Jinguo Zhu, Wenhan Dou, Xiao-Gang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai:
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding. CVPR 2025: 29767-29779
[c11]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/emnlp/TangQWZWMWZZZ25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/TangQWZWMWZZZ25
Yihong Tang, Ao Qu, Zhaokai Wang, Dingyi Zhuang, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao:
Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Spatial Reasoning. EMNLP (Findings) 2025: 4083-4103
[c10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/ismir/WangBZHYTHL25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ismir/WangBZHYTHL25
Zhaokai Wang, Chenxi Bao, Le Zhuo, Jingrui Han, Yang Yue, Yihong Tang, Victor Shea-Jay Huang, Yue Liao:
A Survey on Vision-to-Music Generation: Methods, Datasets, Evaluation, and Challenges. ISMIR 2025: 223-234
[i19]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2501-07783
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2501-07783
Zhaokai Wang, Xizhou Zhu, Xue Yang, Gen Luo, Hao Li, Changyao Tian, Wenhan Dou, Junqi Ge, Lewei Lu, Yu Qiao, Jifeng Dai:
Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding. CoRR abs/2501.07783 (2025)
[i18]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-07050
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-07050
Victor Shea-Jay Huang, Le Zhuo, Yi Xin, Zhaokai Wang, Peng Gao, Hongsheng Li:
TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation. CoRR abs/2503.07050 (2025)
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-21254
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-21254
Zhaokai Wang, Chenxi Bao, Le Zhuo, Jingrui Han, Yang Yue, Yihong Tang, Victor Shea-Jay Huang, Yue Liao:
Vision-to-Music Generation: A Survey. CoRR abs/2503.21254 (2025)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2504-13837
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2504-13837
Yang Yue, Zhiqi Chen, Rui Lu, Andrew Zhao, Zhaokai Wang, Yang Yue, Shiji Song, Gao Huang:
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? CoRR abs/2504.13837 (2025)
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2507-12566
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2507-12566
Gen Luo, Wenhan Dou, Wenhao Li, Zhaokai Wang, Xue Yang, Changyao Tian, Hao Li, Weiyun Wang, Wenhai Wang, Xizhou Zhu, Yu Qiao, Jifeng Dai:
Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models. CoRR abs/2507.12566 (2025)
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2508-04482
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2508-04482
Xueyu Hu, Tao Xiong, Biao Yi, Zishu Wei, Ruixuan Xiao, Yurun Chen, Jiasheng Ye, Meiling Tao, Xiangxin Zhou, Ziyu Zhao, Yuhuai Li, Shengze Xu, Shenzhi Wang, Xinchen Xu, Shuofei Qiao, Zhaokai Wang, Kun Kuang, Tieyong Zeng, Liang Wang, Jiwei Li, Yuchen Eleanor Jiang, Wangchunshu Zhou, Guoyin Wang, Keting Yin, Zhou Zhao, Hongxia Yang, Fan Wu, Shengyu Zhang, Fei Wu:
OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use. CoRR abs/2508.04482 (2025)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2508-18265
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2508-18265
Weiyun Wang, Zhangwei Gao, Lixin Gu, Hengjun Pu, Long Cui, Xingguang Wei, Zhaoyang Liu, Linglin Jing, Shenglong Ye, Jie Shao, Zhaokai Wang, Zhe Chen, Hongjie Zhang, Ganlin Yang, Haomin Wang, Qi Wei, Jinhui Yin, Wenhao Li, Erfei Cui, Guanzhou Chen, Zichen Ding, Changyao Tian, Zhenyu Wu, JingJing Xie, Zehao Li, Bowen Yang, Yuchen Duan, Xuehui Wang, Zhi Hou, Haoran Hao, Tianyi Zhang, Songze Li, Xiangyu Zhao, Haodong Duan, Nianchen Deng, Bin Fu, Yinan He, Yi Wang, Conghui He, Botian Shi, Junjun He, Yingtong Xiong, Han Lv, Lijun Wu, Wenqi Shao, Kaipeng Zhang, Huipeng Deng, Biqing Qi, Jiaye Ge, Qipeng Guo, Wenwei Zhang, Songyang Zhang, Maosong Cao, Junyao Lin, Kexian Tang, Jianfei Gao, Haian Huang, Yuzhe Gu, Chengqi Lyu, Huanze Tang, Rui Wang, Haijun Lv, Wanli Ouyang, Limin Wang, Min Dou, Xizhou Zhu, Tong Lu, Dahua Lin, Jifeng Dai, Weijie Su, Bowen Zhou, Kai Chen, Yu Qiao, Wenhai Wang, Gen Luo:
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency. CoRR abs/2508.18265 (2025)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-14232
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-14232
Zhaokai Wang, Penghao Yin, Xiangyu Zhao, Changyao Tian, Yu Qiao, Wenhai Wang, Jifeng Dai, Gen Luo:
GenExam: A Multidisciplinary Text-to-Image Exam. CoRR abs/2509.14232 (2025)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-12126
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-12126
Zhenxin Lei, Zhangwei Gao, Changyao Tian, Erfei Cui, Guanzhou Chen, Danni Yang, Yuchen Duan, Zhaokai Wang, Wenhao Li, Weiyun Wang, Xiangyu Zhao, Jiayi Ji, Yu Qiao, Wenhai Wang, Gen Luo:
MetaCaptioner: Towards Generalist Visual Captioning with Open-source Suites. CoRR abs/2510.12126 (2025)
2024
[j3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/staeors/TangJZDWZZG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/staeors/TangJZDWZZG24
Feixiang Tang, Yifei Ji, Yongsheng Zhang, Zhen Dong, Zhaokai Wang, Qingjun Zhang, Bingji Zhao, Heli Gao:
Drifting Ionospheric Scintillation Simulation for L-Band Geosynchronous SAR. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 17: 842-854 (2024)
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/Li0WZZQWLLD24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/Li0WZZQWLLD24
Hao Li, Xue Yang, Zhaokai Wang, Xizhou Zhu, Jie Zhou, Yu Qiao, Xiaogang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai:
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft. CVPR 2024: 16426-16435
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/TangWQYWZKHGZZM24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/TangWQYWZKHGZZM24
Yihong Tang, Zhaokai Wang, Ao Qu, Yihao Yan, Zhaofeng Wu, Dingyi Zhuang, Jushi Kai, Kebing Hou, Xiaotong Guo, Jinhua Zhao, Zhan Zhao, Wei Ma:
ItiNera: Integrating Spatial Optimization with Large Language Models for Open-domain Urban Itinerary Planning. EMNLP (Industry Track) 2024: 1413-1432
[c7]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/Zhu0W0DGL0D24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Zhu0W0DGL0D24
Xizhou Zhu, Xue Yang, Zhaokai Wang, Hao Li, Wenhan Dou, Junqi Ge, Lewei Lu, Yu Qiao, Jifeng Dai:
Parameter-Inverted Image Pyramid Networks. NeurIPS 2024
[i10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-07204
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-07204
Yihong Tang, Zhaokai Wang, Ao Qu, Yihao Yan, Kebing Hou, Dingyi Zhuang, Xiaotong Guo, Jinhua Zhao, Zhan Zhao, Wei Ma:
Synergizing Spatial Optimization with Large Language Models for Open-Domain Urban Itinerary Planning. CoRR abs/2402.07204 (2024)
[i9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-04330
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-04330
Xizhou Zhu, Xue Yang, Zhaokai Wang, Hao Li, Wenhan Dou, Junqi Ge, Lewei Lu, Yu Qiao, Jifeng Dai:
Parameter-Inverted Image Pyramid Networks. CoRR abs/2406.04330 (2024)
[i8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-08202
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-08202
Gen Luo, Xue Yang, Wenhan Dou, Zhaokai Wang, Jifeng Dai, Yu Qiao, Xizhou Zhu:
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training. CoRR abs/2410.08202 (2024)
[i7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-16162
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-16162
Yihong Tang, Ao Qu, Zhaokai Wang, Dingyi Zhuang, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao:
Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Composite Spatial Reasoning. CoRR abs/2410.16162 (2024)
[i6]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-09428
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-09428
Baisen Wang, Le Zhuo, Zhaokai Wang, Chenxi Bao, Wu Chengjing, Xuecheng Nie, Jiao Dai, Jizhong Han, Yue Liao, Si Liu:
Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation. CoRR abs/2412.09428 (2024)
[i5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-09604
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-09604
Hao Li, Changyao Tian, Jie Shao, Xizhou Zhu, Zhaokai Wang, Jinguo Zhu, Wenhan Dou, Xiaogang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai:
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding. CoRR abs/2412.09604 (2024)
2023
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/ZhuoWWLBPHZFL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/ZhuoWWLBPHZFL23
Le Zhuo, Zhaokai Wang, Baisen Wang, Yue Liao, Chenxi Bao, Stanley Peng, Songhao Han, Aixi Zhang, Fei Fang, Si Liu:
Video Background Music Generation: Dataset, Method and Evaluation. ICCV 2023: 15591-15601
[i4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-09238
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-09238
Hao Li, Xue Yang, Zhaokai Wang, Xizhou Zhu, Jie Zhou, Yu Qiao, Xiaogang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai:
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft. CoRR abs/2312.09238 (2023)
2022
[i3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-11248
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-11248
Le Zhuo, Zhaokai Wang, Baisen Wang, Yue Liao, Stanley Peng, Chenxi Bao, Miao Lu, Xiaobo Li, Si Liu:
Video Background Music Generation: Dataset, Method and Evaluation. CoRR abs/2211.11248 (2022)
2021
[c5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/WangBWL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/WangBWL21
Zhaokai Wang, Renda Bao, Qi Wu, Si Liu:
Confidence-aware Non-repetitive Multimodal Transformers for TextCaps. AAAI 2021: 2835-2843
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/DiJ0WZHLY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/DiJ0WZHLY21
Shangzhe Di, Zeren Jiang, Si Liu, Zhaokai Wang, Leyan Zhu, Zexin He, Hongming Liu, Shuicheng Yan:
Video Background Music Generation with Controllable Music Transformer. ACM Multimedia 2021: 2037-2045
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2111-08380
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-08380
Shangzhe Di, Zeren Jiang, Si Liu, Zhaokai Wang, Leyan Zhu, Zexin He, Hongming Liu, Shuicheng Yan:
Video Background Music Generation with Controllable Music Transformer. CoRR abs/2111.08380 (2021)
2020
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2012-03662
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2012-03662
Zhaokai Wang, Renda Bao, Qi Wu, Si Liu:
Confidence-aware Non-repetitive Multimodal Transformers for TextCaps. CoRR abs/2012.03662 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[j2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/access/SuXRGLWX19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/access/SuXRGLWX19
Shubin Su, Limin Xiao, Li Ruan, Fei Gu, Shupan Li, Zhaokai Wang, Rongbin Xu:
An Efficient Density-Based Local Outlier Detection Approach for Scattered Data. IEEE Access 7: 1006-1020 (2019)
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/jvcir/YanXZXRWZ19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jvcir/YanXZXRWZ19
Baicheng Yan, Limin Xiao, Hang Zhang, Daliang Xu, Li Ruan, Zhaokai Wang, Yiyang Zhang:
An adaptive template matching-based single object tracking algorithm with parallel acceleration. J. Vis. Commun. Image Represent. 64 (2019)
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/cluster/YanZXHW19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cluster/YanZXHW19
Baicheng Yan, Yi Zhou, Limin Xiao, Jiantong Huo, Zhaokai Wang:
LogGOPSC: A Parallel Computation Model Extending Network Contention into LogGOPS. CLUSTER 2019: 1-2
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/ijcnn/WangXXSLS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcnn/WangXXSLS19
Zhaokai Wang, Limin Xiao, Rongbin Xu, Shubin Su, Shupan Li, Yao Song:
Deeper Monocular Depth Prediction via Long and Short Skip Connection. IJCNN 2019: 1-7
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/ispa/SuXRXLWHL19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ispa/SuXRXLWHL19
Shubin Su, Limin Xiao, Li Ruan, Rongbin Xu, Shupan Li, Zhaokai Wang, Qigong He, Wei Li:
ADCMO: An Anomaly Detection Approach Based on Local Outlier Factor for Continuously Monitored Object. ISPA/BDCloud/SocialCom/SustainCom 2019: 865-874

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.