


default search action
Limin Wang 0002
王利民
Person information
- affiliation: Nanjing University, State Key Laboratory for Novel Software Technology, China
- affiliation (former): ETH Zurich, Computer Vision Laboratory, Switzerland
- affiliation (former): Chinese University of Hong Kong, Department of Information Engineeing, China
- affiliation (former): Chinese Academy of Sciences, Shenzhen Institutes of Advanced Technology, China
- unicode name: 王利民
Other persons with the same name
- Limin Wang — disambiguation page
- Limin Wang 0001 — London School of Economics, UK
- Limin Wang 0003
— Hainan Normal University, School of Mathematics and Statistics, Haikou, China
- Limin Wang 0004
— Chongqing Jiaotong University, School of Economic and Management, China
- Limin Wang 0005
— Ministry of Agriculture, Key Laboratory of Agri-informatics, Beijing, China (and 1 more)
- Limin Wang 0006
— Chinese Academy of Sciences, State Key Laboratory of Multiphase Complex Systems, Beijing, China
- Limin Wang 0007
— Jilin University, Changchun, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2025
- [j34]Chen Xu, Yuhan Zhu, Haocheng Shen, Boheng Chen, Yixuan Liao, Xiaoxin Chen, Limin Wang:
Progressive Visual Prompt Learning with Contrastive Feature Re-formation. Int. J. Comput. Vis. 133(2): 511-526 (2025) - [i143]Xinhao Li, Yi Wang, Jiashuo Yu, Xiangyu Zeng, Yuhan Zhu, Haian Huang, Jianfei Gao, Kunchang Li, Yinan He, Chenting Wang, Yu Qiao, Yali Wang, Limin Wang:
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling. CoRR abs/2501.00574 (2025) - 2024
- [j33]Fengyuan Shi
, Weilin Huang, Limin Wang
:
End-to-end dense video grounding via parallel regression. Comput. Vis. Image Underst. 242: 103980 (2024) - [j32]Jun Tu, Gangshan Wu, Limin Wang
:
Dual Graph Networks for Pose Estimation in Crowded Scenes. Int. J. Comput. Vis. 132(3): 633-653 (2024) - [j31]Liang Zhao, Yao Teng, Limin Wang
:
Logit Normalization for Long-Tail Object Detection. Int. J. Comput. Vis. 132(6): 2114-2134 (2024) - [j30]Jintao Lin, Zhaoyang Liu, Wenhai Wang, Wayne Wu, Limin Wang
:
VLG: General Video Recognition with Web Textual Knowledge. Int. J. Comput. Vis. 132(10): 4792-4817 (2024) - [j29]Fengyuan Shi
, Ruopeng Gao
, Weilin Huang
, Limin Wang
:
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding. IEEE Trans. Pattern Anal. Mach. Intell. 46(2): 1181-1198 (2024) - [j28]Haisong Liu
, Tao Lu
, Yihui Xu
, Jia Liu
, Limin Wang
:
Learning Optical Flow and Scene Flow With Bidirectional Camera-LiDAR Fusion. IEEE Trans. Pattern Anal. Mach. Intell. 46(4): 2378-2395 (2024) - [j27]Yutao Cui
, Cheng Jiang
, Gangshan Wu
, Limin Wang
:
MixFormer: End-to-End Tracking With Iterative Mixed Attention. IEEE Trans. Pattern Anal. Mach. Intell. 46(6): 4129-4146 (2024) - [j26]Tao Wu
, Mengqi Cao
, Ziteng Gao
, Gangshan Wu
, Limin Wang
:
STMixer: A One-Stage Sparse Action Detector. IEEE Trans. Pattern Anal. Mach. Intell. 46(10): 6842-6857 (2024) - [j25]Yixuan Li
, Zhenzhi Wang
, Zhifeng Li
, Limin Wang
:
Sparse Action Tube Detection. IEEE Trans. Image Process. 33: 1740-1752 (2024) - [j24]Yuer Ma
, Yi Liu
, Limin Wang
, Wenxiong Kang
, Yu Qiao
, Yali Wang
:
Dual Masked Modeling for Weakly-Supervised Temporal Boundary Discovery. IEEE Trans. Multim. 26: 5694-5704 (2024) - [c110]Fengyuan Shi, Jiaxi Gu, Hang Xu, Songcen Xu, Wei Zhang, Limin Wang:
BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models. CVPR 2024: 7393-7402 - [c109]Zhiyu Zhao, Bingkun Huang, Sen Xing, Gangshan Wu, Yu Qiao, Limin Wang:
Asymmetric Masked Distillation for Pre-Training Small Foundation Models. CVPR 2024: 18516-18526 - [c108]Tao Wu, Runyu He, Gangshan Wu, Limin Wang:
SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos. CVPR 2024: 18537-18546 - [c107]Yuhan Zhu, Guozhen Zhang, Jing Tan, Gangshan Wu, Limin Wang:
Dual DETRs for Multi-Label Temporal Action Detection. CVPR 2024: 18559-18569 - [c106]Min Yang, Huan Gao, Ping Guo, Limin Wang:
Adapting Short-Term Transformers for Action Detection in Untrimmed Videos. CVPR 2024: 18570-18579 - [c105]Chunxu Liu, Guozhen Zhang, Rui Zhao, Limin Wang:
Sparse Global Matching for Video Frame Interpolation with Large Motion. CVPR 2024: 19125-19134 - [c104]Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua Lin, Bo Dai:
Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering. CVPR 2024: 20654-20664 - [c103]Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu:
VBench: Comprehensive Benchmark Suite for Video Generative Models. CVPR 2024: 21807-21818 - [c102]Yifei Huang, Guo Chen, Jilan Xu, Mingfang Zhang, Lijin Yang, Baoqi Pei, Hongjie Zhang, Lu Dong, Yali Wang, Limin Wang, Yu Qiao:
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World. CVPR 2024: 22072-22086 - [c101]Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Yi Liu, Zun Wang, Jilan Xu, Guo Chen, Ping Lou, Limin Wang, Yu Qiao:
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark. CVPR 2024: 22195-22206 - [c100]Haisong Liu
, Yang Chen, Haiguang Wang, Zetong Yang, Tianyu Li, Jia Zeng, Li Chen, Hongyang Li, Limin Wang
:
Fully Sparse 3D Occupancy Prediction. ECCV (25) 2024: 54-71 - [c99]Kunchang Li
, Xinhao Li
, Yi Wang, Yinan He
, Yali Wang
, Limin Wang
, Yu Qiao
:
VideoMamba: State Space Model for Efficient Video Understanding. ECCV (26) 2024: 237-255 - [c98]Chen Xu
, Tianhui Song
, Weixin Feng
, Xubin Li
, Tiezheng Ge
, Bo Zheng
, Limin Wang
:
Accelerating Image Generation with Sub-path Linear Approximation Model. ECCV (53) 2024: 323-339 - [c97]Yutao Cui
, Xiaotong Zhao
, Guozhen Zhang, Shengming Cao
, Kai Ma
, Limin Wang
:
StableDrag: Stable Dragging for Point-Based Image Editing. ECCV (58) 2024: 340-356 - [c96]Yi Wang, Kunchang Li, Xinhao Li, Jiashuo Yu, Yinan He, Guo Chen, Baoqi Pei, Rongkun Zheng, Zun Wang, Yansong Shi, Tianxiang Jiang, Songze Li, Jilan Xu, Hongjie Zhang, Yifei Huang, Yu Qiao, Yali Wang, Limin Wang:
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding. ECCV (85) 2024: 396-416 - [c95]Xinhao Li
, Yuhan Zhu
, Limin Wang
:
ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video. ECCV (83) 2024: 425-443 - [c94]Ziteng Gao, Zhan Tong, Limin Wang, Mike Zheng Shou:
SparseFormer: Sparse Visual Recognition via Limited Latent Tokens. ICLR 2024 - [c93]Yi Wang, Yinan He, Yizhuo Li, Kunchang Li, Jiashuo Yu, Xin Ma, Xinhao Li, Guo Chen, Xinyuan Chen, Yaohui Wang, Ping Luo, Ziwei Liu, Yali Wang, Limin Wang, Yu Qiao:
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation. ICLR 2024 - [c92]Qingsong Zhao, Yi Wang, Jilan Xu, Yinan He, Zifan Song, Limin Wang, Yu Qiao, Cairong Zhao:
Does Video-Text Pretraining Help Open-Vocabulary Online Action Detection? NeurIPS 2024 - [i142]Chaochao Lu, Chen Qian, Guodong Zheng, Hongxing Fan, Hongzhi Gao, Jie Zhang, Jing Shao, Jingyi Deng, Jinlan Fu, Kexin Huang, Kunchang Li, Lijun Li, Limin Wang, Lu Sheng
, Meiqi Chen, Ming Zhang, Qibing Ren, Sirui Chen, Tao Gui, Wanli Ouyang, Yali Wang, Yan Teng, Yaru Wang, Yi Wang, Yinan He, Yingchun Wang, Yixu Wang, Yongting Zhang, Yu Qiao, Yujiong Shen, Yurong Mou, Yuxi Chen, Zaibin Zhang, Zhelun Shi, Zhenfei Yin, Zhipin Wang:
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities. CoRR abs/2401.15071 (2024) - [i141]Yutao Cui, Xiaotong Zhao, Guozhen Zhang, Shengming Cao, Kai Ma, Limin Wang:
StableDrag: Stable Dragging for Point-based Image Editing. CoRR abs/2403.04437 (2024) - [i140]Jiange Yang, Bei Liu, Jianlong Fu, Bocheng Pan, Gangshan Wu, Limin Wang:
Spatiotemporal Predictive Pre-training for Robotic Motor Control. CoRR abs/2403.05304 (2024) - [i139]Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, Yu Qiao:
VideoMamba: State Space Model for Efficient Video Understanding. CoRR abs/2403.06977 (2024) - [i138]Guo Chen, Yifei Huang, Jilan Xu, Baoqi Pei, Zhe Chen, Zhiqi Li, Jiahao Wang, Kunchang Li, Tong Lu, Limin Wang:
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding. CoRR abs/2403.09626 (2024) - [i137]Yi Wang, Kunchang Li, Xinhao Li, Jiashuo Yu, Yinan He, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, Songze Li, Hongjie Zhang, Yifei Huang, Yu Qiao, Yali Wang, Limin Wang:
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding. CoRR abs/2403.15377 (2024) - [i136]Yifei Huang, Guo Chen, Jilan Xu, Mingfang Zhang, Lijin Yang, Baoqi Pei, Hongjie Zhang, Lu Dong, Yali Wang, Limin Wang, Yu Qiao:
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World. CoRR abs/2403.16182 (2024) - [i135]Ruopeng Gao, Yijun Zhang, Limin Wang:
Multiple Object Tracking as ID Prediction. CoRR abs/2403.16848 (2024) - [i134]Yuhan Zhu, Guozhen Zhang, Jing Tan, Gangshan Wu, Limin Wang:
Dual DETRs for Multi-Label Temporal Action Detection. CoRR abs/2404.00653 (2024) - [i133]Tao Wu, Runyu He, Gangshan Wu, Limin Wang:
SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos. CoRR abs/2404.04565 (2024) - [i132]Tao Wu, Mengqi Cao, Ziteng Gao, Gangshan Wu, Limin Wang:
STMixer: A One-Stage Sparse Action Detector. CoRR abs/2404.09842 (2024) - [i131]Chen Xu, Tianhui Song, Weixin Feng, Xubin Li, Tiezheng Ge, Bo Zheng, Limin Wang:
Accelerating Image Generation with Sub-path Linear Approximation Model. CoRR abs/2404.13903 (2024) - [i130]Tao Wu, Shuqiu Ge, Jie Qin, Gangshan Wu, Limin Wang:
Open-Vocabulary Spatio-Temporal Action Detection. CoRR abs/2405.10832 (2024) - [i129]Qingyun Li, Zhe Chen, Weiyun Wang, Wenhai Wang, Shenglong Ye, Zhenjiang Jin, Guanzhou Chen, Yinan He, Zhangwei Gao, Erfei Cui, Jiashuo Yu, Hao Tian, Jiasheng Zhou, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Zhenxiang Li, Pei Chu, Yi Wang, Min Dou, Changyao Tian, Xizhou Zhu, Lewei Lu, Yushi Chen, Junjun He, Zhongying Tu, Tong Lu, Yali Wang, Limin Wang, Dahua Lin, Yu Qiao, Botian Shi, Conghui He, Jifeng Dai:
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text. CoRR abs/2406.08418 (2024) - [i128]Baoqi Pei, Guo Chen, Jilan Xu, Yuping He, Yicheng Liu, Kanghua Pan, Yifei Huang, Yali Wang, Tong Lu, Limin Wang, Yu Qiao:
EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation. CoRR abs/2406.18070 (2024) - [i127]Guozhen Zhang, Chunxu Liu, Yutao Cui, Xiaotong Zhao, Kai Ma, Limin Wang:
VFIMamba: Video Frame Interpolation with State Space Models. CoRR abs/2407.02315 (2024) - [i126]Xinhao Li, Zhenpeng Huang, Jing Wang, Kunchang Li, Limin Wang:
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model. CoRR abs/2407.06491 (2024) - [i125]Yisen Wang, Yao Teng, Limin Wang:
CycleHOI: Improving Human-Object Interaction Detection with Cycle Consistency of Detection and Generation. CoRR abs/2407.11433 (2024) - [i124]Yuhan Zhu, Guozhen Zhang, Chen Xu, Haocheng Shen, Xiaoxin Chen, Gangshan Wu, Limin Wang:
Efficient Test-Time Prompt Tuning for Vision-Language Models. CoRR abs/2408.05775 (2024) - [i123]Guozhen Zhang, Jingyu Liu, Shengming Cao, Xiaotong Zhao, Kevin Zhao, Kai Ma, Limin Wang:
Dynamic and Compressive Adaptation of Transformers From Images to Videos. CoRR abs/2408.06840 (2024) - [i122]Xiangyu Zeng, Kunchang Li, Chenting Wang, Xinhao Li, Tianxiang Jiang, Ziang Yan, Songze Li, Yansong Shi, Zhengrong Yue, Yi Wang, Yali Wang, Yu Qiao, Limin Wang:
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning. CoRR abs/2410.19702 (2024) - [i121]Shuai Wang, Zexian Li, Tianhui Song, Xubin Li, Tiezheng Ge, Bo Zheng, Limin Wang:
FlowDCN: Exploring DCN-like Architectures for Fast Image Generation with Arbitrary Resolution. CoRR abs/2410.22655 (2024) - [i120]Ziqi Huang, Fan Zhang, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu:
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models. CoRR abs/2411.13503 (2024) - [i119]Jun Zhang, Desen Meng, Ji Qi, Zhenpeng Huang, Tao Wu, Limin Wang:
p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay. CoRR abs/2412.04449 (2024) - [i118]Zun Wang, Jialu Li, Yicong Hong, Songze Li, Kunchang Li, Shoubin Yu, Yi Wang, Yu Qiao, Yali Wang, Mohit Bansal, Limin Wang:
Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel. CoRR abs/2412.08467 (2024) - [i117]Guo Chen, Yicheng Liu, Yifei Huang, Yuping He, Baoqi Pei, Jilan Xu, Yali Wang, Tong Lu, Limin Wang:
CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding. CoRR abs/2412.12075 (2024) - [i116]Ziang Yan, Zhilin Li, Yinan He, Chenting Wang, Kunchang Li, Xinhao Li, Xiangyu Zeng, Zilei Wang, Yali Wang, Yu Qiao, Limin Wang, Yi Wang:
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment. CoRR abs/2412.19326 (2024) - [i115]Yifei Huang, Jilan Xu, Baoqi Pei, Yuping He, Guo Chen, Lijin Yang, Xinyuan Chen, Yaohui Wang, Zheng Nie, Jinyao Liu, Guoshun Fan, Dechen Lin, Fang Fang, Kunpeng Li, Chang Yuan, Yali Wang, Yu Qiao, Limin Wang:
Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model. CoRR abs/2412.21080 (2024) - 2023
- [j23]Min Yang, Guo Chen, Yin-Dong Zheng
, Tong Lu, Limin Wang
:
BasicTAD: An astounding RGB-Only baseline for temporal action detection. Comput. Vis. Image Underst. 232: 103692 (2023) - [j22]Zuxian Huang
, Gangshan Wu
, Limin Wang
:
Webly-supervised semantic segmentation via curriculum learning. Comput. Vis. Image Underst. 236: 103810 (2023) - [j21]Ziteng Gao, Limin Wang
, Gangshan Wu:
LIP: Local Importance-Based Pooling. Int. J. Comput. Vis. 131(1): 363-384 (2023) - [j20]Jing Tan
, Yuhong Wang
, Gangshan Wu
, Limin Wang
:
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(10): 12506-12520 (2023) - [j19]Yating Tian
, Hongwen Zhang
, Yebin Liu
, Limin Wang
:
Recovering 3D Human Mesh From Monocular Images: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 45(12): 15406-15425 (2023) - [j18]Tao Lu, Chunxu Liu, Youxin Chen, Gangshan Wu
, Limin Wang
:
APP-Net: Auxiliary-Point-Based Push and Pull Operations for Efficient Point Cloud Recognition. IEEE Trans. Image Process. 32: 6500-6513 (2023) - [c91]Jiange Yang, Sheng Guo, Gangshan Wu, Limin Wang:
CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets. AAAI 2023: 3145-3154 - [c90]Tao Lu, Xiang Ding, Haisong Liu, Gangshan Wu, Limin Wang:
LinK: Linear Kernel for LiDAR-based 3D Perception. CVPR 2023: 1105-1115 - [c89]Guozhen Zhang, Yuhan Zhu, Haonan Wang, Youxin Chen, Gangshan Wu, Limin Wang:
Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation. CVPR 2023: 5682-5692 - [c88]Limin Wang, Bingkun Huang, Zhiyu Zhao, Zhan Tong, Yinan He, Yi Wang, Yali Wang, Yu Qiao:
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking. CVPR 2023: 14549-14560 - [c87]Tao Wu
, Mengqi Cao, Ziteng Gao, Gangshan Wu, Limin Wang:
STMixer: A One-Stage Sparse Action Detector. CVPR 2023: 14720-14729 - [c86]Hanlin Wang, Yilu Wu, Sheng Guo, Limin Wang:
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos. CVPR 2023: 14836-14845 - [c85]Miao Cheng
, Limin Wang
:
Graph Routes From Local and Global Entrances. ICBDT 2023: 314-318 - [c84]Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Limin Wang, Yu Qiao:
UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding. ICCV 2023: 1632-1643 - [c83]Shuai Wang, Yao Teng, Limin Wang:
Deep Equilibrium Object Detection. ICCV 2023: 6273-6283 - [c82]Yao Teng, Haisong Liu, Sheng Guo, Limin Wang:
StageInteractor: Query-based Object Detector with Cross-stage Interaction. ICCV 2023: 6554-6565 - [c81]Ruopeng Gao, Limin Wang:
MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking. ICCV 2023: 9867-9876 - [c80]Yutao Cui, Chenkai Zeng, Xiaoyu Zhao, Yichun Yang, Gangshan Wu, Limin Wang:
SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes. ICCV 2023: 9887-9897 - [c79]Lei Chen, Zhan Tong, Yibing Song, Gangshan Wu, Limin Wang:
Efficient Video Action Detection with Token Dropout and Context Refinement. ICCV 2023: 10354-10365 - [c78]Bingkun Huang, Zhiyu Zhao, Guozhen Zhang, Yu Qiao
, Limin Wang:
MGMAE: Motion Guided Masking for Video Masked Autoencoding. ICCV 2023: 13447-13458 - [c77]Jiahao Wang, Guo Chen, Yifei Huang, Limin Wang, Tong Lu:
Memory-and-Anticipation Transformer for Online Action Understanding. ICCV 2023: 13778-13789 - [c76]Haisong Liu, Yao Teng, Tao Lu, Haiguang Wang, Limin Wang:
SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos. ICCV 2023: 18534-18544 - [c75]Kunchang Li, Yali Wang, Yizhuo Li, Yi Wang, Yinan He, Limin Wang, Yu Qiao:
Unmasked Teacher: Towards Training-Efficient Video Foundation Models. ICCV 2023: 19891-19903 - [c74]Matej Kristan, Jirí Matas, Martin Danelljan, Michael Felsberg, Hyung Jin Chang, Luka Cehovin Zajc, Alan Lukezic, Ondrej Drbohlav, Zhongqun Zhang, Khanh-Tung Tran, Xuan-Son Vu, Johanna Björklund, Christoph Mayer, Yushan Zhang, Lei Ke, Jie Zhao, Gustavo Fernández, Noor Al-Shakarji, Dong An, Michael Arens, Stefan Becker, Goutam Bhat, Sebastian Bullinger, Antoni B. Chan, Shijie Chang, Hanyuan Chen, Xin Chen, Yan Chen, Zhenyu Chen, Yangming Cheng, Yutao Cui, Chunyuan Deng, Jiahua Dong, Matteo Dunnhofer, Wei Feng, Jianlong Fu, Jie Gao, Ruize Han, Zeqi Hao, Jun-Yan He, Keji He, Zhenyu He, Xiantao Hu, Kaer Huang, Yuqing Huang, Yi Jiang, Ben Kang, Jin-Peng Lan, Hyungjun Lee, Chenyang Li, Jiahao Li, Ning Li, Wangkai Li, Xiaodi Li, Xin Li, Pengyu Liu, Yue Liu, Huchuan Lu, Bin Luo, Ping Luo, Yinchao Ma, Deshui Miao, Christian Micheloni, Kannappan Palaniappan, Hancheol Park, Matthieu Paul, Houwen Peng, Zekun Qian, Gani Rahmon, Norbert Scherer-Negenborn, Pengcheng Shao, Wooksu Shin, Elham Soltani Kazemi, Tianhui Song, Rainer Stiefelhagen, Rui Sun, Chuanming Tang, Zhangyong Tang, Imad Eddine Toubal, Jack Valmadre, Joost van de Weijer, Luc Van Gool, Jash Vira, Stéphane Vujasinovic, Cheng Wan, Jia Wan, Dong Wang, Fei Wang, Feifan Wang, He Wang, Limin Wang, Song Wang, Yaowei Wang, Zhepeng Wang, Gangshan Wu, Jiannan Wu, Qiangqiang Wu
, Xiaojun Wu, Anqi Xiao, Jinxia Xie, Chenlong Xu, Min Xu, Tianyang Xu, Yuanyou Xu, Bin Yan, Dawei Yang, Ming-Hsuan Yang, Tianyu Yang, Yi Yang, Zongxin Yang, Xuanwu Yin, Fisher Yu, Hongyuan Yu, Qianjin Yu, Weichen Yu, Yongsheng Yuan, Zehuan Yuan, Jianlin Zhang, Lu Zhang, Tianzhu Zhang, Guodongfang Zhao, Shaochuan Zhao, Yaozong Zheng, Bineng Zhong, Jiawen Zhu, Xuefeng Zhu, Yueting Zhuang, ChengAo Zong, Kunlong Zuo:
The First Visual Object Tracking Segmentation VOTS2023 Challenge Results. ICCV (Workshops) 2023: 1788-1810 - [c73]Haoyue Cheng, Zhaoyang Liu, Wayne Wu, Limin Wang:
Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation. ICLR 2023 - [c72]Yue Feng
, Zhengye Zhang
, Rong Quan
, Limin Wang
, Jie Qin
:
RefineTAD: Learning Proposal-free Refinement for Temporal Action Detection. ACM Multimedia 2023: 135-143 - [c71]Hongjie Zhang
, Yi Liu
, Yali Wang
, Limin Wang
, Yu Qiao
:
Learning Discriminative Feature Representation for Open Set Action Recognition. ACM Multimedia 2023: 7696-7705 - [c70]Yutao Cui, Tianhui Song, Gangshan Wu, Limin Wang:
MixFormerV2: Efficient Fully Transformer Tracking. NeurIPS 2023 - [c69]Keqiang Sun, Junting Pan, Yuying Ge, Hao Li, Haodong Duan, Xiaoshi Wu, Renrui Zhang, Aojun Zhou, Zipeng Qin, Yi Wang, Jifeng Dai, Yu Qiao, Limin Wang, Hongsheng Li:
JourneyDB: A Benchmark for Generative Image Understanding. NeurIPS 2023 - [i114]Yutao Cui, Cheng Jiang, Gangshan Wu, Limin Wang:
MixFormer: End-to-End Tracking with Iterative Mixed Attention. CoRR abs/2302.02814 (2023) - [i113]Jiange Yang, Sheng Guo, Gangshan Wu, Limin Wang:
CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets. CoRR abs/2302.06148 (2023) - [i112]Guozhen Zhang, Yuhan Zhu, Haonan Wang, Youxin Chen, Gangshan Wu, Limin Wang:
Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation. CoRR abs/2303.00440 (2023) - [i111]Haisong Liu, Tao Lu, Yihui Xu, Jia Liu, Limin Wang:
Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion. CoRR abs/2303.12017 (2023) - [i110]Hanlin Wang, Yilu Wu, Sheng Guo, Limin Wang:
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos. CoRR abs/2303.14676 (2023) - [i109]Tao Wu, Mengqi Cao, Ziteng Gao, Gangshan Wu, Limin Wang:
STMixer: A One-Stage Sparse Action Detector. CoRR abs/2303.15879 (2023) - [i108]Kunchang Li, Yali Wang, Yizhuo Li, Yi Wang, Yinan He, Limin Wang, Yu Qiao:
Unmasked Teacher: Towards Training-Efficient Video Foundation Models. CoRR abs/2303.16058 (2023) - [i107]Tao Lu, Xiang Ding, Haisong Liu, Gangshan Wu, Limin Wang:
LinK: Linear Kernel for LiDAR-based 3D Perception. CoRR abs/2303.16094 (2023) - [i106]Lei Chen, Zhan Tong, Yibing Song, Gangshan Wu, Limin Wang:
CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection. CoRR abs/2303.16118 (2023) - [i105]Limin Wang, Bingkun Huang, Zhiyu Zhao, Zhan Tong, Yinan He, Yi Wang, Yali Wang, Yu Qiao:
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking. CoRR abs/2303.16727 (2023) - [i104]Ziteng Gao, Zhan Tong, Limin Wang, Mike Zheng Shou:
SparseFormer: Sparse Visual Recognition via Limited Latent Tokens. CoRR abs/2304.03768 (2023) - [i103]Yao Teng, Haisong Liu, Sheng Guo, Limin Wang:
StageInteractor: Query-based Object Detector with Cross-stage Interaction. CoRR abs/2304.04978 (2023) - [i102]Yutao Cui, Chenkai Zeng, Xiaoyu Zhao, Yichun Yang, Gangshan Wu, Limin Wang:
SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes. CoRR abs/2304.05170 (2023) - [i101]Chen Xu, Haocheng Shen, Fengyuan Shi, Boheng Chen, Yixuan Liao, Xiaoxin Chen, Limin Wang:
Progressive Visual Prompt Learning with Contrastive Feature Re-formation. CoRR abs/2304.08386 (2023) - [i100]Lei Chen, Zhan Tong, Yibing Song, Gangshan Wu, Limin Wang:
Efficient Video Action Detection with Token Dropout and Context Refinement. CoRR abs/2304.08451 (2023) - [i99]Zhaoyang Liu, Yinan He, Wenhai Wang, Weiyun Wang, Yi Wang, Shoufa Chen, Qinglong Zhang, Zeqiang Lai, Yang Yang, Qingyun Li, Jiashuo Yu, Kunchang Li, Zhe Chen, Xue Yang, Xizhou Zhu, Yali Wang, Limin Wang, Ping Luo, Jifeng Dai, Yu Qiao:
InternGPT: Solving Vision-Centric Tasks by Interacting with Chatbots Beyond Language. CoRR abs/2305.05662 (2023) - [i98]Kunchang Li, Yinan He, Yi Wang, Yizhuo Li, Wenhai Wang, Ping Luo, Yali Wang, Limin Wang, Yu Qiao:
VideoChat: Chat-Centric Video Understanding. CoRR abs/2305.06355 (2023) - [i97]Guo Chen, Yin-Dong Zheng, Jiahao Wang, Jilan Xu, Yifei Huang, Junting Pan, Yi Wang, Yali Wang, Yu Qiao, Tong Lu, Limin Wang:
VideoLLM: Modeling Video Sequence with Large Language Models. CoRR abs/2305.13292 (2023) - [i96]Yutao Cui, Tianhui Song, Gangshan Wu, Limin Wang:
MixFormerV2: Efficient Fully Transformer Tracking. CoRR abs/2305.15896 (2023) - [i95]Chuhao Jin, Wenhui Tan, Jiange Yang, Bei Liu, Ruihua Song, Limin Wang, Jianlong Fu:
AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation. CoRR abs/2305.18898 (2023) - [i94]