


default search action
Ping Luo 0002
羅平
Person information
- unicode name: 羅平
- affiliation: University of Hong Kong, Department of Computer Science, Hong Kong
- affiliation (PhD 2014): Chinese University of Hong Kong, Department of Information Engineering, Hong Kong
- affiliation (former): Sun Yat-Sen University, School of Software, Guangzhou, China
- affiliation (former): Lotus Hill Insititue, China
Other persons with the same name
- Ping Luo — disambiguation page
- Ping Luo 0001
— Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China (and 1 more)
- Ping Luo 0003
— University of Saskatchewan, Division of Biomedical Engineering, Saskatoon, SK, Canada
- Ping Luo 0004
— Tsinghua University, Key Laboratory for Information System Security, Beijing, China (and 1 more)
- Ping Luo 0005
— University of Electronic Science and Technology of China, Institute of Electronic and Information Engineering, State Key Laboratory of Electronic Thin Films and Integrated Devices, China
- Ping Luo 0006
— Guangzhou University, School of Economics and Statistics, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2025
- [j39]Hao Zhang
, Ping Luo, Te Qi
, Yan Xu, Tieyong Zeng
:
Adaptive Superpixel-Guided Non-Homogeneous Image Dehazing. IEEE Signal Process. Lett. 32: 591-595 (2025) - [j38]Hao Zhang, Wenqi Shao, Hong Liu, Yongqiang Ma, Ping Luo, Yu Qiao, Nanning Zheng, Kaipeng Zhang:
B-AVIBench: Toward Evaluating the Robustness of Large Vision-Language Model on Black-Box Adversarial Visual-Instructions. IEEE Trans. Inf. Forensics Secur. 20: 1434-1446 (2025) - 2024
- [j37]Xiaokang Chen, Mingyu Ding, Xiaodi Wang, Ying Xin, Shentong Mo, Yunhao Wang, Shumin Han, Ping Luo, Gang Zeng, Jingdong Wang
:
Context Autoencoder for Self-supervised Representation Learning. Int. J. Comput. Vis. 132(1): 208-223 (2024) - [j36]Weijia Wu, Yuanqiang Cai, Chunhua Shen, Debing Zhang, Ying Fu, Hong Zhou, Ping Luo:
End-to-End Video Text Spotting with Transformer. Int. J. Comput. Vis. 132(9): 4019-4035 (2024) - [j35]Hao Zhang
, Lumin Xu, Shenqi Lai, Wenqi Shao, Nanning Zheng, Ping Luo, Yu Qiao, Kaipeng Zhang:
Open-Vocabulary Animal Keypoint Detection with Semantic-Feature Matching. Int. J. Comput. Vis. 132(12): 5741-5758 (2024) - [j34]Jian Ding
, Enze Xie
, Hang Xu
, Chenhan Jiang, Zhenguo Li, Ping Luo
, Gui-Song Xia
:
Deeply Unsupervised Patch Re-Identification for Pre-Training Object Detectors. IEEE Trans. Pattern Anal. Mach. Intell. 46(3): 1348-1361 (2024) - [j33]Wang Zeng
, Sheng Jin
, Lumin Xu
, Wentao Liu
, Chen Qian
, Wanli Ouyang
, Ping Luo
, Xiaogang Wang
:
TCFormer: Visual Recognition via Token Clustering Transformer. IEEE Trans. Pattern Anal. Mach. Intell. 46(12): 9521-9535 (2024) - [j32]Junjie Wang
, Qichao Zhang
, Yao Mu
, Dong Li
, Dongbin Zhao
, Yuzheng Zhuang
, Ping Luo
, Bin Wang
, Jianye Hao
:
Prototypical Context-Aware Dynamics for Generalization in Visual Control With Model-Based Reinforcement Learning. IEEE Trans. Ind. Informatics 20(9): 10717-10727 (2024) - [j31]Chongjian Ge
, Yibing Song
, Chao Ma
, Yuankai Qi
, Ping Luo
:
Rethinking Attentive Object Detection via Neural Attention Learning. IEEE Trans. Image Process. 33: 1726-1739 (2024) - [j30]Zhouxia Wang
, Jiawei Zhang
, Xintao Wang
, Tianshui Chen
, Ying Shan
, Wenping Wang
, Ping Luo
:
Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos. IEEE Trans. Image Process. 33: 5676-5687 (2024) - [j29]Zeyu Gao
, Yao Mu
, Chen Chen
, Jingliang Duan
, Ping Luo
, Yanfeng Lu
, Shengbo Eben Li
:
Enhance Sample Efficiency and Robustness of End-to-End Urban Autonomous Driving via Semantic Masked World Model. IEEE Trans. Intell. Transp. Syst. 25(10): 13067-13079 (2024) - [j28]Chaofan Tao
, Rui Lin
, Quan Chen
, Zhaoyang Zhang, Ping Luo
, Ngai Wong
:
FAT: Frequency-Aware Transformation for Bridging Full-Precision and Low-Precision Deep Representations. IEEE Trans. Neural Networks Learn. Syst. 35(2): 2640-2654 (2024) - [j27]Ping Luo
, Jieren Cheng
, Neal Xiong
, Zhenhao Liu, Jie Wu
:
FedVeca: Federated Vectorized Averaging on Non-IID Data With Adaptive Bi-Directional Global Objective. IEEE Trans. Parallel Distributed Syst. 35(11): 2102-2113 (2024) - [c206]Tianqi Wang, Sukmin Kim, Wenxuan Ji, Enze Xie, Chongjian Ge, Junsong Chen, Zhenguo Li, Ping Luo:
DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving. AAAI 2024: 5599-5606 - [c205]Chengyue Wu, Yukang Gan, Yixiao Ge, Zeyu Lu, Jiahao Wang, Ye Feng, Ying Shan, Ping Luo:
LLaMA Pro: Progressive LLaMA with Block Expansion. ACL (1) 2024: 6518-6537 - [c204]Fanqing Meng, Wenqi Shao, Quanfeng Lu, Peng Gao, Kaipeng Zhang, Yu Qiao, Ping Luo:
ChartAssistant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning. ACL (Findings) 2024: 7775-7803 - [c203]Mengkang Hu, Haoyu Dong, Ping Luo, Shi Han, Dongmei Zhang:
KET-QA: A Dataset for Knowledge Enhanced Table Question Answering. LREC/COLING 2024: 9705-9719 - [c202]Lirui Zhao, Yue Yang, Kaipeng Zhang, Wenqi Shao, Yuxin Zhang, Yu Qiao, Ping Luo, Rongrong Ji:
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model. CVPR 2024: 6390-6399 - [c201]Shoufa Chen, Mengmeng Xu, Jiawei Ren, Yuren Cong, Sen He, Yanping Xie, Animesh Sinha, Ping Luo, Tao Xiang, Juan-Manuel Pérez-Rúa:
GenTron: Diffusion Transformers for Image and Video Generation. CVPR 2024: 6441-6451 - [c200]Jiazhi Yang, Shenyuan Gao, Yihang Qiu, Li Chen, Tianyu Li, Bo Dai, Kashyap Chitta, Penghao Wu, Jia Zeng, Ping Luo, Jun Zhang, Andreas Geiger, Yu Qiao, Hongyang Li:
Generalized Predictive Model for Autonomous Driving. CVPR 2024: 14662-14672 - [c199]Zhixuan Liang, Yao Mu, Hengbo Ma, Masayoshi Tomizuka, Mingyu Ding, Ping Luo:
SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution. CVPR 2024: 16467-16476 - [c198]Yutao Hu, Tianbin Li, Quanfeng Lu, Wenqi Shao, Junjun He, Yu Qiao, Ping Luo:
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM. CVPR 2024: 22170-22183 - [c197]Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin Li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai:
Intern VL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks. CVPR 2024: 24185-24198 - [c196]Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, Zhenguo Li:
PIXART-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation. ECCV (32) 2024: 74-91 - [c195]Ruijie Yao
, Sheng Jin
, Lumin Xu
, Wang Zeng
, Wentao Liu
, Chen Qian
, Ping Luo
, Ji Wu
:
GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-label Image Recognition. ECCV (18) 2024: 91-107 - [c194]Sheng Jin
, Shuhuai Li
, Tong Li
, Wentao Liu
, Chen Qian
, Ping Luo
:
You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-person Multi-task Human-Centric Perception. ECCV (18) 2024: 126-146 - [c193]Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beißwenger, Ping Luo, Andreas Geiger, Hongyang Li:
DriveLM: Driving with Graph Visual Question Answering. ECCV (52) 2024: 256-274 - [c192]Jianhao Li, Tianyu Sun, Zhongdao Wang, Enze Xie, Bailan Feng, Hongbo Zhang, Ze Yuan, Ke Xu, Jiaheng Liu, Ping Luo:
Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts. ECCV (84) 2024: 407-423 - [c191]Yi Zhang
, Wang Zeng
, Sheng Jin
, Chen Qian
, Ping Luo
, Wentao Liu
:
When Pedestrian Detection Meets Multi-modal Learning: Generalist Model and Benchmark Dataset. ECCV (48) 2024: 430-448 - [c190]Sheng Jin
, Ruijie Yao
, Lumin Xu
, Wentao Liu
, Chen Qian
, Ji Wu
, Ping Luo
:
UniFS: Universal Few-Shot Instance Perception with Point Representations. ECCV (29) 2024: 464-483 - [c189]Yue Yang, Kaipeng Zhang, Yuying Ge, Wenqi Shao, Zeyue Xue, Yu Qiao, Ping Luo:
Align, Adapt and Inject: Audio-Guided Image Generation, Editing and Stylization. ICASSP 2024: 3475-3479 - [c188]Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Zhongdao Wang, James T. Kwok, Ping Luo, Huchuan Lu, Zhenguo Li:
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis. ICLR 2024 - [c187]Mengkang Hu, Yao Mu, Xinmiao Yu, Mingyu Ding, Shiguang Wu, Wenqi Shao, Qiguang Chen, Bin Wang, Yu Qiao, Ping Luo:
Tree-Planner: Efficient Close-loop Task Planning with Large Language Models. ICLR 2024 - [c186]Yuanfeng Ji, Chongjian Ge, Weikai Kong, Enze Xie, Zhengying Liu, Zhenguo Li, Ping Luo:
Large Language Models as Automated Aligners for benchmarking Vision-Language Models. ICLR 2024 - [c185]Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding:
VDT: General-purpose Video Diffusion Transformers via Mask Modeling. ICLR 2024 - [c184]Wenqi Shao, Mengzhao Chen, Zhaoyang Zhang, Peng Xu, Lirui Zhao, Zhiqian Li, Kaipeng Zhang, Peng Gao, Yu Qiao, Ping Luo:
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models. ICLR 2024 - [c183]Haopeng Sun, Lumin Xu, Sheng Jin, Ping Luo, Chen Qian, Wentao Liu:
PROGRAM: PROtotype GRAph Model based Pseudo-Label Learning for Test-Time Adaptation. ICLR 2024 - [c182]Yi Wang, Yinan He, Yizhuo Li, Kunchang Li, Jiashuo Yu, Xin Ma, Xinhao Li, Guo Chen, Xinyuan Chen, Yaohui Wang, Ping Luo, Ziwei Liu, Yali Wang, Limin Wang, Yu Qiao:
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation. ICLR 2024 - [c181]Peng Xu, Wenqi Shao, Mengzhao Chen, Shitao Tang, Kaipeng Zhang, Peng Gao, Fengwei An, Yu Qiao, Ping Luo:
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation. ICLR 2024 - [c180]Yao Mu, Junting Chen, Qinglong Zhang, Shoufa Chen, Qiaojun Yu, Chongjian Ge, Runjian Chen, Zhixuan Liang, Mengkang Hu, Chaofan Tao, Peize Sun, Haibao Yu, Chao Yang, Wenqi Shao, Wenhai Wang, Jifeng Dai, Yu Qiao, Mingyu Ding, Ping Luo:
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis. ICML 2024 - [c179]Yue Yang, Yuqi Lin, Hong Liu, Wenqi Shao, Runjian Chen, Hailong Shang, Yu Wang, Yu Qiao, Kaipeng Zhang, Ping Luo:
Position: Towards Implicit Prompt For Text-To-Image Models. ICML 2024 - [c178]Kaining Ying, Fanqing Meng, Jin Wang, Zhiqian Li, Han Lin, Yue Yang, Hao Zhang, Wenbo Zhang, Yuqi Lin, Shuo Liu, Jiayi Lei, Quanfeng Lu, Runjian Chen, Peng Xu, Renrui Zhang, Haozhe Zhang, Peng Gao, Yali Wang, Yu Qiao, Ping Luo, Kaipeng Zhang, Wenqi Shao:
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI. ICML 2024 - [c177]Chuanhao Li, Zhen Li, Chenchen Jing, Shuo Liu, Wenqi Shao, Yuwei Wu, Ping Luo, Yu Qiao, Kaipeng Zhang:
SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up-to-Date Internet Knowledge. NeurIPS 2024 - [c176]Shuo Liu, Kaining Ying, Hao Zhang, Yue Yang, Yuqi Lin, Tianle Zhang, Chuanhao Li, Yu Qiao, Ping Luo, Wenqi Shao, Kaipeng Zhang:
ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Ablation Capability for Large Vision-Language Models. NeurIPS 2024 - [c175]Weiyun Wang, Shuibo Zhang, Yiming Ren, Yuchen Duan, Tiantong Li, Shuo Liu, Mengkang Hu, Zhe Chen, Kaipeng Zhang, Lewei Lu, Xizhou Zhu, Ping Luo, Yu Qiao, Jifeng Dai, Wenqi Shao, Wenhai Wang:
Needle In A Multimodal Haystack. NeurIPS 2024 - [c174]Jiannan Wu, Muyan Zhong, Sen Xing, Zeqiang Lai, Zhaoyang Liu, Zhe Chen, Wenhai Wang, Xizhou Zhu, Lewei Lu, Tong Lu, Ping Luo, Yu Qiao, Jifeng Dai:
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks. NeurIPS 2024 - [c173]Tianle Zhang, Langtian Ma, Yuchen Yan, Yuchen Zhang, Yue Yang, Ziyao Guo, Wenqi Shao, Kai Wang, Yang You, Yu Qiao, Ping Luo, Kaipeng Zhang:
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability, Reproducibility, and Practicality. NeurIPS 2024 - [c172]Jie Zhu, Yixiong Chen, Mingyu Ding, Ping Luo, Leye Wang, Jingdong Wang:
MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts. NeurIPS 2024 - [c171]Jia Zeng, Qingwen Bu, Bangjun Wang, Wenke Xia, Li Chen, Hao Dong, Haoming Song, Dong Wang, Di Hu, Ping Luo, Heming Cui, Bin Zhao, Xuelong Li, Yu Qiao, Hongyang Li:
Learning Manipulation by Predicting Interaction. Robotics: Science and Systems 2024 - [c170]Anran Liu
, Cheng Lin
, Yuan Liu
, Xiaoxiao Long
, Zhiyang Dou
, Hao-Xiang Guo
, Ping Luo
, Wenping Wang
:
Part123: Part-aware 3D Reconstruction from a Single-view Image. SIGGRAPH (Conference Paper Track) 2024: 24 - [c169]Zhouxia Wang
, Ziyang Yuan
, Xintao Wang
, Yaowei Li
, Tianshui Chen
, Menghan Xia
, Ping Luo
, Ying Shan
:
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation. SIGGRAPH (Conference Paper Track) 2024: 114 - [i276]Fanqing Meng, Wenqi Shao, Quanfeng Lu, Peng Gao, Kaipeng Zhang, Yu Qiao, Ping Luo:
ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning. CoRR abs/2401.02384 (2024) - [i275]Chengyue Wu, Yukang Gan, Yixiao Ge, Zeyu Lu, Jiahao Wang, Ye Feng, Ping Luo, Ying Shan:
LLaMA Pro: Progressive LLaMA with Block Expansion. CoRR abs/2401.02415 (2024) - [i274]Junsong Chen, Yue Wu, Simian Luo, Enze Xie, Sayak Paul, Ping Luo, Hang Zhao, Zhenguo Li:
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models. CoRR abs/2401.05252 (2024) - [i273]Yutao Hu, Tianbin Li, Quanfeng Lu, Wenqi Shao, Junjun He, Yu Qiao, Ping Luo:
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM. CoRR abs/2402.09181 (2024) - [i272]Junting Chen, Yao Mu, Qiaojun Yu, Tianming Wei, Silang Wu, Zhecheng Yuan, Zhixuan Liang, Chao Yang, Kaipeng Zhang, Wenqi Shao, Yu Qiao, Huazhe Xu, Mingyu Ding, Ping Luo:
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation. CoRR abs/2402.14623 (2024) - [i271]Zekang Yang, Wang Zeng, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu:
AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks. CoRR abs/2402.15351 (2024) - [i270]Yao Mu, Junting Chen, Qinglong Zhang, Shoufa Chen, Qiaojun Yu, Chongjian Ge, Runjian Chen, Zhixuan Liang, Mengkang Hu, Chaofan Tao, Peize Sun, Haibao Yu, Chao Yang, Wenqi Shao, Wenhai Wang, Jifeng Dai, Yu Qiao, Mingyu Ding, Ping Luo:
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis. CoRR abs/2402.16117 (2024) - [i269]Peng Xu, Wenqi Shao, Mengzhao Chen, Shitao Tang, Kaipeng Zhang, Peng Gao, Fengwei An, Yu Qiao, Ping Luo:
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation. CoRR abs/2402.16880 (2024) - [i268]Yue Yang, Yuqi lin, Hong Liu, Wenqi Shao, Runjian Chen, Hailong Shang, Yu Wang, Yu Qiao, Kaipeng Zhang, Ping Luo:
Towards Implicit Prompt For Text-To-Image Models. CoRR abs/2403.02118 (2024) - [i267]Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, Zhenguo Li:
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation. CoRR abs/2403.04692 (2024) - [i266]Hao Zhang, Wenqi Shao, Hong Liu, Yongqiang Ma, Ping Luo, Yu Qiao, Kaipeng Zhang:
AVIBench: Towards Evaluating the Robustness of Large Vision-Language Model on Adversarial Visual-Instructions. CoRR abs/2403.09346 (2024) - [i265]Jiazhi Yang, Shenyuan Gao, Yihang Qiu, Li Chen, Tianyu Li, Bo Dai, Kashyap Chitta, Penghao Wu, Jia Zeng, Ping Luo, Jun Zhang, Andreas Geiger, Yu Qiao, Hongyang Li:
Generalized Predictive Model for Autonomous Driving. CoRR abs/2403.09630 (2024) - [i264]Tianqi Wang, Enze Xie, Ruihang Chu, Zhenguo Li, Ping Luo:
DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving. CoRR abs/2403.16996 (2024) - [i263]Shilong Zhang, Lianghua Huang, Xi Chen, Yifei Zhang, Zhi-Fan Wu, Yutong Feng, Wei Wang, Yujun Shen, Yu Liu, Ping Luo:
FlashFace: Human Image Personalization with High-fidelity Identity Preservation. CoRR abs/2403.17008 (2024) - [i262]Shuo Liu, Kaining Ying, Hao Zhang, Yue Yang, Yuqi Lin, Tianle Zhang, Chuanhao Li, Yu Qiao, Ping Luo, Wenqi Shao, Kaipeng Zhang:
ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Capability for Large Vision-Language Models. CoRR abs/2403.20194 (2024) - [i261]Haibao Yu, Wenxian Yang, Jiaru Zhong, Zhenwei Yang, Siqi Fan, Ping Luo, Zaiqing Nie:
End-to-End Autonomous Driving through V2X Cooperation. CoRR abs/2404.00717 (2024) - [i260]Lirui Zhao, Yue Yang, Kaipeng Zhang, Wenqi Shao, Yuxin Zhang, Yu Qiao, Ping Luo, Rongrong Ji:
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model. CoRR abs/2404.01342 (2024) - [i259]Jiahao Wang, Wenqi Shao, Mengzhao Chen, Chengyue Wu, Yong Liu, Kaipeng Zhang, Songyang Zhang, Kai Chen, Ping Luo:
Adapting LLaMA Decoder to Vision Transformer. CoRR abs/2404.06773 (2024) - [i258]Kaining Ying, Fanqing Meng, Jin Wang, Zhiqian Li, Han Lin, Yue Yang, Hao Zhang, Wenbo Zhang, Yuqi Lin, Shuo Liu, Jiayi Lei, Quanfeng Lu, Runjian Chen, Peng Xu, Renrui Zhang, Haozhe Zhang, Peng Gao, Yali Wang, Yu Qiao, Ping Luo, Kaipeng Zhang, Wenqi Shao:
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI. CoRR abs/2404.16006 (2024) - [i257]Sheng Jin, Ruijie Yao, Lumin Xu, Wentao Liu, Chen Qian, Ji Wu, Ping Luo:
UniFS: Universal Few-shot Instance Perception with Point Representations. CoRR abs/2404.19401 (2024) - [i256]Yao Lai, Jinxin Liu, David Z. Pan, Ping Luo:
Scalable and Effective Arithmetic Tree Generation for Adder and Multiplier Designs. CoRR abs/2405.06758 (2024) - [i255]Chengyue Wu, Yixiao Ge, Qiushan Guo, Jiahao Wang, Zhixuan Liang, Zeyu Lu, Ying Shan, Ping Luo:
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots. CoRR abs/2405.07990 (2024) - [i254]Mengkang Hu, Haoyu Dong, Ping Luo, Shi Han, Dongmei Zhang:
KET-QA: A Dataset for Knowledge Enhanced Table Question Answering. CoRR abs/2405.08099 (2024) - [i253]Chuanhao Li, Zhen Li, Chenchen Jing, Shuo Liu, Wenqi Shao, Yuwei Wu, Ping Luo, Yu Qiao, Kaipeng Zhang:
UDKAG: Augmenting Large Vision-Language Models with Up-to-Date Knowledge. CoRR abs/2405.14554 (2024) - [i252]Yao Lai, Sungyoung Lee, Guojin Chen, Souradip Poddar, Mengkang Hu, David Z. Pan, Ping Luo:
AnalogCoder: Analog Circuit Design via Training-Free Code Generation. CoRR abs/2405.14918 (2024) - [i251]Anran Liu, Cheng Lin, Yuan Liu, Xiaoxiao Long, Zhiyang Dou, Hao-Xiang Guo, Ping Luo, Wenping Wang:
Part123: Part-aware 3D Reconstruction from a Single-view Image. CoRR abs/2405.16888 (2024) - [i250]Jia Zeng, Qingwen Bu, Bangjun Wang, Wenke Xia, Li Chen, Hao Dong, Haoming Song, Dong Wang, Di Hu, Ping Luo, Heming Cui, Bin Zhao, Xuelong Li, Yu Qiao, Hongyang Li:
Learning Manipulation by Predicting Interaction. CoRR abs/2406.00439 (2024) - [i249]Peize Sun, Yi Jiang, Shoufa Chen, Shilong Zhang, Bingyue Peng, Ping Luo, Zehuan Yuan:
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation. CoRR abs/2406.06525 (2024) - [i248]Weiyun Wang, Shuibo Zhang, Yiming Ren, Yuchen Duan, Tiantong Li, Shuo Liu, Mengkang Hu, Zhe Chen, Kaipeng Zhang, Lewei Lu, Xizhou Zhu, Ping Luo, Yu Qiao, Jifeng Dai, Wenqi Shao, Wenhai Wang:
Needle In A Multimodal Haystack. CoRR abs/2406.07230 (2024) - [i247]Jiannan Wu, Muyan Zhong, Sen Xing, Zeqiang Lai, Zhaoyang Liu
, Wenhai Wang, Zhe Chen, Xizhou Zhu, Lewei Lu, Tong Lu, Ping Luo, Yu Qiao, Jifeng Dai:
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks. CoRR abs/2406.08394 (2024) - [i246]Quanfeng Lu, Wenqi Shao, Zitao Liu, Fanqing Meng, Boxuan Li, Botong Chen, Siyuan Huang, Kaipeng Zhang, Yu Qiao, Ping Luo:
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices. CoRR abs/2406.08451 (2024) - [i245]Tianle Zhang, Langtian Ma, Yuchen Yan, Yuchen Zhang, Kai Wang, Yue Yang, Ziyao Guo, Wenqi Shao, Yang You, Yu Qiao, Ping Luo, Kaipeng Zhang:
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability, Reproducibility, and Practicality. CoRR abs/2406.08845 (2024) - [i244]Zeyu Gao, Yao Mu, Jinye Qu, Mengkang Hu, Lingyue Guo, Ping Luo, Yanfeng Lu:
DAG-Plan: Generating Directed Acyclic Dependency Graphs for Dual-Arm Cooperative Planning. CoRR abs/2406.09953 (2024) - [i243]Fanqing Meng, Wenqi Shao, Lixin Luo, Yahong Wang, Yiran Chen, Quanfeng Lu, Yue Yang, Tianshuo Yang, Kaipeng Zhang, Yu Qiao, Ping Luo:
PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models. CoRR abs/2406.11802 (2024) - [i242]Yatai Ji, Shilong Zhang, Jie Wu, Peize Sun, Weifeng Chen, Xuefeng Xiao, Sidi Yang, Yujiu Yang, Ping Luo:
IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model. CoRR abs/2407.07577 (2024) - [i241]Yi Zhang, Wang Zeng, Sheng Jin, Chen Qian, Ping Luo, Wentao Liu:
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset. CoRR abs/2407.10125 (2024) - [i240]Mengzhao Chen, Wenqi Shao, Peng Xu, Jiahao Wang, Peng Gao, Kaipeng Zhang, Yu Qiao, Ping Luo:
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models. CoRR abs/2407.11062 (2024) - [i239]Wang Zeng, Sheng Jin, Lumin Xu, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang:
TCFormer: Visual Recognition via Token Clustering Transformer. CoRR abs/2407.11321 (2024) - [i238]Jianhao Li, Tianyu Sun, Zhongdao Wang, Enze Xie, Bailan Feng, Hongbo Zhang, Ze Yuan, Ke Xu, Jiaheng Liu, Ping Luo:
Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts. CoRR abs/2407.11382 (2024) - [i237]Chaofan Tao, Qian Liu, Longxu Dou, Niklas Muennighoff, Zhongwei Wan, Ping Luo, Min Lin, Ngai Wong:
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies. CoRR abs/2407.13623 (2024) - [i236]Lirui Zhao, Tianshuo Yang, Wenqi Shao, Yuxin Zhang, Yu Qiao, Ping Luo, Kaipeng Zhang, Rongrong Ji:
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model. CoRR abs/2407.16982 (2024) - [i235]Fanqing Meng, Jin Wang, Chuanhao Li, Quanfeng Lu, Hao Tian, Jiaqi Liao, Xizhou Zhu, Jifeng Dai, Yu Qiao, Ping Luo, Kaipeng Zhang, Wenqi Shao:
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models. CoRR abs/2408.02718 (2024) - [i234]Mengkang Hu, Tianxing Chen, Qiguang Chen, Yao Mu, Wenqi Shao, Ping Luo:
HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model. CoRR abs/2408.09559 (2024) - [i233]Yangyang Xu, Wenqi Shao, Yong Du, Haiming Zhu, Yang Zhou, Ping Luo, Shengfeng He:
Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing. CoRR abs/2408.13395 (2024) - [i232]Yao Mu, Tianxing Chen, Shijia Peng, Zanxin Chen, Zeyu Gao, Yude Zou, Lunkai Lin, Zhiqiang Xie, Ping Luo:
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins (early version). CoRR abs/2409.02920 (2024) - [i231]Qingwen Bu, Jia Zeng, Li Chen, Yanchao Yang, Guyue Zhou, Junchi Yan, Ping Luo, Heming Cui, Yi Ma, Hongyang Li:
Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation. CoRR abs/2409.09016 (2024) - [i230]Xi Wang, Tianxing Chen, Qiaojun Yu, Tianling Xu, Zanxin Chen, Yiting Fu, Cewu Lu, Yao Mu, Ping Luo:
Articulated Object Manipulation using Online Axis Estimation with SAM2-Based Tracking. CoRR abs/2409.16287 (2024) - [i229]Hao Zhang, Yongqiang Ma, Wenqi Shao, Ping Luo, Nanning Zheng, Kaipeng Zhang:
HRVMamba: High-Resolution Visual State Space Model for Dense Prediction. CoRR abs/2410.03174 (2024) - [i228]Mengzhao Chen, Yi Liu, Jiahao Wang, Yi Bin, Wenqi Shao, Ping Luo:
PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs. CoRR abs/2410.05265 (2024) - [i227]Fanqing Meng, Jiaqi Liao, Xinyu Tan, Wenqi Shao, Quanfeng Lu, Kaipeng Zhang, Yu Cheng, Dianqi Li, Yu Qiao, Ping Luo:
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation. CoRR abs/2410.05363 (2024) - [i226]Peng Xu, Wenqi Shao, Mingyu Ding, Ping Luo:
DCP: Learning Accelerator Dataflow for Neural Network via Propagation. CoRR abs/2410.06553 (2024) - [i225]Yue Yang, Shuibai Zhang, Wenqi Shao, Kaipeng Zhang, Yi Bin, Yu Wang, Ping Luo:
Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping. CoRR abs/2410.08695 (2024) - [i224]Zhouxia Wang, Jiawei Zhang, Xintao Wang, Tianshui Chen, Ying Shan, Wenping Wang, Ping Luo:
Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos. CoRR abs/2410.11828 (2024) - [i223]Chengyue Wu, Xiaokang Chen, Zhiyu Wu, Yiyang Ma, Xingchao Liu, Zizheng Pan, Wen Liu, Zhenda Xie, Xingkai Yu, Chong Ruan, Ping Luo:
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation. CoRR abs/2410.13848 (2024) - [i222]