


default search action
ICCV 2023: Paris, France
- IEEE/CVF International Conference on Computer Vision, ICCV 2023, Paris, France, October 1-6, 2023. IEEE 2023, ISBN 979-8-3503-0718-4
- Xinyang Liu, Yijin Li, Yanbin Teng, Hujun Bao, Guofeng Zhang, Yinda Zhang, Zhaopeng Cui:
Multi-Modal Neural Radiance Field for Monocular Dense SLAM with a Light-Weight ToF Sensor. 1-11 - Chandan Yeshwanth, Yueh-Cheng Liu, Matthias Nießner, Angela Dai:
ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes. 12-22 - Jiachen Lu, Hongyang Li, Renyuan Peng, Feng Wen, Xinyue Cai, Wei Zhang, Hang Xu, Li Zhang:
Translating Images to Road Network: A Non-Autoregressive Sequence-to-Sequence Approach. 23-33 - Ruojin Cai, Joseph Tung
, Qianqian Wang, Hadar Averbuch-Elor, Bharath Hariharan, Noah Snavely:
Doppelgangers: Learning to Disambiguate Images of Similar Structures. 34-44 - Jinjie Mai, Abdullah Hamdi
, Silvio Giancola, Chen Zhao, Bernard Ghanem
:
EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries. 45-57 - Wenqiang Xu, Wenxin Du, Han Xue, Yutong Li, Ruolin Ye, Yanfeng Wang, Cewu Lu:
ClothPose: A Real-world Benchmark for Visual Analysis of Garment Pose via An Indirect Recording Solution. 58-68 - Zijie Jiang, Masatoshi Okutomi:
EMR-MSF: Self-Supervised Recurrent Monocular Scene Flow Exploiting Ego-Motion Rigidity. 69-78 - Ruofan Liang, Huiting Chen, Chunlin Li, Fan Chen, Selvakumar Panneer, Nandita Vijaykumar:
ENVIDR: Implicit Differentiable Renderer with Neural Environment Lighting. 79-89 - Yihua Zhang, Ruisi Cai, Tianlong Chen, Guanhua Zhang, Huan Zhang, Pin-Yu Chen, Shiyu Chang, Zhangyang Wang, Sijia Liu:
Robust Mixture-of-Expert Training for Convolutional Neural Networks. 90-101 - Dong Lu, Zhiqiang Wang, Teng Wang, Weili Guan, Hongchang Gao, Feng Zheng:
Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models. 102-111 - Hritik Bansal, Fan Yin, Nishad Singhi, Aditya Grover, Yu Yang, Kai-Wei Chang:
CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning. 112-123 - Md Farhamdur Reza, Ali Rahmati, Tianfu Wu, Huaiyu Dai:
CGBA: Curvature-aware Geometric Black-box Attack. 124-133 - Minjong Lee, Dongwoo Kim:
Robust Evaluation of Diffusion-Based Adversarial Purification. 134-144 - Yao Ge, Yun Li, Keji Han, Junyi Zhu, Xianzhong Long:
Advancing Example Exploitation Can Alleviate Critical Challenges in Adversarial Training. 145-154 - Zixuan Zhu
, Rui Wang, Cong Zou, Lihua Jing:
The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data. 155-164 - Indranil Sur, Karan Sikka, Matthew Walmer, Kaushik Koneripalli, Anirban Roy, Xiao Lin, Ajay Divakaran, Susmit Jha:
TIJO: Trigger Inversion with Joint Optimization for Defending Multimodal Backdoored Models. 165-175 - Yangru Huang, Peixi Peng, Yifan Zhao, Yunpeng Zhai, Haoran Xu, Yonghong Tian:
Simoun: Synergizing Interactive Motion-appearance Understanding for Vision-based Reinforcement Learning. 176-185 - Yiming Li, Qi Fang, Jiamu Bai, Siheng Chen, Felix Juefei-Xu, Chen Feng:
Among Us: Adversarially Robust Collaborative Perception by Consensus. 186-195 - Cristiano Saltori, Aljosa Osep, Elisa Ricci, Laura Leal-Taixé:
Walking Your LiDOG: A Journey Through Multiple Domains for LiDAR Semantic Segmentation. 196-206 - Yunpeng Zhai, Peixi Peng, Yifan Zhao, Yangru Huang, Yonghong Tian:
Stabilizing Visual Reinforcement Learning via Asymmetric Interactive Cooperation. 207-216 - Yuanzhi Liang, Xiaohan Wang, Linchao Zhu, Yi Yang:
MAAL: Multimodality-Aware Autoencoder-based Affordance Learning for 3D Articulated Objects. 217-227 - Lingdong Kong
, Youquan Liu, Runnan Chen, Yuexin Ma, Xinge Zhu, Yikang Li, Yuenan Hou, Yu Qiao, Ziwei Liu:
Rethinking Range View Representation for LiDAR Segmentation. 228-240 - Haitao Lin, Yanwei Fu, Xiangyang Xue:
PourIt!: Weakly-supervised Liquid Perception from a Single Image for Visual Closed-Loop Robotic Pouring. 241-251 - Arthur Moreau, Nathan Piasco, Moussâb Bennehar
, Dzmitry Tsishkou, Bogdan Stanciulescu, Arnaud de La Fortelle:
CROSSFIRE: Camera Relocalization On Self-Supervised Features from an Implicit Representation. 252-262 - Hyesong Choi, Hunsang Lee, Seongwon Jeong, Dongbo Min:
Environment Agnostic Representation for Visual Reinforcement learning. 263-273 - Qiongjie Cui, Huaijiang Sun, Jianfeng Lu, Weiqing Li, Bin Li, Hongwei Yi, Haofan Wang:
Test-time Personalizable Forecasting of 3D Human Poses. 274-283 - Hao Xiang, Runsheng Xu, Jiaqi Ma:
HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative Perception with Vision Transformer. 284-295 - Antoine Mercier, Ruan Erasmus, Yashesh Savani, Manik Dhingra, Fatih Porikli, Guillaume Berger:
Efficient neural supersampling on a novel gaming dataset. 296-306 - Hong-Wing Pang, Binh-Son Hua, Sai-Kit Yeung:
Locally Stylized Neural Radiance Fields. 307-316 - Dongqing Wang, Tong Zhang, Sabine Süsstrunk:
NEMTO: Neural Environment Matting for Novel View and Relighting Synthesis of Transparent Objects. 317-327 - Xiaoyang Kang, Tao Yang, Wenqi Ouyang, Peiran Ren, Lingzhi Li, Xuansong Xie:
DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders. 328-338 - Weicai Ye, Shuo Chen, Chong Bao, Hujun Bao, Marc Pollefeys, Zhaopeng Cui, Guofeng Zhang:
IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis. 339-151 - Jiayi Liu, Ali Mahdavi-Amiri, Manolis Savva:
PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects. 352-363 - Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu:
ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model. 364-373 - Maham Tanveer, Yizhi Wang, Ali Mahdavi-Amiri, Hao Zhang:
DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion. 374-384 - Yi-Ling Qiao, Alexander Gao, Yiran Xu, Yue Feng, Jia-Bin Huang, Ming C. Lin:
Dynamic Mesh-Aware Radiance Fields. 385-396 - Wenzhang Sun, Yunlong Che, Yandong Guo, Han Huang:
Neural Reconstruction of Relightable Human Model from Monocular Video. 397-407 - Alexander Mai, Dor Verbin, Falko Kuester, Sara Fridovich-Keil:
Neural Microfacet Fields for Inverse Rendering. 408-418 - Ishit Mehta, Manmohan Chandraker, Ravi Ramamoorthi:
A Theory of Topological Derivatives for Inverse Rendering of Geometry. 419-429 - Etai Sella, Gal Fiebelman, Peter Hedman, Hadar Averbuch-Elor:
Vox-E: Text-guided Voxel Editing of 3D Objects. 430-440 - Chenxin Li, Brandon Y. Feng
, Zhiwen Fan, Panwang Pan, Zhangyang Wang:
StegaNeRF: Embedding Invisible Information within Neural Radiance Fields. 441-453 - Liu He, Daniel G. Aliaga:
GlobalMapper: Arbitrary-Shaped Urban Layout Generation. 454-464 - Fan Lu, Yan Xu, Guang Chen, Hongsheng Li
, Kwan-Yee Lin, Changjun Jiang:
Urban Radiance Field Representation with Deformable Neural Mesh Primitives. 465-476 - Barbara Roessle, Matthias Nießner:
End2End Multi-View Feature Matching with Differentiable Pose Optimization. 477-487 - Chen Geng
, Hong-Xing Yu, Sharon Zhang, Maneesh Agrawala
, Jiajun Wu:
Tree-Structured Shading Decomposition. 488-498 - Dominique Piché-Meunier, Yannick Hold-Geoffroy, Jianming Zhang, Jean-François Lalonde:
Lens Parameter Estimation for Realistic Depth of Field Modeling. 499-508 - Chongyang Zhong, Lei Hu, Zihao Zhang, Shihong Xia:
AttT2M: Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism. 509-519 - Manuel Ladron de Guevara, Yannick Hold-Geoffroy, Jose Echevarria, Cameron Smith, Yijun Li, Daichi Ito:
Cross-modal Latent Space Alignment for Image to Avatar Translation. 520-529 - Yibo Yang, Stephan Mandt:
Computationally-Efficient Neural Image Compression with Shallow Decoders. 530-540 - Salwa K. Al Khatib
, Mohamed El Amine Boudjoghra, Jean Lahoud, Fahad Shahbaz Khan:
3D Instance Segmentation via Enhanced Spatial and Semantic Supervision. 541-550 - Zhijie Deng, Yucen Luo:
Learning Neural Eigenfunctions for Unsupervised Semantic Segmentation. 551-561 - Weiguang Zhao
, Yuyao Yan, Chaolong Yang, Jianan Ye, Xi Yang, Kaizhu Huang:
Divide and Conquer: 3D Point Cloud Instance Segmentation With Point-Wise Binarization. 562-571 - Wentong Li, Yuqian Yuan, Song Wang, Jianke Zhu, Jianshu Li, Jian Liu, Lei Zhang
:
Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport. 572-581 - Sina Gholamian, Ali Vahdat:
Handwritten and Printed Text Segmentation: A Signature Case Study. 582-592 - Sihyeon Kim, Juyeon Ko, Minseok Joo, Juhan Cha, Jaewon Lee, Hyunwoo J. Kim:
Semantic-Aware Implicit Template Learning via Part Deformation Consistency. 593-603 - Yunze Liu, Junyu Chen, Zekai Zhang, Jingwei Huang, Li Yi:
LeaF: Learning Frames for 4D Point Cloud Sequence Understanding. 604-613 - Sanghyun Jo, In-Jae Yu, Kyungsu Kim:
MARS: Model-agnostic Biased Object Removal without Additional Supervision for Weakly-Supervised Semantic Segmentation. 614-623 - Zelin Peng, Guanchun Wang, Lingxi Xie, Dongsheng Jiang, Wei Shen, Qi Tian:
USAGE: A Unified Seed Area Generation Paradigm for Weakly Supervised Semantic Segmentation. 624-634 - Maksym Bekuzarov, Ariana Bermudez, Joon-Young Lee, Hao Li:
XMem++: Production-level Video Segmentation From Few Annotated Frames. 635-644 - Maolin Gao, Paul Roetzer, Marvin Eisenberger, Zorah Lähner, Michael Möller, Daniel Cremers, Florian Bernard:
ΣIGMA: Scale-Invariant Global Sparse Shape Matching. 645-654 - Qianxiong Xu
, Wenting Zhao, Guosheng Lin, Cheng Long:
Self-Calibrated Cross Attention Network for Few-Shot Segmentation. 655-665 - Kehan Li, Yian Zhao, Zhennan Wang, Zesen Cheng, Peng Jin, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen:
Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation. 666-676 - Sunghwan Kim, Dae-Hwan Kim, Hoseong Kim:
Texture Learning Domain Randomization for Domain Generalized Segmentation. 677-687 - Tiankang Su, Huihui Song, Dong Liu, Bo Liu, Qingshan Liu:
Unsupervised Video Object Segmentation with Online Adversarial Self-Tuning. 688-698 - Jun Chen, Deyao Zhu
, Guocheng Qian
, Bernard Ghanem
, Zhicheng Yan, Chenchen Zhu, Fanyi Xiao, Sean Chang Culatana, Mohamed Elhoseiny
:
Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation Only. 699-710 - Nazir Nayal, Misra Yavuz, João F. Henriques, Fatma Güney:
RbA: Segmenting Unknown Regions Rejected by All. 711-722 - Sriram Ravindran, Debraj Basu:
Sempart: Self-supervised Multi-resolution Partitioning of Image Semantics. 723-733 - Sadra Safadoust, Fatma Güney:
Multi-Object Discovery by Low-Dimensional Object Motion. 734-744 - Enxu Li, Sergio Casas, Raquel Urtasun:
MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory. 745-754 - Changwei Wang, Rongtao Xu, Shibiao Xu, Weiliang Meng, Xiaopeng Zhang:
Treating Pseudo-labels Generation as Image Matting for Weakly Supervised Semantic Segmentation. 755-765 - Rui Yang, Lin Song, Yixiao Ge, Xiu Li:
BoxSnake: Polygonal Instance Segmentation with Box Supervision. 766-776 - Quan Tang, Bowen Zhang, Jiajun Liu, Fagui Liu, Yifan Liu:
Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation. 777-786 - Yichen Liu, Benran Hu, Junkai Huang, Yu-Wing Tai, Chi-Keung Tang:
Instance Neural Radiance Field. 787-796 - Kunyang Han, Yong Liu, Jun Hao Liew, Henghui Ding, Jiajun Liu, Yitong Wang, Yansong Tang, Yujiu Yang
, Jiashi Feng, Yao Zhao, Yunchao Wei:
Global Knowledge Calibration for Fast Open-Vocabulary Segmentation. 797-807 - Duo Peng, Ping Hu, Qiuhong Ke, Jun Liu:
Diffusion-based Image Translation with Label Guidance for Domain Adaptive Semantic Segmentation. 808-820 - Yuhe Liu, Chuanjian Liu, Kai Han, Quan Tang, Zengchang Qin:
Boosting Semantic Segmentation from the Perspective of Explicit Class Embeddings. 821-831 - Hala Lamdouar, Weidi Xie, Andrew Zisserman:
The Making and Breaking of Camouflage. 832-842 - Zekang Zhang, Guangyu Gao, Jianbo Jiao
, Chi Harold Liu
, Yunchao Wei:
CoinSeg: Contrast Inter- and Intra- Class Representations for Incremental Segmentation. 843-853 - Xueyi Liu, Bin Wang, He Wang, Li Yi:
Few-Shot Physically-Aware Articulated Mesh Generation via Hierarchical Deformation. 854-864 - Fenggen Yu, Yiming Qian, Francisca Gil-Ureta, Brian Jackson, Eric P. Bennett, Hao Zhang:
HAL3D: Hierarchical Active Learning for Fine-Grained 3D Part Labeling. 865-875 - Tianyi Shi, Xiaohuan Ding, Liang Zhang, Xin Yang:
FreeCOS: Self-Supervised Learning from Fractals and Unlabeled Images for Curvilinear Object Segmentation. 876-886 - Xin Xu, Tianyi Xiong, Zheng Ding, Zhuowen Tu:
MasQCLIP for Open-Vocabulary Universal Image Segmentation. 887-898 - Kaining Ying, Qing Zhong
, Weian Mao, Zhenhua Wang, Hao Chen, Lin Yuanbo Wu, Yifan Liu, Chengxiang Fan
, Yunzhi Zhuge, Chunhua Shen:
CTVIS: Consistent Training for Online Video Instance Segmentation. 899-908 - Ting Chen, Lala Li, Saurabh Saxena, Geoffrey E. Hinton, David J. Fleet:
A Generalist Framework for Panoptic Segmentation of Images and Videos. 909-919 - Bo Miao
, Mohammed Bennamoun
, Yongsheng Gao, Ajmal Mian
:
Spectrum-guided Multi-granularity Referring Video Object Segmentation. 920-930 - Changqi Wang, Haoyu Xie, Yuhui Yuan, Chong Fu, Xiangyu Yue:
Space Engage: Collaborative Space Supervision for Contrastive-based Semi-Supervised Semantic Segmentation. 931-942 - Hoyoung Kim, Minhyeon Oh, Sehyun Hwang, Suha Kwak, Jungseul Ok:
Adaptive Superpixel for Active Learning in Semantic Segmentation. 943-953 - Yuxin Mao, Jing Zhang, Mochu Xiang, Yiran Zhong, Yuchao Dai:
Multimodal Variational Auto-encoder based Audio-Visual Segmentation. 954-965 - Yichen Yuan, Yifan Wang, Lijun Wang, Xiaoqi Zhao, Huchuan Lu, Yu Wang, Weibo Su, Lei Zhang
:
Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation. 966-976 - Cheng-Kun Yang, Min-Hung Chen, Yung-Yu Chuang, Yen-Yu Lin
:
2D-3D Interlaced Transformer for Point Cloud Segmentation with Scene-Level Supervision. 977-987 - Mischa Dombrowski, Hadrien Reynaud, Matthew Baugh, Bernhard Kainz
:
Foreground-Background Separation through Concept Distillation from Generative Image Foundation Models. 988-998 - Muzhi Zhu, Hengtao Li, Hao Chen, Chengxiang Fan
, Weian Mao, Chenchen Jing, Yifan Liu, Chunhua Shen:
SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning. 999-1008 - Boyang Li, Yingqian Wang, Longguang Wang, Fei Zhang, Ting Liu, Zaiping Lin, Wei An, Yulan Guo:
Monte Carlo Linear Clustering with Single-Point Supervision is Enough for Infrared Small Target Detection. 1009-1019 - Hao Zhang, Feng Li, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianwei Yang, Lei Zhang:
A Simple Framework for Open-Vocabulary Segmentation and Detection. 1020-1031 - Zongwei Wu, Danda Pani Paudel
, Deng-Ping Fan
, Jingjing Wang, Shuo Wang, Cédric Demonceaux, Radu Timofte, Luc Van Gool:
Source-free Depth for Object Pop-out. 1032-1042 - Amit Kumar Rana, Sabarinath Mahadevan, Alexander Hermans, Bastian Leibe:
DynaMITe: Dynamic Query Bootstrapping for Multi-object Interactive Segmentation Transformer. 1043-1052 - Junzhang Chen, Xiangzhi Bai:
Atmospheric Transmission and Thermal Inertia Induced Blind Road Segmentation with a Large-Scale Dataset TBRSD. 1053-1063 - Yuxi Wang, Jian Liang, Jun Xiao, Shuqi Mei, Yuran Yang, Zhaoxiang Zhang:
Informative Data Mining for One-shot Cross-Domain Semantic Segmentation. 1064-1074 - Shan Wang
, Chuong Nguyen
, Jiawei Liu, Kaihao Zhang, Wenhan Luo
, Yanhao Zhang, Sundaram Muthu
, Fahira Afzal Maken, Hongdong Li:
Homography Guided Temporal Fusion for Road Line and Marking Segmentation. 1075-1085 - Cong Han, Yujie Zhong, Dengjie Li, Kai Han, Lin Ma:
Open-Vocabulary Semantic Segmentation with Decoupled One-Pass Network. 1086-1096 - Junlong Li, Bingyao Yu, Yongming Rao, Jie Zhou, Jiwen Lu:
TCOVIS: Temporally Consistent Online Video Instance Segmentation. 1097-1107 - Liyi Chen, Chenyang Lei, Ruihuang Li, Shuai Li, Zhaoxiang Zhang, Lei Zhang
:
FPR: False Positive Rectification for Weakly Supervised Semantic Segmentation. 1108-1118 - Lukas Zbinden, Lars Doorenbos
, Theodoros Pissas, Adrian Thomas Huber, Raphael Sznitman
, Pablo Márquez-Neila:
Stochastic Segmentation with Conditional Categorical Diffusion Models. 1119-1129 - Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang:
SegGPT: Towards Segmenting Everything In Context. 1130-1140 - Xi Chen, Shuang Li, Ser-Nam Lim, Antonio Torralba, Hengshuang Zhao:
Open-vocabulary Panoptic Segmentation with Embedding Modulation. 1141-1150 - Yuyuan Liu, Choubo Ding, Yu Tian, Guansong Pang, Vasileios Belagiannis, Ian D. Reid, Gustavo Carneiro
:
Residual Pattern Learning for Pixel-wise Out-of-Distribution Detection in Semantic Segmentation. 1151-1161 - Pitchaporn Rewatbowornwong, Nattanat Chatthee, Ekapol Chuangsuwanich
, Supasorn Suwajanakorn:
Zero-guidance Segmentation Using Zero Segment Labels. 1162-1172 - Jiawei Liu, Changkun Ye, Shan Wang, Ruikai Cui, Jing Zhang, Kaihao Zhang, Nick Barnes
:
Model Calibration in Dense Classification with Adaptive Label Perturbation. 1173-1184 - Jie Ma, Chuan Wang, Yang Liu, Liang Lin, Guanbin Li:
Enhanced Soft Label for Semi-Supervised Semantic Segmentation. 1185-1195 - Kaixin Cai, Pengzhen Ren
, Yi Zhu, Hang Xu, Jianzhuang Liu, Changlin Li, Guangrun Wang, Xiaodan Liang:
MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation. 1196-1205 - Weijia Wu, Yuzhong Zhao, Mike Zheng Shou, Hong Zhou, Chunhua Shen:
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models. 1206-1217 - Rui Sun, Yuan Wang, Huayu Mai, Tianzhu Zhang, Feng Wu:
Alignment Before Aggregation: Trajectory Memory Retrieval Network for Video Object Segmentation. 1218-1228 - Peixia Li, Pulak Purkait, Thalaiyasingam Ajanthan, Majid Abdolshah, Ravi Garg, Hisham Husain, Chenchen Xu, Stephen Gould, Wanli Ouyang
, Anton van den Hengel:
Semi-Supervised Semantic Segmentation under Label Noise via Diverse Learning Groups. 1229-1238 - Cody Simons, Dripta S. Raychaudhuri, Sk Miraj Ahmed, Suya You, Konstantinos Karydis, Amit K. Roy-Chowdhury:
SUMMIT: Source-Free Adaptation of Uni-Modal Models to Multi-Modal Targets. 1239-1249 - Yu-Hsing Hsieh, Guan-Sheng Chen, Shun-Xian Cai, Ting-Yun Wei, Huei-Fang Yang, Chu-Song Chen:
Class-incremental Continual Learning for Instance Segmentation with Image-level Weak Supervision. 1250-1261 - Jianxiong Gao, Xuelin Qian, Yikai Wang, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu:
Coarse-to-Fine Amodal Segmentation with Shape Prior. 1262-1271 - Ke Fan, Jingshi Lei, Xuelin Qian, Miaopeng Yu, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu:
Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-centric Representation. 1272-1281 - Tao Zhang, Xingye Tian, Yu Wu, Shunping Ji, Xuebo Wang, Yuan Zhang, Pengfei Wan:
DVIS: Decoupled Video Instance Segmentation Framework. 1282-1291 - Ayça Takmaz, Jonas Schult, Irem Kaftan, Mertcan Akçay
, Bastian Leibe, Robert W. Sumner, Francis Engelmann
, Siyu Tang
:
3D Segmentation of Humans in Point Clouds with Synthetic Data. 1292-1304 - Shijie Lian, Hua Li, Runmin Cong, Suqi Li, Wei Zhang, Sam Kwong:
WaterMask: Instance Segmentation for Underwater Imagery. 1305-1315 - Ho Kei Cheng, Seoung Wug Oh, Brian L. Price, Alexander G. Schwing, Joon-Young Lee:
Tracking Anything with Decoupled Video Segmentation. 1316-1326 - Chenming Li, Daoan Zhang, Wenjian Huang, Jianguo Zhang:
Cross Contrasting Feature Perturbation for Domain Generalization. 1327-1337 - Lei Fan, Bo Liu, Haoxiang Li, Ying Wu, Gang Hua:
Flexible Visual Recognition by Evidential Modeling of Confusion and Ignorance. 1338-1347 - Rabab Abdelfattah, Qing Guo, Xiaoguang Li, Xiaofeng Wang, Song Wang:
CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification. 1348-1357 - Jongyoun Noh, Hyekang Park, Junghyup Lee, Bumsub Ham:
RankMixup: Ranking-Based Mixup Training for Network Calibration. 1358-1368 - Yang Lu, Yiliang Zhang, Bo Han, Yiu-Ming Cheung, Hanzi Wang:
Label-Noise Learning with Intrinsically Long-Tailed Data. 1369-1378 - Xingyu Liu, Sanping Zhou, Le Wang, Gang Hua:
Parallel Attention Interaction Network for Few-Shot Skeleton-based Action Recognition. 1379-1388 - Jiangning Zhang, Xiangtai Li, Jian Li, Liang Liu, Zhucun Xue, Boshen Zhang, Zhengkai Jiang, Tianxin Huang, Yabiao Wang, Chengjie Wang:
Rethinking Mobile Block for Efficient Attention-based Models. 1389-1400 - Dongjun Lee, Seokwon Song, Jihee Suh, Joonmyeong Choi, Sanghyeok Lee, Hyunwoo J. Kim:
Read-only Prompt Optimization for Vision-Language Few-shot Learning. 1401-1411 - Zhongzhan Huang, Mingfu Liang, Jinghui Qin, Shanshan Zhong, Liang Lin:
Understanding Self-attention Mechanism via Dynamical System Perspective. 1412-1422 - Wenqiao Zhang, Changshuo Liu, Lingze Zeng, Beng Chin Ooi, Siliang Tang, Yueting Zhuang:
Learning in Imperfect Environment: Multi-Label Classification with Long-Tailed Distribution and Partial Labels. 1423-1432 - Shunxin Wang, Raymond N. J. Veldhuis, Christoph Brune
, Nicola Strisciuglio:
What do neural networks learn in image classification? A frequency shortcut perspective. 1433-1442 - Tong Liang, Jim Davis:
Inducing Neural Collapse to a Fixed Hierarchy-Aware Frame for Reducing Mistake Severity. 1443-1452 - Reza Averly, Wei-Lun Chao:
Unified Out-Of-Distribution Detection: A Model-Specific Perspective. 1453-1463 - Myeongho Jeon, Myungjoo Kang, Joonseok Lee:
A Unified Framework for Robustness on Diverse Sampling Errors. 1464-1472 - Xuelin Zhu, Jian Liu, Weijia Liu, Jiawei Ge, Bo Liu
, Jiuxin Cao:
Scene-Aware Label Graph Learning for Multi-Label Image Classification. 1473-1482 - Xiaobo Xia, Jiankang Deng
, Wei Bao, Yuxuan Du, Bo Han, Shiguang Shan, Tongliang Liu:
Holistic Label Correction for Noisy Multi-Label Classification. 1483-1493 - Guiping Cao, Shengda Luo, Wenjian Huang, Xiangyuan Lan, Dongmei Jiang, Yaowei Wang, Jianguo Zhang:
Strip-MLP: Efficient Token Interaction for Vision MLP. 1494-1504 - Ke Xu, Lei Han, Ye Tian, Shangshang Yang, Xingyi Zhang:
EQ-Net: Elastic Quantization Neural Networks. 1505-1514 - Renrong Shao
, Wei Zhang, Jianhua Yin, Jun Wang:
Data-free Knowledge Distillation for Fine-grained Visual Categorization. 1515-1525 - Xilin He, Qinliang Lin, Cheng Luo, Weicheng Xie, Siyang Song, Feng Liu, Linlin Shen:
Shift from Texture-bias to Shape-bias: Edge Deformation-based Augmentation for Robust Object Recognition. 1526-1535 - Isack Lee, Eungi Lee, Seok Bong Yoo:
Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for Occluded Facial Expression Recognition. 1536-1546 - Nan Zhou, Jiaxin Chen, Di Huang:
DR-Tune: Improving Fine-tuning of Pretrained Visual Models by Distribution Regularization with Semantic Calibration. 1547-1556 - Jaewoo Park, Jacky Chen Long Chai, Jaeho Yoon, Andrew Beng Jin Teoh:
Understanding the Feature Norm for Out-of-Distribution Detection. 1557-1567 - Ruoyi Du, Wenqing Yu, Heqing Wang, Ting-En Lin, Dongliang Chang, Zhanyu Ma:
Multi-View Active Fine-Grained Visual Recognition. 1568-1578 - Ruiyuan Gao, Chenchen Zhao, Lanqing Hong, Qiang Xu:
DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using Pre-trained Diffusion Models. 1579-1589 - Yurong Guo, Ruoyi Du, Yuan Dong, Timothy M. Hospedales, Yi-Zhe Song, Zhanyu Ma:
Task-aware Adaptive Learning for Cross-domain Few-shot Learning. 1590-1599 - Qidong Huang, Xiaoyi Dong, Dongdong Chen, Yinpeng Chen, Lu Yuan, Gang Hua, Weiming Zhang, Nenghai Yu:
Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting. 1600-1610 - Shouwen Wang
, Qian Wan
, Xiang Xiang, Zhigang Zeng:
Saliency Regularization for Self-Training with Partial Annotations. 1611-1620 - Lanyun Zhu
, Tianrun Chen
, Jianxiong Yin, Simon See, Jun Liu:
Learning Gabor Texture Features for Fine-Grained Recognition. 1621-1631 - Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Limin Wang, Yu Qiao:
UniFormerV2: Unlocking the Potential of Image ViTs for Video Understanding. 1632-1643 - Ziyi Zhang, Weikai Chen, Chaowei Fang, Zhen Li, Lechao Chen, Liang Lin, Guanbin Li:
RankMatch: Fostering Confidence and Consistency in Learning with Noisy Labels. 1644-1654 - Yanan Wu, Zhixiang Chi, Yang Wang, Songhe Feng:
MetaGCD: Learning to Continually Learn in Generalized Category Discovery. 1655-1665 - Zhiqiang Shen:
FerKD: Surgical Label Adaptation for Efficient Distillation. 1666-1675 - Chengxin Liu, Hao Lu, Zhiguo Cao, Tongliang Liu:
Point-Query Quadtree for Crowd Counting, Localization, and More. 1676-1685 - Jaewoo Park, Yoon Gyo Jung
, Andrew Beng Jin Teoh:
Nearest Neighbor Guidance for Out-of-Distribution Detection. 1686-1695 - HyunJae Lee, Heon Song, Hyeonsoo Lee, Gihyeon Lee, Suyeong Park, Donggeun Yoo:
Bayesian Optimization Meets Self-Distillation. 1696-1705 - Yu-Ming Tang, Yi-Xing Peng, Wei-Shi Zheng:
When Prompt-based Incremental Learning Does Not Meet Strong Pretraining. 1706-1716 - Chengkai Hou, Jieyu Zhang, Tianyi Zhou:
When to Learn What: Model-Adaptive Data Augmentation Curriculum. 1717-1728 - Florent Chiaroni, Jose Dolz, Imtiaz Masud Ziko, Amar Mitiche, Ismail Ben Ayed:
Parametric Information Maximization for Generalized Category Discovery. 1729-1739 - Jiazheng Xing, Mengmeng Wang, Yudi Ruan, Bofan Chen, Yaowei Guo, Boyu Mu, Guang Dai, Jingdong Wang, Yong Liu:
Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching. 1740-1750 - Liang Chen, Yong Zhang, Yibing Song, Anton van den Hengel, Lingqiao Liu:
Domain Generalization via Rationale Invariance. 1751-1760 - Ziqing Wang, Yuetong Fang, Jiahang Cao
, Qiang Zhang, Zhongrui Wang, Renjing Xu:
Masked Spiking Transformer. 1761-1771 - Wuxuan Shi
, Mang Ye
:
Prototype Reminiscence and Augmented Asymmetric Knowledge Aggregation for Non-Exemplar Class-Incremental Learning. 1772-1781 - Yun Li, Zhe Liu, Saurav Jha, Lina Yao:
Distilled Reverse Attention Network for Open-world Compositional Zero-Shot Learning. 1782-1791 - Shuo He, Guowu Yang, Lei Feng:
Candidate-aware Selective Disambiguation Based On Normalized Entropy for Instance-dependent Partial-label Learning. 1792-1801 - Hualiang Wang, Yi Li, Huifeng Yao, Xiaomeng Li:
CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No. 1802-1812 - Benzhi Wang, Yang Yang, Jinlin Wu, Guo-Jun Qi, Zhen Lei:
Self-similarity Driven Scale-invariant Learning for Weakly Supervised Person Search. 1813-1822 - Chanho Ahn, Kikyung Kim, Ji-Won Baek, Jongin Lim
, Seungju Han
:
Sample-wise Label Confidence Incorporation for Learning with Noisy Labels. 1823-1832 - Xiaobo Xia, Bo Han, Yibing Zhan, Jun Yu, Mingming Gong, Chen Gong, Tongliang Liu:
Combating Noisy Labels with Sample Selection by Mining High-Discrepancy Examples. 1833-1843 - Pingyu Wu, Wei Zhai, Yang Cao, Jiebo Luo
, Zheng-Jun Zha:
Spatial-Aware Token for Weakly Supervised Object Localization. 1844-1854 - Sriram Balasubramanian, Soheil Feizi:
Towards Improved Input Masking for Convolutional Neural Networks. 1855-1865 - Robert van der Klis, Stephan Alaniz, Massimiliano Mancini
, Cássio Fraga Dantas, Dino Ienco, Zeynep Akata, Diego Marcos:
PDiscoNet: Semantically consistent part discovery for fine-grained recognition. 1866-1876 - Divyansh Srivastava, Tuomas P. Oikarinen, Tsui-Wei Weng:
Corrupting Neuron Explanations of Deep Visual Features. 1877-1886 - Dawid Rymarczyk, Joost van de Weijer, Bartosz Zielinski, Bartlomiej Twardowski:
ICICLE: Interpretable Class Incremental Continual Learning. 1887-1898 - Uddeshya Upadhyay, Shyamgopal Karthik, Massimiliano Mancini
, Zeynep Akata:
ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models. 1899-1910 - Julia Hornauer, Adrian Holzbock, Vasileios Belagiannis:
Out-of-Distribution Detection for Monocular Depth Estimation. 1911-1921 - Sukrut Rao
, Moritz Böhle, Amin Parchami-Araghi, Bernt Schiele:
Studying How to Efficiently and Effectively Guide Models with Explanations. 1922-1933 - Amil Dravid, Yossi Gandelsman, Alexei A. Efros, Assaf Shocher:
Rosetta Neurons: Mining the Common Units in a Model Zoo. 1934-1943 - Nanne van Noord
:
Prototype-based Dataset Comparison. 1944-1954 - Haozhe Liu, Mingchen Zhuge, Bing Li, Yuhui Wang, Francesco Faccio, Bernard Ghanem
, Jürgen Schmidhuber:
Learning to Identify Critical States for Reinforcement Learning from Videos. 1955-1965 - Alexandros Stergiou
, Nikos Deligiannis:
Leaping Into Memories: Space-Time Deep Feature Synthesis. 1966-1976 - Yifei Zhang, Siyi Gu, Yuyang Gao, Bo Pan, Xiaofeng Yang, Liang Zhao:
MAGI: Multi-Annotated Explanation-Guided Learning. 1977-1987 - Wei Huang, Xingyu Zhao
, Gaojie Jin, Xiaowei Huang:
SAFARI: Versatile and Efficient Evaluations for Robustness of Interpretability. 1988-1998 - Hang Li, Jindong Gu, Rajat Koner, Sahand Sharifzadeh, Volker Tresp:
Do DALL-E and Flamingo Understand Each Other? 1999-2010 - Qihan Huang, Mengqi Xue, Wenqi Huang, Haofei Zhang
, Jie Song, Yongcheng Jing, Mingli Song:
Evaluation and Improvement of Interpretability for Self-Explainable Part-Prototype Networks. 2011-2020 - Jingwei Zhang, Farzan Farnia:
MoreauGrad: Sparse and Robust Interpretation of Neural Networks via Moreau Envelope. 2021-2030 - Kelu Yao, Jin Wang, Boyu Diao, Chao Li:
Towards Understanding the Generalization of Deepfake Detectors from a Game-Theoretical View. 2031-2041 - Xue Wang, Zhibo Wang, Haiqin Weng, Hengchang Guo, Zhifei Zhang, Lu Jin, Tao Wei, Kui Ren:
Counterfactual-based Saliency Map: Towards Visual Contrastive Explanations for Neural Networks. 2042-2051 - Giyoung Jeon, Haedong Jeong, Jaesik Choi
:
Beyond Single Path Integrated Gradients for Reliable Input Attribution via Randomized Path Sampling. 2052-2061 - Chong Wang, Yuyuan Liu, Yuanhong Chen, Fengbei Liu, Yu Tian, Davis J. McCarthy, Helen Frazer, Gustavo Carneiro
:
Learning Support and Trivial Prototypes for Interpretable Image Classification. 2062-2072 - Oren Barkan, Yehonatan Elisha, Yuval Asher, Amit Eshel, Noam Koenigstein
:
Visual Explanations via Iterated Integrated Attributions. 2073-2084 - Nan Liu, Yilun Du, Shuang Li, Joshua B. Tenenbaum, Antonio Torralba:
Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models. 2085-2095 - Xiaoshi Wu, Keqiang Sun, Feng Zhu, Rui Zhao, Hongsheng Li
:
Human Preference Score: Better Aligning Text-to-image Models with Human Preference. 2096-2105 - Elad Levi, Eli Brosh, Mykola Mykhailych, Meir Perez:
DLT: Conditioned layout generation with Joint Discrete-Continuous Diffusion Layout Transformer. 2106-2115 - Thanh Van Le, Hao Phung, Thuan Hoang Nguyen
, Quan Dao, Ngoc N. Tran, Anh Tuan Tran:
Anti-DreamBooth: Protecting users from personalized text-to-image synthesis. 2116-2127 - Michal J. Tyszkiewicz, Pascal Fua, Eduard Trulls:
GECCO: Geometrically-Conditioned Point Diffusion Models. 2128-2138 - Shengqu Cai, Eric Ryan Chan, Songyou Peng, Mohamad Shahbazi, Anton Obukhov, Luc Van Gool, Gordon Wetzstein:
DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models. 2139-2150 - Korrawe Karunratanakul, Konpat Preechakul, Supasorn Suwajanakorn, Siyu Tang
:
Guided Motion Diffusion for Controllable Human Motion Synthesis. 2151-2162 - Yanzhao Zheng, Yunzhou Shi, Yuhao Cui, Zhongzhou Zhao, Zhiling Luo, Wei Zhou:
COOP: Decoupling and Coupling of Whole-Body Grasping Pose Generation. 2163-2173 - Guillaume Couairon, Marlène Careil, Matthieu Cord, Stéphane Lathuilière, Jakob Verbeek:
Zero-shot spatial layout conditioning for text-to-image diffusion models. 2174-2183 - Aibek Alanov, Vadim Titov, Maksim Nakhodnov, Dmitry P. Vetrov:
StyleDomain: Efficient and Lightweight Parameterizations of StyleGAN for One-shot and Few-shot Domain Adaptation. 2184-2194 - Jianfeng Xiang, Jiaolong Yang
, Yu Deng, Xin Tong:
GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds. 2195-2205 - Alexander C. Li, Mihir Prabhudesai, Shivam Duggal, Ellis Brown
, Deepak Pathak:
Your Diffusion Model is Secretly a Zero-Shot Classifier. 2206-2217 - Jiali Cui, Ying Nian Wu, Tian Han:
Learning Hierarchical Features with Joint Latent Space Energy-Based Prior. 2218-2227 - Liang Xu, Ziyang Song, Dongliang Wang, Jing Su, Zhicheng Fang, Chenjing Ding, Weihao Gan, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng
, Wei Wu:
ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation. 2228-2238 - Ruoshi Liu, Chengzhi Mao, Purva Tendulkar, Hao Wang, Carl Vondrick:
Landscape Learning for Neural Network Inversion. 2239-2250 - Martin Nicolas Everaert
, Marco Bocchio, Sami Arpa, Sabine Süsstrunk, Radhakrishna Achanta:
Diffusion in Style. 2251-2261 - Gene Chou, Yuval Bahat, Felix Heide:
Diffusion-SDF: Conditional Generative Modeling of Signed Distance Functions. 2262-2272 - Xuanmeng Zhang, Jianfeng Zhang, Rohan Chacko, Hongyi Xu, Guoxian Song, Yi Yang, Jiashi Feng:
GETAvatar: Generative Textured Meshes for Animatable Human Avatars. 2273-2282 - Aishwarya Agarwal, Srikrishna Karanam, K. J. Joseph, Apoorv Saxena, Koustava Goswami, Balaji Vasan Srinivasan:
A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis. 2283-2293 - Shilin Lu
, Yanzhu Liu, Adams Wai-Kin Kong:
TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition. 2294-2305 - Yijun Qian, Jack Urbanek, Alexander G. Hauptmann, Jungdam Won:
Breaking The Limits of Text-conditioned 3D Motion Synthesis with Elaborative Descriptions. 2306-2316 - Germán Barquero, Sergio Escalera
, Cristina Palmero:
BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction. 2317-2327 - Amir Hertz, Kfir Aberman, Daniel Cohen-Or:
Delta Denoising Score. 2328-2337 - Xingyu Chen, Yu Deng, Baoyuan Wang:
Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation. 2338-2348 - Amit Raj, Srinivas Kaza, Ben Poole, Michael Niemeyer, Nataniel Ruiz, Ben Mildenhall, Shiran Zada, Kfir Aberman, Michael Rubinstein, Jonathan T. Barron, Yuanzhen Li, Varun Jampani:
DreamBooth3D: Subject-Driven Text-to-3D Generation. 2349-2359 - Shuang Song, Yuanbang Liang
, Jing Wu, Yu-Kun Lai, Yipeng Qin:
Feature Proliferation - the "Cancer" in StyleGAN and its Treatments. 2360-2370 - Berkay Kicanaoglu, Pablo Garrido, Gaurav Bharaj:
Unsupervised Facial Performance Editing via Vector-Quantized StyleGAN Representations. 2371-2382 - Jianfeng Xiang, Jiaolong Yang
, Binbin Huang, Xin Tong:
3D-aware Image Generation using 2D Diffusion Models. 2383-2393 - Ganghun Lee, Minji Kim, Yunsu Lee, Minsu Lee, Byoung-Tak Zhang:
Neural Collage Transfer: Artistic Reconstruction via Material Manipulation. 2394-2405 - Teng Hu, Jiangning Zhang, Liang Liu, Ran Yi, Siqi Kou, Haokun Zhu, Xu Chen, Yabiao Wang, Chengjie Wang, Lizhuang Ma:
Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption. 2406-2415 - Hansheng Chen, Jiatao Gu, Anpei Chen, Wei Tian, Zhuowen Tu, Lingjie Liu, Hao Su:
Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction. 2416-2425 - Rohit Gandikota, Joanna Materzynska, Jaden Fiotto-Kaufman, David Bau
:
Erasing Concepts from Diffusion Models. 2426-2436 - Ziyang Yuan, Yiming Zhu, Yu Li, Hongyu Liu, Chun Yuan:
Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding. 2437-2447 - Seunggyu Chang, Gihoon Kim, Hayeon Kim:
HairNeRF: Geometry-Aware Image Synthesis for Hairstyle Transfer. 2448-2458 - Yuanze Lin, Chen Wei, Huiyu Wang, Alan L. Yuille, Cihang Xie:
SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training. 2459-2469 - Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Xiangyang Ji, Chang Liu, Li Yuan, Jie Chen:
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model. 2470-2481 - Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin:
Explore and Tell: Embodied Visual Captioning in 3D Environments. 2482-2491 - Xuanlin Li, Yunhao Fang, Minghua Liu, Zhan Ling, Zhuowen Tu, Hao Su:
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability. 2492-2503 - Xu Yang, Zhangzikang Li, Haiyang Xu, Hanwang Zhang, Qinghao Ye, Chenliang Li, Ming Yan, Yu Zhang, Fei Huang, Songfang Huang:
Learning Trajectory-Word Alignments for Video-Language Tasks. 2504-2514 - Dizhan Xue
, Shengsheng Qian, Changsheng Xu:
Variational Causal Inference Network for Explanatory Visual Question Answering. 2515-2525 - Moon Ye-Bin, Jisoo Kim, Hongyeob Kim, Kilho Son, Tae-Hyun Oh:
TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation. 2526-2537 - Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo:
Segment Every Reference Object in Spatial and Temporal Spaces. 2538-2550 - Juncheng Li, Minghe Gao, Longhui Wei, Siliang Tang, Wenqiao Zhang, Mengze Li, Wei Ji, Qi Tian, Tat-Seng Chua, Yueting Zhuang:
Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models. 2551-2562 - Bumsoo Kim, Yeonsik Jo, Jinhyung Kim, Seung-Hwan Kim:
Misalign, Contrast then Distill: Rethinking Misalignments in Language-Image Pretraining. 2563-2572 - Yifeng Zhang
, Shi Chen, Qi Zhao:
Toward Multi-Granularity Decision-Making: Explicit Visual Reasoning with Hierarchical Knowledge. 2573-2583 - Junyu Bi, Daixuan Cheng, Ping Yao, Bochen Pang, Yuefeng Zhan, Chuanguang Yang, Yujing Wang, Hao Sun, Weiwei Deng, Qi Zhang:
VL-Match: Enhancing Vision-Language Pretraining with Token-Level and Instance-Level Matching. 2584-2593 - Ioana Croitoru, Simion-Vlad Bogolin, Samuel Albanie, Yang Liu, Zhaowen Wang, Seunghyun Yoon, Franck Dernoncourt, Hailin Jin, Trung Bui:
Moment Detection in Long Tutorial Videos. 2594-2604 - Xiangyang Zhu, Renrui Zhang, Bowei He
, Aojun Zhou, Dong Wang, Bin Zhao, Peng Gao:
Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement. 2605-2615 - Nitzan Bitton Guetta, Yonatan Bitton, Jack Hessel, Ludwig Schmidt, Yuval Elovici, Gabriel Stanovsky, Roy Schwartz:
Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images. 2616-2627 - Yixuan Wu, Zhao Zhang
, Chi Xie
, Feng Zhu, Rui Zhao:
Advancing Referring Expression Segmentation Beyond Single Image. 2628-2638 - Xiangyang Zhu, Renrui Zhang, Bowei He
, Ziyu Guo, Ziyao Zeng, Zipeng Qin, Shanghang Zhang, Peng Gao:
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning. 2639-2650 - Weizhen He, Weijie Chen, Binbin Chen, Shicai Yang, Di Xie, Luojun Lin, Donglian Qi, Yueting Zhuang:
Unsupervised Prompt Tuning for Text-Driven Object Detection. 2651-2661 - Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao:
Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding. 2662-2671 - Sophia Gu, Christopher Clark, Aniruddha Kembhavi:
I can't believe there's no images! : Learning Visual Tasks Using Only Language Supervision. 2672-2683 - Guanghui Li, Mingqi Gao, Heng Liu, Xiantong Zhen, Feng Zheng:
Learning Cross-Modal Affinity for Referring Video Object Segmentation Targeting Limited Samples. 2684-2693 - Henghui Ding, Chang Liu, Shuting He, Xudong Jiang
, Chen Change Loy:
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions. 2694-2703 - Chun-Mei Feng, Kai Yu, Yong Liu, Salman Khan, Wangmeng Zuo:
Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning. 2704-2714 - Xi Tian, Yong-Liang Yang, Qi Wu:
ShapeScaffolder: Structure-Aware 3D Shape Generation from Text. 2715-2724 - Vishaal Udandarao, Ankush Gupta, Samuel Albanie:
SuS-X: Training-Free Name-Only Transfer of Vision-Language Models. 2725-2736 - Yiwei Ma, Haowei Wang, Xiaoqing Zhang, Guannan Jiang, Xiaoshuai Sun, Weilin Zhuang, Jiayi Ji, Rongrong Ji:
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance. 2737-2748 - Dongming Wu, Tiancai Wang, Yuang Zhang, Xiangyu Zhang, Jianbing Shen:
OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation. 2749-2758 - Yifan Yang, Weiquan Huang, Yixuan Wei, Houwen Peng, Xinyang Jiang, Huiqiang Jiang, Fangyun Wei, Yin Wang, Han Hu, Lili Qiu, Yuqing Yang:
Attentive Mask CLIP. 2759-2769 - Jiangtong Li, Li Niu, Liqing Zhang:
Knowledge Proxy Intervention for Deconfounded Video Question Answering. 2770-2781 - Kevin Qinghong Lin, Pengchuan Zhang, Joya Chen, Shraman Pramanick, Difei Gao, Alex Jinpeng Wang, Rui Yan, Mike Zheng Shou:
UniVTG: Towards Unified Video-Language Temporal Grounding. 2782-2792 - Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang:
Self-supervised Cross-view Representation Reconstruction for Change Captioning. 2793-2803 - Ziyang Wang, Yi-Lin Sung, Feng Cheng, Gedas Bertasius, Mohit Bansal:
Unified Coarse-to-Fine Alignment for Video-Text Retrieval. 2804-2815 - Yang Liu, Jiahua Zhang, Qingchao Chen
, Yuxin Peng:
Confidence-aware Pseudo-label Learning for Weakly Supervised Visual Grounding. 2816-2826 - Chengyang Zhao, Yikang Shen, Zhenfang Chen, Mingyu Ding, Chuang Gan:
TextPSG: Panoptic Scene Graph Generation from Textual Descriptions. 2827-2838 - Wei Lin, Leonid Karlinsky, Nina Shvetsova, Horst Possegger, Mateusz Kozinski
, Rameswar Panda, Rogério Feris, Hilde Kuehne, Horst Bischof:
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge. 2839-2850 - Yaowei Li, Bang Yang, Xuxin Cheng, Zhihong Zhu, Hongxiang Li, Yuexian Zou:
Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation. 2851-2862 - Devaansh Gupta, Siddhant Kharbanda, Jiawei Zhou, Wanhua Li, Hanspeter Pfister, Donglai Wei:
CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation. 2863-2874 - Morris Alper, Hadar Averbuch-Elor:
Learning Human-Human Interactions in Images from Weak Textual Supervision. 2875-2887 - Chaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Fei Huang, Songfang Huang:
BUS : Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization. 2888-2898 - Ziyu Zhu, Xiaojian Ma, Yixin Chen, Zhidong Deng, Siyuan Huang, Qing Li:
3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment. 2899-2909 - Kaicheng Yang, Jiankang Deng
, Xiang An
, Jiawei Li, Ziyong Feng
, Jia Guo, Jing Yang, Tongliang Liu:
ALIP: Adaptive Language-Image Pre-training with Synthetic Caption. 2910-2919 - Cheng Shi, Sibei Yang:
LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models. 2920-2929 - Wooyoung Kang, Jonghwan Mun, Sungjun Lee, Byungseok Roh:
Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning. 2930-2940 - Zi Qian
, Xin Wang, Xuguang Duan, Pengda Qin, Yuhong Li, Wenwu Zhu:
Decouple Before Interact: Multi-Modal Prompt Learning for Continual Visual Question Answering. 2941-2950 - Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A. Smith, Jiebo Luo
:
PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3. 2951-2963 - Yu Wu
, Yana Wei
, Haozhe Wang, Yongfei Liu, Sibei Yang, Xuming He:
Grounded Image Text Matching with Mismatched Relation Reasoning. 2964-2975 - Mohamed Ashraf Abdelsalam, Samrudhdhi B. Rangrej, Isma Hadji, Nikita Dvornik, Konstantinos G. Derpanis, Afsaneh Fazly:
GePSAn: Generative Procedure Step Anticipation in Cooking Videos. 2976-2985 - Chan Hee Song, Brian M. Sadler, Jiaman Wu, Wei-Lun Chao, Clayton Washington, Yu Su:
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models. 2986-2997 - Zi-Yuan Hu, Yanyang Li, Michael R. Lyu, Liwei Wang:
VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control. 2998-3008 - Manuele Barraco, Sara Sarto
, Marcella Cornia, Lorenzo Baraldi
, Rita Cucchiara:
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning. 3009-3019 - Jaemin Cho, Abhay Zala, Mohit Bansal:
DALL-EVAL: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models. 3020-3031 - Yicong Hong, Yang Zhou, Ruiyi Zhang, Franck Dernoncourt, Trung Bui, Stephen Gould, Hao Tan:
Learning Navigational Visual Representations with Semantic Map Supervision. 3032-3044 - Jiajin Tang, Ge Zheng, Jingyi Yu, Sibei Yang:
CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection. 3045-3055 - Nan Xi, Jingjing Meng, Junsong Yuan:
Open Set Video HOI detection from Action-centric Chain-of-Look Prompting. 3056-3066 - An Yan, Yu Wang, Yiwu Zhong, Chengyu Dong, Zexue He, Yujie Lu, William Yang Wang, Jingbo Shang, Julian J. McAuley:
Learning Concise and Descriptive Attributes for Visual Recognition. 3067-3077 - Dohwan Ko, Ji Soo Lee, Miso Choi, Jaewon Chu, Jihwan Park
, Hyunwoo J. Kim:
Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models. 3078-3089 - Thomas Mensink, Jasper R. R. Uijlings, Lluís Castrejón, Arushi Goel, Felipe Cadar, Howard Zhou
, Fei Sha, André Araújo, Vittorio Ferrari:
Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories. 3090-3101 - Daechul Ahn, Daneul Kim, Gwangmo Song, Seung Hwan Kim, Honglak Lee, Dongyeop Kang, Jonghyun Choi
:
Story Visualization by Online Text Augmentation with Context Memory. 3102-3112 - Junjie Fei, Teng Wang, Jinrui Zhang, Zhenyu He, Chengjie Wang, Feng Zheng:
Transferable Decoding with Visual Entities for Zero-Shot Image Captioning. 3113-3123 - Alex Jinpeng Wang, Kevin Qinghong Lin, David Junhao Zhang, Stan Weixian Lei, Mike Zheng Shou:
Too Large; Data Reduction for Vision-Language Pre-Training. 3124-3134 - Weihan Wang, Zhen Yang, Bin Xu, Juanzi Li, Yankui Sun:
ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation. 3135-3146 - Roni Paiss, Ariel Ephrat, Omer Tov, Shiran Zada, Inbar Mosseri, Michal Irani, Tali Dekel:
Teaching CLIP to Count to Ten. 3147-3157 - Junsheng Zhou, Baorui Ma, Shujuan Li, Yu-Shen Liu, Zhizhong Han:
Learning a More Continuous Zero Level Set in Unsigned Distance Fields through Level Set Projection. 3158-3169 - Wenyan Cong, Hanxue Liang, Peihao Wang, Zhiwen Fan, Tianlong Chen, Mukund Varma T., Yi Wang, Zhangyang Wang:
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts. 3170-3181 - Yixuan Li, Lihan Jiang, Linning Xu, Yuanbo Xiangli, Zhenzhi Wang, Dahua Lin, Bo Dai:
MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering and Beyond. 3182-3192 - Aron Schmied, Tobias Fischer, Martin Danelljan, Marc Pollefeys, Fisher Yu:
R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras. 3193-3203 - Yuan Li, Zhi-Hao Lin, David A. Forsyth, Jia-Bin Huang, Shenlong Wang:
ClimateNeRF: Extreme Weather Synthesis in Neural Radiance Field. 3204-3215 - Tiange Xiang, Adam Sun, Jiajun Wu, Ehsan Adeli
, Li Fei-Fei:
Rendering Humans from Object-Occluded Monocular Videos. 3216-3227 - Yuanbo Xiangli, Linning Xu, Xingang Pan, Nanxuan Zhao, Bo Dai, Dahua Lin:
AssetField: Assets Mining and Reconfiguration in Ground Feature Plane Representation. 3228-3238 - Yingfei Liu, Junjie Yan, Fan Jia, Shuailin Li, Aqi Gao, Tiancai Wang, Xiangyu Zhang:
PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images. 3239-3249 - Takuhiro Kaneko:
MIMO-NeRF: Fast Neural Rendering with Multi-input Multi-output Neural Radiance Fields. 3250-3260 - Zelin Gao, Weichen Dai, Yu Zhang:
Adaptive Positional Encoding for Bundle-Adjusting Neural Radiance Fields. 3261-3271 - Yiming Wang
, Qin Han, Marc Habermann, Kostas Daniilidis, Christian Theobalt, Lingjie Liu:
NeuS2: Fast Learning of Neural Implicit Surfaces for Multi-view Reconstruction. 3272-3283 - Qitong Wang, Long Zhao, Liangzhe Yuan, Ting Liu, Xi Peng:
Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition. 3284-3294 - Junpeng Jing, Jiankun Li, Pengfei Xiong, Jiangyu Liu, Shuaicheng Liu, Yichen Guo, Xin Deng, Mai Xu, Lai Jiang, Leonid Sigal:
Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching. 3295-3304 - Martin Bråtelund, Felix Rydell:
Compatibility of Fundamental Matrices for Complete Viewing Graphs. 3305-3313 - Pin Tang, Hai-Ming Xu, Chao Ma:
ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation. 3314-3324 - Jinqing Zhang, Yanan Zhang
, Qingjie Liu, Yunhong Wang:
SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection. 3325-3334 - Ziying Song, Haiyue Wei, Lin Bai, Lei Yang
, Caiyan Jia:
GraphAlign: Enhancing Accurate Feature Alignment by Graph matching for Multi-Modal 3D Object Detection. 3335-3346 - Mikhail Terekhov, Viktor Larsson:
Tangent Sampson Error: Fast Approximate Two-view Reprojection Error for Central Camera Models. 3347-3355 - Gilles Puy, Alexandre Boulch, Renaud Marlet:
Using a Waffle Iron for Automotive Point Cloud Semantic Segmentation. 3356-3366 - Levente Hajder, Lajos Lóczi, Daniel Barath:
Fast Globally Optimal Surface Normal from an Affine Correspondence. 3367-3378 - Marcel C. Bühler, Kripasindhu Sarkar, Tanmay Shah, Gengyan Li
, Daoye Wang, Leonhard Helminger, Sergio Orts-Escolano, Dmitry Lagun, Otmar Hilliges, Thabo Beeler, Abhimitra Meka:
Preface: A Data-driven Volumetric Prior for Few-shot Ultra High-resolution Face Synthesis. 3379-3390 - Brent Yi, Weijia Zeng, Sam Buchanan, Yi Ma:
Canonical Factors for Hybrid Neural Fields. 3391-3403 - Haobo Jiang, Zheng Dang, Shuo Gu, Jin Xie, Mathieu Salzmann, Jian Yang:
Center-Based Decoupled Point Cloud Registration for 6D Object Pose Estimation. 3404-3414 - Annika Hagemann, Moritz Knorr, Christoph Stiller:
Deep geometry-aware camera self-calibration from video. 3415-3425 - Nathaniel Burgdorfer, Philippos Mordohai:
V-FUSE: Volumetric Depth Map Fusion with Long-Range Constraints. 3426-3435 - Yuxiang Cai, Yifan Zhu, Haiwei Zhang, Bo Ren:
Consistent Depth Prediction for Transparent Object Reconstruction from RGB-D Camera. 3436-3445 - Sungwon Hwang, Junha Hyung, Daejin Kim, Min-Jung Kim, Jaegul Choo:
FaceCLIPNeRF: Text-driven 3D Face Manipulation using Deformable Neural Radiance Fields. 3446-3456 - Xiufeng Xie, Riccardo Gherardi, Zhihong Pan, Stephen Huang:
HollowNeRF: Pruning Hashgrid-Based NeRFs with Trainable Collision Mitigation. 3457-3467 - Jae-Hyeok Lee, Dae-Shik Kim:
ICE-NeRF: Interactive Color Editing of NeRFs via Decomposition-Aware Weight Optimization. 3468-3478 - Zhijian Huang, Sihao Lin, Guiyu Liu, Mukun Luo, Chaoqiang Ye, Hang Xu, Xiaojun Chang
, Xiaodan Liang:
FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration. 3479-3488 - Aarrushi Shandilya, Benjamin Attal, Christian Richardt, James Tompkin, Matthew O'Toole:
Neural Fields for Structured Lighting. 3489-3499 - Tao Xie, Ke Wang, Siyi Lu, Yukun Zhang, Kun Dai, Xiaoyu Li, Jie Xu, Li Wang, Lijun Zhao, Xinyu Zhang, Ruifeng Li:
CO-Net: Learning Multiple Point Cloud Tasks at Once with A Cohesive Network. 3500-3510 - Jiahui Zhang, Fangneng Zhan, Yingchen Yu, Kunhao Liu, Rongliang Wu, Xiaoqin Zhang, Ling Shao, Shijian Lu:
Pose-Free Neural Radiance Fields via Implicit Pose Regularization. 3511-3520 - Xiao Pan, Zongxin Yang, Jianxin Ma, Chang Zhou, Yi Yang:
TransHuman: A Transformer-based Human Representation for Generalizable Neural Human Rendering. 3521-3532 - Haoyu Wu, Alexandros Graikos, Dimitris Samaras:
S-VolSDF: Sparse Multi-View Stereo Regularization of Neural Implicit Surfaces. 3533-3545 - Chaoran Tian, Weihong Pan, Zimo Wang, Mao Mao, Guofeng Zhang, Hujun Bao, Ping Tan, Zhaopeng Cui:
DPS-Net: Deep Polarimetric Stereo Depth Estimation. 3546-3556 - Changyong Shu, Jiajun Deng, Fisher Yu, Yifan Liu:
3DPPE: 3D Point Positional Encoding for Transformer-based Multi-Camera 3D Object Detection. 3557-3566 - Qi Ma, Danda Pani Paudel
, Ajad Chhatkuli, Luc Van Gool:
Deformable Neural Radiance Fields using RGB and Event Cameras. 3567-3577 - Jingyang Zhang, Yao Yao, Shiwei Li, Jingbo Liu, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan:
NeILF++: Inter-Reflectable Light Fields for Geometry and Material Estimation. 3578-3587 - Chunlin Ren, Qingshan Xu, Shikun Zhang, Jiaqi Yang:
Hierarchical Prior Mining for Non-local Multi-View Stereo. 3588-3597 - Shihao Wang
, Yingfei Liu, Tiancai Wang, Ying Li, Xiangyu Zhang:
Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection. 3598-3608 - Sara Rojas, Jesus Zarzar, Juan C. Pérez, Artsiom Sanakoyeu, Ali K. Thabet, Albert Pumarola, Bernard Ghanem
:
Re-ReND: Real-time Rendering of NeRFs across Devices. 3609-3618 - Xiaoyang Huang, Yi Zhang, Kai Chen, Teng Li, Wenjun Zhang, Bingbing Ni:
Learning Shape Primitives via Implicit Convexity Regularization. 3619-3628 - Ruihong Yin, Sezer Karaoglu, Theo Gevers:
Geometry-guided Feature Learning and Fusion for Indoor Scene Reconstruction. 3629-3638 - Zhiwei Zhang, Zhizhong Zhang, Qian Yu, Ran Yi, Yuan Xie, Lizhuang Ma:
LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and Semantic-Aware Alignment. 3639-3648 - Wenjie Ding, Limeng Qiao, Xi Qiu, Chi Zhang:
PivotNet: Vectorized Pivot Learning for End-to-end HD Map Construction. 3649-3659 - Ming Qian
, Jincheng Xiong, Gui-Song Xia
, Nan Xue:
Sat2Density: Faithful Density Learning from Satellite-Ground Image Pairs. 3660-3669 - Xin Lai, Yuhui Yuan, Ruihang Chu, Yukang Chen, Han Hu, Jiaya Jia
:
Mask-Attention-Free Transformer for 3D Instance Segmentation. 3670-3680 - Xiaoyong Lu, Yaping Yan, Tong Wei, Songlin Du:
Scene-Aware Feature Matching. 3681-3690 - Zhuoxiao Chen, Yadan Luo
, Zheng Wang, Mahsa Baktashmotlagh
, Zi Huang
:
Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling. 3691-3703 - Youmin Zhang, Fabio Tosi, Stefano Mattoccia
, Matteo Poggi
:
GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction. 3704-3714 - Valter Piedade
, Pedro Miraldo:
BANSAC: A dynamic BAyesian Network for adaptive SAmple Consensus. 3715-3724 - Felix Rydell, Elima Shehu, Angélica Torres:
Theoretical and Numerical Analysis of 3D Reconstruction Using Point and Line Incidences. 3725-3734 - Haozhe Lin, Zequn Chen, Jinzhi Zhang, Bing Bai, Yu Wang, Ruqi Huang, Lu Fang:
RealGraph: A Multiview Dataset for 4D Real-world Context Graph Generation. 3735-3745 - Kaiqiang Xiong, Rui Peng, Zhe Zhang, Tianxing Feng, Jianbo Jiao
, Feng Gao, Ronggang Wang:
CL-MVSNet: Unsupervised Multi-view Stereo with Dual-level Contrastive Learning. 3746-3757 - Zhuofan Zong, Dongzhi Jiang, Guanglu Song, Zeyue Xue, Jingyong Su, Hongsheng Li
, Yu Liu:
Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction. 3758-3767 - Zitian Wang, Zehao Huang, Jiahui Fu, Naiyan Wang, Si Liu:
Object as Query: Lifting any 2D Object Detector to 3D Detection. 3768-3777 - Ming Nie, Yujing Xue, Chunwei Wang, Chaoqiang Ye, Hang Xu, Xinge Zhu, Qingqiu Huang, Michael Bi Mi, Xinchao Wang, Li Zhang:
PARTNER: Level up the Polar Representation for LiDAR 3D Object Detection. 3778-3790 - Chuxin Wang, Wenfei Yang, Tianzhu Zhang:
Not Every Side Is Equal: Localization Uncertainty Estimation for Semi-Supervised 3D Object Detection. 3791-3801 - Yifan Zhang, Zhen Dong, Huanrui Yang, Ming Lu, Cheng-Ching Tseng, Yuan Du, Kurt Keutzer, Li Du, Shanghang Zhang:
QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection. 3802-3812 - Lvmin Zhang, Anyi Rao
, Maneesh Agrawala
:
Adding Conditional Control to Text-to-Image Diffusion Models. 3813-3824 - Liwen Wu, Rui Zhu, Mustafa B. Yaldiz, Yinhao Zhu, Hong Cai, Janarbek Matai, Fatih Porikli, Tzu-Mao Li, Manmohan Chandraker, Ravi Ramamoorthi:
Factorized Inverse Path Tracing for Efficient and Accurate Material-Lighting Estimation. 3825-3835 - Jianren Wang, Sudeep Dasari, Mohan Kumar Srirama, Shubham Tulsiani, Abhinav Gupta:
Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations. 3836-3845 - Chengliang Zhong, Yuhang Zheng, Yupeng Zheng, Hao Zhao, Li Yi, Xiaodong Mu, Ling Wang
, Pengfei Li, Guyue Zhou, Chao Yang, Xinliang Zhang, Jian Zhao:
3D Implicit Transporter for Temporally Consistent Keypoint Discovery. 3846-3857 - Nathan Mankovich, Tolga Birdal:
Chordal Averaging on Flag Manifolds and Its Applications. 3858-3867 - Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang
, Li Yi, He Wang:
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning. 3868-3879 - Zhiyu Huang, Haochen Liu, Chen Lv:
GameFormer: Game-theoretic Modeling and Learning of Transformer-based Interactive Prediction and Planning for Autonomous Driving. 3880-3890 - Gengshan Yang, Shuo Yang, John Z. Zhang, Zachary Manchester, Deva Ramanan:
PPR: Physically Plausible Reconstruction from Monocular Videos. 3891-3901 - Wenjia Wang, Yongtao Ge, Haiyi Mei, Zhongang Cai, Qingping Sun, Yanjun Wang, Chunhua Shen, Lei Yang, Taku Komura:
Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction. 3902-3912 - Hyekang Park, Jongyoun Noh, Youngmin Oh, Donghyeon Baek, Bumsub Ham:
ACLS: Adaptive and Conditional Label Smoothing for Network Calibration. 3913-3922 - Jun Luo, Matías Mendieta, Chen Chen, Shandong Wu:
PGFed: Personalize Each Client's Global Objective for Federated Learning. 3923-3933 - Angelina Wang, Olga Russakovsky:
Overwriting Pretrained Bias with Finetuning Data. 3934-3945 - Cheng Zhang, Xuanbai Chen, Siqi Chai, Chen Henry Wu, Dmitry Lagun, Thabo Beeler, Fernando De la Torre:
ITI-Gen: Inclusive Text-to-Image Generation. 3946-3957 - Robin Hesse, Simone Schaub-Meyer, Stefan Roth:
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods. 3958-3968 - Bo Dai, Linge Wang, Baoxiong Jia, Zeyu Zhang, Song-Chun Zhu, Chi Zhang, Yixin Zhu:
X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events. 3969-3979 - Irena Gao, Gabriel Ilharco, Scott M. Lundberg, Marco Túlio Ribeiro:
Adaptive Testing of Computer Vision Models. 3980-3991 - Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloé Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, Ross B. Girshick:
Segment Anything. 3992-4003 - Perrine Chassat, Juhyun Park
, Nicolas J.-B. Brunel:
Shape Analysis of Euclidean Curves under Frenet-Serret Framework. 4004-4013 - Shyam Nandan Rai
, Fabio Cermelli, Dario Fontanel, Carlo Masone
, Barbara Caputo:
Unmasking Anomalies in Road-Scene Segmentation. 4014-4023 - Lu Qi, Jason Kuen, Tiancheng Shen, Jiuxiang Gu, Wenbo Li, Weidong Guo, Jiaya Jia
, Zhe Lin, Ming-Hsuan Yang:
High Quality Entity Segmentation. 4024-4033 - Haochen Wang, Xiaolong Jiang, Xu Tang, Yao Hu, Cilin Yan, Weidi Xie, Shuai Wang
, Efstratios Gavves:
Towards Open-Vocabulary Video Instance Segmentation. 4034-4043 - Yutao Hu, Qixiong Wang, Wenqi Shao, Enze Xie, Zhenguo Li, Jungong Han, Ping Luo:
Beyond One-to-One: Rethinking the Referring Image Segmentation. 4044-4054 - Wenhao Tang, Sheng Huang, Xiaoxian Zhang, Fengtao Zhou, Yi Zhang, Bo Liu:
Multiple Instance Learning Framework with Masked Hard Instance Mining for Whole Slide Image Classification. 4055-4064 - Colorado J. Reed, Ritwik Gupta, Shufan Li, Sarah Brockman, Christopher Funk
, Brian Clipp, Kurt Keutzer, Salvatore Candido, Matt Uyttendaele, Trevor Darrell:
Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning. 4065-4076 - Pandeng Li, Chen-Wei Xie, Liming Zhao, Hongtao Xie, Jiannan Ge, Yun Zheng, Deli Zhao, Yongdong Zhang:
Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval. 4077-4087 - Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Bin Luo, Jun-Yan He, Jin-Peng Lan, Yifeng Geng, Xuansong Xie:
Towards Deeply Unified Depth-aware Panoptic Segmentation with Bi-directional Guidance Learning. 4088-4098 - Liulei Li, Wenguan Wang
, Yang Yi:
LogicSeg: Parsing Visual Semantics with Neural Logic Learning and Reasoning. 4099-4110 - Kamal Gupta, Varun Jampani, Carlos Esteves, Abhinav Shrivastava, Ameesh Makadia, Noah Snavely, Abhishek Kar:
ASIC: Aligning Sparse in-the-wild Image Collections. 4111-4122 - Yael Vinker, Yuval Alaluf, Daniel Cohen-Or, Ariel Shamir:
CLIPascene: Scene Sketching with Different Types and Levels of Abstraction. 4123-4133 - Koutilya PNVR, Bharat Singh, Pallabi Ghosh, Behjat Siddiquie, David Jacobs:
LD-ZNet: A Latent Diffusion Approach for Text-Based Image Segmentation. 4134-4145 - Tianshi Cao, Karsten Kreis, Sanja Fidler, Nicholas Sharp, Kangxue Yin:
TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models. 4146-4158 - Zhang Chen, Zhong Li, Liangchen Song, Lele Chen, Jingyi Yu, Junsong Yuan, Yi Xu:
NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions. 4159-4171 - William Peebles, Saining Xie:
Scalable Diffusion Models with Transformers. 4172-4182 - Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Zhengzhe Liu, Xiaojuan Qi:
Texture Generation on 3D Meshes with Point-UV Diffusion. 4183-4193 - Eric R. Chan, Koki Nagano, Matthew A. Chan, Alexander W. Bergman, Jeong Joon Park
, Axel Levy, Miika Aittala, Shalini De Mello, Tero Karras, Gordon Wetzstein:
Generative Novel View Synthesis with 3D-Aware Diffusion Models. 4194-4206 - Enze Xie, Lewei Yao, Han Shi, Zhili Liu, Daquan Zhou
, Zhaoqiang Liu, Jiawei Li, Zhenguo Li:
DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning. 4207-4216 - Kyle Sargent, Jing Yu Koh, Han Zhang, Huiwen Chang, Charles Herrmann, Pratul P. Srinivasan, Jiajun Wu, Deqing Sun:
VQ3D: Learning a 3D-Aware Generative Model on ImageNet. 4217-4227 - Wenhang Ge, Tao Hu, Haoyu Zhao, Shu Liu, Ying-Cong Chen:
Ref-NeuS: Ambiguity-Reduced Neural Implicit Surface Learning for Multi-View Reconstruction with Reflection. 4228-4237 - Kushagra Pandey, Stephan Mandt:
A Complete Recipe for Diffusion Generative Models. 4238-4249 - Yiqi Zhong, Luming Liang, Ilya Zharkov, Ulrich Neumann:
MMVP: Motion-Matrix-based Video Prediction. 4250-4260 - Tomer Stolik, Itai Lang, Shai Avidan:
SAGA: Spectral Adversarial Geometric Attack on 3D Meshes. 4261-4271 - Qiufan Ji, Lin Wang
, Cong Shi, Shengshan Hu, Yingying Chen, Lichao Sun:
Benchmarking and Analyzing Robust Point Cloud Recognition: Bag of Tricks for Defending Adversarial Examples. 4272-4281 - Naufal Suryanto
, Yongsu Kim, Harashta Tatimma Larasati, Hyoeun Kang, Thi-Thu-Huong Le, Yoonyoung Hong, Hunmin Yang, Se-Yoon Oh, Howon Kim:
ACTIVE: Towards Highly Transferable 3D Physical Camouflage for Universal and Robust Vehicle Evasion. 4282-4291 - Peifei Zhu, Genki Osada, Hirokatsu Kataoka, Tsubasa Takahashi:
Frequency-aware GAN for Adversarial Manipulation Generation. 4292-4301 - Heeseon Kim, Minji Son, Minbeom Kim, Myung-Joon Kwon, Changick Kim:
Breaking Temporal Consistency: Generating Video Universal Adversarial Perturbations Using Image Models. 4302-4311 - Han Fang, Jiyi Zhang, Yupeng Qiu, Jiayang Liu, Ke Xu, Chengfang Fang, Ee-Chien Chang:
Tracing the Origin of Adversarial Attack for Forensic Investigation and Deterrence. 4312-4321 - Ziqi Zhou, Shengshan Hu, Ruizhi Zhao, Qian Wang, Leo Yu Zhang, Junhui Hou
, Hai Jin:
Downstream-agnostic Adversarial Examples. 4322-4332 - Zhigang Su, Dawei Zhou, Nannan Wang, Decheng Liu, Zhen Wang, Xinbo Gao:
Hiding Visual Information via Obfuscating Adversarial Perturbations. 4333-4343 - Changjiang Li, Ren Pang, Zhaohan Xi, Tianyu Du, Shouling Ji, Yuan Yao, Ting Wang:
An Embarrassingly Simple Backdoor Attack on Self-supervised Learning. 4344-4355 - Kaixun Jiang, Zhaoyu Chen, Hao Huang, Jiafeng Wang, Dingkang Yang, Bo Li, Yan Wang, Wenqiang Zhang:
Efficient Decision-based Black-box Patch Attacks on Video Recognition. 4356-4366 - Satoshi Suzuki, Shin'ya Yamaguchi, Shoichiro Takeda, Sekitoshi Kanai, Naoki Makishima, Atsushi Ando, Ryo Masumura:
Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff. 4367-4378 - Qingwen Bu, Dong Huang, Heming Cui:
Towards Building More Robust Models with Frequency Bias. 4379-4388 - Ningfei Wang, Yunpeng Luo, Takami Sato, Kaidi Xu, Qi Alfred Chen:
Does Physical Adversarial Example Really Matter to Autonomous Driving? Towards System-Level Effect of Adversarial Object Evasion Attack. 4389-4400 - Kaijie Zhu, Xixu Hu
, Jindong Wang, Xing Xie, Ge Yang:
Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning. 4401-4411 - Xuannan Liu, Yaoyao Zhong, Yuhang Zhang, Lixiong Qin, Weihong Deng:
Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregation. 4412-4421 - Xingxing Wei, Yao Huang, Yitong Sun, Jie Yu:
Unified Adversarial Patch for Cross-modal Attacks in the Physical World. 4422-4431 - Donghua Wang, Wen Yao, Tingsong Jiang, Chao Li, Xiaoqian Chen:
RFLA: A Stealthy Reflected Light Adversarial Attack in the Physical World. 4432-4442 - Mingli Zhu, Shaokui Wei, Li Shen, Yanbo Fan, Baoyuan Wu:
Enhancing Fine-Tuning based Backdoor Defense with Sharpness-Aware Minimization. 4443-4454 - Ka-Chun Shum
, Hong-Wing Pang, Binh-Son Hua, Duc Thanh Nguyen, Sai-Kit Yeung:
Conditional 360-degree Image Synthesis for Immersive Indoor Scene Decoration. 4455-4465 - Bin Chen, Jia-Li Yin, Shukai Chen, Bohao Chen, Ximeng Liu:
An Adaptive Model Ensemble Adversarial Attack for Boosting Adversarial Transferability. 4466-4475 - Byung-Kwan Lee, Junho Kim, Yong Man Ro:
Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning. 4476-4486 - Yaguan Qian, Shuke He, Chenyu Zhao
, Jiaqiang Sha, Wei Wang, Bin Wang:
LEA2: A Lightweight Ensemble Adversarial Attack via Non-overlapping Vulnerable Frequency Regions. 4487-4498 - Yulin Jin, Xiaoyu Zhang, Jian Lou, Xu Ma, Zilong Wang, Xiaofeng Chen:
Explaining Adversarial Robustness of Neural Networks from Clustering Effect Perspective. 4499-4508 - Ruyi Ding, Shijin Duan, Xiaolin Xu, Yunsi Fei:
VertexSerum: Poisoning Graph Neural Networks for Link Inference. 4509-4518 - Thibault Maho, Seyed-Mohsen Moosavi-Dezfooli, Teddy Furon:
How to choose your best allies for a transferable attack? 4519-4528 - Dongyoon Yang, Insung Kong, Yongdai Kim:
Enhancing Adversarial Robustness in Low-Label Regime via Adaptively Weighted Regularization and Knowledge Distillation. 4529-4538 - Xinquan Chen, Xitong Gao, Juanjuan Zhao, Kejiang Ye, Cheng-Zhong Xu:
AdvDiffuser: Natural Adversarial Example Synthesis with Diffusion Models. 4539-4549 - Tao Zhou, Qi Ye, Wenhan Luo
, Kaihao Zhang, Zhiguo Shi, Jiming Chen:
F&F Attack: Adversarial Attack against Multiple Object Trackers by Inducing False Negatives and False Positives. 4550-4560 - Lukas Struppek, Dominik Hintersdorf, Kristian Kersting:
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis. 4561-4573 - Zhengzhi Lu, He Wang, Ziyi Chang, Guoan Yang, Hubert P. H. Shum
:
Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient. 4574-4583 - Xiaosen Wang, Zeliang Zhang, Jianping Zhang:
Structure Invariant Transformation for better Adversarial Transferability. 4584-4596 - Min Liu, Alberto L. Sangiovanni-Vincentelli, Xiangyu Yue:
Beating Backdoor Attack at Its Own Game. 4597-4606 - Wenshuo Ma, Yidong Li, Xiaofeng Jia, Wei Xu:
Transferable Adversarial Attack for Both Vision Transformers and Convolutional Networks via Momentum Integrated Gradients. 4607-4616 - Nabeel Hingun, Chawin Sitawarin, Jerry Li, David A. Wagner:
REAP: A Large-Scale Realistic Adversarial Patch Benchmark. 4617-4628 - Siquan Huang
, Yijiang Li, Chong Chen
, Leyu Shi, Ying Gao
:
Multi-metrics adaptively identifies backdoors in Federated learning. 4629-4639 - Zhuoer Xu, Zhangxuan Gu, Jianping Zhang, Shiwen Cui, Changhua Meng, Weiqiang Wang:
Backpropagation Path Search On Adversarial Transferability. 4640-4650 - Teresa Yeo, Oguzhan Fatih Kar
, Zahra Sodagar, Amir Zamir:
Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback. 4651-4664 - Jianshuo Dong, Han Qiu, Yiming Li, Tianwei Zhang, Yuanjie Li, Zeqi Lai, Chao Zhang, Shu-Tao Xia:
One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training. 4665-4675 - Junfeng Guo, Ang Li, Lixu Wang, Cong Liu:
PolicyCleanse: Backdoor Detection and Mitigation for Competitive Reinforcement Learning. 4676-4685 - Shouwei Ruan, Yinpeng Dong, Hang Su, Jianteng Peng, Ning Chen, Xingxing Wei:
Towards Viewpoint-Invariant Visual Recognition via Adversarial Training. 4686-4696 - Mengnan Zhao, Lihe Zhang, Yuqiu Kong, Baocai Yin:
Fast Adversarial Training with Smooth Convergence. 4697-4706 - Virat Shejwalkar, Lingjuan Lyu, Amir Houmansadr:
The Perils of Learning From Unlabeled Data: Backdoor Attacks on Semi-supervised Learning. 4707-4717 - Hegui Zhu, Yuchen Ren
, Xiaoyan Sui, Lianping Yang, Wuming Jiang:
Boosting Adversarial Transferability via Gradient Relevance Attack. 4718-4727 - Guanhao Gan, Yiming Li, Dongxian Wu, Shu-Tao Xia:
Towards Robust Model Watermark via Reducing Parametric Vulnerability. 4728-4738 - Yiran Liu, Xin Feng, Yunlong Wang, Wu Yang, Di Ming
:
TRM-UAP: Enhancing the Transferability of Data-Free Universal Adversarial Perturbation via Truncated Ratio Maximization. 4739-4748 - Guangnian Wan, Haitao Du, Xuejing Yuan, Jun Yang, Meiling Chen, Jie Xu:
Enhancing Privacy Preservation in Federated Learning via Learning Rate Perturbation. 4749-4758 - Jie Zhang, Chen Chen, Weiming Zhuang, Lingjuan Lyu:
TARGET: Federated Class-Continual Learning via Exemplar-Free Distillation. 4759-4770 - Sriram Yenamandra, Pratik Ramesh, Viraj Prabhu, Judy Hoffman:
FACTS: First Amplify Correlations and Then Slice to Discover Bias. 4771-4781 - Yutong Wu, Xingshuo Han, Han Qiu, Tianwei Zhang:
Computation and Data Efficient Backdoor Attacks. 4782-4791 - Yaopei Zeng, Lei Liu, Li Liu, Li Shen, Shaoguo Liu, Baoyuan Wu:
Global Balanced Experts for Federated Long-Tailed Learning. 4792-4802 - Qucheng Peng
, Ce Zheng, Chen Chen:
Source-free Domain Adaptive Human Pose Estimation. 4803-4813 - Nicole Meister, Dora Zhao, Angelina Wang, Vikram V. Ramaswamy
, Ruth Fong, Olga Russakovsky:
Gender Artifacts in Visual Datasets. 4814-4825 - Haokun Chen, Ahmed Frikha, Denis Krompass, Jindong Gu, Volker Tresp:
FRAug: Tackling Federated Learning with Non-IID Features via Representation Augmentation. 4826-4836 - Zahra Ghodsi, Mojan Javaheripi, Nojan Sheybani, Xinqiao Zhang
, Ke Huang, Farinaz Koushanfar
:
zPROBE: Zero Peek Robustness Checks for Federated Learning. 4837-4847 - Myeongseob Ko, Ming Jin, Chenguang Wang, Ruoxi Jia:
Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study. 4848-4858 - Chen Yang, Meilu Zhu
, Yifan Liu, Yixuan Yuan
:
FedPD: Federated Open Set Recognition with Parameter Disentanglement. 4859-4868 - Junxu Liu
, Mingsheng Xue, Jian Lou, Xiaoyu Zhang, Li Xiong, Zhan Qin:
MUter: Machine Unlearning on Adversarially Trained Models. 4869-4879 - William Thong, Przemyslaw Joniak, Alice Xiang:
Beyond Skin Tone: A Multidimensional Measure of Apparent Skin Color. 4880-4890 - Jannik Brinkmann, Paul Swoboda, Christian Bartelt:
A Multidimensional Analysis of Social Biases in Vision Transformers. 4891-4900 - Jiaxuan Li, Duc Minh Vo, Hideki Nakayama:
Partition-and-Debias: Agnostic Biases Mitigation via A Mixture of Biases-Specific Experts. 4901-4911 - Dongyao Zhu, Yanbo Fang, Bowen Lei, Yiqun Xie, Dongkuan Xu, Jie Zhang, Ruqi Zhang:
Rethinking Data Distillation: Do Not Overlook Calibration. 4912-4922 - Rémi Nahon
, Van-Tam Nguyen, Enzo Tartaglione:
Mining bias-target Alignment from Voronoi Cells. 4923-4932 - Ming-Chang Chiu, Pin-Yu Chen, Xuezhe Ma:
Better May Not Be Fairer: A Study on Subgroup Discrepancy in Image Classification. 4933-4943 - Hao Fang, Bin Chen, Xuan Wang, Zhi Wang, Shu-Tao Xia:
GIFD: A Generative Gradient Inversion Method with Feature Domain Optimization. 4944-4953 - Hao Liang, Pietro Perona, Guha Balakrishnan:
Benchmarking Algorithmic Bias in Face Recognition: An Experimental Approach Using Synthetic Faces and Human Evaluation. 4954-4964 - Guangyu Sun, Matías Mendieta, Jun Luo, Shandong Wu, Chen Chen:
FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning. 4965-4975 - Sungwon Han, Sungwon Park, Fangzhao Wu, Sundong Kim
, Bin Zhu, Xing Xie, Meeyoung Cha
:
Towards Attack-tolerant Federated Learning via Critical Parameter Analysis. 4976-4985 - Ziheng Huang, Boheng Li, Yan Cai, Run Wang, Shangwei Guo, Liming Fang, Jing Chen, Lina Wang:
What can Discriminator do? Towards Box-free Ownership Verification of Generative Adversarial Networks. 4986-4996 - Xiuwen Fang, Mang Ye
, Xiyuan Yang:
Robust Heterogeneous Federated Learning under Data Corruption. 4997-5007 - Yuhao Zhou, Mingjia Shi, Yuanxi Li, Yanan Sun, Qing Ye, Jiancheng Lv:
Communication-efficient Federated Learning with Single-Step Synthetic Features Compressor for Faster Convergence. 5008-5017 - Jianqing Zhang, Yang Hua, Hao Wang, Tao Song
, Zhengui Xue, Ruhui Ma, Jian Cao, Haibing Guan:
GPFL: Simultaneously Learning Global and Personalized Feature Information for Personalized Federated Learning. 5018-5028 - Wenxuan Zeng, Meng Li, Wenjie Xiong
, Tong Tong, Wen-Jie Lu, Jin Tan, Runsheng Wang, Ru Huang:
MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention. 5029-5040 - Jan Hendrik Metzen, Robin Hutmacher, N. Grace Hua, Valentyn Boreiko, Dan Zhang:
Identification of Systematic Errors of Image Classifiers on Rare Subgroups. 5041-5050 - Nadiya Shvai, Arcadi Llanza Carmona, Amir Nakib:
Adaptive Image Anonymization in the Context of Image Classification with Neural Networks. 5051-5060 - Saeed Vahidian, Sreevatsank Kadaveru, Woonjoon Baek, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, Mubarak Shah, Bill Lin:
When Do Curricula Work in Federated Learning? 5061-5071 - Haotian Wang, Haoang Chi, Wenjing Yang, Zhipeng Lin, Mingyang Geng, Long Lan, Jing Zhang, Dacheng Tao:
Domain Specified Optimization for Deployment Authorization. 5072-5082 - Ming Li, Xiangyu Xu, Hehe Fan, Pan Zhou
, Jun Liu, Jia-Wei Liu, Jiahe Li, Jussi Keppo, Mike Zheng Shou, Shuicheng Yan:
STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition. 5083-5092 - Yuke Zhang, Dake Chen, Souvik Kundu, Chenghao Li, Peter A. Beerel
:
SAL-ViT: Towards Latency Efficient Private Inference on ViT using Selective Attention Search with a Learnable Softmax Approximation. 5093-5102 - Chi Zhang, Xiaoman Zhang, Ekanut Sotthiwat, Yanyu Xu, Ping Liu, Liangli Zhen, Yong Liu:
Generative Gradient Inversion via Over-Parameterized Networks in Federated Learning. 5103-5112 - Abhipsa Basu, R. Venkatesh Babu, Danish Pruthi:
Inspecting the Geographical Representativeness of Images from Text-to-Image Models. 5113-5124 - Yunqian Wen, Bo Liu
, Jingyi Cao, Rong Xie, Li Song:
Divide and Conquer: a Two-Step Method for High Quality Face De-identification with Model Explainability. 5125-5134 - Yizhe Li, Yu-Lin Tsai, Chia-Mu Yu, Pin-Yu Chen, Xuebin Ren:
Exploring the Benefits of Visual Prompting in Differential Privacy. 5135-5144 - Lei Zhang, Zhibo Wang, Xiaowei Dong, Yunhe Feng, Xiaoyi Pang, Zhifei Zhang, Kui Ren:
Towards Fairness-aware Adversarial Network Pruning. 5145-5154 - Hongwu Peng, Shaoyi Huang
, Tong Zhou, Yukui Luo, Chenghong Wang, Zigeng Wang, Jiahui Zhao, Xi Xie, Ang Li, Tony Geng, Kaleel Mahmood, Wujie Wen, Xiaolin Xu, Caiwen Ding:
AutoReP: Automatic ReLU Replacement for Fast Private Network Inference. 5155-5165 - Xingxuan Zhang, Renzhe Xu, Han Yu, Yancheng Dong, Pengfei Tian, Peng Cui:
Flatness-Aware Minimization for Domain Generalization. 5166-5179 - Jingwei Sun, Ziyue Xu, Dong Yang, Vishwesh Nath, Wenqi Li, Can Zhao, Daguang Xu, Yiran Chen, Holger R. Roth:
Communication-Efficient Vertical Federated Learning with Limited Overlapping Samples. 5180-5189 - Gorjan Radevski
, Dusan Grujicic, Matthew B. Blaschko
, Marie-Francine Moens, Tinne Tuytelaars
:
Multimodal Distillation for Egocentric Action Recognition. 5190-5201 - Peri Akiva, Jing Huang, Kevin J. Liang, Rama Kovvuri, Xingyu Chen, Matt Feiszli, Kristin J. Dana, Tal Hassner:
Self-Supervised Object Detection from Egocentric Videos. 5202-5214 - Lorenzo Mur-Labadia, Josechu J. Guerrero, Ruben Martinez-Cantin:
Multi-label affordance mapping from egocentric vision. 5215-5226 - Huiyu Wang, Mitesh Kumar Singh, Lorenzo Torresani:
Ego-Only: Egocentric Action Detection without Exocentric Transferring. 5227-5238 - Boxiao Pan, Bokui Shen, Davis Rempe, Despoina Paschalidou, Kaichun Mo, Yanchao Yang, Leonidas J. Guibas:
COPILOT: Human-Environment Collision Prediction and Localization from Egocentric Videos. 5239-5249 - Yue Xu
, Yong-Lu Li, Zhemin Huang, Michael Xu Liu, Cewu Lu, Yu-Wing Tai, Chi-Keung Tang:
EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding. 5250-5261 - Shraman Pramanick, Yale Song, Sayan Nag, Kevin Qinghong Lin, Hardik Shah, Mike Zheng Shou, Rama Chellappa, Pengchuan Zhang:
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone. 5262-5274 - Yiye Chen, Yunzhi Lin, Ruinian Xu, Patricio A. Vela:
WDiscOOD: Out-of-Distribution Detection via Whitened Linear Discriminant Analysis. 5275-5284 - Yandong Wen, Weiyang Liu, Yao Feng, Bhiksha Raj, Rita Singh, Adrian Weller, Michael J. Black, Bernhard Schölkopf:
Pairwise Similarity Learning is SimPLE. 5285-5295 - Zexi Li, Xinyi Shang, Rui He, Tao Lin, Chao Wu:
No Fear of Classifier Biases: Neural Collapse Inspired Federated Learning with Synthetic and Fixed Classifier. 5296-5306 - Jeffrey Gu, Kuan-Chieh Wang, Serena Yeung
:
Generalizable Neural Fields as Partially Observed Neural Processes. 5307-5316 - Fabian Mentzer, Eirikur Agustsson, Michael Tschannen:
M2T: Masking Transformers Twice for Faster Decoding. 5317-5326 - Bill Psomas
, Ioannis Kakogeorgiou
, Konstantinos Karantzalos, Yannis Avrithis:
Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit? 5327-5337 - Yuan Liu, Songyang Zhang
, Jiacheng Chen, Zhaohui Yu, Kai Chen, Dahua Lin:
Improving Pixel-based MIM by Reducing Wasted Modeling Capability. 5338-5349 - Kechun Liu, Yitong Jiang, Inchang Choi, Jinwei Gu:
Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration. 5350-5360 - Ruchika Chavhan, Henry Gouk, Da Li, Timothy M. Hospedales:
Quality Diversity for Visual Pre-Training. 5361-5371 - Chengkai Hou, Jieyu Zhang, Haonan Wang, Tianyi Zhou:
Subclass-balancing Contrastive Learning for Long-tailed Recognition. 5372-5384 - Sotiris Anagnostidis, Aurélien Lucchi, Thomas Hofmann:
Mastering Spatial Graph Prediction of Road Networks. 5385-5395 - Max van Spengler
, Erwin Berkhout, Pascal Mettes:
Poincaré ResNet. 5396-5405 - Xiaotong Li, Zixuan Hu, Yixiao Ge, Ying Shan, Ling-Yu Duan:
Exploring Model Transferability through the Lens of Potential Energy. 5406-5415 - Yixuan Wei, Han Hu, Zhenda Xie, Ze Liu, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo:
Improving CLIP Fine-tuning Performance. 5416-5426 - Tianjiao Ding, Shengbang Tong, Kwan Ho Ryan Chan, Xili Dai, Yi Ma, Benjamin D. Haeffele:
Unsupervised Manifold Linearizing and Clustering. 5427-5438 - Yeti Ziya Gürbüz, Ozan Sener, A. Aydin Alatan:
Generalized Sum Pooling for Metric Learning. 5439-5450 - Ke Liu, Feng Liu, Haishuai Wang, Ning Ma, Jiajun Bu, Bo Han:
Partition Speeds Up Learning Implicit Neural Representations Based on Exponential-Increase Hypothesis. 5451-5460 - Mannat Singh, Quentin Duval, Kalyan Vasudev Alwala, Haoqi Fan, Vaibhav Aggarwal, Aaron Adcock, Armand Joulin, Piotr Dollár, Christoph Feichtenhofer, Ross B. Girshick, Rohit Girdhar, Ishan Misra:
The effectiveness of MAE pre-pretraining for billion-scale pretraining. 5461-5471 - Han Xiao, Wenzhao Zheng, Zheng Zhu, Jie Zhou, Jiwen Lu:
Token-Label Alignment for Vision Transformers. 5472-5481 - Nishant Jain, Harkirat S. Behl, Yogesh Singh Rawat, Vibhav Vineet:
Efficiently Robustify Pre-Trained Models. 5482-5492 - Tao Xie, Kun Dai, Siyi Lu, Ke Wang, Zhiqiang Jiang, Jinghan Gao, Dedong Liu, Jie Xu, Lijun Zhao, Ruifeng Li:
OFVL-MS: Once for Visual Localization across Multiple Indoor Scenes. 5493-5503 - Cheng Yan, Shiyu Zhang, Yang Liu, Guansong Pang, Wenjun Wang:
Feature Prediction Diffusion Model for Video Anomaly Detection. 5504-5514 - Chia-Hao Chen, Ying-Tian Liu, Zhifei Zhang, Yuan-Chen Guo, Song-Hai Zhang:
Joint Implicit Neural Representation for High-fidelity and Compact Vector Fonts. 5515-5525 - Zijian Wang
, Yadan Luo
, Liang Zheng, Zi Huang
, Mahsa Baktashmotlagh
:
How Far Pre-trained Models Are from Neural Collapse on the Target Dataset Informs their Transferability. 5526-5535 - Chengkun Wang, Wenzhao Zheng, Zheng Zhu, Jie Zhou, Jiwen Lu:
OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions. 5536-5547 - Kanchana Ranasinghe, Brandon McKinzie, Sachin Ravi, Yinfei Yang, Alexander Toshev, Jonathon Shlens:
Perceptual Grouping in Contrastive Vision-Language Models. 5548-5561 - Bingyin Zhao, Zhiding Yu, Shiyi Lan, Yutao Cheng, Anima Anandkumar, Yingjie Lao, José M. Álvarez:
Fully Attentional Networks with Self-emerging Token Labeling. 5562-5572 - Xudong Tian, Zhizhong Zhang, Xin Tan, Jun Liu, Chengjie Wang, Yanyun Qu, Guannan Jiang, Yuan Xie:
Instance and Category Supervision are Alternate Learners for Continual Learning. 5573-5582 - Hong Yan, Yang Liu, Yushen Wei, Zhen Li, Guanbin Li, Liang Lin:
SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence Pre-training. 5583-5595 - David Fan
, Jue Wang, Shuai Liao, Yi Zhu, Vimal Bhat, Hector J. Santos-Villalobos, Rohith MV, Xinyu Li:
Motion-Guided Masking for Spatiotemporal Representation Learning. 5596-5606 - Enneng Yang
, Li Shen, Zhenyi Wang, Shiwei Liu, Guibing Guo, Xingwei Wang:
Data Augmented Flatness-aware Gradient Projection for Continual Learning. 5607-5616 - Ziyi Wang, Xumin Yu, Yongming Rao, Jie Zhou, Jiwen Lu:
Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models. 5617-5627 - Yefei He, Zhenyu Lou, Luoming Zhang, Jing Liu
, Weijia Wu, Hong Zhou, Bohan Zhuang:
BiViT: Extremely Compressed Binary Vision Transformers. 5628-5640 - Sepehr Sameni, Simon Jenni, Paolo Favaro:
Spatio-Temporal Crop Aggregation for Video Representation Learning. 5641-5651 - Hanjae Kim, Jiyoung Lee
, Seongheon Park, Kwanghoon Sohn:
Hierarchical Visual Primitive Experts for Compositional Zero-Shot Learning. 5652-5662 - Shengjiang Quan, Masahiro Hirano, Yuji Yamakawa:
Semantic Information in Contrastive Learning. 5663-5673 - Xuehan Bai, Yan Li, Yanhua Cheng, Wenjie Yang, Quan Chen, Han Li:
Cross-Domain Product Representation Learning for Rich-Content E-Commerce. 5674-5683 - Haoyang Cheng, Haitao Wen, Xiaoliang Zhang
, Heqian Qiu, Lanxiao Wang, Hongliang Li:
Contrastive Continuity on Augmentation Stability Rehearsal for Continual Self-Supervised Learning. 5684-5694 - Mehmet Kerim Yucel
, Ramazan Gokberk Cinbis, Pinar Duygulu:
HybridAugment++: Unified Frequency Spectra Perturbations for Model Robustness. 5695-5705 - Wenliang Zhao, Yongming Rao, Zuyan Liu, Benlin Liu, Jie Zhou, Jiwen Lu:
Unleashing Text-to-Image Diffusion Models for Visual Perception. 5706-5716 - Abhishek Aich, Samuel Schulter, Amit K. Roy-Chowdhury, Manmohan Chandraker, Yumin Suh:
Efficient Controllable Multi-Task Architectures. 5717-5728 - Ruihan Xu, Haokui Zhang, Wenze Hu, Shiliang Zhang, Xiaoyu Wang:
ParCNetV2: Oversized Kernel with Enhanced Attention*. 5729-5739 - Zihao Sun, Yu Sun, Longxing Yang, Shun Lu, Jilin Mei, Wenxiao Zhao, Yu Hu:
Unleashing the Power of Gradient Signal-to-Noise Ratio for Zero-Shot NAS. 5740-5750 - Fudong Lin, Summer Crawford, Kaleb Guillot, Yihe Zhang, Yan Chen, Xu Yuan
, Li Chen, Shelby Williams, Robert Minvielle, Xiangming Xiao, Drew Gholson, Nicolas Ashwell, Tri Setiyono, Brenda Tubana, Lu Peng, Magdy A. Bayoumi, Nian-Feng Tzeng
:
MMST-ViT: Climate Change-aware Crop Yield Prediction via Multi-Modal Spatial-Temporal Vision Transformer. 5751-5761 - Pavan Kumar Anasosalu Vasu, James Gabriel, Jeff Zhu, Oncel Tuzel, Anurag Ranjan:
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization. 5762-5772 - Sudong Cai:
IIEU: Rethinking Neural Feature Activation from Decision-Making. 5773-5783 - Nam Hyeon-Woo, Kim Yu-Ji, Byeongho Heo, Dongyoon Han, Seong Joon Oh, Tae-Hyun Oh:
Scratching Visual Transformer's Back with Uniform Attention. 5784-5795 - Xudong Wang, Li Lyna Zhang, Jiahang Xu, Quanlu Zhang, Yujing Wang, Yuqing Yang, Ningxin Zheng, Ting Cao, Mao Yang:
SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference. 5796-5805 - Chen Tang, Li Lyna Zhang, Huiqiang Jiang, Jiahang Xu, Ting Cao, Quanlu Zhang, Yuqing Yang, Zhi Wang, Mao Yang:
ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices. 5806-5817 - Jongbin Ryu, Dongyoon Han, Jongwoo Lim:
Gramian Attention Heads are Strong yet Efficient Vision Learners. 5818-5828 - Yulin Wang
, Yang Yue, Rui Lu, Tianjiao Liu, Zhao Zhong, Shiji Song, Gao Huang:
EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones. 5829-5841 - Jinhong Wang, Yi Cheng, Jintai Chen, Tingting Chen, Danny Chen, Jian Wu:
Ord2Seq: Regarding Ordinal Regression as Label Sequence Prediction. 5842-5852 - Shipeng Bai, Jun Chen, Xintian Shen, Yixuan Qian, Yong Liu:
Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning. 5853-5862 - Runyi Yu, Zhennan Wang, Yinhuai Wang, Kehan Li, Chang Liu, Haoyi Duan, Xiangyang Ji, Jie Chen:
LaPE: Layer-adaptive Position Embedding for Vision Transformers with Independent Layer Normalization. 5863-5873 - Anurag Roy, Vinay Kumar Verma, Sravan Voonna, Kripabandhu Ghosh, Saptarshi Ghosh, Abir Das:
Exemplar-Free Continual Transformer with Convolutions. 5874-5884 - Yongjie Chen, Hongmin Liu, Haoran Yin, Bin Fan:
Building Vision Transformers with Hierarchy Aware Feature Aggregation. 5885-5895 - Mingyang Zhang, Xinyi Yu
, Haodong Zhao, Linlin Ou:
ShiftNAS: Improving One-shot NAS via Probability Shift. 5896-5905 - Akshaya Athwale, Arman Afrasiyabi, Justin Lagüe, Ichrak Shili, Ola Ahmad, Jean-François Lalonde:
DarSwin: Distortion Aware Radial Swin Transformer. 5906-5915 - Xiaoxing Wang, Xiangxiang Chu, Yuda Fan, Zhexi Zhang, Bo Zhang, Xiaokang Yang, Junchi Yan:
ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation. 5916-5926 - Yixing Xu, Chao Li, Dong Li, Xiao Sheng, Fan Jiang, Lu Tian, Ashish Sirasao:
FDViT: Improve the Hierarchical Architecture of Vision Transformer. 5927-5937 - Dongchen Han, Xuran Pan, Yizeng Han, Shiji Song, Gao Huang:
FLatten Transformer: Vision Transformer using Focused Linear Attention. 5938-5948 - Xiangxiang Chu, Shun Lu, Xudong Li, Bo Zhang:
MixPath: A Unified Approach for One-shot Neural Architecture Search. 5949-5958 - Jingtao Wang, Zengjie Song, Yuxi Wang, Jun Xiao, Yuran Yang, Shuqi Mei, Zhaoxiang Zhang:
SSF: Accelerating Training of Spiking Neural Networks with Stabilized Spiking Flow. 5959-5968 - Yizeng Han, Dongchen Han, Zeyu Liu, Yulin Wang
, Xuran Pan, Yifan Pu, Chao Deng, Junlan Feng, Shiji Song, Gao Huang:
Dynamic Perceiver for Efficient Visual Recognition. 5969-5979 - Sucheng Ren, Xingyi Yang
, Songhua Liu, Xinchao Wang:
SG-Former: Self-guided Transformer with Evolving Token Reallocation. 5980-5991 - Weifeng Lin, Ziheng Wu, Jiayu Chen, Jun Huang, Lianwen Jin:
Scale-Aware Modulation Meet Transformer. 5992-6003 - Wenze Liu, Hao Lu, Hongtao Fu, Zhiguo Cao:
Learning to Upsample by Learning to Sample. 6004-6014 - Yansong Peng
, Yueyi Zhang, Zhiwei Xiong, Xiaoyan Sun, Feng Wu:
GET: Group Event Transformer for Event-Based Vision. 6015-6025 - Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Zheng-Jun Zha, Yan Lu, Baining Guo:
Adaptive Frequency Filters As Efficient Global Token Mixers. 6026-6036 - Haokui Zhang, Wenze Hu, Xiaoyu Wang:
Fcaformer: Forward Cross Attention in Hybrid Vision Transformer. 6037-6046 - Yaolei Qi, Yuting He, Xiaoming Qi, Yuan Zhang, Guanyu Yang:
Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation. 6047-6056 - Seyedalireza Khoshsirat, Chandra Kambhamettu:
Sentence Attention Blocks for Answer Grounding. 6057-6067 - Quang Hieu Vo, Linh-Tam Tran, Sung-Ho Bae, Lok-Won Kim, Choong Seon Hong:
MST-compression: Compressing and Accelerating Binary Neural Networks with Minimum Spanning Tree. 6068-6077 - Ilwi Yun, Chanyong Shin, Hyunku Lee, Hyuk-Jae Lee, Chae-Eun Rhee:
EGformer: Equirectangular Geometry-biased Transformer for 360 Depth Estimation. 6078-6089 - Guhnoo Yun, Juhan Yoo, Kijung Kim, Jeongho Lee, Dong Hwan Kim:
SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation. 6090-6101 - Jie Song, Zhengqi Xu, Sai Wu, Gang Chen, Mingli Song:
ModelGiF: Gradient Fields for Model Functional Distance. 6102-6112 - Gustavo Adolfo Vargas Hakim, David Osowiechi, Mehrdad Noori, Milad Cheraghalikhani, Ali Bahri, Ismail Ben Ayed, Christian Desrosiers:
ClusT3: Information Invariant Test-Time Training. 6113-6112 - Borui Zhao, Renjie Song, Jiajun Liang:
Cumulative Spatial Knowledge Distillation for Vision Transformers. 6123-6132 - Jong-Hyeon Baek, Daehyun Kim, Su-Min Choi, Hyo-Jun Lee, Hanul Kim, Yeong Jun Koh:
Luminance-aware Color Transform for Multiple Exposure Correction. 6133-6142 - Qingyan Meng, Mingqing Xiao, Shen Yan, Yisen Wang, Zhouchen Lin, Zhi-Quan Luo:
Towards Memory- and Time-Efficient Backpropagation for Training Spiking Neural Networks. 6143-6153 - Mateusz Michalkiewicz, Masoud Faraki, Xiang Yu, Manmohan Chandraker, Mahsa Baktashmotlagh
:
Domain Generalization Guided by Gradient Signal to Noise Ratio of Parameters. 6154-6165 - Borui Zhao, Quan Cui, Renjie Song, Jiajun Liang:
DOT: A Distillation-Oriented Trainer. 6166-6175 - Yuhong Li, Jiajie Li, Cong Hao, Pan Li, Jinjun Xiong, Deming Chen:
Extensible and Efficient Proxy for Neural Architecture Search. 6176-6187 - Utkarsh Singhal, Carlos Esteves, Ameesh Makadia, Stella X. Yu:
Learning to Transform for Generalizable Instance-wise Invariance. 6188-6198 - Alexandre Kirchmeyer, Jia Deng:
Convolutional Networks with Oriented 1D Kernels. 6199-6209 - Yanghao Wang, Zhongqi Yue, Xian-Sheng Hua, Hanwang Zhang:
Random Boxes Are Open-world Object Detectors. 6210-6220 - Yuxin Fang, Shusheng Yang, Shijie Wang, Yixiao Ge, Ying Shan, Xinggang Wang
:
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection. 6221-6230 - Qiming Xia, Jinhao Deng, Chenglu Wen, Hai Wu, Shaoshuai Shi, Xin Li, Cheng Wang:
CoIn: Contrastive Instance Feature Mining for Outdoor 3D Object Detection with Very Limited Annotations. 6231-6240 - Minying Zhang, Tianpeng Bu, Lulu Hu:
A Dynamic Dual-Processing Object Detection Framework Inspired by the Brain's Recognition Mechanism. 6241-6251 - Yilong Lv, Min Li, Yujie He, Zhuzhen He, Shaopeng Li
, Aitao Yang:
Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for Accurate Object Detection. 6252-6261 - Declan McIntosh, Alexandra Branzan Albu:
Inter-Realization Channels: Unsupervised Anomaly Detection Beyond One-Class Classification. 6262-6272 - Shuai Wang, Yao Teng, Limin Wang:
Deep Equilibrium Object Detection. 6273-6283 - Jing Zhao, Li Sun, Qingli Li:
RecursiveDet: End-to-End Region-based Recursive Object Detection. 6284-6293 - Xiang Yuan, Gong Cheng, Kebing Yan, Qinghua Zeng, Junwei Han:
Small Object Detection via Coarse-to-fine Proposal Generation and Imitation Learning. 6294-6304 - Shenghao Fu, Junkai Yan, Yipeng Gao, Xiaohua Xie, Wei-Shi Zheng:
ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation. 6305-6315 - Xiaofeng Mao, Yuefeng Chen, Yao Zhu, Da Chen, Hang Su, Rong Zhang, Hui Xue:
COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts. 6316-6327 - Yuzhong Zhao, Qixiang Ye, Weijia Wu, Chunhua Shen, Fang Wan:
Generative Prompt Model for Weakly Supervised Object Localization. 6328-6338 - Shanshan Lao, Guanglu Song, Boxiao Liu, Yu Liu, Yujiu Yang
:
UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors. 6339-6349 - Jaehyeok Bae, Jae-Han Lee, Seyun Kim:
PNI: Industrial Anomaly Detection using Position and Neighborhood Information. 6350-6360 - Shanshan Lao, Guanglu Song, Boxiao Liu, Yu Liu, Yujiu Yang
:
Masked Autoencoders Are Stronger Knowledge Distillers. 6361-6370 - Ziyu Li, Jingming Guo, Tongtong Cao, Bingbing Liu, Wankou Yang:
GPA-3D: Geometry-aware Prototype Alignment for Unsupervised Domain Adaptive 3D Object Detection from Point Clouds. 6371-6380 - Lingyu Xiao, Xiang Li, Sen Yang, Wankou Yang:
ADNet: Lane Shape Prediction via Anchor Decomposition. 6381-6390 - Qipeng Liu, Luojun Lin, Zhifeng Shen, Zhifeng Yang:
Periodically Exchange Teacher-Student for Source-Free Object Detection. 6391-6401 - Xinzhu Ma, Yongtao Wang, Yinmin Zhang, Zhiyi Xia, Yuan Meng, Zhihui Wang, Haojie Li, Wanli Ouyang
:
Towards Fair and Comprehensive Comparisons for Image-Based 3D Object Detection. 6402-6412 - Xianpeng Liu, Ce Zheng, Kelvin Cheng, Nan Xue, Guo-Jun Qi, Tianfu Wu:
Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver. 6413-6423 - Hewei Guo, Liping Ren, Jingjing Fu, Yuwang Wang, Zhizheng Zhang, Cuiling Lan, Haoqian Wang, Xinwen Hou:
Template-guided Hierarchical Feature Restoration for Anomaly Detection. 6424-6435 - Yuting Wang, Velibor Ilic, Jiatong Li, Branislav Kisacanin, Vladimir Pavlovic:
ALWOD: Active Learning for Weakly-Supervised Object Detection. 6436-6446 - Hansol Kim, Youngjun Kwak, Minyoung Jung, Jinho Shin, Youngsung Kim, Changick Kim:
ProtoFL: Unsupervised Federated Learning via Prototypical Distillation. 6447-6456 - Ting Lei, Fabian Caba, Qingchao Chen
, Hailin Jin, Yuxin Peng, Yang Liu:
Efficient Adaptive Human-Object Interaction Detection with Concept-guided Memory. 6457-6467 - Shilong Liu, Tianhe Ren, Jiayu Chen, Zhaoyang Zeng, Hao Zhang, Feng Li, Hongyang Li, Jun Huang, Hang Su, Jun Zhu, Lei Zhang:
Detection Transformer with Stable Matching. 6468-6477 - Liangqi Li, Jiaxu Miao, Dahu Shi, Wenming Tan, Ye Ren, Yi Yang, Shiliang Pu:
Distilling DETR with Visual-Linguistic Knowledge for Open-Vocabulary Object Detection. 6478-6487 - Tri Cao, Jiawen Zhu, Guansong Pang:
Anomaly Detection under Distribution Shift. 6488-6500 - Aritra Bhowmik, Yu Wang, Nora Baka, Martin R. Oswald
, Cees G. M. Snoek:
Detecting Objects with Context-Likelihood Graphs and Graph Refinement. 6501-6510 - Yeonghwan Song, Seokwoo Jang, Dina Katabi, Jeany Son:
Unsupervised Object Localization with Representer Point Selection. 6511-6521 - Yutong Lin, Yuhui Yuan, Zheng Zhang, Chen Li, Nanning Zheng, Han Hu:
DETR Does Not Need Multi-Scale or Locality Design. 6522-6531 - Qiaoyi Su, Yuhong Chou, Yifan Hu, Jianing Li, Shijie Mei, Ziyang Zhang, Guoqi Li:
Deep Directly-Trained Spiking Neural Networks for Object Detection. 6532-6542 - David Schinagl, Georg Krispel, Christian Fruhwirth-Reisinger, Horst Possegger, Horst Bischof
:
GACE: Geometry Aware Confidence Enhancement for Black-box 3D Object Detectors on LiDAR-Data. 6543-6553 - Yao Teng, Haisong Liu, Sheng Guo, Limin Wang:
StageInteractor: Query-based Object Detector with Cross-stage Interaction. 6554-6565 - Yifan Pu, Yiru Wang, Zhuofan Xia, Yizeng Han, Yulin Wang
, Weihao Gan, Zidong Wang, Shiji Song, Gao Huang:
Adaptive Rotated Convolution for Rotated Object Detection. 6566-6577 - Manyuan Zhang, Guanglu Song, Yu Liu, Hongsheng Li:
Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection. 6578-6587 - Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo:
Exploring Transformers for Open-world Instance Segmentation. 6588-6598 - Xiaojun Tang, Junsong Fan, Chuanchen Luo, Zhaoxiang Zhang, Man Zhang, Zongyuan Yang:
DDG-Net: Discriminability-Driven Graph Network for Weakly-supervised Temporal Action Localization. 6599-6609 - Qiang Chen, Xiaokang Chen, Jian Wang, Shan Zhang, Kun Yao, Haocheng Feng, Junyu Han, Errui Ding, Gang Zeng, Jingdong Wang:
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment. 6610-6619 - Zhiwei Chen, Jinren Ding, Liujuan Cao, Yunhang Shen, Shengchuan Zhang, Guannan Jiang, Rongrong Ji:
Category-aware Allocation Transformer for Weakly Supervised Object Localization. 6620-6629 - Zhuangzhuang Chen, Jin Zhang, Zhuonan Lai, Guanming Zhu, Zun Liu, Jie Chen, Jianqiang Li
:
The Devil is in the Crack Orientation: A New Perspective for Crack Detection. 6630-6640 - Yu Pei, Xian Zhao, Hao Li, Jingyuan Ma, Jingwei Zhang, Shiliang Pu:
Clusterformer: Cluster-based Transformer for 3D Object Detection in Point Clouds. 6641-6650 - Dehua Zheng, Wenhui Dong, Hailin Hu, Xinghao Chen, Yunhe Wang:
Less is More: Focus Attention for Efficient DETR. 6651-6660 - Hongyang Li, Hao Zhang, Zhaoyang Zeng, Shilong Liu, Feng Li, Tianhe Ren, Lei Zhang:
DFA3D: 3D Deformable Attention For 2D-to-3D Feature Lifting. 6661-6670 - Ke Zhu, Minghao Fu, Jianxin Wu:
Multi-Label Self-Supervised Learning with Scene Images. 6671-6680 - Mingqiao Ye, Lei Ke, Siyuan Li, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu:
Cascade-DETR: Delving into High-Quality Universal Object Detection. 6681-6691 - Yanjing Li, Sheng Xu, Mingbao Lin, Jihao Yin, Baochang Zhang, Xianbin Cao:
Representation Disparity-aware Distillation for 3D Object Detection. 6692-6701 - Khurram Azeem Hashmi, Goutham Kallempudi, Didier Stricker, Muhammad Zeshan Afzal:
FeatEnHancer: Enhancing Hierarchical Features for Object Detection and Beyond Under Low-Light Vision. 6702-6712 - Tao Ma, Xuemeng Yang, Hongbin Zhou
, Xin Li, Botian Shi, Junjie Liu, Yuchen Yang, Zhizheng Liu, Liang He, Yu Qiao, Yikang Li, Hongsheng Li
:
DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds. 6713-6724 - Zhuofan Zong, Guanglu Song, Yu Liu:
DETRs with Collaborative Hybrid Assignments Training. 6725-6735 - Jiong Wang, Huiming Zhang, Haiwen Hong, Xuan Jin, Yuan He, Hui Xue, Zhou Zhao:
Open-Vocabulary Object Detection With an Open Corpus. 6736-6746 - Saksham Suri, Sai Saketh Rambhatla, Rama Chellappa, Abhinav Shrivastava:
SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining. 6747-6758 - Xinyi Zhang, Naiqi Li, Jiawei Li, Tao Dai, Yong Jiang, Shu-Tao Xia:
Unsupervised Surface Anomaly Detection with Diffusion Probabilistic Model. 6759-6768 - Haiyang Wang, Hao Tang
, Shaoshuai Shi, Aoxue Li, Zhenguo Li, Bernt Schiele, Liwei Wang:
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation. 6769-6779 - Xincheng Yao, Ruoqi Li, Zefeng Qian, Yan Luo, Chongyang Zhang:
Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image Anomaly Detection. 6780-6790 - Junkai Xu, Liang Peng, Haoran Chen, Hao Li, Wei Qian, Ke Li, Wenxiao Wang, Deng Cai:
MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection. 6791-6801 - Feng Liu, Xiaosong Zhang, Zhiliang Peng, Zonghao Guo, Fang Wan, Xiangyang Ji, Qixiang Ye:
Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection. 6802-6811 - Ziye Chen, Yu Liu, Mingming Gong, Bo Du, Guoqi Qian, Kate Smith-Miles:
Generating Dynamic Kernels via Transformers for Lane Detection. 6812-6821 - Lu Zhang, Chenbo Zhang, Jiajia Zhao, Jihong Guan, Shuigeng Zhou:
Meta-ZSDETR: Zero-shot DETR with Meta-learning. 6822-6831 - Di Wu, Pengfei Chen, Xuehui Yu, Guorong Li, Zhenjun Han, Jianbin Jiao:
Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes. 6832-6842 - Ming Li, Jie Wu, Xionghui Wang, Chen Chen, Jie Qin, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan:
AlignDet: Aligning Pre-training and Fine-tuning in Object Detection. 6843-6853 - Zhengzhong Tu, Peyman Milanfar, Hossein Talebi:
MULLER: Multilayer Laplacian Resizer for Vision. 6854-6864 - Guodong Wang
, Yunhong Wang, Jie Qin, Dongming Zhang, Xiuguo Bao, Di Huang:
Unilaterally Aggregated Contrastive Learning with Hierarchical Augmentation for Anomaly Detection. 6865-6874