


default search action
ICMR 2023: Thessaloniki, Greece
- Ioannis Kompatsiaris, Jiebo Luo, Nicu Sebe, Angela Yao, Vasileios Mazaris, Symeon Papadopoulos, Adrian Popescu, Zi Helen Huang:
Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, ICMR 2023, Thessaloniki, Greece, June 12-15, 2023. ACM 2023
Regular Long Papers
- Nitish Nag
, Hyungik Oh
, Mengfan Tang
, Mingshu Shi
, Ramesh C. Jain
:
Integrative Multi-Modal Computing for Personal Health Navigation. 1-9 - Hugo Schindler
, Adrian Popescu
, Van-Khoa Nguyen
, Jerome Deshayes-Chossart
:
Raising User Awareness about the Consequences of Online Photo Sharing. 10-19 - Sven Schultze
, Ani Withöft
, Larbi Abdenebaoui
, Susanne Boll
:
Explaining Image Aesthetics Assessment: An Interactive Approach. 20-28 - Omar Adjali
, Paul Grimal
, Olivier Ferret
, Sahar Ghannay
, Hervé Le Borgne
:
Explicit Knowledge Integration for Knowledge-Aware Visual Question Answering about Named Entities. 29-38 - Shuo Chen
, Ying-Jun Du
, Pascal Mettes
, Cees G. M. Snoek
:
Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation. 39-47 - Ying He
, Gongqing Wu
, Desheng Cai
, Xuegang Hu
:
Cross-View Sample-Enriched Graph Contrastive Learning Network for Personalized Micro-video Recommendation. 48-56 - Konstantin Schall
, Kai Uwe Barthel
, Nico Hezel
, Klaus Jung
:
Improving Image Encoders for General-Purpose Nearest Neighbor Search and Classification. 57-66 - Giacomo Nebbia
, Adriana Kovashka
:
Hypernymization of named entity-rich captions for grounding-based multi-modal pretraining. 67-75 - Yizhao Gao
, Zhiwu Lu
:
CMMT: Cross-Modal Meta-Transformer for Video-Text Retrieval. 76-84 - Jiazhi Guan
, Hang Zhou
, Zhizhi Guo
, Tianshu Hu
, Lirui Deng
, Chengbin Quan
, Meng Fang
, Youjian Zhao
:
Dual-Modality Co-Learning for Unveiling Deepfake in Spatio-Temporal Space. 85-94 - Jiaxin Deng
, Dong Shen
, Haojie Pan
, Xiangyu Wu
, Ximan Liu
, Gaofeng Meng
, Fan Yang
, Tingting Gao
, Ruiji Fu
, Zhongyuan Wang
:
A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset. 95-104 - Chiyu Zhang
, Zaiyan Dai
, Peng Cao
, Jun Yang
:
Edge Enhanced Image Style Transfer via Transformers. 105-114 - Juheon Hwang
, Jiwoo Kang
, Kyoungoh Lee
, Sanghoon Lee
:
Unlocking Potential of 3D-aware GAN for More Expressive Face Generation. 115-124 - Yuze Wang
, Junyi Wang
, Yansong Qu
, Yue Qi
:
RIP-NeRF: Learning Rotation-Invariant Point-based Neural Radiance Field for Fine-grained Editing and Compositing. 125-134 - Tiancong Cheng
, Ying Zhang
, Yifang Yin
, Roger Zimmermann
, Zhiwen Yu
, Bin Guo
:
A Multi-Teacher Assisted Knowledge Distillation Approach for Enhanced Face Image Authentication. 135-143 - Ying Zhang
, Lilei Zheng
, Vrizlynn L. L. Thing
, Roger Zimmermann
, Bin Guo
, Zhiwen Yu
:
FaceLivePlus: A Unified System for Face Liveness Detection and Face Verification. 144-152 - Bing Han, Jianshu Li
, Wenqi Ren, Man Luo
, Jian Liu, Xiaochun Cao:
SIGMA-DF: Single-Side Guided Meta-Learning for Deepfake Detection. 153-161 - Yizhe Zhu
, Jialin Gao
, Xi Zhou
:
AVForensics: Audio-driven Deepfake Video Detection with Masking Strategy in Self-supervision. 162-171 - Marco Arazzi
, Marco Cotogni
, Antonino Nocera
, Luca Virgili
:
Predicting Tweet Engagement with Graph Neural Networks. 172-180 - Peiwang Tang
, Qinghua Zhang
, Xianchao Zhang
:
A Recurrent Neural Network based Generative Adversarial Network for Long Multivariate Time Series Forecasting. 181-189 - Victoria Sherratt
, Kevin Pimbblet
, Nina Dethlefs
:
Multi-channel Convolutional Neural Network for Precise Meme Classification. 190-198 - Yankun Wu
, Yuta Nakashima
, Noa Garcia
:
Not Only Generative Art: Stable Diffusion for Content-Style Disentanglement in Art Analysis. 199-208 - Wen-Jiin Tsai
, Yi-Cheng Tien
:
Attention-based Video Virtual Try-On. 209-216 - Soyun Choi
, Youjia Zhang
, Sungeun Hong
:
Intra-inter Modal Attention Blocks for RGB-D Semantic Segmentation. 217-225 - Cheng-Yu Fang
, Xian-Feng Han
:
Joint Geometric-Semantic Driven Character Line Drawing Generation. 226-233 - Zeqing Xia
, Zhouhui Lian
:
CurveSDF: Binary Image Vectorization Using Signed Distance Fields. 234-242 - Yusong Wang
, Dongyuan Li
, Kotaro Funakoshi
, Manabu Okumura
:
EMP: Emotion-guided Multi-modal Fusion and Contrastive Learning for Personality Traits Recognition. 243-252 - Zefan Zhang
, Yi Ji
, Chunping Liu
:
Knowledge-Aware Causal Inference Network for Visual Dialog. 253-261 - Chun Zhang
, Keyan Ren
, Qingyun Bian
, Yu Shi
:
Less is More: Decoupled High-Semantic Encoding for Action Recognition. 262-271 - Ziwei Xiong
, Han Wang
:
Dual-Stream Multimodal Learning for Topic-Adaptive Video Highlight Detection. 272-279 - Ruilin Zhang
, Haiyang Zheng
, Hongpeng Wang
:
TDEC: Deep Embedded Image Clustering with Transformer and Distribution Information. 280-288 - Beibei Zhang
, Yaqun Fang
, Fan Yu
, Jia Bei
, Tongwei Ren
:
MMSF: A Multimodal Sentiment-Fused Method to Recognize Video Speaking Style. 289-297 - Guoxing Yang
, Haoyu Lu
, Zelong Sun
, Zhiwu Lu
:
Shot Retrieval and Assembly with Text Script for Video Montage Generation. 298-306 - Shenshen Li
, Xing Xu
, Fumin Shen
, Yang Yang
:
Multi-granularity Separation Network for Text-Based Person Retrieval with Bidirectional Refinement Regularization. 307-315 - Tiening Sun
, Zhong Qian
, Peifeng Li
, Qiaoming Zhu
:
Graph Interactive Network with Adaptive Gradient for Multi-Modal Rumor Detection. 316-324 - Harsh Sinha
, Adriana Kovashka
:
Towards Shape-regularized Learning for Mitigating Texture Bias in CNNs. 325-334 - Mingqi Chen
, Feng Shuang
, Shaodong Li
, Xi Liu
:
ASCS-Reinforcement Learning: A Cascaded Framework for Accurate 3D Hand Pose Estimation. 335-342 - Yangming Zhou
, Yuzhou Yang
, Qichao Ying
, Zhenxing Qian
, Xinpeng Zhang
:
Multi-modal Fake News Detection on Social Media via Multi-grained Information Fusion. 343-352 - Mingjun Li
, Shuo Xu
, Feng Su
:
Learning and Fusing Multi-Scale Representations for Accurate Arbitrary-Shaped Scene Text Recognition. 353-361 - Chunhong Cao
, Huawei Fu
, Gai Li
, Mengyang Wang
, Xieping Gao
:
Modeling Functional Brain Networks with Multi-Head Attention-based Region-Enhancement for ADHD Classification. 362-369 - Chunhong Cao
, Gai Li
, Huawei Fu
, Xingxing Li
, Xieping Gao
:
SPAE: Spatial Preservation-based Autoencoder for ADHD functional brain networks modelling. 370-377 - Bingchao Wu
, Yangyuxuan Kang
, Bei Guan
, Yongji Wang
:
We Are Not So Similar: Alleviating User Representation Collapse in Social Recommendation. 378-387 - Pengzhi Li
, Yikang Ding
, Linge Li
, Jingwei Guan
, Zhiheng Li
:
Towards Practical Consistent Video Depth Estimation. 388-397 - Jiancheng Pan
, Qing Ma
, Cong Bai
:
Reducing Semantic Confusion: Scene-aware Aggregation Network for Remote Sensing Cross-modal Retrieval. 398-406 - Jialin Tian
, Xing Xu
, Zuo Cao
, Gong Zhang
, Fumin Shen
, Yang Yang
:
Zero-shot Sketch-based Image Retrieval with Adaptive Balanced Discriminability and Generalizability. 407-415 - Liang Li
, Weiwei Sun
:
Label-wise Deep Semantic-Alignment Hashing for Cross-Modal Retrieval. 416-424 - Ying Li
, Chunming Guan
, Jiaquan Gao
:
TsP-Tran: Two-Stage Pure Transformer for Multi-Label Image Retrieval. 425-433 - Maria Pegia
, Björn Þór Jónsson
, Anastasia Moumtzidou
, Ilias Gialampoukidis
, Stefanos Vrochidis
, Ioannis Kompatsiaris
:
MuseHash: Supervised Bayesian Hashing for Multimodal Image Representation. 434-442 - Siteng Huang
, Qiyao Wei
, Donglin Wang
:
Reference-Limited Compositional Zero-Shot Learning. 443-451 - Haram Choi
, Cheolwoong Na
, Jinseop Kim
, Jihoon Yang
:
Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training. 452-461 - Feng Zhao
, Min Zhang
, Tiancheng Huang
, Donglin Wang
:
TAGM: Task-Aware Graph Model for Few-shot Node Classification. 462-471 - Yutian Luo
, Yizhao Gao
, Zhiwu Lu
:
Learning with Adaptive Knowledge for Continual Image-Text Modeling. 472-480 - Wenxiu Geng
, Xiangxian Li
, Yulong Bian
:
A Dual-branch Enhanced Multi-task Learning Network for Multimodal Sentiment Analysis. 481-489 - Yu Zang
, Zhe Xue
, Shilong Ou
, Yunfei Long
, Hai Zhou
, Junping Du
:
FedPcf : An Integrated Federated Learning Framework with Multi-Level Prospective Correction Factor. 490-498 - Lina Sun
, Yewen Li
, Yumin Dong
:
Learning From Expert: Vision-Language Knowledge Distillation for Unsupervised Cross-Modal Hashing Retrieval. 499-507 - Yaoqing Li
, Sheng-Hua Zhong
, Shuai Li
, Yan Liu:
A Robust Deep Learning Enhanced Monocular SLAM System for Dynamic Environments. 508-515 - Yingnan Fu
, Wenyuan Cai
, Ming Gao
, Aoying Zhou
:
Symbol Location-Aware Network for Improving Handwritten Mathematical Expression Recognition. 516-524
Regular Short Papers
- Daichi Suzuki
, Go Irie
, Kiyoharu Aizawa
:
Text-to-Image Fashion Retrieval with Fabric Textures. 525-529 - Panagiota Alexoudi
, Ioannis Mademlis
, Ioannis Pitas
:
Escaping local minima in deep reinforcement learning for video summarization. 530-534 - Florian Spiess
, Ralph Gasser
, Silvan Heller
, Heiko Schuldt
, Luca Rossetto
:
A Comparison of Video Browsing Performance between Desktop and Virtual Reality Interfaces. 535-539 - Zhexu Shen
, Liang Yang
, Zhihan Yang
, Hongfei Lin
:
More Than Simply Masking: Exploring Pre-training Strategies for Symbolic Music Understanding. 540-544 - Pu Ching
, Hung-Kuo Chu
, Min-Chun Hu
:
SOFA: Style-based One-shot 3D Facial Animation Driven by 2D landmarks. 545-549 - Kun He
, Changyu Li
, Jie Shao
:
Strong-Weak Cross-View Interaction Network for Stereo Image Super-Resolution. 550-554 - Jiabao Sheng
, Saikit Lam
, Zhe Li
, Jiang Zhang
, Xinzhi Teng
, Yuanpeng Zhang
, Jing Cai
:
Multi-view Contrastive Learning with Additive Margin for Adaptive Nasopharyngeal Carcinoma Radiotherapy Prediction. 555-559 - Shuiying Liao
, Yujuan Ding
, P. Y. Mok
:
Recommendation of Mix-and-Match Clothing by Modeling Indirect Personal Compatibility. 560-564 - Arun Zachariah
, Praveen Rao
:
Video Retrieval for Everyday Scenes With Common Objects. 565-570 - subst Nico, Tse-Yu Pan
, Herman Prawiro
, Jian-Wei Peng
, Wen-Cheng Chen
, Hung-Kuo Chu
, Min-Chun Hu
:
Offensive Tactics Recognition in Broadcast Basketball Videos Based on 2D Camera View Player Heatmaps. 571-575 - Meishan Liu
, Meng Jian
, Ge Shi
, Ye Xiang
, Lifang Wu
:
Graph Contrastive Learning on Complementary Embedding for Recommendation. 576-580 - Sahar Tahmasebi
, Sherzod Hakimov
, Ralph Ewerth
, Eric Müller-Budack
:
Improving Generalization for Multimodal Fake News Detection. 581-585 - Christos Koutlis
, Manos Schinas
, Symeon Papadopoulos
:
MemeFier: Dual-stage Modality Fusion for Image Meme Classification. 586-591 - Aristotelis Ballas
, Christos Diou
:
CNNs with Multi-Level Attention for Domain Generalization. 592-596 - Werner Bailer
, Rahel Arnold
, Vera Benz
, Davide Coccomini
, Anastasios Gkagkas
, Gylfi Þór Guðmundsson
, Silvan Heller
, Björn Þór Jónsson
, Jakub Lokoc
, Nicola Messina
, Nick Pantelidis
, Jiaxin Wu
:
Improving Query and Assessment Quality in Text-Based Interactive Video Retrieval Evaluation. 597-601 - Iacopo Ghinassi
, Lin Wang
, Chris Newell
, Matthew Purver
:
Multimodal Topic Segmentation of Podcast Shows with Pre-trained Neural Encoders. 602-606 - Georgios Orfanidis
, Konstantinos Ioannidis
, Anastasios Tefas
, Stefanos Vrochidis
, Ioannis Kompatsiaris
:
Tweaking EfficientDet for frugal training. 607-611 - Mingyuan Ge
, Yewen Li
, Longfei Ma
, Mingyong Li
:
Deep Enhanced-Similarity Attention Cross-modal Hashing Learning. 612-616 - Kai Feng
, Tao Liu
, Heng Zhang
, Zihao Meng
, Zemin Miao
:
TNOD: Transformer Network with Object Detection for Tag Recommendation. 617-621 - Tianqi Zhao
, Ming Kong
, Tian Liang
, Qiang Zhu
, Kun Kuang
, Fei Wu
:
CLAP: Contrastive Language-Audio Pre-training Model for Multi-modal Sentiment Analysis. 622-626
Brave New Ideas Paper
- David Alonso del Barrio
, Daniel Gatica-Perez
:
Framing the News: From Human Perception to Large Language Model Inferences. 627-635
Doctoral Symposium Paper
- Shenshen Li
:
Dual-Path Semantic Construction Network for Composed Query-Based Image Retrieval. 636-639
Reproducibility Track Paper
- Mitchell Lee
, Chris Lee
, Sanjay Penmetsa
, Min Chen
, Mizuki Miyashita
, Naatosi Fish
, Bo Wu
, Omar Shahbaz Khan
:
Reproducibility Companion Paper: MeTILDA - Platform for Melodic Transcription in Language Documentation and Application. 640-643
Technical Demonstrations
- Kento Terauchi
, Keiji Yanai
:
CalorieCam360: Simultaneous Eating Action Recognition of Multiple People Using an Omnidirectional Camera. 644-648 - Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, Nicola Messina, Lucia Vadicamo, Claudio Vairo:
VISIONE: A Large-Scale Video Retrieval System with Advanced Search Functionalities. 649-653 - Kai Uwe Barthel
, Nico Hezel
, Konstantin Schall
, Klaus Jung
:
navigu.net: NAvigation in Visual Image Graphs gets User-friendly. 654-658 - Manos Schinas
, Panagiotis Galopoulos
, Symeon Papadopoulos
:
MAAM: Media Asset Annotation and Management. 659-663 - Stefanos Stoikos
, David Kauchak
, Douglas Turnbull
, Alexandra Papoutsaki
:
Cross-Language Music Recommendation Exploration. 664-668
Keynote Talk Abstracts
- Nozha Boujemaa, Abdelrahman Hassan
, Giorgi Kokaia
, Pratyush Kumar Sinha
:
How Responsible LLMs are beneficial to search and exploration in Retail industry. 669 - Jürgen Gall
:
Efficient CNNs and Transformers for Video Understanding and Image Synthesis. 670 - Elisa Ricci
:
Recognizing Actions in Videos under Domain Shift. 671
Tutorial Abstract
- Kai Uwe Barthel
:
Algorithms for Generating and Evaluating Visually Sorted Grid Layouts. 672-673
Workshop Abstracts
- Guillaume Habault
, Minh-Son Dao
, Michael Alexander Riegler
, Duc-Tien Dang-Nguyen
, Yuta Nakashima
, Cathal Gurrin
:
ICDAR'23: Intelligent Cross-Data Analysis and Retrieval. 674-675 - Luca Cuccovillo
, Bogdan Ionescu
, Giorgos Kordopatis-Zilos
, Symeon Papadopoulos
, Adrian Popescu
:
MAD '23 Workshop: Multimedia AI against Disinformation. 676-677 - Cathal Gurrin
, Björn Þór Jónsson
, Duc-Tien Dang-Nguyen
, Graham Healy
, Jakub Lokoc
, Liting Zhou
, Luca Rossetto
, Minh-Triet Tran
, Wolfgang Hürst
, Werner Bailer
, Klaus Schoeffmann
:
Introduction to the Sixth Annual Lifelog Search Challenge, LSC'23. 678-679

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.