


default search action
ACM Transactions on Multimedia Computing, Communications, and Applications, Volume 21
Volume 21, Number 1, January 2025
- Bogdan Ionescu

, Ioannis Patras
, Henning Müller
, Alberto Del Bimbo
:
Introduction to the Special Issue on Realistic Synthetic Data: Generation, Learning, Evaluation. 1:1-1:7 - Adam Westerski

, Wee Teck Fong
:
Synthetic Data for Object Detection with Neural Networks: State-of-the-Art Survey of Domain Randomisation Techniques. 2:1-2:20 - Bruno Vaz

, Álvaro Figueira
:
GANs in the Panorama of Synthetic Data Generation Methods. 3:1-3:28 - Azeez Idris

, Mohammed Khaleel
, Wallapak Tavanapong
, Piet C. de Groen
:
Synthesized Image Training Techniques: On Improving Model Performance Using Confusion. 4:1-4:24 - Wenmiao Hu

, Yifang Yin
, Ying Kiat Tan
, An Tran
, Hannes Kruppa
, Roger Zimmermann
:
GAN-Assisted Road Segmentation from Satellite Imagery. 5:1-5:29 - Fabio Hellmann

, Silvan Mertes
, Mohamed Benouis
, Alexander Hustinx
, Tzung-Chien Hsieh
, Cristina Conati
, Peter M. Krawitz
, Elisabeth André
:
GANonymization: A GAN-Based Face Anonymization Framework for Preserving Emotional Expressions. 6:1-6:27 - Kaifeng Zou

, Sylvain Faisan
, Boyang Yu
, Sébastien Valette
, Hyewon Seo
:
4D Facial Expression Diffusion Model. 7:1-7:23 - Anjali T

, Masilamani V.
:
Text-Guided Synthesis of Masked Face Images. 8:1-8:14 - Xin Huang

, Dong Liang
, Hongrui Cai
, Yunfeng Bai
, Juyong Zhang
, Feng Tian
, Jinyuan Jia
:
Double Reference Guided Interactive 2D and 3D Caricature Generation. 9:1-9:21 - Chaitra Desai

, Sujay Benur
, Ujwala Patil
, Uma Mudenagudi
:
RSUIGM: Realistic Synthetic Underwater Image Generation with Image Formation Model. 10:1-10:22 - Roberto Amoroso

, Davide Morelli
, Marcella Cornia
, Lorenzo Baraldi
, Alberto Del Bimbo
, Rita Cucchiara
:
Parents and Children: Distinguishing Multimodal Deepfakes from Natural Images. 11:1-11:23 - Pedro Celard

, Eva Lorenzo Iglesias
, José Manuel Sorribes-Fdez, Lourdes Borrajo
, Adrián Seara Vieira
:
New Metrics and Dataset for Biological Development Video Generation. 12:1-12:23 - Lysa Gramoli

, Julien Cumin
, Jérémy Lacoche, Anthony Foulonneau
, Bruno Arnaldi
, Valérie Gouranton
:
Generating and Evaluating Data of Daily Activities with an Autonomous Agent in a Virtual Smart Home. 13:1-13:25 - Louis Airale

, Xavier Alameda-Pineda
, Stéphane Lathuilière, Dominique Vaufreydaz
:
Autoregressive GAN for Semantic Unconditional Head Motion Generation. 14:1-14:14 - Kerim Hodzic

, Mirsad Cosovic
, Sasa Mrdovic
, Jason J. Quinlan
, Darijo Raca
:
DashReStreamer: Framework for Creation of Impaired Video Clips under Realistic Network Conditions. 15:1-15:26 - Mihai Gabriel Constantin

, Dan-Cristian Stanciu
, Liviu-Daniel Stefan
, Mihai Dogariu
, Dan Mihailescu
, George Ciobanu
, Matt Bergeron
, Winston Liu
, Konstantin Belov
, Octavian Radu
, Bogdan Ionescu
:
Exploring Generative Adversarial Networks for Augmenting Network Intrusion Detection Tasks. 16:1-16:19
- Jialin Yang

, Chunyu Lin
, Lang Nie
, Zisen Kong
, Jiapeng Wang
, Yao Zhao
:
Toward Oriented Fisheye Object Detection: Dataset and Baseline. 17:1-17:19 - Enji Liang

, Kuiyuan Zhang
, Zhongyun Hua
, Xiaohua Jia
:
Multi-Scale Feature Attention Fusion for Image Splicing Forgery Detection. 18:1-18:20 - Qingxin Sheng

, Chong Fu
, Zhaonan Lin
, Junxin Chen
, Xingwei Wang
, Chiu-Wing Sham
:
Content-Aware Selective Encryption for H.265/HEVC Using Deep Hashing Network and Steganography. 19:1-19:22 - Xu Cheng

, Zichun Wang
, Yan Jiang
, Xingyu Liu
, Hao Yu
, Jingang Shi
, Zitong Yu
:
Dual-Path Imbalanced Feature Compensation Network for Visible-Infrared Person Re-Identification. 20:1-20:24 - Pan Liao

, Feng Yang
, Di Wu
, Bo Liu
, Xingle Zhang
, Shangjun Zhou
:
Enhanced Multi-Object Tracking: Inferring Motion States of Tracked Objects. 21:1-21:25 - Hong Zhang

, Jiaxu Wan
, Jing Zhang
, Ding Yuan
, Xuliang Li
, Yifan Yang
:
P2FTrack: Multi-Object Tracking with Motion Prior and Feature Posterior. 22:1-22:22 - Loris Sauter

, Ralph Gasser
, Heiko Schuldt
, Abraham Bernstein
, Luca Rossetto
:
Performance Evaluation in Multimedia Retrieval. 23:1-23:23 - Linhua Kong

, Yiming Wang
, Dongxia Chang
, Yao Zhao
:
Temporal-Enhanced Radar and Camera Fusion for Object Detection. 24:1-24:16 - Yuxiao Huang

, Zhicong Huang
, Jingwen Zhao
, Haifeng Hu
, Dihu Chen
:
AMVFNet: Attentive Multi-View Fusion Network for 3D Object Detection. 25:1-25:18 - Chao Wang

, Zhongyuan Wang
, Ruimin Hu
, Xiaochen Wang
, Wen Zhou
:
Optimal Illumination Distance Metrics for Person Re-Identification in Complex Lighting Conditions. 26:1-26:18 - Tinghui Wu

, Shuhe Zhang
, Dihu Chen
, Haifeng Hu
:
Text-and-Image Learning Transformer for Cross-Modal Person Re-Identification. 27:1-27:18 - Xichu Ma

, Varun Sharma
, Min-Yen Kan
, Wee Sun Lee
, Ye Wang
:
KeYric: Unsupervised Keywords Extraction and Expansion from Music for Coherent Lyrics Generation. 28:1-28:28 - Huazhong Zhao

, Lei Qi
, Xin Geng
:
CLIP-DFGS: A Hard Sample Mining Method for CLIP in Generalizable Person Re-Identification. 29:1-29:20 - Xingchen Li

, Jun Xiao
, Guikun Chen
, Yinfu Feng
, Yi Yang, An-An Liu
, Long Chen
:
Decomposed Prototype Learning for Few-Shot Scene Graph Generation. 30:1-30:24 - Zan Chen

, Tao Wang
, Jun Li
, Wenlong Guo
, Yuanjing Feng
, Xueming Qian
, Xingsong Hou
:
Discard Significant Bits of Compressed Sensing: A Robust Image Coding for Resource-Limited Contexts. 31:1-31:25 - Chongyu Liu

, Dezhi Peng
, Yuliang Liu
, Lianwen Jin
:
CTRNet++: Dual-Path Learning with Local-Global Context Modeling for Scene Text Removal. 32:1-32:22 - Kai Han

, Jin Wang
, Yunhui Shi
, Hanqin Cai
, Nam Ling
, Baocai Yin
:
WTDUN: Wavelet Tree-Structured Sampling and Deep Unfolding Network for Image Compressed Sensing. 33:1-33:22 - Guilin Lan

, Ye-Qian Du
, Zhouwang Yang
:
Robust Multimodal Representation under Uncertain Missing Modalities. 34:1-34:23 - Nayoung Kim

, Jung-Kyung Lee
, Je-Won Kang
:
Reference-based In-loop Filter with Robust Neural Feature Transfer for Video Coding. 35:1-35:24 - Kehua Guo

, Xuyang Tan
, Xiangyuan Zhu
, Shaojun Guo
, Zhipeng Xi
:
ATMNet: Adaptive Texture Migration Network for Guided Depth Super-Resolution. 36:1-36:21 - Federico Becattini

, Xiaolin Chen
, Andrea Puccia
, Haokun Wen
, Xuemeng Song
, Liqiang Nie
, Alberto Del Bimbo
:
Interactive Garment Recommendation with User in the Loop. 37:1-37:21 - Nan Wang

, Qi Wang
:
Dynamic Weighted Gating for Enhanced Cross-Modal Interaction in Multimodal Sentiment Analysis. 38:1-38:19 - Yangchun Zhu

, Yufei Zheng
, Jiawei Liu
, Yao Li
, Zhengjun Zha
:
Noise-Resistance Learning via Multi-Granularity Consistency for Unsupervised Domain Adaptive Person Re-Identification. 39:1-39:23 - Wei Ji

, Li Li
, Hao Fei
, Xiangyan Liu
, Xun Yang
, Juncheng Li
, Roger Zimmermann
:
Toward Complex-query Referring Image Segmentation: A Novel Benchmark. 40:1-40:18
Volume 21, Number 2, February 2025
- Yushu Zhang

, William Puech
, Anderson Rocha
, Rongxing Lu
, Stefano Cresci
, Roberto Di Pietro
:
Introduction to the Special Issue on Security and Privacy of Avatar in Metaverse. 41:1-41:3 - Fan Wang

, Zhangjie Fu
, Xiang Zhang
:
A Self-Defense Copyright Protection Scheme for NFT Image Art Based on Information Embedding. 42:1-42:23 - Jinwei Wang

, Haihua Wang
, Jiawei Zhang
, Hao Wu
, Xiangyang Luo
, Bin Ma
:
Invisible Adversarial Watermarking: A Novel Security Mechanism for Enhancing Copyright Protection. 43:1-43:22 - Rui Zhai

, Rongrong Ni
, Yang Yu
, Yao Zhao
:
FaceDefend: Copyright Protection to Prevent Face Embezzle. 44:1-44:19 - Hanqing Zhao

, Wenbo Zhou
, Dongdong Chen
, Weiming Zhang
, Ying Guo
, Zhen Cheng
, Pengfei Yan
, Nenghai Yu
:
Audio-Visual Contrastive Pre-train for Face Forgery Detection. 45:1-45:16 - Long Tang

, Dengpan Ye
, Zhenhao Lu
, Yunming Zhang
, Chuanxi Chen
:
Feature Extraction Matters More: An Effective and Efficient Universal Deepfake Disruptor. 46:1-46:22 - Jian Zhang

, Jiangqun Ni
, Fan Nie
, Jiwu Huang
:
Domain-invariant and Patch-discriminative Feature Learning for General Deepfake Detection. 47:1-47:19 - Dengyong Zhang

, Wenjie Zhu
, Xin Liao
, Feifan Qi
, Gaobo Yang
, Xiangling Ding
:
Spatiotemporal Inconsistency Learning and Interactive Fusion for Deepfake Video Detection. 48:1-48:24 - Rui Yang

, Rushi Lan
, Zhenrong Deng
, Xiaonan Luo
, Xiyan Sun
:
Deepfake Video Detection Using Facial Feature Points and Ch-Transformer. 49:1-49:22 - Jianheng Tang

, Kejia Fan
, Wenjie Yin
, Shihao Yang
, Yajiang Huang
, Anfeng Liu
, Naixue Xiong
, Mianxiong Dong
, Tian Wang
, Shaobo Zhang
:
A Quality-Aware and Obfuscation-Based Data Collection Scheme for Cyber-Physical Metaverse Systems. 50:1-50:23 - Xiaoxuan Han

, Songlin Yang
, Wei Wang
, Ziwen He
, Jing Dong
:
Exploiting Backdoors of Face Synthesis Detection with Natural Triggers. 51:1-51:24 - Jiuzhen Zeng

, Laurence T. Yang
, Chao Wang
, Junjie Su
, Xianjun Deng
:
A New Tensor Summary Statistic for Real-Time Detection of Stealthy Anomaly in Avatar Interaction. 52:1-52:23 - Letian Sha

, Xiao Chen
, Fu Xiao
, Zhong Wang
, Zhangbo Long
, Qianyu Fan
, Jiankuo Dong
:
VRVul-Discovery: BiLSTM-based Vulnerability Discovery for Virtual Reality Devices in Metaverse. 53:1-53:19 - Gui Xiao

, Zhen Ling
, Qunqun Fan
, Xiangyu Xu
, Wenjia Wu
, Ding Ding
, Chen Chen
, Xinwen Fu
:
Pivot: Panoramic-Image-Based VR User Authentication against Side-Channel Attacks. 54:1-54:19 - Yalin Song

, Wenbin Jiang
, Xiuli Chai
, Zhihua Gan
, Mengyuan Zhou
, Lei Chen
:
Cross-Attention Based Two-Branch Networks for Document Image Forgery Localization in the Metaverse. 55:1-55:24 - Yuanman Li

, Lanhao Ye
, Haokun Cao
, Wei Wang
, Zhongyun Hua
:
Cascaded Adaptive Graph Representation Learning for Image Copy-Move Forgery Detection. 56:1-56:24
- Cong Hu

, Xiao-Zhong Wei
, Xiaojun Wu
:
DIRformer: A Novel Image Restoration Approach Based on U-shaped Transformer and Diffusion Models. 57:1-57:23 - Yuyu Xu

, Pingping Zhang
, Minghui Chen
, Qiudan Zhang
, Wenhui Wu
, Yun Zhang
, Xu Wang
:
RGB-D Data Compression via Bi-Directional Cross-Modal Prior Transfer and Enhanced Entropy Modeling. 58:1-58:17 - Jiayu Yang

, Yongqi Zhai
, Wei Jiang
, Chunhui Yang
, Feng Gao
, Ronggang Wang
:
Adaptive Prediction Structure for Learned Video Compression. 59:1-59:23 - Yifan Wang

, Liang Feng
, Fenglin Cai
, Lusi Li
, Rui Wu
, Jie Li
:
TEC-CNN: Toward Efficient Compressing of Convolutional Neural Nets with Low-rank Tensor Decomposition. 60:1-60:23 - Chong-Yang Xiang

, Xiao Wu
, Jun-Yan He
, Zhaoquan Yuan
, Tingquan He
:
Person in Uniforms Re-Identification. 61:1-61:23 - Xiyao Liu

, Cundian Yang
, Jianbiao He
, Hui Fang
, Gerald Schaefer
, Jian Zhang
, Yuesheng Zhu
, Shichao Zhang
:
Attack-Defending Contrastive Learning for Volumetric Medical Image Zero-Watermarking. 62:1-62:23 - Anqi Cao

, Zhijing Wan
, Xiao Wang
, Wei Liu
, Wei Wang
, Zheng Wang
, Xin Xu
:
Diversity-Representativeness Replay and Knowledge Alignment for Lifelong Vehicle Re-identification. 63:1-63:20 - Xiaonuo Dongye

, Haiyan Jiang
, Dongdong Weng
, Zhenliang Zhang
:
Demonstrative Learning for Human-Agent Knowledge Transfer. 64:1-64:24 - Chengxin Zhao

, Hefei Ling
, Jialie Shen
, Han Fang
, Sijing Xie
, Yaokun Fang
, Zongyi Li
, Ping Li
:
GSyncCode: Geometry Synchronous Hidden Code for One-step Photography Decoding. 65:1-65:21 - Xiaolin Chen

, Xuemeng Song
, Jianhui Zuo
, Yinwei Wei
, Liqiang Nie
, Tat-Seng Chua
:
Domain-aware Multimodal Dialog Systems with Distribution-based User Characteristic Modeling. 66:1-66:22 - Chenghao Li

, Lei Qi
, Xin Geng
:
A SAM-guided Two-stream Lightweight Model for Anomaly Detection. 67:1-67:23 - Ji-Yan Wu

, Kasun Gamlath
, Archan Misra
:
Pr-Ge-Ne: Efficient Encoding of Pervasive Video Sensing Streams by Pruned Generative Networks. 68:1-68:22 - Wei Ji

, Li Li
, Zheqi Lv
, Wenqiao Zhang
, Mengze Li
, Zhen Wan
, Wenqiang Lei
, Roger Zimmermann
:
Backpropagation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration. 69:1-69:17 - Heqi Peng

, Yunhong Wang
, Ruijie Yang
, Beichen Li
, Rui Wang
, Yuanfang Guo
:
AED-PADA: Improving Generalizability of Adversarial Example Detection via Principal Adversarial Domain Adaptation. 70:1-70:24 - Ning Xu

, Xiaowen Wang
, Jing Liu
, Lanjun Wang
, Xuanya Li
, Mengxiao Zhu
, Yongdong Zhang
, An-An Liu
:
Model Can Be Subtle: Two Important Mechanisms for Social Media Popularity Prediction. 71:1-71:20
Volume 21, Number 3, March 2025
- Jiapeng Wang

, Zening Lin
, Dayi Huang
, Longfei Xiong
, Lianwen Jin
:
LiLTv2: Language-substitutable Layout-image Transformer for Visual Information Extraction. 72:1-72:27 - Yili Jin

, Jiahao Li
, Bin Li
, Yan Lu
:
Neural Image Compression with Regional Decoding. 73:1-73:18 - Xiaotian Wu

, Xinjie Feng
, Bing Chen
, Ching-Nung Yang
, Qing-Yu Peng
, Wei Qi Yan
:
EVCS-DAS: Evolving Visual Cryptography Schemes for Dynamic Access Structures. 74:1-74:27 - Mohamed Zakariya Talhaoui

, Zhelong Wang
, Mohamed Amine Midoun
, Abdelkarim Smaili
, Mekkaoui Djamel Eddine
, Mourad Lablack
, Ke Zhang
:
Vulnerability Detection and Improvements of an Image Cryptosystem for Real-Time Visual Protection. 75:1-75:23 - Kai Xu

, Lichun Wang
, Shuang Li
, Tong Gao
, Baocai Yin
:
Scene Adaptive Context Modeling and Balanced Relation Prediction for Scene Graph Generation. 76:1-76:19 - Khouloud Samrouth

, Pia El Housseini
, Olivier Déforges
:
Siamese Network-Based Detection of Deepfake Impersonation Attacks with a Person of Interest Approach. 77:1-77:23 - Yiping Yang

, Baiyun Cui
, Yingming Li
:
A Multimodal Hierarchical Attentional Ordering Network. 78:1-78:20 - Haoxian Ruan

, Zhihua Xu
, Zhijing Yang
, Yongyi Lu
, Jinghui Qin
, Tianshui Chen
:
Learning Semantic-aware Representation in Visual-Language Models for Multi-label Recognition with Partial Labels. 79:1-79:19 - Kun Yan

, Zied Bouraoui
, Fangyun Wei
, Chang Xu
, Ping Wang
, Shoaib Jameel
, Steven Schockaert
:
Modeling Multi-modal Cross-interaction for Multi-label Few-shot Image Classification Based on Local Feature Selection. 80:1-80:28 - Yajie Liu

, Pu Ge
, Guodong Wang
, Qingjie Liu
, Di Huang
:
Multi-Grained Contrastive Learning for Text-Supervised Open-Vocabulary Semantic Segmentation. 81:1-81:21 - Yipei Chen

, Hua Yuan
, Baojun Ma
, Limin Wang
, Yu Qian
:
Beyond Songs: Analyzing User Sentiment through Music Playlists and Multimodal Data. 82:1-82:24 - Yuzhen Niu

, Yeyuan Xu
, Yuezhou Li
, Jiabang Zhang
, Yuzhong Chen
:
Skeleton-Boundary-Guided Network for Camouflaged Object Detection. 83:1-83:21 - Xiaofeng Zhang

, Zishan Xu
, Hao Tang
, Chaochen Gu
, Wei Chen
, Abdulmotaleb El-Saddik
:
Wakeup-Darkness: When Multimodal Meets Unsupervised Low-Light Image Enhancement. 84:1-84:25 - Jiahang Tu

, Wei Ji
, Hanbin Zhao
, Chao Zhang
, Roger Zimmermann
, Hui Qian
:
DriveDiTFit: Fine-tuning Diffusion Transformers for Autonomous Driving Data Generation. 85:1-85:29 - Yifan Jiao

, Chenglong Cai
, Bing-Kun Bao
:
Unified Text-Image Space Alignment with Cross-Modal Prompting in CLIP for UDA. 86:1-86:20 - Feifei Kou

, Bingwei Wang
, Hai-Sheng Li
, Chuangying Zhu
, Lei Shi
, Jiwei Zhang
, Limei Qi
:
Potential Features Fusion Network for Multimodal Fake News Detection. 87:1-87:24 - Shihao Zou

, Yuanlu Xu
, Nikolaos Sarafianos
, Federica Bogo
, Tony Tung
, Weixin Si
, Li Cheng
:
Generating High-Fidelity Clothed Human Dynamics with Temporal Diffusion. 88:1-88:21 - Jiaxin Chen

, Xin Liao
, Zhenxing Qian
, Zheng Qin
:
PRest-Net: Multi-domain Probability Estimation Network for Robust Image Forgery Detection. 89:1-89:20 - Qiang Li

, Di Liu
, Guang Zu
, Sen Li
, Hui Sun
, Jianzhong Wang
:
Multigranularity Feature Aggregation and Cross-level Boundary Modeling for Temporal Action Detection. 90:1-90:24 - Lin Huang

, Chuan Qin
, Guorui Feng
, Xiangyang Luo
, Xinpeng Zhang
:
New Framework of Robust Image Encryption. 91:1-91:22 - Jiayue Chen

, Xiaomeng Wang
, Tong Xu
, Shiwei Wu
:
Towards Scene-Centric Multi-Level Interest Mining for Video Recommendation. 92:1-92:24 - Xiusheng Lu

, Yanbin Hao
, Lechao Cheng
, Sicheng Zhao
, Yutao Liu
, Mingli Song
:
Mixed Attention and Channel Shift Transformer for Efficient Action Recognition. 93:1-93:20 - Haifeng Zhao

, Chi Zhang
, Deyin Liu
, Lin Yuanbo Wu
:
Deformation Field Fusion for Medical Image Registration. 94:1-94:17 - Lisong Ou

, Zhixin Li
:
Multi-modal Sarcasm Detection on Social Media via Multi-Granularity Information Fusion. 95:1-95:23 - Ao Fu

, Jiaqi Zhao
, Yong Zhou
, Wen-Liang Du
, Rui Yao
, Abdulmotaleb El-Saddik
:
Similarity Regulation and Calibration Alignment for Weakly Supervised Text-Based Person Re-Identification. 96:1-96:19 - Shaojun Zhu

, Bincheng Zhu
, Kaikai Chi
, Jiefan Qiu
, Hailong Shi
, Xingyu Gao
:
Maximizing Long-Term Task Completion Ratio of UAV-Enabled Wirelessly Powered MEC Systems. 97:1-97:25 - Xuanqing Cao

, Wengang Zhou
, Qi Sun
, Weilun Wang
, Li Li
, Houqiang Li
:
DISA: Disentangled Dual-Branch Framework for Affordance-Aware Human Insertion. 98:1-98:18 - Marco Mameli

, Marina Paolanti
, Adriano Mancini
, Primo Zingaretti
, Roberto Pierdicca
:
RenderGAN: Enhancing Real-time Rendering Efficiency with Deep Learning. 99:1-99:22 - Lv Tang

, Xinfeng Zhang
, Li Zhang
:
UVC: A Unified Deep Video Compression Framework. 100:1-100:23 - Shen Wang

, Yu Wang
, Renjie Qiao
, Kejun Wu
, Chia-Wen Lin
, Chengtao Cai
:
Multi-Scale Dynamic Fusion for Visible-Infrared Person Re-Identification. 101:1-101:24 - Yucheng Li

, Siwang Zhou
, Deyan Tang
, Liubo Ouyang
, Jia Liu
:
GFPNet: Generalizable Face Privacy Network with Dynamic Defense Training. 102:1-102:22
Volume 21, Number 4, April 2025
- Dan Guo

, Troy McDaniel
, Shuhui Wang
, Meng Wang
:
Introduction to the Special Issue on Deep Learning for Robust Human Body Language Understanding. 103:1-103:7 - Jian Zhang

, Kaihao He
, Ting Yu
, Jun Yu
, Zhenming Yuan
:
Semi-Supervised RGB-D Hand Gesture Recognition via Mutual Learning of Self-Supervised Models. 104:1-104:20 - Shengeng Tang

, Feng Xue
, Jingjing Wu
, Shuo Wang
, Richang Hong
:
Gloss-driven Conditional Diffusion Models for Sign Language Production. 105:1-105:17 - Kaixin Chen

, Lin Zhang
, Zhong Wang
, Shengjie Zhao
, Yicong Zhou
:
Skeleton-Aware Graph-Based Adversarial Networks for Human Pose Estimation from Sparse IMUs. 106:1-106:22 - Zhewei Tu

, Xiangbo Shu
, Peng Huang
, Rui Yan
, Zhenxing Liu
, Jiachao Zhang
:
Leveraging Frame- and Feature-level Progressive Augmentation for Semi-supervised Action Recognition. 107:1-107:21 - Linhua Xiang

, Zengfu Wang
:
Joint Mixing Data Augmentation for Skeleton-Based Action Recognition. 108:1-108:24 - Zenan Shi

, Wenyu Liu
, Haipeng Chen
:
Face Reconstruction-Based Generalized Deepfake Detection Model with Residual Outlook Attention. 109:1-109:19 - Peng He

, Jun Yu
, Chengjie Ge, Ye Yu, Wei Xu, Lei Wang, Tianyu Liu, Zhen Kan:
Domain-Separated Bottleneck Attention Fusion Framework for Multimodal Emotion Recognition. 110:1-110:21 - Yan Gan

, Chenxue Yang
, Mao Ye
, Renjie Huang
, Deqiang Ouyang
:
Generative Adversarial Networks with Learnable Auxiliary Module for Image Synthesis. 111:1-111:21
- Wei Liu, Xin Xu

, Hua Chang
, Xin Yuan
, Zheng Wang
:
Mix-Modality Person Re-Identification: A New and Practical Paradigm. 112:1-112:21 - Nianzi Li

, Guijuan Zhang
, Ping Du
, Dianjie Lu
:
GP-HSI: Human-Scene Interaction with Geometric and Physical Constraints. 113:1-113:22 - Enyuan Zhao

, Ning Song
, Ze Zhang
, Jie Nie
, Xinyue Liang
, Zhiqiang Wei
:
Language-guided Bias Generation Contrastive Strategy for Visual Question Answering. 114:1-114:21 - Kun Wang

, Jiuxin Cao
, Jiawei Ge
, Chang Liu
, Bo Liu
:
Dual-Domain Triple Contrast for Cross-Dataset Skeleton-Based Action Recognition. 115:1-115:23 - Runing Li

, Jiangyan Dai
, Qibing Qin
, Chengduan Wang
, Huihui Zhang
, Yugen Yi
:
Texture and Structure-Guided Dual-Attention Mechanism for Image Inpainting. 116:1-116:25 - Nana Zhang

, Min Xiong
, Dandan Zhu
, Kun Zhu
, Guangtao Zhai
, Xiaokang Yang
:
Audio-Visual Saliency Prediction Model with Implicit Neural Representation. 117:1-117:23 - Zhenqiang Zhang

, Kun Li
, Shengeng Tang
, Yanyan Wei
, Fei Wang
, Jinxing Zhou
, Dan Guo
:
Temporal Boundary Awareness Network for Repetitive Action Counting. 118:1-118:22 - Zicheng Zhang

, Yingjie Zhou
, Chunyi Li
, Wei Sun
, Xiongkuo Min
, Xiaohong Liu
, Guangtao Zhai
:
MM-PCQA+: Advancing Multi-Modal Learning for Point Cloud Quality Assessment. 119:1-119:22 - Xiao Cui

, Qi Sun
, Min Wang
, Li Li
, Wengang Zhou
, Houqiang Li
:
LayoutEnc: Leveraging Enhanced Layout Representations for Transformer-based Complex Scene Synthesis. 120:1-120:21 - Chintha Sri Pothu Raju

, Rabul Hussain Laskar
, Zulfiqar Ali
, Ghulam Muhammad
:
Attention-based Fusion for Stroke Lesion Segmentation on Computed Tomography Perfusion Data. 121:1-121:23 - Qianxing Li

, Dehui Kong
, Jinghua Li
, Dongpan Chen
, Baocai Yin
:
Multi-Anchor Offset Representation Based Coarse-to-Fine Diffusion Model for Human Pose Estimation. 122:1-122:21 - Wasim Ahmad

, Yan-Tsung Peng
, Yuan-Hao Chang
, Gaddisa Olani Ganfure
, Sarwar Khan
:
CapST: Leveraging Capsule Networks and Temporal Attention for Accurate Model Attribution in Deep-fake Videos. 123:1-123:23 - Zekun Sun

, Na Ruan
:
GANK: Dynamic Geometric and Appearance Features for Efficient and Robust Detection of Face Forgery. 124:1-124:24 - Hancheng Zhu

, Li Yan
, Yong Zhou
, Rui Yao
, Zhiwen Shao
, Jiaqi Zhao
, Leida Li
:
Image Cropping with Content and Composition Attribute-aware Global Relation Reasoning. 125:1-125:19 - Wenying Wen

, Yu Ye
, Ziye Yuan
, Baolin Qiu
, Dingli Hua
:
LFIZW-GRHFMR: Robust Zero-Watermarking with GRHFMR for Light Field Image. 126:1-126:17 - Fan Chen

, Lingfeng Qu
, Hadi Amirpour
, Christian Timmerer
, Hongjie He
:
Counterfeiting Attacks on an RDH-EI Scheme Based on Block-Permutation and Co-XOR. 127:1-127:25 - Shangrong Yang

, Chunyu Lin
, Kang Liao
, Yao Zhao
:
FishFormer: Annulus Slicing-based Transformer for Fisheye Rectification. 128:1-128:16 - Jiahui Wang

, Qin Xu
, Bo Jiang
, Bin Luo
:
Transductive Few-shot Learning via Joint Message Passing and Prototype-based Soft-label Propagation. 129:1-129:21 - Jie Wang

, Tingfa Xu
, Liqiang Song
, Lihe Ding
, Hui Li
, Peng Jiang
, Yuqi Han
, Jianan Li
:
PAPooling: Graph-based Position Adaptive Aggregation of Local Geometry in Point Clouds. 130:1-130:18 - Tao Song

, Kunlin Yang
, Fan Meng
, Xin Li
, Handan Sun
, Chenglizhao Chen
:
Tropical Cyclone Image Super-Resolution via Multimodality Fusion. 131:1-131:22 - Qianjiang Hu

, Wei Hu
:
Dynamic Point Cloud Denoising via Gradient Fields. 132:1-132:24 - Jiannan Huang

, Mengxue Qu
, Longfei Li
, Yunchao Wei
:
AdGPT: Explore Meaningful Advertising with ChatGPT. 133:1-133:23
Volume 21, Number 5, May 2025
- Chao Wen

, Chen Wei
, Yuhua Qian
, Xiaodan Song
, Xuemei Xie
:
Prompt-Based Invertible Mapping Alignment for Unsupervised Domain Adaptation. 134:1-134:17 - Jiacheng Deng

, Dengpan Ye
, Jizhi Li
, Ziyi Liu
, Long Tang
, Yunming Zhang
:
The Interpretable and Transferable Adversarial Attack against Synthetic Speech Detectors. 135:1-135:19 - Jiawei Ge

, Jiuxin Cao
, Xiangmei Chen
, Xuelin Zhu
, Weijia Liu
, Chang Liu
, Kun Wang
, Bo Liu
:
Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking. 136:1-136:21 - Mengyu Shi

, Miao Wang
, Yujun Zhang
:
RePC: A Novel Neural Video Quality Enhancement System Framework for ABR Streaming of VBR-encoded Videos. 137:1-137:22 - Rinyoichi Takezoe

, Hao Chen
, Gang Shen
, Xuefei Lv
, Yaowei Wang
, Shiliang Zhang
, Xiaoyu Wang
:
Context-Assisted Active Learning for Weakly Supervised Person Search. 138:1-138:20 - Yang Wang

, Yixing Zhang
, Xudie Ren
, Yuxin Deng
:
MoDA: Mixture of Domain Adapters for Parameter-efficient Generalizable Person Re-identification. 139:1-139:19 - Jiebin Yan

, Ziwen Tan
, Jiale Rao
, Lei Wu
, Yifan Zuo
, Yuming Fang
:
Computational Analysis of Degradation Modeling in Blind Panoramic Image Quality Assessment. 140:1-140:23 - Yuchao Feng

, Mengjie Qin
, Jiawei Jiang
, Jintao Lai
, Jianwei Zheng
:
Axial-shunted Spatial-temporal Conversation for Change Detection. 141:1-141:21 - Wei Jiang

, Jiayu Yang
, Yongqi Zhai
, Feng Gao
, Ronggang Wang
:
MLIC++: Linear Complexity Multi-Reference Entropy Modeling for Learned Image Compression. 142:1-142:25 - Xingjie Zhuang

, Fengling Zhou
, Zhixin Li
:
Multi-Modal Sarcasm Detection via Knowledge-Aware Focused Graph Convolutional Networks. 143:1-143:22 - Xu Liu

, Na Xia
, Jinxing Zhou
, Zhangbin Li
, Dan Guo
:
Towards Energy-efficient Audio-visual Classification via Multimodal Interactive Spiking Neural Network. 144:1-144:24 - Jiebin Yan

, Kangcheng Wu
, Junjie Chen
, Ziwen Tan
, Yuming Fang
, Weide Liu
:
Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Flexible and Effective Paradigm. 145:1-145:19 - Xuecheng Hua

, Ke Cheng
, Gege Zhu
, Hu Lu
, Yuanquan Wang
, Shitong Wang
:
Local-Aware Residual Attention Vision Transformer for Visible-Infrared Person Re-Identification. 146:1-146:24 - Taotao Jing

, Haifeng Xia
, Hongfu Liu
, Zhengming Ding
:
Interpretable Novel Target Discovery through Open-Set Domain Adaptation. 147:1-147:24 - Dengyong Zhang

, Runqi Lou
, Jiaxin Chen
, Xiangling Ding
, Xin Liao
, Gaobo Yang
:
Video Frame Interpolation via Fast Bidirectional 3D Correlation Volume. 148:1-148:22 - Yan Wang

, Hong Xie
, Jinyang He
, Xiaoyu Shi
, Mingsheng Shang
:
Cross-Domain Semantic Transfer for Domain Generalization. 149:1-149:24 - Kang Lin

, Wei Zhou
, Zhijie Zheng
, Dihu Chen
, Tao Su
:
Temporal and Semantic Correlation Network for Weakly-Supervised Temporal Action Localization. 150:1-150:23 - Zhaoda Ye

, Xiangteng He
, Yuxin Peng
:
RaT2IGen: Relation-aware Text-to-image Generation via Learnable Prompt. 151:1-151:19 - Mohan Zhou

, Yalong Bai
, Qing Yang
, Tiejun Zhao
:
StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models. 152:1-152:22 - Dongjian Yu, Weiqing Min, Xin Jin, Qian Jiang, Ying Jin, Shuqiang Jiang:

Diverse and High-Quality Food Image Generation from Only Food Names. 153:1-153:22
Volume 21, Number 6, June 2025
- Wei-Yen Hsu

, Yu-Chieh Chen
:
Multi-Attribute Feature-Aware Network for Facial Expression Recognition. 154:1-154:20 - Linlin Fan

, Mingliang Zhou
, Xuekai Wei
, Yong Feng
, Tao Xiang
, Bin Fang
, Zhaowei Shang
, Fan Jia
, Xu Zhuang
, Huayan Pu
, Jun Luo
:
Sparse Reduced-Rank Fully Connected Layers with Its Applications in Detection and Classification. 155:1-155:23 - Davoud Fani

, Seyed Ali-Asghar Beheshti Shirazi
, Mohammed Ghanbari
, Esmatollah Rezaei
:
On Temporal Smoothness of Video Reconstruction Quality in the DCVS via Non-Uniform Sampling. 156:1-156:26 - I-Chun Huang

, Yuang Shi
, Yuan-Chun Sun
, Wei Tsang Ooi
, Chun-Ying Huang
, Cheng-Hsin Hsu
:
Composing Error Concealment Pipelines for Dynamic 3D Point Cloud Streaming. 157:1-157:28 - Jie Li

, Zhixia Zhao
, Qiyue Li
, Zhixin Li
, Pengyuan Zhou
, Zhi Liu
, Hao Zhou
, Zhu Li
:
VPFormer: Leveraging Transformer with Voxel Integration for Viewport Prediction in Volumetric Video. 158:1-158:27 - Nina Willis

, Abraham Bernstein
, Luca Rossetto
:
Effects of Human Cognition-Inspired Task Presentation on Interactive Video Retrieval. 159:1-159:25 - Donglin Zhang

, Chang-Xing Li
, Mengke Li
, Zhikai Hu
:
Discrete Elective Hashing with Incomplete Labels for Efficient Cross-Modal Retrieval. 160:1-160:20 - Bowen Sun

, Guo Lu
, Shibao Zheng
:
DiFace: Cross-Modal Face Recognition through Controlled Diffusion. 161:1-161:22 - Jiajia Tang

, Binbin Ni
, Feiwei Zhou
, Dongjun Liu
, Yu Ding
, Yong Peng
, Andrzej Cichocki
, Qibin Zhao
, Wanzeng Kong
:
Fine-grained Semantic Disentanglement Network for Multimodal Sarcasm Analysis. 162:1-162:22 - Peng Ren

, Yunfeng Bai
, Xiaoheng Li
, Jinyuan Jia
:
Semantic-driven Cross-space Graph Interaction Network for Fine-grained 3D Point Cloud Understanding. 163:1-163:21 - Claudio Rota

, Marco Buzzelli
, Simone Bianco
, Raimondo Schettini
:
Scalable Residual Laplacian Network for HEVC-compressed Video Restoration. 164:1-164:22 - Shuo Wang

, Jinda Lu
, Huixia Ben
, Yanbin Hao
, Xingyu Gao
, Meng Wang
:
Interventional Feature Generation for Few-shot Learning. 165:1-165:21 - Lisi Wei

, Libo Zhao
, Xiaoli Zhang
:
MAINet: Modality-Aware Interaction Network for Medical Image Fusion. 166:1-166:23 - Yuxuan Zhou

, Mingyang Li
, Jingze Tong
, Linlin Li
, Zhiwei Yang
:
SD-Meta: The Software-Defined Network of Human-Centric Metaverse for Multi-Lead or Multi-Media Data in Spread Spectrum Communications. 167:1-167:23 - Wazib Ansar

, Saptarsi Goswami
, Amlan Chakrabarti
, Basabi Chakraborty
:
TexIm FAST: Text-to-Image Encoding for Semantic Similarity Evaluation of Disproportionate Sequences. 168:1-168:23 - Qianqian Du

, Hui Yin
, Lang Nie
, Yanting Liu
, Jin Wan
:
EnIter: Enhancing Iterative Multi-View Depth Estimation with Universal Contextual Hints. 169:1-169:23 - Tong Wu

, Jinhua Zhu
, Wengang Zhou
, Houqiang Li
:
RESIST: Rationale-Enhanced and Reward Model-Based End-to-End Social Influence Dialogue System. 170:1-170:23 - Yongxin Wang

, Feng Dong
, Zhen-Duo Chen
, Xin Luo
, Xin-Shun Xu
:
Domain-Aware Semantic Alignment Hashing for Large-Scale Zero-Shot Image Retrieval. 171:1-171:20 - Jia Cui

, Jinchen Shen
, Jialin Wei
, Shiyu Liu
, Zhaojia Ye
, Shijian Luo
, Zhen Qin
:
Community Transferrable Representation Learning for Image Style Classification. 172:1-172:20 - Qian Yin

, Xinfeng Zhang
, Ruoke Yan
, Yuhuai Zhang
, Shanshe Wang
, Siwei Ma
:
Joint Structure-Texture Scan-Order for Point Cloud Attribute Compression Using Affine Transformation. 173:1-173:21 - Yuanzhou Huang

, Songwei Pei
, Rui Zeng
:
DQFormer: Transformer with Decoupled Query Augmentations for End-to-End Multi-Object Tracking. 174:1-174:23 - Jiahao Lyu

, Jin Wei
, Gangyan Zeng
, Zeng Li
, Enze Xie
, Wei Wang
, Can Ma
, Yu Zhou
:
TextBlockV2: Towards Precise-Detection-Free Scene Text Spotting with Pre-trained Language Model. 175:1-175:21 - Yang-Hao Zhou

, Heyan Huang
, Cunhan Guo
, Rong-Cheng Tu
, Zeyu Xiao
, Bo Wang
, Xian-Ling Mao
:
ALOHA: Adapting Local Spatio-Temporal Context to Enhance the Audio-Visual Semantic Segmentation. 176:1-176:23 - Bing Yang

, Xueqin Xiang
, Wanzeng Kong
, Jianhai Zhang
, Jinliang Yao
:
Hybrid Feature Integrated Transformer for 3D Hand Reconstruction from a Single RGB Image. 177:1-177:19 - Weizhi Xian

, Junyi Wang
, Xuekai Wei
, Jielu Yan
, Yueting Huang
, Kunyin Guo
, Weijia Jia
, Mingliang Zhou
:
DTSD: A Dual Teacher-Student-Based Discrimination Model for Anomaly Detection. 178:1-178:21 - Jili Chen

, Qionghao Huang
, Changqin Huang
, Xiaodi Huang
:
Actual Cause-Guided Adaptive Gradient Scaling for Balanced Multimodal Sentiment Analysis. 179:1-179:24 - Bing Fan

, Feng Ding
, Guopu Zhu
, Jiwu Huang
, Sam Kwong
, Pradeep K. Atrey
, Siwei Lyu
:
Generating Higher-Quality Anti-Forensics DeepFakes with Adversarial Sharpening Mask. 180:1-180:18 - Zhiyuan Liu

, Qi Zou
, Xixia Xu
, Yanting Pei
:
Multi-Person Pose Estimation with Feature Enhancement and Decoupling Based on Contrastive Learning. 181:1-181:23 - Dongjun Liu

, Weichen Dai
, Honggang Liu
, Hangjie Yi
, Wanzeng Kong
:
Brain-Machine Cross-Modal Alignment via Sample Relational Learning for Visual Classification. 182:1-182:21 - Seongmin Lee

, Jiwoo Kang
, Sanghoon Lee
:
3D Facial Shape Similarity with Deep Perceptual Representations. 183:1-183:27
Volume 21, Number 7, July 2025
- Kajal Kansal

, Yongkang Wong
, Mohan S. Kankanhalli
:
Implications of Privacy Regulations on Video Surveillance Systems. 184:1-184:27 - Yu-Ao Wang

, James She
, Troy TianYu Lin
, Kang Zhang
:
AI Visual Art History: An Art Movement with Expanded Artistic Horizon. 185:1-185:16 - Abdulmotaleb El-Saddik

, Jamil Ahmad
, Mustaqeem
, Saad Abouzahir
, Wail Gueaieb
:
Unleashing Creativity in the Metaverse: Generative AI and Multimodal Content. 186:1-186:43 - Abdelhak Bentaleb

, May Lim
, Sarra Hammoudi
, Saad Harous
, Roger Zimmermann
:
Solutions, Challenges, and Opportunities in Volumetric Video Streaming: An Architectural Perspective. 187:1-187:35 - Miaohui Wang

, Runnan Huang
, Wuyuan Xie
, Zhan Ma
, Siwei Ma
:
Compression Approaches for LiDAR Point Clouds and Beyond: A Survey. 188:1-188:31 - Zicheng Zhang

, Yingjie Zhou
, Chunyi Li
, Baixuan Zhao
, Xiaohong Liu
, Guangtao Zhai
:
Quality Assessment in the Era of Large Models: A Survey. 189:1-189:31 - Haopeng Wang

, Haiwei Dong
, Abdulmotaleb El-Saddik
:
Immersive Multimedia Communication: State-of-the-Art on Extended Reality Streaming. 190:1-190:33 - Hao Wu

, Maha Abdallah
, Yuanfang Chi
, Lehao Lin
, Wei Cai
:
Web3 Multimedia Applications: Under the Impact of Decentralization. 191:1-191:38 - Ammar Rashed

, Shervin Shirmohammadi
, Ihab Amer
, Mohamed Hefeeda
:
A Review of Player Engagement Estimation in Video Games: Challenges and Opportunities. 192:1-192:33 - Xin Wang

, Ting Yu Tsai
, Li Lin
, Hui Guo
, Shu Hu
, Ming-Ching Chang
, Pradeep K. Atrey
, Siwei Lyu
:
Spotting the Fakes: A Deep Dive into GAN-Generated Face Detection. 193:1-193:24 - Xinjie Zhang

, Tenggan Zhang
, Lei Sun
, Jinming Zhao
, Qin Jin
:
Exploring Interpretability in Deep Learning for Affective Computing: A Comprehensive Review. 194:1-194:28 - Yuanding Zhou

, Xinran Li
, Cheng Xiong
, Heng Yao
, Chuan Qin
:
A Survey of Perceptual Hashing for Multimedia. 195:1-195:28 - Weiqing Min

, Xingjian Hong
, Yuxin Liu
, Mingyu Huang
, Ying Jin
, Pengfei Zhou
, Leyi Xu
, Yilin Wang
, Shuqiang Jiang
, Yong Rui
:
Multimodal Food Learning. 196:1-196:28 - Lei Gao

, Kai Liu
, Zheng Guo
, Ling Guan
:
Mathematics-Inspired Models: A Green and Interpretable Learning Paradigm for Multimedia Computing. 197:1-197:22 - Christian Timmerer

, Hadi Amirpour
, Farzad Tashtarian
, Samira Afzal
, Amr Rizk
, Michael Zink
, Hermann Hellwagner
:
HTTP Adaptive Streaming: A Review on Current Advances and Future Challenges. 198:1-198:27 - Shah Muhammad Imtiyaj Uddin

, Rashedul Islam Sumon
, Md Ariful Islam Mozumder
, Md Kamran Hussin Chowdhury
, Tagne Poupi Theodore Armand
, Hee-Cheol Kim:
Innovations and Challenges of AI in Film: A Methodological Framework for Future Exploration. 199:1-199:55 - Ahmed Telili

, Wassim Hamidouche
, Hadi Amirpour
, Sid Ahmed Fezza
, Christian Timmerer
, Luce Morin
:
Convex Hull Prediction Methods for Bitrate Ladder Construction: Design, Evaluation, and Comparison. 200:1-200:23 - Jiaqi Wang

, Ricky Yu-Kwong Kwok
, Edith C. H. Ngai
:
Towards Key Point Identification (KPI) for Lecture Videos: Approaches and Performance Evaluation. 201:1-201:23 - Longye Du

, Shuaiyu Deng
, Ying Li
, Jun Li
, Qi Tian
:
A Survey on Composed Image Retrieval. 202:1-202:27
- Monireh Vahdati, Fedwa Laamarti

, Abdulmotaleb El-Saddik
:
Meta-Review of Wearable Devices for Healthcare in the Metaverse. 203:1-203:36 - Xuan Shao

, Lin Zhang
, Tianjun Zhang
, Shengjie Zhao
:
Towards a Robust Visual-Inertial-Surround-View SLAM System for Autonomous Indoor Parking. 204:1-204:23 - Zongsheng Cao

, Qianqian Xu
, Zhiyong Yang
, Yuan He
, Xiaochun Cao
, Qingming Huang
:
GAHE: Geometry-Aware Embedding for Hyper-Relational Knowledge Graph Representation. 205:1-205:26 - Jiajie Fang

, Mengjuan Jiang
, Jiaqing Fan
, Bangjun Wang
, Fanzhang Li
:
Complementarily Learning Decoupled Category-Region-Aware Prototype for Few-Shot Classification. 206:1-206:22 - Zheng Liu

, Kunyu Yang
, Yu Weng
, Zheng He
, Xuan Liu
, Honghao Gao
:
SCAG: Semantic Co-occurring Attention Guided Alignment for Knowledge-based Visual Question Answering. 207:1-207:20 - Weiyu Wang

, Chunmei Qing
, Junpeng Tan
, Xiangmin Xu
:
Multi-view Panoramic Image Style Transfer with Multi-scale Attention and Global Sharing. 208:1-208:19 - Lu Zhang

, Rui Yao
, Yuhong Zhang
, Yong Zhou
, Fuyuan Hu
, Jiaqi Zhao
, Zhiwen Shao
:
Historical Object-Aware Prompt Learning for Universal Hyperspectral Object Tracking. 209:1-209:20 - Alain Aoun

, Mahmoud Masadeh
, Sofiène Tahar:
ML-based Load Value Approximator for Efficient Multimedia Processing. 210:1-210:18 - Fubin Guo

, Qi Wang
, Qingshan Wang
, Sheng Chen
:
Accurate Hand Modeling in Whole-Body Mesh Reconstruction Using Joint-Level Features and Kinematic-Aware Topology. 211:1-211:23 - Zhipeng Yu

, Zimeng Zhao
, Yanxi Du
, Yuzhou Zheng
, Binghui Zuo
, Yangang Wang
:
T2C: Text-guided 4D Cloth Generation. 212:1-212:19 - Yue Li

, Junru Li
, Chaoyi Lin
, Kai Zhang
, Li Zhang
, Franck Galpin
, Thierry Dumas
, Hongtao Wang
, Muhammed Coban
, Jacob Ström
, Du Liu
, Kenneth Andersson
:
Advanced Neural Network-Based Video Coding Technologies for Intra Prediction and In-Loop Filtering. 213:1-213:23
Volume 21, Number 8, August 2025
- Moncef Gabbouj

, Jin Li
, Haibo Hu
, Yang Xiang
:
Introduction to the Special Issue on AI Empowered Edge Computing for Multimedia Applications. 214:1-214:3 - Fengyin Li

, Hongzhe Liu
, Guangshun Li
, Yilei Wang
, Huiyu Zhou
, Shanshan Cao
, Tao Li
:
SeSMR: Secure and Efficient Session-Based Multimedia Recommendation in Edge Computing. 215:1-215:21 - Hongyi Qiu

, Ning Li
, Pengfei Li
, Ruitao Hou
, Yuting Zhang
, Yun Peng
:
Boundary Attention-Guided Sparse Feature Learning for Underwater Object Tracking in Edge Computing. 216:1-216:17 - Xin Nie

, Laurence T. Yang
, Zhe Li
, Fulan Fan
, Zecan Yang
:
Tensor-empowered Incomplete Multimodal Learning with Modality Reconstruction for Edge Intelligence. 217:1-217:20 - Lefeng Zhang

, Tianqing Zhu
, Ping Xiong
, Wanlei Zhou
:
The Price of Unlearning: Identifying Unlearning Risk in Edge Computing. 218:1-218:23 - Jie Wen

, Nan Jiang
, Lang Li
, Jie Zhou
, Yanpei Li
, Hualin Zhan
, Guang Kou
, Weihao Gu
, Jiahui Zhao
:
TA-Detector: A GNN-Based Anomaly Detector via Trust Relationship. 219:1-219:21 - Jiaxing Li

, Yu-an Tan
, Jie Yang
, Zhengdao Li
, Heng Ye
, Chenxiao Xia
, Yuanzhang Li
:
Unsupervised Adversarial Example Detection of Vision Transformers for Trustworthy Edge Computing. 220:1-220:19 - Li Tang, Haibo Hu

, Moncef Gabbouj
, Qingqing Ye, Yang Xiang
, Jin Li, Lang Li
:
A Survey on Securing Image-Centric Edge Intelligence. 221:1-221:35 - Pei-Gen Ye

, Wenfeng Wang
, Bing Mi
, Kongyang Chen
:
EdgeStreaming: Secure Computation Intelligence in Distributed Edge Networks for Streaming Analytics. 222:1-222:15 - Chang Tan

, Zhewei Liu
, Zhengdao Li
, Jingyu Jia
, Siyi Lv
, Tong Li
, Zheli Liu
:
EdgeSyn: Privacy-Preserving Data Publishing on Edge Network over Infinite Multimedia Data Stream. 223:1-223:16 - Jiaqi Sun

, Xianjun Deng
, Shenghao Liu
, Xiaoxuan Fan
, Yongling Huang
, Yuanyuan He
, Celimuge Wu
, James Jong Hyuk Park:
Contrastive Learning-Based Speech Spoofing Detection for Multimedia Security in Edge Intelligence. 224:1-224:21
- Yi Zheng

, Yong Zhou
, Fayao Liu
, Jiaqi Zhao
, Hancheng Zhu
, Wen-Liang Du
:
CCFL: Customized Client Federated Learning for Unsupervised Person Re-identification. 225:1-225:21 - Jianping Zhong

, Zhaobo Qi
, Kaiwen Duan
, Yuanrong Xu
, Weigang Zhang
, Qingming Huang
:
Multi-Modal 3D Object Detector with Object-Guided Fusion and Hierarchical Sample Selection. 226:1-226:23 - Bo Peng

, Jia Zhang
, Zhe Zhang
, Liying Xu
, Qingming Huang
, Tao Wang
, Jianjun Lei
:
Adaptive Multi-Exposure Image Correction via Joint Lightness and Structure Awareness. 227:1-227:15 - Tengfei Zheng

, Bo Wang
, Gen Li
, Yuxing Tang
, Qiang Dou
:
Efficient Privacy-Preserving Video Analytics via Share Transforming in Distributed Clouds. 228:1-228:29 - Nicolas M. Dibot

, Julien P. Renoult
, William Puech
:
Generation and Editing of Mandrill Faces: Application to Sex Editing and Assessment. 229:1-229:23 - Xuemei Zhou

, Irene Viola
, Evangelos Alexiou
, Jack Jansen
, Pablo César
:
Subjective and Objective Quality Assessment for Dynamic Point Cloud with Visual Attention in 6 DoF. 230:1-230:24 - Changtao Miao

, Qi Chu
, Zhentao Tan
, Zhenchao Jin
, Tao Gong
, Wanyi Zhuang
, Yue Wu
, Bin Liu
, Honggang Hu
, Nenghai Yu
:
Multi-spectral Class Center Network for Face Manipulation Localization. 231:1-231:24 - Cheng Xiong

, Chuan Qin
, Zhenxing Qian
, Xiaolong Li
, Xinpeng Zhang
:
Robust and Secure Hashing Towards Pirated Neural Network Model Detection. 232:1-232:20 - Zichen Zhu

, Stefano Petrangeli
, Viswanathan (Vishy) Swaminathan
, Sheng Wei
:
Offloading-based Power-Efficient Mobile VTuber Live Streaming. 233:1-233:21 - Longteng Kong

, Wanting Zhou
, Yongjian Huai
, Jie Qin
:
Multi-Scale Reconstruction and Relation Decomposition Modeling for Group Activity Recognition. 234:1-234:23 - Wenjie Zhang

, Jun Yin
, Peng Yu
, Yibo Guo
, Xiaoheng Jiang
, Shaohui Jin
, Ming Liang Xu
:
Echo Depth Estimation via Attention-based Hierarchical Multi-scale Feature Fusion Network. 235:1-235:20 - Hoyoung Kim

, Azimbek Khudoyberdiev
, Shubhangi S. R. Garnaik
, Arani Bhattacharya
, Jihoon Ryoo
:
CLOUD-CODEC: A New Way of Storing Traffic Camera Footage at Scale. 236:1-236:28 - Xianhong Wen

, Sheng Ren
, Bin Hu
, Xiangyuan Zhu
, Tianyu Chen
, Kehua Guo
:
Adaptive Alignment Contrastive Learning of Degradation Prediction for Blind Image Super-Resolution. 237:1-237:22 - Jiahe Wang

, Xizhan Gao
, Sijie Niu
, Hui Zhao
, Guang Feng
:
DIRL: Learning Discriminative ID-Related Representations for Video Visible-Infrared Person ReID. 238:1-238:16 - Songran Zhou

, Tao Wu
, Xuewei Li
, Xiubo Liang
, Naye Ji
, Xi Li
:
GAMA-Pose: Graph-Aware Multi-Representation Aggregation for 3D Human Pose Estimation. 239:1-239:24 - Guibiao Liao

, Jiankun Li
, Zhenyu Bao
, Xiaoqing Ye
, Qing Li
, Kanglin Liu
:
CLIP-GS: CLIP-Informed Gaussian Splatting for View-Consistent 3D Indoor Semantic Understanding. 240:1-240:24 - Huijuan Guo

, Baoning Niu
, Ying Huang
, Xuefei Bai
, Fangpeng Lan
, Peng Zhao
:
Spread Spectrum Watermark in DC: A View from the Embedding Processing. 241:1-241:24 - Mohamed Zakariya Talhaoui

, Zhelong Wang
, Mohamed Amine Midoun
, Messaouda Trid
, Hamidaoui Meryem
, Abdelkarim Smaili
, Mekkaoui Djamel Eddine
, Mourad Lablack
:
A Real-Time Medical Image Encryption Algorithm Leveraging a Novel Hypersensitive Chaotic Map. 242:1-242:23 - Xin Zhang

, Hongzhi Feng
, M. Shamim Hossain
, Yinzhuo Chen
, Hongbo Wang
, Yuyu Yin
:
Scaled Background Swap: Video Augmentation for Action Quality Assessment with Background Debiasing. 243:1-243:18
- Lisi Wei

, Libo Zhao
, Xiaoli Zhang
:
Corrigendum: MAINet: Modality-Aware Interaction Network for Medical Image Fusion. C1:1
Volume 21, Number 9, September 2025
- Carsten Griwodz

, Mea Wang
, Roger Zimmermann
:
Introduction to the Special Issue on MMSys 2023 and NOSSDAV 2023. 244:1-244:4 - Na Li

, Zichen Zhu
, Sheng Wei
, Yao Liu
:
EVASR: Edge-Based Salience-Aware Super-Resolution for Enhanced Video Quality and Power Efficiency. 245:1-245:24 - Bruno Yuji Lino Kimura, Simone Ferlin

, Thomas William do Prado Paiva, Toktam Mahmoodi
, Anna Brunström
, Ozgu Alay
:
Evaluating Adaptive Video Streaming over Multipath QUIC with Shared Bottleneck Detection. 246:1-246:25 - Ila Gokarn

, Yigong Hu
, Tarek F. Abdelzaher
, Archan Misra
:
RA-MOSAIC: Resource Adaptive Edge AI Optimization over Spatially Multiplexed Video Streams. 247:1-247:25 - Jiaxi Li

, Jingwei Liao
, Bo Chen
, Anh Nguyen
, Aditi Tiwari
, Qian Zhou
, Zhisheng Yan
, Klara Nahrstedt
:
ST-360: Spatial-Temporal Filtering-Based Low-Latency 360-Degree Video Analytics Framework. 248:1-248:25 - Gabriel De Castro Araújo

, Henrique Domingues Garcia
, Mylène C. Q. Farias
, Ravi Prakash
, Marcelo M. Carvalho
:
A 360-degree Video Player for Dynamic Video Editing Applications. 249:1-249:23 - Michael Rudolph

, Stefan Schneegass
, Amr Rizk
:
Transcoding V-PCC Point Cloud Streams in Real-time. 250:1-250:22 - Hao Fang

, Haoyuan Zhao
, Feng Wang
, Yi Ching Chou
, Long Chen
, Jianxin Shi
, Jiangchuan Liu
:
Streaming Media over LEO Satellite Networking: A Measurement-Based Analysis and Optimization. 251:1-251:24 - Zoubida Ameur

, Claire-Hélène Demarty
, Olivier Le Meur
, Daniel Ménard
:
Style-FG: A Style-based Framework for Film Grain Analysis and Synthesis. 252:1-252:24 - Raphael Abreu

, Joel A. F. dos Santos
, Gheorghita Ghinea
, Débora C. Muchaluat-Saade
:
Assessing Usefulness, Ease of Use, and Recognition Performance of Semi-Automatic Mulsemedia Authoring. 253:1-253:19 - Silvia Rossi

, Irene Viola
, Laura Toni
, Pablo César
:
A Clustering Approach to Unveil User Similarities in 6 df Extended Reality Applications. 254:1-254:27 - Vijay John

, Yasutomo Kawanishi
:
Multimodal Cascaded Framework with Multimodal Latent Loss Functions Robust to Missing Modalities. 255:1-255:21 - Kuan-Yu Lee

, Ashutosh Singla
, Pablo César
, Cheng-Hsin Hsu
:
Adaptive Cloud VR Gaming Optimized by Gamer QoE Models. 256:1-256:24
- Yuqing Yang

, Anh Nguyen
, Zhisheng Yan
:
A Patch Can Disrupt Live Video Streaming: Physical Adversarial Attacks on Deep Learning Compression. 257:1-257:23 - Xiaoye Qu

, Qiyuan Chen
, Wei Wei
, Jiashuo Sun
, Daizong Liu
, Jianfeng Dong
:
Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation. 258:1-258:22 - Bing Liu

, Wenjie Yang
, Mingming Liu
, Hao Liu
, Yong Zhou
, Peng Liu
:
Syntactic-Conditional Diffusion Networks for Controllable Image Captioning. 259:1-259:25 - Liyong Xu

, Yifan Jiao
, Bing-Kun Bao
:
Bool Prompt with Decomposition and Enhancement: Zero-Shot VQA Based on PVLMs. 260:1-260:21 - Pengyu Li

, Cheolkon Jung
:
MRFGNet: Multiscale Reference Frame Generation Network for VVC Inter-Coding. 261:1-261:20 - Guiyu Xia

, Zhedong Jin
, Dongdong Fang
, Yubao Sun
:
Source Information-Assisted UV-Space Transformation Network for Person Image Generation. 262:1-262:16 - Junle Liu

, Yun Zhang
, Zixi Guo
, Xiaoxia Huang
, Gangyi Jiang
:
Multiscale Feature Importance-Based Bit Allocation for End-to-End Feature Coding for Machines. 263:1-263:19 - Hefeng Ji

, Jing Xiao
, Jiefan Lin
, Jimin Liu
, Haoyong Yu
:
Intelligent Tumor Synthesis Based on Medical Image Knowledge for Liver Tumor Segmentation. 264:1-264:23 - Hao Ding

, Jing Sun
, Rui Long
, Xiaoping Jiang
, Hongling Shi
, Yuting Qin
, Zongze Li
, Jian-Jin Li
:
Visible-Infrared Person Re-Identification Based on Feature Decoupling and Refinement. 265:1-265:16 - Sanhita Pathak

, Vinay Kaushik
, Brejesh Lall
:
Garment Recycle Training and Conditional Garment-Person Outline Attention-Guided Virtual Tryon. 266:1-266:26 - Zishan Xu

, Xiaofeng Zhang
, Yuqing Yang
, Wei Chen
, Jueting Liu
, Tingting Xu
, Zehua Wang
, Abdulmotaleb El-Saddik
:
MuralAgent: Enhancing Ancient Mural Outpainting with RAG-Based Texts and Multimodal Integration. 267:1-267:17 - Hanzhang Wang

, Haoran Wang
, Zhongrui Yu
, Mingming Sun
, Junjun Jiang
, Xianming Liu
, Deming Zhai
:
FAST: Flexibly Controllable Arbitrary Style Transfer via Latent Diffusion Models. 268:1-268:20 - Zhichao Zhang

, Wei Sun
, XinYue Li, Jun Jia
, Xiongkuo Min
, Zicheng Zhang
, Chunyi Li
, Zijian Chen
, Puyi Wang
, Fengyu Sun
, Shangling Jui
, Guangtao Zhai
:
Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified Model. 269:1-269:24 - Bowen Huang

, Yanwei Zheng
, Chuanlin Lan
, Dongchen Sui
, Xinpeng Zhao
, Xiao Zhang
, Mengbai Xiao
, Dongxiao Yu
:
Action-Aware Visual-Textual Alignment for Long-Instruction Vision-and-Language Navigation. 270:1-270:22 - Chenyang Lu

, Zhikai Wei
, Huapeng Wu
, Le Sun
, Tianming Zhan
:
KANformer: Dual-Priors-Guided Low-Light Enhancement via KAN and Transformer. 271:1-271:20 - Xiang Guo

, Ruimin Hu
, Dongliang Zhu, Mei Wang
:
Uniform Light Transformer for Person Re-identification under Complex Illumination. 272:1-272:18 - Xin Liu

, Qiya Song
, Lin Xiao
, Chun Wang
, Xieping Gao
:
LPIC: Learnable Prompts and ID-guided Contrastive Learning for Multimodal Recommendation. 273:1-273:16
Volume 21, Number 10, October 2025
- Alex Falcon, Giuseppe Serra, Sergio Escalera, Michael Wray:

Introduction to the Special Issue on Text-Multimedia Retrieval: Retrieving Multimedia Data by Means of Natural Language. 274:1-274:4 - Shiping Ge, Zhiwei Jiang, Yafeng Yin, Cong Wang, Zifeng Cheng, Qing Gu:

Fine-Grained Alignment Network for Zero-Shot Cross-Modal Retrieval. 275:1-275:24 - Suyi Li, Chenyi Jiang, Shidong Wang, Yang Long, Zheng Zhang, Haofeng Zhang:

Contextual Interaction via Primitive-based Adversarial Training for Compositional Zero-shot Learning. 276:1-276:24 - Ying Li, Ding Yuxiang:

MoHGCN: Momentum Hypergraph Convolution Network for Cross-modal Retrieval. 277:1-277:21 - Suncheng Xiang, Jingsheng Gao, Mingye Xie, Mengyuan Guan, Jiacheng Ruan, Yuzhuo Fu:

Learning Visual-Semantic Embedding for Generalizable Person Re-Identification: A Unified Perspective. 278:1-278:17 - Renjie Pan, Hua Yang, Xiangyu Zhao:

ReAL: Improving Image-Text Retrieval with Authentic Negative Repository Learning. 279:1-279:22 - Shunxiang Zhang, Jiajia Liu, Yixuan Jiao, Yulei Zhang, Lei Chen, Kuanching Li:

A Multimodal Semantic Fusion Network with Cross-Modal Alignment for Multimodal Sentiment Analysis. 280:1-280:22 - Alex Ergasti, Tomaso Fontanini, Claudio Ferrari, Massimo Bertozzi, Andrea Prati:

MARS: Paying More Attention to Visual Attributes for Text-Based Person Search. 281:1-281:22 - Taichi Nishimura, Shota Nakada, Masayoshi Kondo:

Vision-Language Models Learn Super Images for Efficient Partially Relevant Video Retrieval. 282:1-282:22 - Liming Xu, Hanqi Li, Jie Shao, Xianhua Zeng, Weisheng Li:

Multi-scale Consistency Deep Lifelong Cross-modal Hashing. 283:1-283:23 - Liming Xu, Dengping Zhao, Hanqi Li, Xianhua Zeng, Bochuan Zheng:

Deep Differential Lifelong Cross-modal Hashing for Stream Medical Data Retrieval. 284:1-284:23 - Qun Zhang, Chao Yang, Bin Jiang, Bolin Zhang:

Multi-Grained Alignment with Knowledge Distillation for Partially Relevant Video Retrieval. 285:1-285:22 - Hongyi Zhu, Jia-Hong Huang, Yixian Shen, Stevan Rudinac, Evangelos Kanoulas:

Interactive Image Retrieval Meets Query Rewriting with Large Language and Vision Language Models. 286:1-286:23 - Sina Ehsani, Jian Liu:

Elevating Textual Question Answering with On-Demand Visual Augmentation. 287:1-287:25 - Diego Gragnaniello, Antonio Greco, Carlo Sansone, Bruno Vento:

Video Fire Recognition Using Zero-Shot Vision-Language Models Guided by a Task-Aware Object Detector. 288:1-288:24 - Nicola Messina, Jan Sedmidubský, Fabrizio Falchi, Tomás Rebok:

Joint-Dataset Learning and Cross-Consistent Regularization for Text-to-Motion Retrieval. 289:1-289:24
- Jianbo Song, Hong Zhang, Yachun Feng, Hanyang Liu, Yifan Yang:

Language-guided Visual Tracking: Comprehensive and Effective Multimodal Information Fusion. 290:1-290:23 - Divya Arora Bhayana, Om Prakash Verma:

Trans-Convo-Former Net for Hierarchical Prediction of Household Images. 291:1-291:21 - Xiaobo Hu, Youfang Lin, Jinwen Wang, Yue Liu, Shuo Wang, Hehe Fan, Kai Lv:

Learning Robust Representations via Bidirectional Transition for Visual Reinforcement Learning. 292:1-292:24 - Mingliang Zhou, Shuqi Han, Jun Luo, Xu Zhuang, Qin Mao, Zhengguo Li:

Transformer-Based and Structure-Aware Dual-Stream Network for Low-Light Image Enhancement. 293:1-293:24 - Yuan Cao, Dong Wang:

Dual-Branch Cross-Layer Information Flow Network for Camouflaged Object Detection in Complex Scenes. 294:1-294:19 - Haojie Li, Hao Chen, Yining Huang, Tianshui Chen, Shuangping Huang:

Enhancing Lip Dynamic Authenticity: Learning 3D Temporal Representations for Talking Head Synthesis. 295:1-295:21 - Zhili Zhou, Wensheng Zhang, Zhengdao Li, Huilin Ge, Bin Qiu, Fengjun Xiao, Yongfeng Huang:

Progressive Generative Steganography via High-Resolution Image Generation for Covert Communication. 296:1-296:23 - Jingtian Wang, Xiaolong Li, Bin Ma, Yao Zhao:

Boosting Transferability of Adversarial Examples with Spatio-Temporal Context. 297:1-297:22 - Xu Guo, Tong Zhang, Fuyun Wang, Xudong Wang, Xiaoya Zhang, Xin Liu, Zhen Cui:

MMHCL: Multi-Modal Hypergraph Contrastive Learning for Recommendation. 298:1-298:23 - Xiao Pan, Zongxin Yang, Shuai Bai, Yi Yang:

GD-NeRF: Generative Detail Compensation for One-shot Generalizable Neural Radiance Fields. 299:1-299:24 - Jiacheng Yao, Jing Zhang, Shuying Zhang, Li Zhuo:

Cross-Modal Tri-Semantic Correlation-CLIP for Short Video Homogenization Recognition. 300:1-300:23 - Zhiwen Shao, Yifan Cheng, Fan Zhang, Xuehuai Shi, Canlin Li, Lizhuang Ma, Dit-Yan Yeung:

Micro-Expression Recognition via Fine-Grained Dynamic Perception. 301:1-301:23 - Yue Liu, Zhangkai Ni, Peilin Chen, Shiqi Wang, Xinfeng Zhang, Hanli Wang, Sam Kwong:

EIN: Exposure-Induced Network for Single-Image HDR Reconstruction. 302:1-302:23 - Ali Ghorbanpour, Mohammad Amin Arab, Mohamed Hefeeda:

RDIAS: Robust and Decentralized Image Authentication System. 303:1-303:28

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














