


default search action
MMAsia 2024: Auckland, New Zealand
- Ruili Wang, Zhiyong Wang, Jiaying Liu, Alberto Del Bimbo, Jun Zhou, Anup Basu, Min Xu:

Proceedings of the 6th ACM International Conference on Multimedia in Asia, MMAsia 2024, AucklandNew Zealand, December 3-6, 2024. ACM 2024, ISBN 979-8-4007-1273-9
Full Papers
- Chen-Hsiu Huang

, Ja-Ling Wu
:
SLIC: Secure Learned Image Codec through Compressed Domain Watermarking to Defend Image Manipulation. 1:1-1:7 - Bingyang Cui

, Yujie Zhang
, Qi Yang
, Yiling Xu
:
MS-GeodesicPSIM: Predicting the Quality of Static Mesh with Texture Map via multi-scale Geodesic Patch Similarity. 2:1-2:7 - Haowei Lou

, Hye-Young Paik
, Wen Hu
, Lina Yao
:
StyleSpeech: Parameter-efficient Fine Tuning for Pre-trained Controllable Text-to-Speech. 3:1-3:7 - Jian Ma

, Bin Zhu
, Kun Li
, Dima Damen
:
Active Object Segmentation: A New Modality for Egocentric Action Recognition. 4:1-4:7 - Xiaoyi Han

, Yanfei Wu
, Nan Pu
, Zunlei Feng
, Qifei Zhang
, Yijun Bei
, Lechao Cheng
:
Fire and Smoke Detection with Burning Intensity Representation. 5:1-5:8 - Xianbin Hu

, Wei Wu
, Zhu Li
:
RandommaskFormer: Light Weight Remote Sensing Scene Classification with Masked Transformer. 6:1-6:7 - Tailin Yang

, Wei Wu
, Zhu Li
, Rui Zhou
:
Multi-Frame Sparse Convolutional Learning for Point Cloud Color Denoising. 7:1-7:7 - Yiran Chen

, Haoran Liu
, Mingzhe Liu
, Yanhua Liu
, Ruili Wang
, Peng Li
:
Moving Object Tracking based on Kernel and Random-coupled Neural Network. 8:1-8:6 - Zihuang Wu

, Xinyu Xiong
, Ying Chen
, Siying Li
, Hua Chen
:
MoE-Polyp: Shifting More Attention to Small Polyp Segmentation via Mixture-of-Experts. 9:1 - Zhiyuan Wang

, Cong Yang
, Yulu Zhang
, Zeyd Boukhers
, Wei Sui
, Yi Ji
, Chunping Liu
:
Transition in Focus of Prediction Tasks for Skeleton Graph Component Detection with Transformer. 10:1-10:7 - Chenqiu Zhao

, Guanfang Dong
, Anup Basu
:
Accelerating Inference of Networks in the Frequency Domain. 11:1-11:7 - Qi Yang

, Kaifa Yang
, Yuke Xing
, Yiling Xu
, Zhu Li
:
A Benchmark for Gaussian Splatting Compression and Quality Assessment Study. 12:1-12:8 - Zichuan Huang

, Yifan Li
, Shuai Yang
, Jiaying Liu
:
CoolColor: Text-guided COherent OLd film COLORization. 13:1-13:7 - Takara Taniguchi

, Ryosuke Furuta
:
Learning Gaussian Data Augmentation in Feature Space for One-shot Object Detection in Manga. 14:1-14:8 - Shuwei Peng

, Xu Zhang
, Aiwen Jiang
, Changhong Liu
, Jihua Ye
:
Low-Light Image Enhancement via FourierTMamba: A Hybrid Frequency-Spatial Approach. 15:1-15:8 - Xin Li

, Feng Xu
, Yao Tong
, Fan Liu
, Yiwei Fang
, Xin Lyu
, Jun Zhou
:
FreqFormer: A Frequency Transformer for Semantic Segmentation of Remote Sensing Images. 16:1-16:8 - Ze Kun Wang

, Zhan Jun Si
:
Adaptive Both homo- and hetero-Feature Integration for Multimodal Emotion Recognition. 17:1 - Ruikun Zhang

, Hao Yang
, Yan Yang
, Ying Fu
, Liyuan Pan
:
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset. 18:1 - Guan-Yu Wu

, Wei-Ta Chu
:
Incremental Few-Shot Object Detection by Leveraging External Information from Large Multimodal Models. 19:1-19:7 - Yuang Liu

, Dacheng Liao, Mengshi Qi
, Liang Liu
, Huadong Ma
:
RoboFormer: A Robust Multi-Modal Transformer for 3D Object Detection in Autonomous Driving. 20:1-20:7 - Fanyi Wang

, Peng Liu
, Haotian Hu
, Dan Meng
, Jingwen Su
, Jinjin Xu
, Yanhao Zhang
, Xiaoming Ren
, Zhiwang Zhang
:
LoopAnimate: Loopable Salient Object Animation. 21:1-21:8 - Yuankang Pan

, Zhaoquan Yuan
, Xiao Wu
, Zechao Li
, Changsheng Xu
:
TMM-CLIP: Task-guided Multi-Modal Alignment for Rehearsal-Free Class Incremental Learning. 22:1-22:7 - Long H. Nguyen

, Nhat Truong Pham
, Mustaqeem Khan
, Alice Othmani
, Abdulmotaleb El-Saddik
:
HuBERT-CLAP: Contrastive Learning-Based Multimodal Emotion Recognition using Self-Alignment Approach. 23:1-23:6 - Tianchen Zhou

, Jiateng Liu
, Yue Jin
, Li Yao
:
MicroMamba: State Space Model with Partitioned Window Scan for Micro-Expression Recognition. 24:1-24:7 - Yang Yi

, Dasith de Silva Edirimuni
, Ye Zhu
, Shang Gao
, Zhiyong Wang
, Antonio Robles-Kelly
, Xuequan Lu
:
Point Cloud Normal Estimation via Representation Learning on Height Maps. 25:1-25:7 - Jieqiong Zhou

, Guoqing Zhang
, Yuhui Zheng
, Fuguo Zhang
:
Local Feature-Emphasizing Transformer for Cloth-Changing Person Re-identification. 26:1 - Chao Tan

, Sheng Li
, Yang Cao
, Zhao Ren
, Tanja Schultz
:
Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition. 27:1-27:7 - Faisal Ahmed

, Justin Rozeboom
, Hanran Song
, Chenqiu Zhao
, Anup Basu
:
LMoW: A Latent Random Variable Model for Unconditional Human Motion Generation. 28:1-28:8 - Yujia Xu

, Deyu Pan
, Ling Ding
:
A method for detecting hands off the steering wheel. 29:1-29:6 - Luhao Zhu

, Xiangwei Kong
, Runsen Li
, Guodong Guo
:
Where You See Is What You Know: A Visual-Semantic Conceptual Explainer. 30:1-30:7 - Shogo Yonezawa

, Yukinobu Taniguchi
, Go Irie
:
Bivariate Mixup for 2D Contact Point Localization with Piezoelectric Microphone Array. 31:1-31:7 - Jinheng Zhou

, Wu Liu
, Guang Yang
, He Zhao
, Feiniu Yuan
:
Prompting Industrial Anomaly Segment with Large Vision-Language Models. 32:1 - Yi-Chen Li

, Chih-Fan Hsu
, Jian-Kai Wang
, Chung-Chi Tsai
, Cheng-Hsin Hsu
:
MAFS: Modality-Aware Federated Semi-Supervised Learning with Selective Data Sharing Specified by Individual Clients. 33:1-33:8 - Yiran Song

, Qianyu Zhou
, Kun Hu
, Lizhuang Ma
, Xuequan Lu
:
CFRL: Coarse-Fine Decoupled Representation Learning For Long-Tailed Recognition. 34:1-34:7 - Tao Jiang

, Feng Hou
, Yi Wang
:
Multimodal Energy Prompting for Video Salient Object Detection. 35:1-35:8 - Yingkai He

, Zhen Zhang
, Jing Xiao
:
A Multi-scale Framework towards Human-Machine Friendly Remote Sensing Image Coding. 36:1-36:6 - Mingwei Cao

, Fengna Wang
, Dengdi Sun
, Haifeng Zhao
:
BCS-NeRF: Bundle Cross-Sensing Neural Radiance Fields. 37:1-37:8 - Zhenzhen Hu

, Xin Guan
, Jia Li
, Zijie Song
, Richang Hong
:
Dual-Stream Keyframe Enhancement for Video Question Answering. 38:1-38:7 - Junchao Ge

, Huafeng Li
, Yafei Zhang
:
Robust discriminative and modal-consistent feature learning for fine-grained sketch-based image retrieval. 39:1-39:8 - Ying Hu

, Chenyi Zhuang
, Pan Gao
:
DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer. 40:1 - Mary Pilataki

, Matthias Mauch
, Simon Dixon
:
Pitch-aware generative pretraining improves multi-pitch estimation with scarce data. 41:1-41:8 - Qianyu Li

, Bingcai Chen
, Jiaxing Tian
, Ruolan Liu
:
FA-UNext: A Feedback Attention-based MLP Network for Medical Image Segmentation. 42:1-42:7 - Jiaqi Chen

, Yan Yang
, Shizhuo Deng
, Da Teng
, Liyuan Pan
:
SpikMamba: When SNN meets Mamba in Event-based Human Action Recognition. 43:1-43:8 - Tingting Yao

, Yuan Gao
, Zihao Feng
, Qing Hu
, Zhiyong Wang
:
Underwater Image Enhancement via Domain Adaptive Transfer Learning and Hybrid Reinforcement Model. 44:1-44:7 - Yuhang Zhang

, Cuixin Yang
, Muxin Liao
, Shishun Tian
, Wenbin Zou
, Chen Xu
:
Layout Relationship Decoupling Framework for Multi-target Domain Adaptative Semantic Segmentation. 45:1-45:7 - Haoyuan Zhang

, Xiangyu Zhu
, Qu Tang
, Zhaoxiang Zhang
, Zhen Lei
:
STODINE: Decompose video to Object-centric Spatial-Temporal Slots for physical reasoning. 46:1-46:7 - Ziqiang Liu, Gongwei Fang, Wentong Wang, Qiang Liu:

Multimodal Sign Language Knowledge Graph and Representation: Text Video KeyFrames and Motion Trajectories. 47:1-47:7 - JieYing Liu

:
The Quantification of Emotional Expressions and Perceptions of Vocal Vibrato in Basic Emotion: Commercial Operatic Singing Recordings. 48:1-48:7 - Zhengjie Lu

, Jinjia Peng
, Huibing Wang
, Qingxuan Shi
, Bin Wang
:
HSMnet: Hybrid Sampling and Matching Network for DETR-based Person Search. 49:1-49:7 - Trung Thanh Nguyen

, Yasutomo Kawanishi
, Takahiro Komamizu
, Ichiro Ide
:
Action Selection Learning for Multi-label Multi-view Action Recognition. 50:1 - Sangni Xu, Hao Xiong, Qiuxia Wu, Zhihui Wang, Shlomo Berkovsky

, Zhiyong Wang:
Fast Online Adaptation of Visual SLAM via Variational Information Transfer and Preservation. 51:1-51:7 - Son Duy Dao

, Hengcan Shi
, Dinh Q. Phung
, Jianfei Cai
:
CA-OVS: Cluster and Adapt Mask Proposals for Open-Vocabulary Semantic Segmentation. 52:1-52:8 - Shuangping Chen

, Huijin Wang
, Shun Long
, Jieyun Bai
, Jianmei Jiang
:
Ultrasound Video Segmentation of Pubic Symphysis and Fetal Head for Angle of Progression Measurement. 53:1-53:8 - Pu Li

, Yibiao Zhao
, Xiaobai Liu
:
Policy-driven Auto-Augmentation with Distillment Rewards for Scene Text Recognition. 54:1-54:8 - Li Jiao

, Lihong Cao
, Tian Wang
:
Prompt-based Continual Learning for Extending Pretrained CLIP Models' Knowledge. 55:1-55:8 - Xinhao Zhong

, Siyu Jiao
, Yao Zhao
, Yunchao Wei
:
Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection. 56:1-56:7 - Junjiang Liu

, Dandan Sun
, Hailun Xia
, Jiangtao Bai
, Xinyue Fan
:
FeedMatch: Evolves for Semi-Supervised Multimedia Classification from Student Feedback. 57:1-57:6 - Huilin Chen

:
MFNet: Mixed Feature Network for Enhancing Facial Emotion Recognition on the Small-Scale Dataset. 58:1-58:7 - Zhiyi Mo

, Guangtong Zhang
, Jian Nong
, Bineng Zhong
, Zhi Li
:
Dual-stream Multi-modal Interactive Vision-language Tracking. 59:1-59:7 - Yangyuan Chen

, Zhizhong Ma
, Mingjing Wang
, Mingzhe Liu
:
Advancing Music Emotion Recognition: A Transformer Encoder-Based Approach. 60:1-60:5 - Jie Wang

, Huilin Chen
, Wandong Xue
, Dongming Chen
, Dongqi Wang
:
A Multi-angle Text Recognition Algorithm. 61:1-61:7 - Zhiyuan Li

, Dongnan Liu
, Heng Wang
, Chaoyi Zhang
, Weidong Cai
:
Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation. 62:1-62:8 - Yuanyuan Shi

, Yunan Li
, Huizhou Chen
, Siyu Liang
, Qiguang Miao
:
CISampler: Correlated Information Guided Frame Sampling for Gesture Recognition in Video. 63:1-63:8 - Meng Shen

, Yake Wei
, Jianxiong (Terry) Yin
, Deepu Rajan
, Di Hu
, Simon See
:
Enhancing Modality Representation and Alignment for Multimodal Cold-start Active Learning. 64:1-64:8 - Huajie Tan

, Guoqing Xiang
, Xiaodong Xie
, Huizhu Jia
:
Joint Frame-Level and Block-Level Rate-Perception Optimized Preprocessing for Video Coding. 65:1 - Yongjian Liu

, Shunwei Zhang
, Jinyu Xu
, Jiachen Li
, Yanchun Ma
, Qing Xie
:
Dlpp-Net: Degradation Location Prior Prediction Network for Image Restoration. 66:1-66:8 - Fengqi Li

, Mengchao Guo
, Renxuan Xiong
, Donglei Yang
, Yi Wang
, Fengqiang Xu
:
MSTMENet: Multi-Scale Spatio-Temporal Mapping and Evolution Network for Video Deraining. 67:1 - Yichen Ouyang

, Jiayi Ye
, Wenhao Chai
, Dapeng Tao
, Yibing Zhan
, Gaoang Wang
:
An Efficient Multi-prior Hybrid Approach for Consistent 3D Generation from Single Images. 68:1 - Minghui Wang

, Zixu Wang
, Hongbin Xu
, Kun Hu
, Zhiyong Wang
, Wenxiong Kang
:
T2QRM: Text-Driven Quadruped Robot Motion Generation. 69:1-69:7 - Yanming Chen

, Ziyu Liu
, Xiangjian He
:
MambaVesselNet: A Hybrid CNN-Mamba Architecture for 3D Cerebrovascular Segmentation. 70:1-70:7 - Hongyu An

, Xinfeng Zhang
, Shijie Zhao
, Li Zhang
:
FATO: Frequency Attention Transformer for Omnidirectional Image Super-Resolution. 71:1-71:7 - Kai Zhang

, Xia Yuan
, Shuntong Chen
, Di Hu
, Chunxia Zhao
:
Multi-Modality Semantic-Shared Cross-View Ground-to-Aerial Localization. 72:1-72:7 - Sicheng Liu

, Lintao Wang
, Xiaogang Zhu
, Xuequan Lu
, Zhiyong Wang
, Kun Hu
:
SITransformer: Shared Information-Guided Transformer for Extreme Multimodal Summarization. 73:1-73:7 - Song Huang

, Ziming Zeng
, Min Li
, Jianping Wang
:
Unified Multi-view Clustering based on Joint Multi-Structure Representation Learning. 74:1-74:7 - Cheng-Kang Tan

, Wei-Ta Chu
:
CS-HOI: Human Object Interaction Detection Enhanced by Common Sense. 75:1-75:7 - Zhiqian Dong

, Sheng Yang
, Peng Zhou
:
Dual-Enhanced Disentangled Multi-View Clustering. 76:1-76:7 - Rim El Filali

, Soufiane Jdaba
, Ronghui Xie
, Ran Shi
, Tong Qiao
, Pan Qiaodong
, Ting Wu
:
S2FB IoU: Improving Boundary-based Object-Centric Image Segmentation Quality Evaluation. 77:1-77:7 - Yu Wei

, Yi Wang
, Shijun Yan
, Tianzhu Wang
, Zhihan Wang
, Weirong Sun
, Yu Zhao
, Xinwei Xue
:
CSUNet: Contour-Sensitive Underwater Salient Object Detection. 78:1-78:7 - Yachao He

, Li Liu
, Huaxiang Zhang
, Dongmei Liu
, Hongzhen Li
:
A Unified Contrastive Framework with Multi-Granularity Fusion for Text-to-Image Generation. 79:1-79:7 - Jingxuan Chen

:
GGAvatar: Reconstructing Garment-Separated 3D Gaussian Splatting Avatars from Monocular Video. 80:1-80:7 - Zhaojun Guo

, Junqiang Huang
, Guobiao Li
, Wanli Peng
, Xinpeng Zhang
, Zhenxing Qian
, Sheng Li
:
Emotion-Aware and Efficient Meme Sticker Dialogue Generation. 81:1 - Xiaocong Zhou

, Fan Liu
, Chuanyi Zhang
, Feifan Li
, Wenwen Cai
, Jun Zhou
:
Feature-weighted Multi-stage Bayesian Prototype for Few-shot Classification. 82:1-82:7 - Jiale Wang

, Xueliang Liu
, Yuling Su
:
A Robust Few-shot Learning Framework via Dual-branch Adversarial Noise Pretraining. 83:1-83:8 - Zheng-Xian Keh

, Lai-Kuan Wong
, Yuen Peng Loh
, Ke Gu
, Weisi Lin
:
KBY-Net: A Dual Learning Framework for Improving Object Detection in Rainy Weather Conditions. 84:1-84:7 - Muhammad Saad Shakeel

, Kun Liu
, Xiaochuan Liao
, Wenxiong Kang
:
MRGait: A Multi-range feature learning framework for Cross-View Gait Recognition. 85:1-85:7 - Shun Katada

, Kazunori Komatani
:
Personalized Sentiment Estimation Based on Recall and Resting Ratio of Frontal EEG. 86:1-86:7 - Wenyu Shao

, Hongbo Liu
:
TCFusion: A Three-branch Cross-domain Fusion Network for Infrared and Visible Images. 87:1 - Jianhua Zhao

, Xue Jun Li
, Peter Han Joo Chong
:
HFS-HNeRV: High-Frequency Spectrum Hybrid Neural Representation for Videos. 88:1-88:7 - Zichen Zhu

, Zhongze Tang
, Amir Nassereldine
, Jinjun Xiong
, Sheng Wei
:
OpenVideoWalls: an Open-Source System for Building Video Walls with Recycling Heterogeneous Displays. 89:1-89:7 - Yingkai He

, Zhen Zhang
, Liang Liao
, Jing Xiao
:
Latent Variables Coding for Perceptual Image Compression. 90:1-90:7 - Yuxin Yang

, Pengfei Zhu
, Mengshi Qi
, Huadong Ma
:
Following in the Footsteps: Predicting Human Trajectories Using Motion Pattern Memory. 91:1-91:7 - Liqun Shan

, Rujun Zhang
, Sai Venkatesh Chilukoti
, Xingli Zhang
, Insup Lee
, Xiali Hei
:
IdentityKD: Identity-wise Cross-modal Knowledge Distillation for Person Recognition via mmWave Radar Sensors. 92:1-92:7 - Xingang Wang

, Mengyi Wang
, Hai Cui
, Yijia Zhang
:
Efficient Low-Dimensional Representation Via Manifold Learning-Based Model for Multimodal Sentiment Analysis. 93:1-93:7 - Yixin Zhang

, Yoko Yamakata
, Keishi Tajima
:
Adaptive Feature Inheritance and Thresholding for Ingredient Recognition in Multimedia Cooking Instructions. 94:1-94:7 - Fei Xiang

, Hongbo Liu
, Ruili Wang
, Junjie Hou
, Xingang Wang
:
DCEPNet: Dual-Channel Emotional Perception Network for Speech Emotion Recognition. 95:1 - Dongming Chen

, Mingshuo Nie
, Zhengping Sun
, Huilin Chen
, Dongqi Wang
:
An Information Cascade Prediction Algorithm Based on Time Series. 96:1 - Chengxi Lei

, Satwinder Singh
, Feng Hou
, Ruili Wang
:
Mix-fine-tune: An Alternate Fine-tuning Strategy for Domain Adaptation and Generalization of Low-resource ASR. 97:1-97:7 - Yuchong Sun

, Bei Liu
, Xu Chen
, Ruihua Song
, Jianlong Fu
:
ViCo: Engaging Video Comment Generation with Human Preference Rewards. 98:1 - Zeyu Zhao

, Nan Gao
, Zhi Zeng
, Guixuan Zhang
, Jie Liu
, Shuwu Zhang
:
A Unified Editing Method for Co-Speech Gesture Generation via Diffusion Inversion. 99:1-99:7 - Raj Jaiswal

, Avinash Anand
, Rajiv Ratn Shah
:
Advancing Multimodal LLMs: A Focus on Geometry Problem Solving Reasoning and Sequential Scoring. 100:1-100:7 - Haipeng Li

, Guangcun Wei
, Haochen Xu
, Boyan Guo
:
DocPointer: A parameter-efficient Pointer Network for Key Information Extraction. 101:1-101:7 - Yani Chen

, Jiaxiang E
, Kaiyu Nie
, Xiaoxia Nie
, Ruili Wang
:
Development of a Chinese Synonym Library: Enhancing Clinical Terminology Standardization and Interoperability. 102:1-102:7 - Chen Wang

, Feng Hou
, Yi Wang
, Ruili Wang
:
Structured Bipartite Graph Ensemble Clustering. 103:1-103:7 - Yuan Gao

, Feng Hou
, Ruili Wang
:
Incorporating Pre-ordering Representations for Low-resource Neural Machine Translation. 104:1-104:7 - Wenhao Gao

, Zhenbo Song
, Zhenyuan Zhang
, Jianfeng Lu
:
On the Robustness of Deep Face Inpainting: An Adversarial Perspective. 105:1-105:7 - Ying Qiao

, Aoxuan Chen
, Xiang Li
, Jinfei Gao
:
Variational Stochastic Multiple Auto-Encoder For Multimodal Recommendation. 106:1-106:7 - Jiazhen Zhang

, Kun Li
, Yanyan Wei
, Fei Wang
, Wei Qian
, Jinxing Zhou
, Dan Guo
:
Repetitive Action Counting with Feature Interaction Enhancement and Adaptive Gate Fusion. 107:1-107:7 - Xiong Zeng

, Min Jiang
, Ronghua Huang
:
Multi-stage Image Deraining based on Pre-trained Diffusion Model. 108:1-108:7 - Tomoya Sugihara

, Shuntaro Masuda
, Ling Xiao
, Toshihiko Yamasaki
:
Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video. 109:1 - Qingjin Wei

, Xiaozhuo Li
, Dinglu Liu
, Zhiwu Liao
:
MFTAnet: Two-step Aggregation Net of Multiscale Features for Pneumoconiosis Screening. 110:1-110:7 - Huijie Zhang

, Xiaobai Liu
:
Focal Diffusion Process for Object-Aware 3D LiDAR Generation. 111:1-111:7 - Longyun Dong

, Yuanrong Xu
, Jianping Zhong
, Zhaobo Qi
, Weigang Zhang
:
Improving Sequential DeepFake Detection with Local information enhancement. 112:1 - Pingyi Huo

, Ajay Narayanan Sridhar
, Md Fahim Faysal Khan
, Kiwan Maeng
, Vijaykrishnan Narayanan
:
QoS-Diff: Adaptive Auto-tuning Framework for Low-latency Diffusion Model Inference. 113:1-113:7 - Cui Xu

, Laiyun Qing
:
Point-Supervised Temporal Action Detection with Label Supplementation Based on Transformer. 114:1-114:7 - Xu Gu

, Xihua Wang
, Chuhao Jin
, Ruihua Song
:
ScaMo: Towards Text to Video Storyboard Generation Using Scale and Movement of Shots. 115:1-115:8 - Guohuan Gao

, Gang Zhang
, Xiangyang Xu
:
ADP3D: Adaptive Point Selection for Efficient Multi-frame 3D Object Detection. 116:1-116:7 - Shanshan Yao

, Tian Li
:
Multi-domain Acoustic Feature Fusion for Speaker Recognition. 117:1-117:6 - Zhitong Zhu

, Jing Yu
, Keke Gai
, Jiamin Zhuang
, Gaopeng Gou
, Gang Xiong
:
Flexible Semantic Watermarking for Robust Diffusion Model Detection and Tracing. 118:1-118:7 - Shan Wan

, Wu Liu
, Yijun Liu
, Feiniu Yuan
, Chunli Meng
:
Watermarking Vision-Language Models. 119:1 - Jinwei Li

, Yongkang Cheng
, Yonghe Zhang
, Pengcheng Wang
:
Hierarchical Part-Attention Networks for 3D Human Reconstruction. 120:1
Short Papers and Demo Papers
- Hairui Yang

, Ning Wang
, Zhihui Wang
, Lei Wang
:
Sketch-based 3D Model Retrieval with Cross-Modal Representation. 121:1-121:5 - Daidou Guo

, Chuan Qin
:
PCMark-NAS: Lightweight Print-Camera Resilient Watermarking Networks via Neural Architecture Search. 122:1-122:5 - Daidou Guo

, Ching-Chun Chang
, Cheng SenMao
, Chuan Qin
:
Highly Fault-Tolerant Discrete Lattice Information Coding Method for Screen-Shooting Scenarios. 123:1-123:5 - Yu Song

, Xiaohui Yang
, Rongping Huang
, Haifeng Bai
, Lili Yang
:
CSCCap: Plugging Sparse Coding in Zero-Shot Image Captioning. 124:1-124:5 - Zuyi Pei

, Baoli Sun
, Zhihui Wang
, Haojie Li
:
Fine-grained Video Semantic Distillation for Video-Text Retrieval. 125:1-125:5 - Zihao Tang

, Xinyi Wang
, Mariano Cabezas
, Arkiev D'Souza
, Michael Barnett
, Fernando Calamante
, Weidong Cai
, Chenyu Wang
:
Fibre Population-guided Pre-training for 3D Spatial Super-Resolution on Multimodal Brain Diffusion MR Imaging. 126:1 - Mingzhe Zhang

, Laura J. Ferris
, Lin Yue
, Miao Xu
:
Emotionally Guided Symbolic Music Generation Using Diffusion Models: The AGE-DM Approach. 127:1-127:5 - Wei-Lun Huang

, Shao-Hung Wu
, Hung-Chang Huang
, Min-Chun Hu
, Tse-Yu Pan
:
Description-Driven Audiovisual Embedding Space Learning for Enhanced Movie Understanding. 128:1-128:5 - Chenxi Niu

, Ziyu Liu
, Xiangjian He
:
SS-FS CSA: Self-Supervised and Fully Supervised Integration for 3D Cerebrovascular Segmentation. 129:1-129:5 - Hao Zhang

, Xingning Dong
, Jinfei Gao
, Liang Hao
, Pei Shen
, Tian Gan
:
MBC-ATA: Maximum Binary Classification and Anchor-based Triplet Augmentation for Unbiased Scene Graph Generation. 130:1-130:5 - Tianqi Wei

, Zhi Chen
, Xin Yu
:
Snap and Diagnose: An Advanced Multimodal Retrieval System for Identifying Plant Diseases in the Wild. 131:1-131:3

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














