


default search action
26th ICMI 2024: San Jose, Costa Rica
- Hayley Hung, Catharine Oertel, Mohammad Soleymani, Theodora Chaspari, Hamdi Dibeklioglu, Jainendra Shukla, Khiet P. Truong:
Proceedings of the 26th International Conference on Multimodal Interaction, ICMI 2024, San Jose, Costa Rica, November 4-8, 2024. ACM 2024, ISBN 979-8-4007-0462-8
Keynotes
- Heloisa Candello:
A human-centered approach to design multimodal conversational systems. 1 - Yaser Sheikh:
3D Calling with Codec Avatars. 2 - Catherine Pelachaud
:
Greta, what else? Our research towards building socially interactive agents. 3
Main Track - Long and Short Papers
- Kelsey Turbeville, Jennarong Muengtaweepongsa, Samuel Stevens, Jason Moss, Amy Pon, Kyra Lee, Charu Mehra, Jenny Gutierrez Villalobos
, Ranjitha Kumar:
LLM-powered Multimodal Insight Summarization for UX Testing. 4-11 - Nikola Kovacevic
, Christian Holz
, Markus Gross
, Rafael Wampfler
:
On Multimodal Emotion Recognition for Human-Chatbot Interaction in the Wild. 12-21 - Debasmita Ghose
, Oz Gitelson
, Brian Scassellati:
Integrating Multimodal Affective Signals for Stress Detection from Audio-Visual Data. 22-32 - Shu Zhong
, Elia Gatti, Youngjun Cho, Marianna Obrist
:
Feeling Textiles through AI: An exploration into Multimodal Language Models and Human Perception Alignment. 33-37 - Metehan Doyran
, Albert Ali Salah
, Ronald Poppe
:
Decoding Contact: Automatic Estimation of Contact Signatures in Parent-Infant Free Play Interactions. 38-46 - Christopher Dawes
, Jing Xue
, Giada Brianza
, Patricia Ivette Cornelio Martinez
, Roberto A. Montano Murillo
, Emanuela Maggioni
, Marianna Obrist
:
ScentHaptics: Augmenting the Haptic Experiences of Digital Mid-Air Textiles with Scent. 47-56 - Meng Chen Lee
, Zhigang Deng
:
Online Multimodal End-of-Turn Prediction for Three-party Conversations. 57-65 - Muneeb Imtiaz Ahmad, Abdullah Alzahrani
, Sunbul M. Ahmad
:
Detecting Deception in Natural Environments Using Incremental Transfer Learning. 66-75 - Florian Mathis
, Brad A. Myers
, Ben Lafreniere
, Michael Glueck
, David P. S. Marques:
MR-Driven Near-Future Realities: Previewing Everyday Life Real-World Experiences Using Mixed Reality. 76-85 - Ayane Tashiro, Mai Imamura, Shiro Kumano, Kazuhiro Otsuka:
Exploring Interlocutor Gaze Interactions in Conversations based on Functional Spectrum Analysis. 86-94 - Matilda Knierim, Sahil Jain
, Murat Han Aydogan
, Kenneth Mitra
, Kush Desai, Akanksha Saran, Kim Baraka
:
Leveraging Prosody as an Informative Teaching Signal for Agent Learning: Exploratory Studies and Algorithmic Implications. 95-123 - Tiantian Feng
, Daniel Yang
, Digbalay Bose
, Shrikanth Narayanan:
Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing? 124-133 - Xuenan Li, Zhaoyang Xu
:
The Impact of Auditory Warning Types and Emergency Obstacle Avoidance Takeover Scenarios on Takeover Behavior. 134-143 - Émilie Fabre
, Katie Seaborn
, Adrien Verhulst, Yuta Itoh
, Jun Rekimoto:
Juicy Text: Onomatopoeia and Semantic Text Effects for Juicy Player Experiences. 144-153 - Isna Alfi Bustoni
, Mark McGill, Stephen Anthony Brewster
:
Exploring the Alteration and Masking of Everyday Noise Sounds using Auditory Augmented Reality. 154-163 - Micol Spitale
, Fabio Catania
, Francesca Panzeri
:
Understanding Non-Verbal Irony Markers: Machine Learning Insights Versus Human Judgment. 164-172 - Wentao Yu, Dorothea Kolossa
, Robert M. Nickel
:
Generalization Boost in Bimodal Classification via Data Fusion Trained on Sparse Datasets. 173-181 - Steeven Villa
, Yannick Weiss
, Mei-Yi Lu, Moritz Ziarko
, Albrecht Schmidt, Jasmin Niess:
Envisioning Futures: How the Modality of AI Recommendations Impacts Conversation Flow in AR-enhanced Dialogue. 182-193 - Maia Stiber
, Dan Bohus
, Sean Andrist
:
"Uh, This One?": Leveraging Behavioral Signals for Detecting Confusion during Physical Tasks. 194-203 - Debaditya Shome, Nasim Montazeri Ghahjaverestan
, Ali Etemad
:
NapTune: Efficient Model Tuning for Mood Classification using Previous Night's Sleep Measures along with Wearable Time-series. 204-213 - Tanmay Srivastava
, R. Michael Winters, Thomas M. Gable
, Yu-Te Wang
, Teresa LaScala
, Ivan J. Tashev:
Whispering Wearables: Multimodal Approach to Silent Speech Recognition with Head-Worn Devices. 214-223 - Marius Funk
, Shogo Okada
, Elisabeth André:
Multilingual Dyadic Interaction Corpus NoXi+J: Toward Understanding Asian-European Non-verbal Cultural Characteristics and their Influences on Engagement. 224-233 - Qijun Cao, Junqi Zhang, Shengtao Fan, Jiaqi Rong
, Menghao Qi, Zhuowen Duan, Peikun Zhao, Ling Liu
, Zihao Zhou, Wenjie Chen
:
NearFetch: Enhancing Touch-Based Mobile Interaction on Public Displays with an Embedded Programmable NFC Array. 234-243 - Babette Bühler
, Efe Bozkir
, Hannah Deininger
, Patricia Goldberg
, Peter Gerjets
, Ulrich Trautwein
, Enkelejda Kasneci:
Detecting Aware and Unaware Mind Wandering During Lecture Viewing: A Multimodal Machine Learning Approach Using Eye Tracking, Facial Videos and Physiological Data. 244-253 - Shaid Hasan
, Mohammad Samin Yasar
, Tariq Iqbal:
M2RL: A Multimodal Multi-Interface Dataset for Robot Learning from Human Demonstrations. 254-263 - Rui Zhang
, Yixuan Li
, Zihuang Wu, Yong Zhang, Jie Zhao, Yang Jiao
:
SemanticTap: A Haptic Toolkit for Vibration Semantic Design of Smartphone. 264-273 - Esam Ghaleb
, Bulat Khaertdinov, Wim T. J. L. Pouw, Marlou Rasenberg
, Judith Holler
, Asli Özyürek, Raquel Fernández
:
Learning Co-Speech Gesture Representations in Dialogue through Contrastive Learning: An Intrinsic Evaluation. 274-283 - Alice Delbosc
, Magalie Ochs, Nicolas Sabouret
, Brian Ravenet
, Stéphane Ayache:
Mitigation of gender bias in automatic facial non-verbal behaviors generation. 284-292 - Mehmet Akhoroz, Caglar Yildirim
:
Poke Typing: Effects of Hand-Tracking Input and Key Representation on Mid-Air Text Entry Performance in Virtual Reality. 293-301 - Raquel Yupanqui
, John Sohn
, Yoojun Kim
, Raquel Flores, Hanwool Lee, Jinwoo Kim
, SangHyun Lee
, Youngjib Ham
, Chanam Lee, Theodora Chaspari:
A multimodal analysis of environmental stress experienced by older adults during outdoor walking trips: Implications for designing new intelligent technologies to enhance walkability in low-income Latino communities. 302-311 - Areej Buker, Alessandro Vinciarelli
:
Emotion Recognition for Multimodal Recognition of Attachment in School-Age Children. 312-320 - Eva Fringi
, Nesreen Alshubaily, Lorenzo Picinali
, Stephen Anthony Brewster
, Tanaya Guha, Alessandro Vinciarelli
:
Is Distance a Modality? Multi-Label Learning for Speech-Based Joint Prediction of Attributed Traits and Perceived Distances in 3D Audio Immersive Environments. 321-330 - Daniel Alvarado-Chou, Yuen C. Law
:
The Plausibility Paradox on Interactions with Complex Virtual Objects in Virtual Environments. 331-338 - Torsten Wörtwein
, Nicholas B. Allen
, Jeffrey F. Cohn
, Louis-Philippe Morency:
SMURF: Statistical Modality Uniqueness and Redundancy Factorization. 339-349 - Muhittin Gokmen
, Evangelos Sariyanidi, Lisa Yankowitz
, Casey J. Zampella
, Robert T. Schultz
, Birkan Tunç:
Detecting Autism from Head Movements using Kinesics. 350-354 - Sixia Li
, Kazumi Kumagai, Mihoko Otake-Matsuura
, Shogo Okada
:
Automatic mild cognitive impairment estimation from the group conversation of coimagination method. 355-360 - Zakariae Belmekki
, David Antonio Gómez Jáuregui, Patrick Reuter, Jun Li
, Jean-Claude Martin
, Karl Jenkins
, Nadine Couture
:
Generating Facial Expression Sequences of Complex Emotions with Generative Adversarial Networks. 361-372 - Meisam Booshehri
, Hendrik Buschmeier
, Philipp Cimiano
:
A Model of Factors Contributing to the Success of Dialogical Explanations. 373-381 - Divij Gupta
, Anubhav Bhatti
, Suraj Parmar
, Chen Dan, Yuwei Liu, Bingjie Shen, San Lee:
Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting. 382-386 - Chenyao Diao
, Stephanie Arévalo Arboleda
, Alexander Raake
:
Nonverbal Dynamics in Dyadic Videoconferencing Interaction: The Role of Video Resolution and Conversational Quality. 387-396 - Ehsanul Haque Nirjhar, Winfred Arthur
, Theodora Chaspari:
Perception of Stress: A Comparative Multimodal Analysis of Time-Continuous Stress Ratings from Self and Observers. 397-406 - Megan Caruso
, Rosy Southwell
, Leanne M. Hirshfield, Sidney D'Mello:
Putting the "Brain" Back in the Eye-Mind Link: Aligning Eye Movements and Brain Activations During Naturalistic Reading. 407-417 - Abdulrahman Mohamed Selim
, Omair Shahzad Bhatti
, Michael Barz
, Daniel Sonntag
:
Perceived Text Relevance Estimation Using Scanpaths and GNNs. 418-427 - Daksitha Senel Withanage Don
, Dominik Schiller
, Tobias Hallmen
, Silvan Mertes
, Tobias Baur, Florian Lingenfelser, Mitho Müller, Lea Kaubisch, Corinna Reck, Elisabeth André:
Towards Automated Annotation of Infant-Caregiver Engagement Phases with Multimodal Foundation Models. 428-438 - Hiromu Otsubo
, Alexander Marquardt
, Melissa Steininger
, Marvin Lehnort, Felix Dollack
, Yutaro Hirao
, Monica Perusquía-Hernández
, Hideaki Uchiyama, Ernst Kruijff
, Bernhard E. Riecke
, Kiyoshi Kiyokawa:
First-Person Perspective Induces Stronger Feelings of Awe and Presence Compared to Third-Person Perspective in Virtual Reality. 439-448 - Olcay Türk
, Stefan Lazarov
, Yu Wang
, Hendrik Buschmeier
, Angela Grimminger
, Petra Wagner
:
Predictability of Understanding in Explanatory Interactions Based on Multimodal Cues. 449-458 - Mansi Sharma
, Camilo Andrés Martínez Martínez
, Benedikt Emanuel Wirth
, Antonio Krüger
, Philipp Müller
:
Distinguishing Target and Non-Target Fixations with EEG and Eye Tracking in Realistic Visual Scenes. 459-468 - André Pereira
, Lubos Marcinek, Jura Miniota
, Sofia Thunberg
, Erik Lagerstedt
, Joakim Gustafson, Gabriel Skantze
, Bahar Irfan:
Multimodal User Enjoyment Detection in Human-Robot Conversation: The Power of Large Language Models. 469-478 - Karen Rosero, Ali N. Salman, Rami R. Hallac
, Carlos Busso:
Lip Abnormality Detection for Patients with Repaired Cleft Lip and Palate: A Lip Normalization Approach. 479-487 - Ali N. Salman, Ning Wang
, Luz Martinez-Lucas
, Andrea Vidal, Carlos Busso:
MSP-GEO Corpus: A Multimodal Database for Understanding Video-Learning Experience. 488-497 - Yash Prakash
, Akshay Kolgar Nayak
, Shoaib Mohammed Alyaan
, Pathan Aseef Khan, Hae Na Lee
, Vikas Ashok
:
Improving Usability of Data Charts in Multimodal Documents for Low Vision Users. 498-507 - Pooja Prajod
, Bhargavi Mahesh
, Elisabeth André:
Stressor Type Matters! - Exploring Factors Influencing Cross-Dataset Generalizability of Physiological Stress Detection. 508-517 - Anubhav
, Kantaro Fujiwara
:
Across Trials vs Subjects vs Contexts: A Multi-Reservoir Computing Approach for EEG Variations in Emotion Recognition. 518-525 - Fatema Hasan
, Yulong Li
, James R. Foulds
, Shimei Pan
, Bishwaranjan Bhattacharjee
:
DoubleDistillation: Enhancing LLMs for Informal Text Analysis using Multistage Knowledge Distillation from Speech and Text. 526-535 - Sydney Thompson
, Alexander Lew, Yifan Li
, Elizabeth Stanish
, Alex Huang
, Rohan Phanse
, Marynel Vázquez
:
Predicting Human Intent to Interact with a Public Robot: The People Approaching Robots Database (PAR-D). 536-545 - Maksim Siniukov
, Yufeng Yin
, Eli Fast, Yingshan Qi
, Aarav Monga
, Audrey Kim, Mohammad Soleymani:
SEMPI: A Database for Understanding Social Engagement in Video-Mediated Multiparty Interaction. 546-555 - Kalin Stefanov
, Yukiko I. Nakano
, Chisa Kobayashi, Ibuki Hoshina, Tatsuya Sakato
, Fumio Nihei, Chihiro Takayama
, Ryo Ishii
, Masatsugu Tsujii:
Participation Role-Driven Engagement Estimation of ASD Individuals in Neurodiverse Group Discussions. 556-564 - Hung Le
, Sixia Li
, Candy Olivia Mawalim
, Hung-Hsuan Huang
, Chee Wee Leong, Shogo Okada
:
Do We Need To Watch It All? Efficient Job Interview Video Processing with Differentiable Masking. 565-574 - Ashwin T. S., Gautam Biswas
:
Relating Students Cognitive Processes and Learner-Centered Emotions: An Advanced Deep Learning Approach. 575-584
Blue Sky papers
- Bhaktipriya Radharapu
, Harish Krishna:
RealSeal: Revolutionizing Media Authentication with Real-Time Realism Scoring. 585-590 - Radu-Daniel Vatavu
:
AI as Modality in Human Augmentation: Toward New Forms of Multimodal Interaction with AI-Embodied Modalities. 591-595 - Sachin Pathiyan Cherumanal
, Ujwal Gadiraju
, Damiano Spina
:
Everything We Hear: Towards Tackling Misinformation in Podcasts. 596-601
Doctoral Consortium
- Paul Raingeard de la Bletiere
:
A musical Robot for People with Dementia. 602-606 - Vasundhara Joshi
:
Enhancing Collaboration and Performance among EMS Students through Multimodal Learning Analytics. 607-611 - Zonghuan Li
:
Towards Automatic Social Involvement Estimation. 612-616 - Ernesto Rivera-Alvarado
:
Video Game Technologies Applied for Teaching Assembly Language Programming. 617-621 - Ivan Kondyurin
:
Modelling Social Intentions in Complex Conversational Settings. 622-626 - Abdullah Alzahrani
, Muneeb Imtiaz Ahmad:
Real-Time Trust Measurement in Human-Robot Interaction: Insights from Physiological Behaviours. 627-631 - Megan Caruso
:
A Multimodal Understanding of the Eye-Mind Link. 632-636 - Anubhav
:
Investigating Multi-Reservoir Computing for EEG-based Emotion Recognition. 637-641 - Shu Zhong
:
Design Digital Multisensory Textile Experiences. 642-646 - Jayneel Vora
:
Towards Trustworthy and Efficient Diffusion Models. 647-651
Grand Challenges: ERR
- Micol Spitale
, Maria Teresa Parreira
, Maia Stiber
, Minja Axelsson
, Neval Kara, Garima Kankariya
, Chien-Ming Huang
, Malte F. Jung, Wendy Ju
, Hatice Gunes:
ERR@HRI 2024 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Interactions. 652-656 - Lennart Wachowiak, Peter Tisnikar
, Andrew Coles
, Gerard Canal
, Oya Çeliktutan
:
A Time Series Classification Pipeline for Detecting Interaction Ruptures in HRI Based on User Reactions. 657-665 - Pradip Pramanick
, Silvia Rossi
:
PRISCA at ERR@HRI 2024: Multimodal Representation Learning for Detecting Interaction Ruptures in HRI. 666-670 - Ruben Janssens
, Eva Verhelst, Mathieu De Coster
:
Predicting Errors and Failures in Human-Robot Interaction from Multi-Modal Temporal Data. 671-676
Grand Challenges: EVAC
- Fabien Ringeval, Björn W. Schuller
, Gérard Bailly
, Safaa Azzakhnini, Hippolyte Fournier
:
EVAC 2024 - Empathic Virtual Agent Challenge: Appraisal-based Recognition of Affective States. 677-683 - Thomas Thebaud
, Anna Favaro
, Yaohan Guan, Yuchen Yang
, Prabhav Singh
, Jesús Villalba
, Laureano Moro-Velázquez
, Najim Dehak:
Multimodal Emotion Recognition Harnessing the Complementarity of Speech, Language, and Vision. 684-689
Workshop Summaries
- Radoslaw Niewiadomski
, Ferran Altarriba Bertran
, Christopher Dawes
, Marianna Obrist
, Maurizio Mancini
:
First Multimodal Banquet: Exploring Innovative Technology for Commensality and Human-Food Interaction (CoFI2024). 690-693 - Youngwoo Yoon, Taras Kucherenko, Alice Delbosc
, Rajmund Nagy
, Teodor Nikolov
, Gustav Eje Henter:
GENEA Workshop 2024: The 5th Workshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents. 694-695 - Michael Barz
, Roman Bednarik
, Andreas Bulling, Cristina Conati, Daniel Sonntag
:
HumanEYEze 2024: Workshop on Eye Tracking for Multimodal Human-Centric Computing. 696-697 - Hendrik Buschmeier
, Teena Hassan
, Stefan Kopp:
Multimodal Co-Construction of Explanations with XAI Workshop. 698-699

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.