


default search action
26th ICMI 2024: San Jose, Costa Rica
- Hayley Hung, Catharine Oertel, Mohammad Soleymani, Theodora Chaspari, Hamdi Dibeklioglu, Jainendra Shukla, Khiet P. Truong:

Proceedings of the 26th International Conference on Multimodal Interaction, ICMI 2024, San Jose, Costa Rica, November 4-8, 2024. ACM 2024, ISBN 979-8-4007-0462-8
Keynotes
- Heloisa Candello:

A human-centered approach to design multimodal conversational systems. 1 - Yaser Sheikh:

3D Calling with Codec Avatars. 2 - Catherine Pelachaud

:
Greta, what else? Our research towards building socially interactive agents. 3
Main Track - Long and Short Papers
- Kelsey Turbeville, Jennarong Muengtaweepongsa, Samuel Stevens, Jason Moss, Amy Pon, Kyra Lee, Charu Mehra, Jenny Gutierrez Villalobos

, Ranjitha Kumar
:
LLM-powered Multimodal Insight Summarization for UX Testing. 4-11 - Nikola Kovacevic

, Christian Holz
, Markus Gross
, Rafael Wampfler
:
On Multimodal Emotion Recognition for Human-Chatbot Interaction in the Wild. 12-21 - Debasmita Ghose

, Oz Gitelson
, Brian Scassellati:
Integrating Multimodal Affective Signals for Stress Detection from Audio-Visual Data. 22-32 - Shu Zhong

, Elia Gatti, Youngjun Cho, Marianna Obrist
:
Feeling Textiles through AI: An exploration into Multimodal Language Models and Human Perception Alignment. 33-37 - Metehan Doyran

, Albert Ali Salah
, Ronald Poppe
:
Decoding Contact: Automatic Estimation of Contact Signatures in Parent-Infant Free Play Interactions. 38-46 - Christopher Dawes

, Jing Xue
, Giada Brianza
, Patricia Ivette Cornelio Martinez
, Roberto A. Montano Murillo
, Emanuela Maggioni
, Marianna Obrist
:
ScentHaptics: Augmenting the Haptic Experiences of Digital Mid-Air Textiles with Scent. 47-56 - Meng-Chen Lee

, Zhigang Deng
:
Online Multimodal End-of-Turn Prediction for Three-party Conversations. 57-65 - Muneeb Imtiaz Ahmad, Abdullah Alzahrani

, Sunbul M. Ahmad
:
Detecting Deception in Natural Environments Using Incremental Transfer Learning. 66-75 - Florian Mathis

, Brad A. Myers
, Ben Lafreniere
, Michael Glueck
, David P. S. Marques:
MR-Driven Near-Future Realities: Previewing Everyday Life Real-World Experiences Using Mixed Reality. 76-85 - Ayane Tashiro, Mai Imamura, Shiro Kumano, Kazuhiro Otsuka:

Exploring Interlocutor Gaze Interactions in Conversations based on Functional Spectrum Analysis. 86-94 - Matilda Knierim, Sahil Jain

, Murat Han Aydogan
, Kenneth Mitra
, Kush Desai, Akanksha Saran, Kim Baraka
:
Leveraging Prosody as an Informative Teaching Signal for Agent Learning: Exploratory Studies and Algorithmic Implications. 95-123 - Tiantian Feng

, Daniel Yang
, Digbalay Bose
, Shrikanth Narayanan:
Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing? 124-133 - Xuenan Li, Zhaoyang Xu

:
The Impact of Auditory Warning Types and Emergency Obstacle Avoidance Takeover Scenarios on Takeover Behavior. 134-143 - Émilie Fabre

, Katie Seaborn
, Adrien Verhulst, Yuta Itoh
, Jun Rekimoto:
Juicy Text: Onomatopoeia and Semantic Text Effects for Juicy Player Experiences. 144-153 - Isna Alfi Bustoni

, Mark McGill, Stephen Anthony Brewster
:
Exploring the Alteration and Masking of Everyday Noise Sounds using Auditory Augmented Reality. 154-163 - Micol Spitale

, Fabio Catania
, Francesca Panzeri
:
Understanding Non-Verbal Irony Markers: Machine Learning Insights Versus Human Judgment. 164-172 - Wentao Yu, Dorothea Kolossa

, Robert M. Nickel
:
Generalization Boost in Bimodal Classification via Data Fusion Trained on Sparse Datasets. 173-181 - Steeven Villa

, Yannick Weiss
, Mei-Yi Lu, Moritz Ziarko
, Albrecht Schmidt, Jasmin Niess:
Envisioning Futures: How the Modality of AI Recommendations Impacts Conversation Flow in AR-enhanced Dialogue. 182-193 - Maia Stiber

, Dan Bohus
, Sean Andrist
:
"Uh, This One?": Leveraging Behavioral Signals for Detecting Confusion during Physical Tasks. 194-203 - Debaditya Shome, Nasim Montazeri Ghahjaverestan

, Ali Etemad
:
NapTune: Efficient Model Tuning for Mood Classification using Previous Night's Sleep Measures along with Wearable Time-series. 204-213 - Tanmay Srivastava

, R. Michael Winters, Thomas M. Gable
, Yu-Te Wang
, Teresa LaScala
, Ivan J. Tashev:
Whispering Wearables: Multimodal Approach to Silent Speech Recognition with Head-Worn Devices. 214-223 - Marius Funk

, Shogo Okada
, Elisabeth André:
Multilingual Dyadic Interaction Corpus NoXi+J: Toward Understanding Asian-European Non-verbal Cultural Characteristics and their Influences on Engagement. 224-233 - Qijun Cao, Junqi Zhang, Shengtao Fan, Jiaqi Rong

, Menghao Qi, Zhuowen Duan, Peikun Zhao, Ling Liu
, Zihao Zhou, Wenjie Chen
:
NearFetch: Enhancing Touch-Based Mobile Interaction on Public Displays with an Embedded Programmable NFC Array. 234-243 - Babette Bühler

, Efe Bozkir
, Hannah Deininger
, Patricia Goldberg
, Peter Gerjets
, Ulrich Trautwein
, Enkelejda Kasneci:
Detecting Aware and Unaware Mind Wandering During Lecture Viewing: A Multimodal Machine Learning Approach Using Eye Tracking, Facial Videos and Physiological Data. 244-253 - Shaid Hasan

, Mohammad Samin Yasar
, Tariq Iqbal:
M2RL: A Multimodal Multi-Interface Dataset for Robot Learning from Human Demonstrations. 254-263 - Rui Zhang

, Yixuan Li
, Zihuang Wu, Yong Zhang, Jie Zhao, Yang Jiao
:
SemanticTap: A Haptic Toolkit for Vibration Semantic Design of Smartphone. 264-273 - Esam Ghaleb

, Bulat Khaertdinov, Wim T. J. L. Pouw, Marlou Rasenberg
, Judith Holler
, Asli Özyürek, Raquel Fernández
:
Learning Co-Speech Gesture Representations in Dialogue through Contrastive Learning: An Intrinsic Evaluation. 274-283 - Alice Delbosc

, Magalie Ochs, Nicolas Sabouret
, Brian Ravenet
, Stéphane Ayache:
Mitigation of gender bias in automatic facial non-verbal behaviors generation. 284-292 - Mehmet Akhoroz, Caglar Yildirim

:
Poke Typing: Effects of Hand-Tracking Input and Key Representation on Mid-Air Text Entry Performance in Virtual Reality. 293-301 - Raquel Yupanqui

, John Sohn
, Yoojun Kim
, Raquel Flores, Hanwool Lee, Jinwoo Kim
, SangHyun Lee
, Youngjib Ham
, Chanam Lee, Theodora Chaspari:
A multimodal analysis of environmental stress experienced by older adults during outdoor walking trips: Implications for designing new intelligent technologies to enhance walkability in low-income Latino communities. 302-311 - Areej Buker, Alessandro Vinciarelli

:
Emotion Recognition for Multimodal Recognition of Attachment in School-Age Children. 312-320 - Eva Fringi

, Nesreen Alshubaily
, Lorenzo Picinali
, Stephen Anthony Brewster
, Tanaya Guha, Alessandro Vinciarelli
:
Is Distance a Modality? Multi-Label Learning for Speech-Based Joint Prediction of Attributed Traits and Perceived Distances in 3D Audio Immersive Environments. 321-330 - Daniel Alvarado-Chou

, Yuen C. Law
:
The Plausibility Paradox on Interactions with Complex Virtual Objects in Virtual Environments. 331-338 - Torsten Wörtwein

, Nicholas B. Allen
, Jeffrey F. Cohn
, Louis-Philippe Morency:
SMURF: Statistical Modality Uniqueness and Redundancy Factorization. 339-349 - Muhittin Gokmen

, Evangelos Sariyanidi, Lisa Yankowitz
, Casey J. Zampella
, Robert T. Schultz
, Birkan Tunç:
Detecting Autism from Head Movements using Kinesics. 350-354 - Sixia Li

, Kazumi Kumagai
, Mihoko Otake-Matsuura
, Shogo Okada
:
Automatic mild cognitive impairment estimation from the group conversation of coimagination method. 355-360 - Zakariae Belmekki

, David Antonio Gómez Jáuregui, Patrick Reuter, Jun Li
, Jean-Claude Martin
, Karl Jenkins
, Nadine Couture
:
Generating Facial Expression Sequences of Complex Emotions with Generative Adversarial Networks. 361-372 - Meisam Booshehri

, Hendrik Buschmeier
, Philipp Cimiano
:
A Model of Factors Contributing to the Success of Dialogical Explanations. 373-381 - Divij Gupta

, Anubhav Bhatti
, Suraj Parmar
, Chen Dan, Yuwei Liu, Bingjie Shen, San Lee:
Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting. 382-386 - Chenyao Diao

, Stephanie Arévalo Arboleda
, Alexander Raake
:
Nonverbal Dynamics in Dyadic Videoconferencing Interaction: The Role of Video Resolution and Conversational Quality. 387-396 - Ehsanul Haque Nirjhar, Winfred Arthur

, Theodora Chaspari:
Perception of Stress: A Comparative Multimodal Analysis of Time-Continuous Stress Ratings from Self and Observers. 397-406 - Megan Caruso

, Rosy Southwell
, Leanne M. Hirshfield, Sidney D'Mello:
Putting the "Brain" Back in the Eye-Mind Link: Aligning Eye Movements and Brain Activations During Naturalistic Reading. 407-417 - Abdulrahman Mohamed Selim

, Omair Shahzad Bhatti
, Michael Barz
, Daniel Sonntag
:
Perceived Text Relevance Estimation Using Scanpaths and GNNs. 418-427 - Daksitha Senel Withanage Don

, Dominik Schiller
, Tobias Hallmen
, Silvan Mertes
, Tobias Baur, Florian Lingenfelser, Mitho Müller, Lea Kaubisch, Corinna Reck, Elisabeth André:
Towards Automated Annotation of Infant-Caregiver Engagement Phases with Multimodal Foundation Models. 428-438 - Hiromu Otsubo

, Alexander Marquardt
, Melissa Steininger
, Marvin Lehnort, Felix Dollack
, Yutaro Hirao
, Monica Perusquía-Hernández
, Hideaki Uchiyama, Ernst Kruijff
, Bernhard E. Riecke
, Kiyoshi Kiyokawa:
First-Person Perspective Induces Stronger Feelings of Awe and Presence Compared to Third-Person Perspective in Virtual Reality. 439-448 - Olcay Türk

, Stefan Lazarov
, Yu Wang
, Hendrik Buschmeier
, Angela Grimminger
, Petra Wagner
:
Predictability of Understanding in Explanatory Interactions Based on Multimodal Cues. 449-458 - Mansi Sharma

, Camilo Andrés Martínez Martínez
, Benedikt Emanuel Wirth
, Antonio Krüger
, Philipp Müller
:
Distinguishing Target and Non-Target Fixations with EEG and Eye Tracking in Realistic Visual Scenes. 459-468 - André Pereira

, Lubos Marcinek, Jura Miniota
, Sofia Thunberg
, Erik Lagerstedt
, Joakim Gustafson, Gabriel Skantze
, Bahar Irfan:
Multimodal User Enjoyment Detection in Human-Robot Conversation: The Power of Large Language Models. 469-478 - Karen Rosero, Ali N. Salman, Rami R. Hallac

, Carlos Busso:
Lip Abnormality Detection for Patients with Repaired Cleft Lip and Palate: A Lip Normalization Approach. 479-487 - Ali N. Salman, Ning Wang

, Luz Martinez-Lucas
, Andrea Vidal, Carlos Busso:
MSP-GEO Corpus: A Multimodal Database for Understanding Video-Learning Experience. 488-497 - Yash Prakash

, Akshay Kolgar Nayak
, Shoaib Mohammed Alyaan
, Pathan Aseef Khan, Hae Na Lee
, Vikas Ashok
:
Improving Usability of Data Charts in Multimodal Documents for Low Vision Users. 498-507 - Pooja Prajod

, Bhargavi Mahesh
, Elisabeth André:
Stressor Type Matters! - Exploring Factors Influencing Cross-Dataset Generalizability of Physiological Stress Detection. 508-517 - Anubhav

, Kantaro Fujiwara
:
Across Trials vs Subjects vs Contexts: A Multi-Reservoir Computing Approach for EEG Variations in Emotion Recognition. 518-525 - Fatema Hasan

, Yulong Li
, James R. Foulds
, Shimei Pan
, Bishwaranjan Bhattacharjee
:
DoubleDistillation: Enhancing LLMs for Informal Text Analysis using Multistage Knowledge Distillation from Speech and Text. 526-535 - Sydney Thompson

, Alexander Lew, Yifan Li
, Elizabeth Stanish
, Alex Huang
, Rohan Phanse
, Marynel Vázquez
:
Predicting Human Intent to Interact with a Public Robot: The People Approaching Robots Database (PAR-D). 536-545 - Maksim Siniukov

, Yufeng Yin
, Eli Fast, Yingshan Qi
, Aarav Monga
, Audrey Kim
, Mohammad Soleymani:
SEMPI: A Database for Understanding Social Engagement in Video-Mediated Multiparty Interaction. 546-555 - Kalin Stefanov

, Yukiko I. Nakano
, Chisa Kobayashi, Ibuki Hoshina, Tatsuya Sakato
, Fumio Nihei
, Chihiro Takayama
, Ryo Ishii
, Masatsugu Tsujii:
Participation Role-Driven Engagement Estimation of ASD Individuals in Neurodiverse Group Discussions. 556-564 - Hung Le

, Sixia Li
, Candy Olivia Mawalim
, Hung-Hsuan Huang
, Chee Wee Leong, Shogo Okada
:
Do We Need To Watch It All? Efficient Job Interview Video Processing with Differentiable Masking. 565-574 - Ashwin T. S., Gautam Biswas

:
Relating Students Cognitive Processes and Learner-Centered Emotions: An Advanced Deep Learning Approach. 575-584
Blue Sky papers
- Bhaktipriya Radharapu

, Harish Krishna:
RealSeal: Revolutionizing Media Authentication with Real-Time Realism Scoring. 585-590 - Radu-Daniel Vatavu

:
AI as Modality in Human Augmentation: Toward New Forms of Multimodal Interaction with AI-Embodied Modalities. 591-595 - Sachin Pathiyan Cherumanal

, Ujwal Gadiraju
, Damiano Spina
:
Everything We Hear: Towards Tackling Misinformation in Podcasts. 596-601
Doctoral Consortium
- Paul Raingeard de la Bletiere

:
A musical Robot for People with Dementia. 602-606 - Vasundhara Joshi

:
Enhancing Collaboration and Performance among EMS Students through Multimodal Learning Analytics. 607-611 - Zonghuan Li

:
Towards Automatic Social Involvement Estimation. 612-616 - Ernesto Rivera-Alvarado

:
Video Game Technologies Applied for Teaching Assembly Language Programming. 617-621 - Ivan Kondyurin

:
Modelling Social Intentions in Complex Conversational Settings. 622-626 - Abdullah Alzahrani

, Muneeb Imtiaz Ahmad:
Real-Time Trust Measurement in Human-Robot Interaction: Insights from Physiological Behaviours. 627-631 - Megan Caruso

:
A Multimodal Understanding of the Eye-Mind Link. 632-636 - Anubhav

:
Investigating Multi-Reservoir Computing for EEG-based Emotion Recognition. 637-641 - Shu Zhong

:
Design Digital Multisensory Textile Experiences. 642-646 - Jayneel Vora

:
Towards Trustworthy and Efficient Diffusion Models. 647-651
Grand Challenges: ERR
- Micol Spitale

, Maria Teresa Parreira
, Maia Stiber
, Minja Axelsson
, Neval Kara, Garima Kankariya
, Chien-Ming Huang
, Malte F. Jung, Wendy Ju
, Hatice Gunes:
ERR@HRI 2024 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Interactions. 652-656 - Lennart Wachowiak, Peter Tisnikar

, Andrew Coles
, Gerard Canal
, Oya Çeliktutan
:
A Time Series Classification Pipeline for Detecting Interaction Ruptures in HRI Based on User Reactions. 657-665 - Pradip Pramanick

, Silvia Rossi
:
PRISCA at ERR@HRI 2024: Multimodal Representation Learning for Detecting Interaction Ruptures in HRI. 666-670 - Ruben Janssens

, Eva Verhelst, Mathieu De Coster
:
Predicting Errors and Failures in Human-Robot Interaction from Multi-Modal Temporal Data. 671-676
Grand Challenges: EVAC
- Fabien Ringeval, Björn W. Schuller

, Gérard Bailly
, Safaa Azzakhnini, Hippolyte Fournier
:
EVAC 2024 - Empathic Virtual Agent Challenge: Appraisal-based Recognition of Affective States. 677-683 - Thomas Thebaud

, Anna Favaro
, Yaohan Guan, Yuchen Yang
, Prabhav Singh
, Jesús Villalba
, Laureano Moro-Velázquez
, Najim Dehak:
Multimodal Emotion Recognition Harnessing the Complementarity of Speech, Language, and Vision. 684-689
Workshop Summaries
- Radoslaw Niewiadomski

, Ferran Altarriba Bertran
, Christopher Dawes
, Marianna Obrist
, Maurizio Mancini
:
First Multimodal Banquet: Exploring Innovative Technology for Commensality and Human-Food Interaction (CoFI2024). 690-693 - Youngwoo Yoon, Taras Kucherenko, Alice Delbosc

, Rajmund Nagy
, Teodor Nikolov
, Gustav Eje Henter:
GENEA Workshop 2024: The 5th Workshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents. 694-695 - Michael Barz

, Roman Bednarik
, Andreas Bulling, Cristina Conati, Daniel Sonntag
:
HumanEYEze 2024: Workshop on Eye Tracking for Multimodal Human-Centric Computing. 696-697 - Hendrik Buschmeier

, Teena Hassan
, Stefan Kopp:
Multimodal Co-Construction of Explanations with XAI Workshop. 698-699

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














