MUWS 2025 - The 4th International Workshop on Multimodal Human Understanding for the Web and Social Media

The 4th International Workshop on Multimodal Human Understanding for the Web and Social Media (MUWS), co-located with ACM Multimedia (ACM MM 2025).

Program

Date: October 28, 2025 in Dublin, Ireland

Timezone: Irish Standard Time

Current time:

Room: Hyatt, Dean Swift 2

Proceedings: https://dl.acm.org/doi/proceedings/10.1145/3728481

09:00 – 9:30 - Chairs’ Welcome
09:30 – 10:30 - Keynote Talk by Beate Schirrmacher “Factuality, Storytelling and the Quest for Authenticity in Journalism”

Abstract: In discussions of mis- and disinformation, facts are often pitted against narratives. However, even reliable and relevant information relies on narration. For instance, journalism does not simply report the facts; it constructs multimodal, transmedial, and serial narratives to explain and engage audiences regarding recent events. How does the increased focus on emotionally charged storytelling impact on the the facts and information conveyed? This presentation examines the interplay between factual narration and storytelling in journalism, exploring how facts and stories inform one another. Drawing on intermedial theory, this kind of analysis explores truthfulness in communication an interplay of perceived traces of events and communicational coherence (Elleström 2018). The analysis reveals how a set of shared basic features shapes journalistic news narration, but it also matters how these features are assembled, as the specific use of narrative and rhetorical patterns can either support or undermine factuality. For instance, in the fake features of the former German reporter Claas Relotius, narrative coherence and claims of authenticity replace specificity. Furthermore, examples of audiovisual journalism highlight how the increased focus on letting images tell the stories and claims to authenticity transform journalistic narration. Rather than opposing facts and fiction, this approach offers a more nuanced understanding of the narrative challenges of news in a rapidly changing and diverse media landscape. So, how does the increased focus on emotionally charged storytelling impact the facts and information conveyed?

Beate Schirrmacher, Associate Professor in Comparative Literature at the Department of Film and Literature, Linnaeus University, Sweden and member of the Linnaeus University Centre for Intermedial and Multimodal Studies. She has studied the intermedial relationships of literature and music. Her current research explores intermedial perspectives on digital convergence, the truth claims of media, and narrative strategies in journalism.

10:30 – 11:00 - Coffee Break

Session 1: Multimodal Understanding Through Impactful World Events

11:00 – 11:20 - “Analyzing Emotional Discourse in Multilingual Social Media: A Case Study of the Russo-Ukrainian Conflict” by Ahmed Taiye Mohammed et al.
11:20 – 11:35 - “Multilingual Evaluation of Image-Text Retrieval in Vision-Language Models: A Metric-Based Perspective” by Bodhisatta Maiti
11:35 – 12:30 - Keynote Talk by Ralph Ewerth “The Challenge of Multimodality in Analyzing Social Media News and Videos”

Abstract: Harmful content on the Web is frequently conveyed in a multimodal manner, i.e., by combining different “forms of expression” (modalities). This yields a major challenge for computational analysis methods, as the modalities can refer to each other and modify the overall meaning of the multimodal information, sometimes even changing the meaning completely (also called meaning multiplication in the literature). However, the complex interplay of information from different modalities is typically not explicitly modelled in state-of-the-art approaches for multimodal information analysis. In this talk, we present our approach to overcome this drawback by introducing a computational model for the complex interplay of modalities (here text and image) and several approaches for multimodal news analysis. Finally, we outline challenges in the quite different domain of educational videos, in particular with regard to video-based learning.

Ralph Ewerth, Professor at Marburg University and group leader of the Visual Analytics Research Group at TIB -- Leibniz Information Centre for Science and Technology. His current research focuses on Multiscale-Multimodal Modelling and Learning within fields such as multimodal machine learning, multimedia information retrieval, vision and language, digital libraries, and visual analytics. He works with a strong interdisciplinary orientation and with applications in the digital humanities, news analytics, psychology, and digital education, among other fields.

12:30 - 13:30 - Lunch break

Session 2: Human-Centred Multimodal Understanding

13:30 – 13:50 - “Analyzing the Visual Variety of Adjectives based on Clustering of Visual Features” by Yui Tanaka et al.
13:50 – 14:10 - “Revealing Label Noise in Multimodal Hateful Video Classification” by Shuonan Yang et al.
14:10 – 14:25 - “Video Analysis of Confusion and Understanding in Dyadic Explanations” by Jonas Paletschek et al.
14:25 – 14:45 - “Landmark Guided Visual Feature Extractor for Visual Speech Recognition with Limited Resources” by Lei Yang et al.
14:45 - Closing ceremony

Organizing Committee

Sherzod Hakimov
Marc A. Kastner
David Semedo
Eric Müller-Budack
Takahiro Komamizu