Date of Award

Fall 5-8-2025

Degree Type

Dissertation/Thesis

Degree Name

MASTER OF SCIENCE IN COMPUTER SCIENCE

Department

COLLEGE OF COMPUTING AND SOFTWARE ENGINEERING

Committee Chair/First Advisor

Md Abdullah Al Hafiz Khan

Second Advisor

Kazi Aminul Islam

Third Advisor

Sanghoon Lee

Abstract

Multimodal intent recognition, as a key research topic in human-computer interaction, aims to construct precise human intent understanding models by fusing heterogeneous data streams including speech, text, gestures, and facial expressions. However, existing multimodal methods require complex feature extraction and fusion strategies. Current approaches exhibit numerous limitations when processing multimodal fusion in complex scenarios, such as high computational complexity in feature extraction and difficulties in bridging the semantic gap between modalities. Furthermore, multimodal datasets in real-world scenarios often present class imbalance and long-tail distribution characteristics, which further exacerbate the learning challenges for models. To address these challenges, We introduce MMIU, a unified framework for ID classification and OOD detection that synthesizes pseudo-OOD examples by convexly mixing in-distribution data and then learns multimodal representations at two levels. At the coarse level, it enforces a binary separation between ID and OOD; at the fine level, it refines ID-class boundaries by assigning confidence scores that reflect each sample’s difficulty and by applying instance-level contrastive learning to pull similar examples together and push dissimilar ones apart. A human-in-the-loop active-learning module further allows experts to label challenging unlabeled samples during training, triggering iterative retraining and yielding more accurate, robust models. This research will: evaluate the feasibility and effectiveness of the new algorithm, reveal the technical challenges and theoretical significance in this research field, and provide new research paradigms and methodological insights for the academic community.

Available for download on Thursday, May 04, 2028

Share

COinS