Location
https://www.kennesaw.edu/ccse/events/computing-showcase/sp25-cday-program.php
Streaming Media
Event Website
https://team-portal-plum.vercel.app/home
Document Type
Event
Start Date
15-4-2025 4:00 PM
Description
This project proposes “XR Agent”, a uncoupled and efficient framework for developing AI-powered extended reality (XR) applications on head-mounted displays (HMDs). Leveraging multimodal artificial intelligence—including MediaPipe(Google open-source CV Model) for computer vision (object segmentation, recognition, pose estimation), multimodal large language models (MLLMs) like Gemini, and Unity’s cross-platform XR development ecosystem—the framework aims to create an extensible base system that enables rapid prototyping and deployment of intelligent XR applications. Currently, it was deployed on the Meta Quest 3 platform, XR Agent explores novel HCI(Human Computer Interaction) paradigms, combining real-time sensor data processing, immersive visualization, and adaptive AI-driven logic. This work addresses challenges modular integration of various different kinds of devices AI models. The framework also will be valuable through use cases in collaborative remote control, immersive training scenarios, and data collection for embodied AI.
Included in
GRM-131 XR Agent (A MLLM powered XR system)
https://www.kennesaw.edu/ccse/events/computing-showcase/sp25-cday-program.php
This project proposes “XR Agent”, a uncoupled and efficient framework for developing AI-powered extended reality (XR) applications on head-mounted displays (HMDs). Leveraging multimodal artificial intelligence—including MediaPipe(Google open-source CV Model) for computer vision (object segmentation, recognition, pose estimation), multimodal large language models (MLLMs) like Gemini, and Unity’s cross-platform XR development ecosystem—the framework aims to create an extensible base system that enables rapid prototyping and deployment of intelligent XR applications. Currently, it was deployed on the Meta Quest 3 platform, XR Agent explores novel HCI(Human Computer Interaction) paradigms, combining real-time sensor data processing, immersive visualization, and adaptive AI-driven logic. This work addresses challenges modular integration of various different kinds of devices AI models. The framework also will be valuable through use cases in collaborative remote control, immersive training scenarios, and data collection for embodied AI.
https://digitalcommons.kennesaw.edu/cday/Spring_2025/Masters_Research/20