GRM-131 XR Agent (A MLLM powered XR system)

Yukang ShenFollow

Location

https://www.kennesaw.edu/ccse/events/computing-showcase/sp25-cday-program.php

Streaming Media

Event Website

https://team-portal-plum.vercel.app/home

Document Type

Event

Start Date

15-4-2025 4:00 PM

Description

This project proposes “XR Agent”, a uncoupled and efficient framework for developing AI-powered extended reality (XR) applications on head-mounted displays (HMDs). Leveraging multimodal artificial intelligence—including MediaPipe(Google open-source CV Model) for computer vision (object segmentation, recognition, pose estimation), multimodal large language models (MLLMs) like Gemini, and Unity’s cross-platform XR development ecosystem—the framework aims to create an extensible base system that enables rapid prototyping and deployment of intelligent XR applications. Currently, it was deployed on the Meta Quest 3 platform, XR Agent explores novel HCI(Human Computer Interaction) paradigms, combining real-time sensor data processing, immersive visualization, and adaptive AI-driven logic. This work addresses challenges modular integration of various different kinds of devices AI models. The framework also will be valuable through use cases in collaborative remote control, immersive training scenarios, and data collection for embodied AI.

Download

Included in

Computer Sciences Commons

COinS

Apr 15th, 4:00 PM

GRM-131 XR Agent (A MLLM powered XR system)

https://www.kennesaw.edu/ccse/events/computing-showcase/sp25-cday-program.php

https://digitalcommons.kennesaw.edu/cday/Spring_2025/Masters_Research/20

GRM-131 XR Agent (A MLLM powered XR system)

Location

Streaming Media

Event Website

Document Type

Start Date

Description

Included in

C-Day Links

Search

Authors

Browse

Links

GRM-131 XR Agent (A MLLM powered XR system)

Presenter Information

Location

Streaming Media

Event Website

Document Type

Start Date

Description

Included in

Share

C-Day Links

Search

Authors

Browse

Links