Beyond Text: Integration of LLMs into Virtual Environments.
Master
Semester programme:Master of Applied IT
Client company:Interaction Design Research Group Fontys
Samuel Slavik
Marc van Grootel
Project description
Our study investigates how to integrate LLM agents into immersive environments using mainly textual data and descriptions. We developed a PoC that allows different LLM models (ChatGPT or LLama) connect to a 3D environment in Unreal Engine and call in-game tools that interact directly with the scene. Without the need of any fine-tuning, our solution enables Human-AI Collaboration in digital worlds.
Context
The rapid advancement of large language models in recent years has sparked a surge of interest in applying them to embodied AI within 3D simulations.
While many existing simulations focus on training embodied agents through reinforcement learning and sensor-based control, they often lack the flexibility, reasoning, and language understanding that LLMs can provide.
Bridging this gap requires new approaches that integrate symbolic reasoning with physical embodiment, enabling more natural and adaptable human-AI interactions.
Results
The findings confirm that large language models have strong potential to serve as controllers for interactive digital environments, but also reveal clear limitations that need to be addressed.
For example:
Ambiguous language exposes a shared weakness, because neither agent “sees” the scene, they rely entirely on text labels. Prompts with relative directions (“move to the left”) do not translate to exact location, causing ineffective tool calling and decrease in accuracy.