GLaDOS Personality Core
Prologue
"Science isn't about asking why. It's about asking, 'Why not?'" - Cave Johnson
GLaDOS is the AI antagonist from Valve's Portal series—a sardonic, passive-aggressive superintelligence who views humans as test subjects worthy of both study and mockery.
Back in 2022 when ChatGPT made its debut, I had a realization: we are living in the Sci-Fi future and can actually build her now. A demented, obsessive AI fixated on humanity, super intelligent yet utterly lacking sound judgment; so just like an LLM, right? 2026, and still no moon colonies or flying cars. But a passive-aggressive AI that controls your lights and runs experiments on you? That we can do.
The architecture borrows from Minsky's Society of Mind—rather than one monolithic prompt, multiple specialized agents (vision, memory, personality, planning) each contribute to a dynamic context. GLaDOS's "self" emerges from their combined output, assembled fresh for each interaction.
The hard part was latency. Getting round-trip response time under 600 milliseconds is a threshold—below it, conversation stops feeling stilted and starts to flow. That meant training a custom TTS model and ruthlessly cutting milliseconds from every part of the pipeline.
Since 2023 I've refactored the system multiple times as better models came out. The current version finally adds what I always wanted: vision, memory, and tool use via MCP.
She sees through a camera, hears through a microphone, speaks through a speaker, and judges you accordingly.
Join our Discord! | Sponsor the project
LocalGLaDOS.mp4
Vision
"We've both said a lot of things that you're going to regret" - GLaDOS
Most voice assistants wait for wake words. GLaDOS doesn't wait—she observes, thinks, and speaks when she has something to say. All the while, parts of her minds are tracking what she sees, monitoring system stats, and researching new neurotoxin recipes online.
Goals:
- Proactive behavior: React to events (vision, sound, time) without being prompted
- Emotional state: PAD model (Pleasure-Arousal-Dominance) for reactive mood
- Persistent personality: HEXACO traits provide stable character across sessions
- Multi-agent architecture: Subagents handle research, memory, emotions; main agent stays focused
- Real-time conversation: Optimized latency, natural interruption handling
What's New
- Emotions: PAD model for reactive mood + HEXACO traits for persistent personality
- Long-term Memory: Facts, preferences, and conversation summaries persist across sessions
- Observer Agent: Constitutional AI monitors behavior and self-adjusts within bounds
- Vision: FastVLM gives her eyes. Details | Demo
- Autonomy: She watches, waits, and speaks when she has something to say. Details
- MCP Tools: Extensible tool system for home automation, system info, etc. Details
- 8GB SBC: Runs on a Rock5b with RK3588 NPU. Branch
Roadmap
"Federal regulations require me to warn you that this next test chamber... is looking pretty good.” - GLaDOS
There's still a lot do do; I will be swapping out models are they are released, and then working on anamatronics, once a good model with inverse kinematics comes out. There was a time when I would code that myself; these days it makes more sense to wait until a trained model is released!
- Train GLaDOS voice
- Personality that actually sounds like her
- Vision via VLM
- Autonomy (proactive behavior)
- MCP tool system
- Emotional state (PAD + HEXACO model)
- Long-term memory
- Implement streaming ASR (nvidia/multitalker-parakeet-streaming-0.6b-v1)
- Observer agent (behavior adjustment)
- 3D-printable enclosure
- Animatronics
Architecture
"Let's be honest. Neither one of us knows what that thing does. Just put it in the corner and I'll deal with it later." - GLaDOS
flowchart TB
subgraph Input
mic[🎤 Microphone] --> vad[VAD] --> asr[ASR]
text[⌨️ Text Input]
tick[⏱️ Timer]
cam[📷 Camera]--> vlm[VLM]
end
subgraph Minds["Subagents"]
sensors[Sensors]
weather[Weather]
emotion[Emotion]
news[News]
memory[Memory]
end
ctx[📋 Context]
subgraph Core["Main Agent"]
llm[🧠 LLM]
tts[TTS]
end
subgraph Output
speaker[🔊 Speaker]
logs[Logs]
images[🖼️ Images]
motors[⚙️ Animatronics]
end
asr -->|priority| llm
text -->|priority| llm
vlm --> ctx
tick -->|autonomy| llm
Minds -->|write| ctx
ctx -->|read| llm
llm --> tts --> speaker
llm --> logs
llm <-->|MCP| tools[Tools]
tools --> images
tools --> motors
GLaDOS runs a loop: each tick she reads her slots (weather, news, vision, mood), decides if she has something to say, and speaks. No wake word—if she has an opinion, you'll hear it.
Two lanes: Your speech jumps the queue (priority lane). The autonomy lane is just the loop running in the background. User always wins.
Audio Pipeline
flowchart LR
subgraph Capture["Audio Capture"]
mic[Microphone<br/>16kHz]
vad[Silero VAD<br/>32ms chunks]
buffer[Pre-activation<br/>Buffer 800ms]
end
subgraph Recognition["Speech Recognition"]
detect[Voice Detected<br/>VAD > 0.8]
accumulate[Accumulate<br/>Speech]
silence[Silence Detection<br/>640ms pause]
asr[Parakeet ASR]
end
subgraph Interruption["Interruption Handling"]
speaking{Speaking?}
stop[Stop Playback]
clip[Clip Response]
end
mic --> vad --> buffer
buffer --> detect --> accumulate
accumulate --> silence --> asr
detect --> speaking
speaking -->|Yes| stop --> clip
- Microphone captures at 16kHz mono
- Silero VAD processes 32ms chunks, triggers at probability > 0.8
- Pre-activation buffer preserves 800ms before voice detected
- Silence detection waits 640ms pause before finalizing
- Interruption stops playback and clips the response in conversation history
Thread Architecture
| Thread | Class | Daemon | Priority | Queue | Purpose |
|---|---|---|---|---|---|
| SpeechListener | SpeechListener |
✓ | INPUT | — | VAD + ASR |
| TextListener | TextListener |
✓ | INPUT | — | Text input |
| LLMProcessor | LanguageModelProcessor |
✗ | PROCESSING | llm_queue_priority |
Main LLM |
| LLMProcessor-Auto-N | LanguageModelProcessor |
✗ | PROCESSING | llm_queue_autonomy |
Autonomy LLM |
| ToolExecutor | ToolExecutor |
✗ | PROCESSING | tool_calls_queue |
Tool execution |
| TTSSynthesizer | TextToSpeechSynthesizer |
✗ | OUTPUT | tts_queue |
Voice synthesis |
| AudioPlayer | SpeechPlayer |
✗ | OUTPUT | audio_queue |
Playback |
| AutonomyLoop | AutonomyLoop |
✓ | BACKGROUND | — | Tick orchestration |
| VisionProcessor | VisionProcessor |
✓ | BACKGROUND | vision_request_queue |
Vision analysis |
Daemon threads can be killed on exit. Non-daemon threads must complete gracefully to preserve state (e.g., conversation history).
Shutdown order: INPUT → PROCESSING → OUTPUT → BACKGROUND → CLEANUP
Context Building
flowchart TB
subgraph Sources["Context Sources"]
sys[System Prompt<br/>Personality]
slots[Task Slots<br/>Weather, News, etc.]
prefs[User Preferences]
const[Constitutional<br/>Modifiers]
mcp[MCP Resources]
vision[Vision State]
end
subgraph Builder["Context Builder"]
merge[Priority-Sorted<br/>Merge]
end
subgraph Final["LLM Request"]
messages[System Messages]
history[Conversation<br/>History]
user[User Message]
end
Sources --> merge --> messages
messages --> history --> user
What the LLM sees on each request:
- System prompt with personality
- Task slots (weather, news, vision state, emotion)
- User preferences from memory
- Constitutional modifiers (behavior adjustments from observer)
- MCP resources (dynamic tool descriptions)
- Conversation history (compacted when exceeding token threshold)
Autonomy System
flowchart TB
subgraph Triggers
tick[⏱️ Time Tick]
vision[📷 Vision Event]
task[📋 Task Update]
end
subgraph Loop["Autonomy Loop"]
bus[Event Bus]
cooldown{Cooldown<br/>Passed?}
build[Build Context<br/>from Slots]
dispatch[Dispatch to<br/>LLM Queue]
end
subgraph Agents["Subagents"]
emotion[Emotion Agent<br/>PAD Model]
compact[Compaction Agent<br/>Token Management]
observer[Observer Agent<br/>Behavior Adjustment]
weather[Weather Agent]
news[HN Agent]
end
Triggers --> bus --> cooldown
cooldown -->|Yes| build --> dispatch
Agents -->|write| slots[Task Slots]
slots -->|read| build
Each subagent runs its own loop: timer or camera triggers it, it makes an LLM decision, and writes to a slot the main agent reads. Fully async—subagents never block the main conversation.
See autonomy.md for details.
Tool Execution
sequenceDiagram
participant LLM
participant Executor as Tool Executor
participant MCP as MCP Server
participant Native as Native Tool
LLM->>Executor: tool_call {name, args}
alt MCP Tool (mcp.*)
Executor->>MCP: call_tool(server, tool, args)
MCP-->>Executor: result
else Native Tool
Executor->>Native: run(tool_call_id, args)
Native-->>Executor: result
end
Executor->>LLM: {role: tool, content: result}
Native tools: speak, do_nothing, get_user_preferences, set_user_preferences
MCP tools: Prefixed with server name (e.g., mcp.system_info.get_cpu). Supports stdio, HTTP, and SSE transports.
See mcp.md for configuration.
Components
"All these science spheres are made out of asbestos, by the way. Keeps out the rats. Let us know if you feel a shortness of breath, a persistent dry cough, or your heart stopping. Because that's not part of the test. That's asbestos." - Cave Johnson
| Component | Technology | Purpose | Status |
|---|---|---|---|
| Speech Recognition | Parakeet TDT (ONNX) | Speech-to-text, 16kHz streaming | ✅ |
| Voice Activity | Silero VAD (ONNX) | Detect speech, 32ms chunks | ✅ |
| Voice Synthesis | Kokoro / GLaDOS TTS | Text-to-speech, streaming | ✅ |
| Interruption | VAD + Playback Control | Talk over her, she stops | ✅ |
| Vision | FastVLM (ONNX) | Scene understanding, change detection | ✅ |
| LLM | OpenAI-compatible API | Reasoning, tool use, streaming | ✅ |
| Tools | MCP Protocol | Extensibility, stdio/HTTP/SSE | ✅ |
| Autonomy | Subagent Architecture | Proactive behavior, tick loop | ✅ |
| Conversation | ConversationStore | Thread-safe history | ✅ |
| Compaction | LLM Summarization | Token management | ✅ |
| Emotional State | PAD + HEXACO | Reactive mood, persistent personality | ✅ |
| Long-term Memory | MCP + Subagent | Facts, preferences, summaries | ✅ |
| Observer Agent | Constitutional AI | Behavior adjustment | ✅ |
✅ = Done | 🔨 = In progress
Quick Start
"The Enrichment Center is required to remind you that the Weighted Companion Cube cannot talk. In the event that it does talk The Enrichment Centre asks you to ignore its advice." - GLaDOS
-
Install Ollama and grab a model:
-
Clone and install:
git clone https://github.com/dnhkng/GLaDOS.git cd GLaDOS python scripts/install.py -
Run:
uv run glados # Voice mode uv run glados tui # Text interface
Installation
GPU Setup (recommended)
- NVIDIA: Install CUDA Toolkit
- AMD/Intel: Install appropriate ONNX Runtime
Works without GPU, just slower.
LLM Backend
GLaDOS needs an LLM. Options:
- Ollama (easiest):
ollama pull llama3.2 - Any OpenAI-compatible API
Configure in glados_config.yaml:
completion_url: "http://localhost:11434/v1/chat/completions" model: "llama3.2" api_key: "" # if needed
Platform Notes
Linux:
sudo apt install libportaudio2
Windows: Install Python 3.12 from Microsoft Store.
macOS: Experimental. Check Discord for help.
Install
git clone https://github.com/dnhkng/GLaDOS.git
cd GLaDOS
python scripts/install.pyUsage
uv run glados # Voice mode uv run glados tui # Text UI uv run glados start --input-mode text # Text only uv run glados start --input-mode both # Voice + text uv run glados say "The cake is a lie" # Just TTS
TUI Controls
Press Ctrl+P to open the command palette. Available commands:
| Command | What it does |
|---|---|
| Status | System overview |
| Speech Recognition | Toggle ASR on/off |
| Text-to-Speech | Toggle TTS on/off |
| Config | View configuration |
| Memory | Long-term memory stats |
| Knowledge | Manage user facts |
Keyboard Shortcuts:
Ctrl+P- Command paletteF1- Help screenCtrl+D/L/S/A/U/M- Toggle panels (Dialog, Logs, Status, Autonomy, Queue, MCP)Ctrl+I- Toggle right info panelsCtrl+R- Restore all panelsEsc- Close dialogs
Configuration
"As part of a required test protocol, we will not monitor the next test chamber. You will be entirely on your own. Good luck." - GLaDOS
Change the LLM
Then in glados_config.yaml:
Browse models: ollama.com/library
Change the Voice
“I'm speaking in an accent that is beyond her range of hearing.” - Wheatley
Kokoro voices in glados_config.yaml:
Female US: af_alloy, af_aoede, af_jessica, af_kore, af_nicole, af_nova, af_river, af_sarah, af_sky Female UK: bf_alice, bf_emma, bf_isabella, bf_lily Male US: am_adam, am_echo, am_eric, am_fenrir, am_liam, am_michael, am_onyx, am_puck Male UK: bm_daniel, bm_fable, bm_george, bm_lewis
Custom Personality
Copy configs/glados_config.yaml, edit the personality:
personality_preprompt: - system: "You are a sarcastic AI who judges humans." - user: "What do you think of my code?" - assistant: "I've seen better output from a random number generator."
Run with:
uv run glados start --config configs/your_config.yaml
MCP Servers
Add tools in glados_config.yaml:
mcp_servers: - name: "system_info" transport: "stdio" command: "python" args: ["-m", "glados.mcp.system_info_server"]
Built-in: system_info, time_info, disk_info, network_info, process_info, power_info, memory
See mcp.md for Home Assistant integration.
TTS API Server
Expose Kokoro as an OpenAI-compatible TTS endpoint:
python scripts/install.py --api ./scripts/serve
Or Docker:
docker compose up -d --build
Generate speech:
curl -X POST http://localhost:5050/v1/audio/speech \ -H "Content-Type: application/json" \ -d '{"input": "Hello.", "voice": "glados"}' \ --output speech.mp3
Troubleshooting
"No one will blame you for giving up. In fact, quitting at this point is a perfectly reasonable response." - GLaDOS
She keeps responding to herself:
Use headphones or a mic with echo cancellation. Or set interruptible: false.
Windows DLL error: Install Visual C++ Redistributable.
Development
Explore the models:
jupyter notebook demo.ipynb