Your local AI workstation.
Discover, convert, run, chat with and benchmark MLX & GGUF models — entirely on your own machine. Powered by TurboQuant KV cache compression for serious memory headroom.
macOS-first · experimental Linux & Windows builds
Everything you need, locally.
A single desktop app that replaces a stack of scripts, notebooks and CLIs.
Dashboard
Live memory, CPU, runtime status, and headroom hints for your target model.
Discover
Curated model families plus Hugging Face Hub search, filtered by capability.
My Models
Recursive scan of local model directories — MLX and GGUF, with rich metadata.
Chat
Threaded local chat with per-thread models, generation metrics and markdown.
Server
Expose any loaded model as an OpenAI-compatible localhost API for your tools.
Benchmarks
Compare cache profiles, throughput, quality and memory across configurations.
Conversion
Import and quantize models into MLX with size and throughput previews.
Logs
Searchable, filtered, multi-channel runtime visibility — chat, server, conversion.
Settings
Manage model directories, ports, LAN access and launch defaults.
1–4 bit KV cache compression for MLX.
ChaosEngineAI ships with TurboQuant — adaptive layer-wise cache compression with FP16 edge layers, randomized Hadamard rotations, optional fused attention and Metal-tuned kernels. Run bigger contexts, on smaller machines, without giving up quality.
- Adaptive per-layer cache strategy
- 1, 2, 3 and 4-bit cache quantization
- Optional fused attention path
- Drop-in patches for
mlx-lm
Built for Apple Silicon. Open to more.
macOS
First-class. Native Tauri shell, MLX worker, llama.cpp engine, signed & notarized release flow.
Linux
Experimental. Backend & frontend run; GGUF via llama.cpp. MLX paths macOS-only.
Windows
Experimental. Backend & frontend run; GGUF via llama.cpp. MLX paths macOS-only.
Get started.
Clone, install, and launch the desktop shell.
# backend
python3 -m venv .venv && . .venv/bin/activate
pip install -e '.[desktop,mlx-lm]'
# desktop
cd desktop
npm install
npm run tauri:dev
Apache 2.0 licensed.