Architecture
MIRA is a FastAPI application with event-driven architecture. PostgreSQL handles storage and vector search. Everything runs locally except the LLM calls.
System Requirements
| Resource | Requirement |
|---|---|
| RAM | 3GB total (including embedding model and all services, excluding LLM if running locally) |
| Disk | 10GB minimum |
| GPU | Not required (CPU-only PyTorch) |
| OS | Linux or macOS |
Core Stack
| Component | Purpose |
|---|---|
| PostgreSQL | Memory storage, vector search (pgvector) |
| Valkey | Redis-compatible caching |
| HashiCorp Vault | Secrets management |
| sentence-transformers | Local embeddings |
| spaCy | Entity extraction (NER) |
| APScheduler | Background job scheduling |
Model Downloads (One-Time)
- spaCy en_core_web_lg: ~800MB
- mdbr-leaf-ir-asym embedding model: ~300MB
- Playwright (optional): ~300MB
Provider Support
MIRA works with any OpenAI-compatible endpoint. Internally follows Anthropic SDK conventions, but translation happens at the proper layer. No vendor lock-in.
Tested Models
- Claude Sonnet 4.5 (best results)
- Deepseek V3.2
- Qwen 3
- Ministral 3
- Acceptable results down to 4b parameters
What You Lose with Local Models
- Extended thinking disabled
cache_controlstripped- Server-side code execution filtered out
- File uploads become text warnings
Deployment
Single cURL command. The deploy.sh script is 2000+ lines of production-grade automation.
curl -fsSL https://raw.githubusercontent.com/taylorsatula/mira-OSS/refs/heads/main/deploy.sh -o deploy.sh && chmod +x deploy.sh && ./deploy.sh
What the Script Handles
- Platform detection (Linux/macOS) with OS-specific service management
- Pre-flight validation: 10GB disk space, port availability (1993, 8200, 6379, 5432), existing installation detection
- Dependency installation with idempotency (skips what's already installed)
- Python venv creation and package installation
- Model downloads (~1.4GB total)
- HashiCorp Vault initialization: AppRole creation, policy setup, automatic unseal, credential storage
- PostgreSQL database and user creation
- Valkey setup
- API key configuration (interactive prompts or skip for later)
- Offline mode with Ollama fallback
- systemd service creation with auto-start on boot (Linux)
- Cleanup and script archival when complete
Run with --loud for verbose output. Fully unattended-capable.
Token Overhead
| Component | Tokens |
|---|---|
| System prompt | ~1,100-1,500 |
| Typical full context | ~8,300 |
| Cached portion on subsequent requests | ~3,300 |
Content controlled via config limits (20 memories max, 5 summaries max).
Event-Driven Architecture
MIRA uses a synchronous event bus. All handlers complete before publish() returns.
Characteristics
- 100% synchronous (no async/await)
- Single-threaded (handlers execute sequentially)
- Error-isolated (one handler failure doesn't block others)
- Ephemeral (no persistence, no replay)
Why synchronous? Guarantees ordering and eliminates race conditions. When TurnCompletedEvent fires, all cleanup completes before the next turn can begin. LLM calls dominate latency anyway, so trading parallelism for predictability is worth it.
Event Hierarchy
ContinuumEvent (frozen dataclass - immutable)
├── MessageEvent
├── ToolEvent
├── WorkingMemoryEvent
│ ├── ComposeSystemPromptEvent
│ ├── SystemPromptComposedEvent
│ ├── UpdateTrinketEvent
│ ├── TrinketContentEvent
│ └── WorkingMemoryUpdatedEvent
└── ContinuumCheckpointEvent
├── TurnCompletedEvent
├── SegmentTimeoutEvent
├── SegmentCollapsedEvent
├── ManifestUpdatedEvent
└── PointerSummariesCollapsingEvent
Segment Collapse ("REM Sleep")
Every 5 minutes, APScheduler checks for inactive conversation segments. On timeout, the system loads segment messages, generates a summary and embedding, extracts tools used, submits to memory extraction via the Batch API, clears search results, and persists collapsed metadata.
Trinket System
Trinkets are modular prompt composition units. Each contributes content with its own cache policy.
Built-in Trinkets
- TimeManager
- ReminderManager
- ManifestTrinket
- ProactiveMemoryTrinket
- ToolGuidanceTrinket
- PunchclockTrinket
- DomaindocTrinket
- GetContextTrinket
- ToolLoaderTrinket
Each trinket has a standard lifecycle: registration with factory, optional event subscription, receiving update requests during prompt composition, generating content, storing in Valkey, and emitting content events.