Architecture

MIRA is a FastAPI application with event-driven architecture. PostgreSQL handles storage and vector search. Everything runs locally except the LLM calls.

System Requirements

ResourceRequirement
RAM3GB total (including embedding model and all services, excluding LLM if running locally)
Disk10GB minimum
GPUNot required (CPU-only PyTorch)
OSLinux or macOS

Core Stack

ComponentPurpose
PostgreSQLMemory storage, vector search (pgvector)
ValkeyRedis-compatible caching
HashiCorp VaultSecrets management
sentence-transformersLocal embeddings
spaCyEntity extraction (NER)
APSchedulerBackground job scheduling

Model Downloads (One-Time)

Provider Support

MIRA works with any OpenAI-compatible endpoint. Internally follows Anthropic SDK conventions, but translation happens at the proper layer. No vendor lock-in.

Tested Models

What You Lose with Local Models

Deployment

Single cURL command. The deploy.sh script is 2000+ lines of production-grade automation.

curl -fsSL https://raw.githubusercontent.com/taylorsatula/mira-OSS/refs/heads/main/deploy.sh -o deploy.sh && chmod +x deploy.sh && ./deploy.sh

What the Script Handles

Run with --loud for verbose output. Fully unattended-capable.

Token Overhead

ComponentTokens
System prompt~1,100-1,500
Typical full context~8,300
Cached portion on subsequent requests~3,300

Content controlled via config limits (20 memories max, 5 summaries max).

Event-Driven Architecture

MIRA uses a synchronous event bus. All handlers complete before publish() returns.

Characteristics

Why synchronous? Guarantees ordering and eliminates race conditions. When TurnCompletedEvent fires, all cleanup completes before the next turn can begin. LLM calls dominate latency anyway, so trading parallelism for predictability is worth it.

Event Hierarchy
ContinuumEvent (frozen dataclass - immutable)
├── MessageEvent
├── ToolEvent
├── WorkingMemoryEvent
│   ├── ComposeSystemPromptEvent
│   ├── SystemPromptComposedEvent
│   ├── UpdateTrinketEvent
│   ├── TrinketContentEvent
│   └── WorkingMemoryUpdatedEvent
└── ContinuumCheckpointEvent
    ├── TurnCompletedEvent
    ├── SegmentTimeoutEvent
    ├── SegmentCollapsedEvent
    ├── ManifestUpdatedEvent
    └── PointerSummariesCollapsingEvent

Segment Collapse ("REM Sleep")

Every 5 minutes, APScheduler checks for inactive conversation segments. On timeout, the system loads segment messages, generates a summary and embedding, extracts tools used, submits to memory extraction via the Batch API, clears search results, and persists collapsed metadata.

Trinket System

Trinkets are modular prompt composition units. Each contributes content with its own cache policy.

Built-in Trinkets

Each trinket has a standard lifecycle: registration with factory, optional event subscription, receiving update requests during prompt composition, generating content, storing in Valkey, and emitting content events.