Memory System
MIRA's memory system is designed to maintain itself. Memories are extracted from conversations automatically, scored for importance, linked to each other, and allowed to decay naturally when they stop being useful.
The Problem
Most knowledge systems become graveyards. Notes pile up, never to be read again. Context windows have limits. You can't surface everything, and manually curating what stays relevant doesn't scale.
MIRA takes a different approach: memories earn their persistence through access and linking. The system handles curation automatically.
Graph-Based Architecture
Memories are nodes. Relationships are edges. Every link is stored in both directions for efficient traversal without joins.
When MIRA retrieves memories, it doesn't just find the best matches. It can walk the graph to discover related context that keyword search would miss.
Activity-Day Decay
Decay runs on activity days, not calendar days. If you take a two-week vacation, your memories don't rot. A memory accessed 10 activity-days ago feels equally fresh whether that took 2 weeks or 2 months of wall-clock time.
This means heavy users and light users experience equivalent "memory freshness" relative to their own usage patterns.
Design Principles
- Non-destructive by default: Supersession and splitting don't delete; consolidation archives
- Sparse links over dense links: Better to miss weak signals than add noise
- Heal-on-read: Dead links are cleaned during traversal, not proactively
Link Types
MIRA uses two categories of links: LLM-classified relationships that require semantic understanding, and automatic structural links that can be inferred cheaply.
LLM-Classified (Sparse, High-Value)
| Type | Meaning |
|---|---|
conflicts | Mutually exclusive information |
supersedes | Temporal update (new info replaces old) |
causes | Direct causation |
instance_of | Concrete example of a pattern |
invalidated_by | Empirical evidence disproves |
motivated_by | Explains intent or reasoning |
null | No meaningful relationship (default when uncertain) |
Automatic Structural (Dense, Cheap)
| Type | Meaning |
|---|---|
was_context_for | Memory was explicitly referenced during conversation |
shares_entity:{Name} | Memories mention the same named entity (via spaCy NER) |
Hybrid Search
MIRA combines BM25 (keyword matching) with vector similarity (semantic matching) using reciprocal rank fusion. The blend depends on intent.
| Intent | BM25 | Vector | Use Case |
|---|---|---|---|
| recall | 60% | 40% | Exact match preference |
| explore | 30% | 70% | Semantic preference |
| exact | 80% | 20% | Strong phrase match |
| general | 40% | 60% | Balanced (default) |
Self-Maintenance
Background jobs handle the memory lifecycle automatically.
| Job | Interval | Purpose |
|---|---|---|
| Extraction batch polling | 1 min | Check batch status for new memories |
| Relationship classification | 1 min | Process new links between memories |
| Failed extraction retry | 6 hours | Retry failed extractions |
| Refinement (split/trim) | 7 days | Break up bloated memories |
| Consolidation | 7 days | Merge similar memories |
| Temporal score recalculation | Daily | Update time-based scores |
| Entity garbage collection | Monthly | Clean orphaned entities |
Consolidation
When memories overlap too much, MIRA merges them. The process uses hub-based clustering to find candidates, then runs two-phase LLM verification (a reasoning model proposes, a fast model reviews). All links transfer to the new memory. Old memories are archived, not deleted.
Splitting
The opposite of consolidation. When a memory grows too verbose (length threshold exceeded, at least 7 days old, accessed at least 5 times), MIRA can split it into focused pieces. The original stays active; split memories coexist.
Supersession
When new information explicitly updates old, MIRA creates a supersedes link. Superseded memories remain active but are marked as having a newer version. This creates a version history.
The Scoring Formula (for the math-curious)
The importance score is computed as:
importance_score = sigmoid(raw_score - 2.0)
where sigmoid(x) = 1.0 / (1.0 + exp(-x))
Raw score composition:
raw_score = (value_score + hub_score + entity_hub_score + mention_score + newness_boost)
× recency_boost
× temporal_multiplier
× expiration_trailoff
Components
| Component | Formula | Purpose |
|---|---|---|
| Value Score | ln(1 + (effective_access / max(7, age_days)) / 0.02) × 0.8 | Access frequency normalized by age |
| Effective Access | access_count × 0.95^(days_since_last_access) | Decaying momentum (5%/activity-day) |
| Recency Boost | 1.0 / (1.0 + activity_days_since_access × 0.015) | ~67 activity-day half-life |
| Newness Boost | max(0, 2.0 - (age_days × 0.033)) | 15 activity-day grace period |
| Hub Score | Linear to 10 links (×0.04), then diminishing returns | Inbound link importance |
| Temporal Multiplier | 2.0× (tomorrow) → 1.5× (week) → 1.2× (2 weeks) → 1.0× | Event proximity boost |
| Expiration Trailoff | Linear decay to 0 over 5 calendar days post-expiration | Graceful expiration |