engram

Overview

Engram — AI memory system for agents

Engram

Engram is a long-term memory system for AI agents. It stores decisions, patterns, and fixes with semantic search, automatic deduplication, and self-learning.

Features

  • Hybrid search — vector (HNSW cosine) + full-text (FTS5 BM25)
  • HyDE — LLM generates a hypothetical memory, embedding is built from the hypothesis instead of the raw query
  • Automatic deduplication — cosine similarity > 0.95 at write time
  • Consolidation — preview → LLM analysis → apply (merge/delete/archive)
  • Q-Learning router — 4 levels: search strategy, LLM selection, contextualization, proactivity
  • Three learning loops — fast (Q-Learning per call), medium (trainer daily/weekly), deep (LoRA fine-tune)
  • Cross-project transfer — project-scoped search with score multiplier, insights are project-agnostic
  • Graceful degradation — every API dependency has a local fallback
  • MCP integration — works with Claude Desktop, Claude Code, Cursor
  • CLI — full-featured command-line interface
  • TUI dashboard — terminal UI with 5 tabs and an init wizard

Three Learning Loops

Engram uses three levels of adaptation, each with different speed and depth:

  • Fast loop (Q-Learning) — updates the Q-table after every call. The router adjusts search strategy, model selection, and contextualization level based on feedback. Latency — milliseconds.
  • Medium loop (Trainer) — runs periodically (daily or weekly). Clusters memories, identifies temporal patterns, trains the classifier and ranking model. Results are saved as ONNX models.
  • Deep loop (LoRA fine-tune) — fine-tunes DistilGPT2 on user data using LoRA adapters. Replaces API calls for HyDE and routine operations with local text generation.

Trained Models

The trainer produces three ONNX models stored in ~/.engram/models/:

ModelSizeAlgorithmPurpose
mode_classifier.onnx13 KBTF-IDF + LogisticRegressionClassifies query type (query/research/brainstorm/debugging) for Q-Learning router
ranking_model.onnx23 KBGradientBoostingRe-ranks search results by score, usage, recency, length, and tags
text_generator.onnx~312 MBDistilGPT2 + LoRALocal text generation replacing API calls for HyDE and routine operations

The first two models are trained during regular engram train runs. The text generator requires engram train --deep with PyTorch installed.

Architecture

Engram uses a two-process model:

  • Rust core (engram-core) — long-lived unix socket server or self-contained CLI. Contains all business logic: storage, search, embeddings, routing, consolidation, ONNX inference.
  • TypeScript MCP server (@engram/mcp-server) — thin MCP ↔ JSON-RPC translator. Manages the Rust process lifecycle.

See Architecture for details.

TUI Dashboard

engram-tui is a terminal dashboard built with ratatui, connecting to engram via unix socket and reading SQLite directly.

5 tabs:

  • Status — database, server, and model health
  • Memories — browse, delete, and archive memories
  • Search — hybrid search through the server's unix socket
  • Q-Learning — router Q-table state
  • Models — trained ONNX model listing

On first launch, engram-tui offers an interactive init wizard similar to engram init.

cargo install engram-tui --locked
engram-tui

Further Reading

Documentation is also available on GitHub Pages.