Engram¶
Persistent memory for AI agents that have to remember across sessions.
Engram keeps an agent's memory in PostgreSQL: the facts it has learned, the raw
history of what happened, large documents split into traceable pieces, and the
links between them. What you get back is a cognitive recall() operator that routes intents, retrieves exact citations, and traces its own logic.
Search that shows its work
Retrieval blends vector similarity, keyword matching, recency, and importance. You see each score, not one number you have to take on faith.
Memory that survives restarts
Task runs, event logs, checkpoints, and background jobs hold the state of work that outlives a single conversation.
Recall you can debug
When an agent forgets something it should have known, trace_recall() tells you
whether the fact was stored, ranked, trimmed by the token budget, or quietly
superseded.
[!WARNING]
Beta: Engram is at0.3.0b1. It is rigorously tested, but the public API and the database schema are subject to change before 1.0. Always back up your data before you run a migration.
Documentation Guides¶
-
Install Engram, boot Postgres, and build an end-to-end memory loop in 10 minutes.
-
How facts, conflict resolution, semantic search, and the intelligent
recalloperator fit together. -
Resumability, the immutable event ledger, checkpoints, and exact-citation chunking.
-
The complete public API, including type signatures and code examples.
-
Environment variables,
EngramSettings, search weight tuning, and provider extras. -
Process isolation, privacy boundaries, observability, and failure modes.
-
79.6% on BEAM 1M (ICLR 2026), 89.8% on LongMemEval-S (ICLR 2025), 85.7% on LoCoMo-10 (ACL 2024). All three runs use
add_batch()(no LLM at ingest) and are reproducible with the scripts inbenchmark/.
Install & Run¶
git clone https://github.com/ahammadnafiz/engram.git
cd engram
pip install -e ".[dev,examples,sentence-transformers]"
# Start the database
docker compose up -d postgres
# Configure for local embeddings
export ENGRAM_DATABASE_URL=postgresql://engram:engram_secret@localhost:5432/engram
export ENGRAM_EMBEDDING_PROVIDER=sentence-transformers
export ENGRAM_EMBEDDING_MODEL=all-MiniLM-L6-v2
import asyncio
from engram import Engram
async def main() -> None:
# 1. Connect and automatically apply schema migrations
async with Engram(memory_policy="coding_agent") as engram:
# 2. Store a durable, critical fact
memory = await engram.add(
"Repo constraint: never revert user changes without approval",
agent_id="codex",
user_id="nafiz",
)
# 3. Recall a source-backed answer, routed by the question's intent
answer = await engram.recall(
"what are my repository constraints?",
agent_id="codex",
user_id="nafiz",
)
print(memory.memory_type) # -> "constraint"
print(answer.answer_text) # -> grounded answer built from stored memory
if __name__ == "__main__":
asyncio.run(main())
This snippet stores a strict repo constraint, then asks Engram to answer a question about it. recall() classifies the question's intent, retrieves the matching memories, and composes a source-backed answer in answer.answer_text. It requires a configured LLM (set ENGRAM_LLM_PROVIDER); for LLM-free retrieval use search() instead.
How It Works¶
Engram keeps two kinds of memory side by side.
| Plane | Tables | Primary Purpose |
|---|---|---|
| Fact Memory | agent_memory, memory_relations |
Vector search, type filters, conflict resolution, and graph traversal. |
| Task Memory | agent_task_runs, agent_events, agent_checkpoints |
Resuming workflows, exact audit history, and background semantic extraction. |
When Engram.connect() is called, it automatically creates the database schema, ensures the pgvector and pg_trgm extensions exist, and sizes the vector columns to match your chosen embedding model.
Common Operations¶
| Objective | Recommended API |
|---|---|
| Store new facts | add(), add_batch(), add_conversation() |
| Intelligently fetch context | recall() |
| Query the raw event ledger | search_events() |
| Analyze why recall failed | trace_recall() |
| Extract facts asynchronously | run_memory_worker(), process_memory_jobs() |
| Ingest & cite a 50-page PDF | record_long_input(), build_long_input_context() |
Benchmark results¶
Engram is evaluated on three standard long-term memory benchmarks. All runs use on-device embeddings (all-MiniLM-L6-v2, free, no API cost at ingest) and add_batch() — raw episodic turns stored verbatim, with all reasoning deferred to query time via search() + recall() + get_lineage(). Same model is used for both composer and judge (claude-sonnet-4-6), which is a known leniency bias worth disclosing. These are floor numbers: add_conversation() (full LLM extraction at ingest) is expected to score higher.
| Benchmark | Questions | Accuracy | Composer |
|---|---|---|---|
| LongMemEval-S (ICLR 2025) | 500 | 89.8% | claude-sonnet-4-6 |
| LoCoMo-10 (ACL 2024) | 1,540 | 85.7% | claude-sonnet-4-6 |
| BEAM 1M (ICLR 2026) | 700 | 79.6% | claude-sonnet-4-6 |
All three benchmark scripts are in benchmark/ and can be run against your own database. See Benchmarks for full per-type breakdowns, honest caveats, ablation table, and reproduce commands.
Included Examples¶
Engram ships with working reference implementations in the examples/ directory.
| File | What it demonstrates |
|---|---|
examples/basic_usage.py |
A comprehensive tour of almost the entire API surface. |
examples/chatbot.py |
A real Gemini-backed terminal chatbot: the benchmark add_batch floor + 4-surface retrieval + composer pipeline, run live. |
examples/long_input_usage.py |
Securely ingesting a massive document and answering questions from anchored source chunks. |