How It Works
Five stages.
One Loop.
Before any foundation model fires, grāmatr runs its own inference layer — classifying intent, assembling context, and constructing a typed plan the model never has to derive itself. grāmatr reasons first. The model executes second. The five stages of the Loop are how that happens.
The Loop.
The Loop is how grāmatr's five pillars — classification, context delivery, governance, cost attribution, and compounding intelligence — run on every request: reasoning first, the model second. The five steps carry the five pillars: Classify is classification, Deliver is context delivery, Execute is governance, Shape is the cost-attribution and audit record, and Learn is compounding intelligence. Each completed request makes the next one smarter — routing accuracy improves, context compresses, and proven workflows promote into reusable organizational skills.
Classify
A contract — not a label — delivered in milliseconds before the model runs.
Deliver
The right context, assembled in real time — not dumped, not guessed. Same packet, every tool.
Execute
Disciplined phase steps and a mandatory plan gate — the agent works to a structure, not its own discretion.
Shape
Typed criteria set before work begins. Every PASS or FAIL recorded with evidence — not after the fact.
Learn
Every gated outcome trains the classifier. The cycle doesn't repeat at the same level — it compounds.
The five stages of the Loop.
Classify — A contract, not a label
Every request passes through grāmatr's inference layer before any frontier model runs. The patent-pending pre-classification architecture triages it in milliseconds.
The output is not a routing decision or a memory retrieval — it is a typed contract. That contract specifies:
- How much effort the request deserves
- Which intent is expressed
- Which capabilities apply
- What constraints are in force
- What quality criteria the response must satisfy
The frontier model receives that contract and starts working immediately, instead of burning thousands of tokens figuring out what was wanted.
"Context engineering is the delicate art and science of filling the context window with just the right information for the next step."
"Building with language models is becoming less about finding the right words for your prompts, and more about answering the broader question of 'what configuration of context is most likely to generate our model's desired behavior?'"
Classify runs in under 100 milliseconds. Every request. Every tool.
Deliver — Surgical context, every tool
Without the Loop, every turn pays four taxes:
- The full system prompt rides along on every turn
- The model spends reasoning tokens figuring out what context it needs
- It spends tool tokens searching for and fetching that context
- It pays again when the first fetch was wrong
With the Loop, the pre-classifier decides on every turn whether context is needed at all — and if it is, exactly what. Only that gets delivered, assembled in real time from a typed knowledge graph and routed to the model before it runs. Those four taxes are paid once per turn — and the savings compound across every turn of every session.
And the payload is portable. The same intelligence packet routes to Claude Code, Cursor, ChatGPT, Gemini, and every MCP-native tool. Start a project in one, hand it off to another, come back. Your context travels — and your investment is in the intelligence layer, not any single model.
Consistency across tools today. Freedom to adopt better tools tomorrow.
Execute — Disciplined work, not free-running agents
The Execute stage is where grāmatr makes agent behavior governable. The agent does not free-run — it executes through a typed phase template with a mandatory plan gate, so work proceeds in disciplined steps: OBSERVE, THINK, PLAN, stop for approval, BUILD, VERIFY, LEARN.
The model produces output the rest of the Loop can verify and learn from. Plan means stop: high-stakes actions present what they intend to do and wait for confirmation before executing. That is not a suggestion — it is architecture.
Shape — Typed gates, audited outcomes
Velocity without quality is a spike. grāmatr's Shape stage is where AI behavior gets shaped to your standard, not the model's default.
Every output runs against typed quality-gate criteria set before the work began — not after. The output either meets the gate or it does not ship. Every PASS or FAIL is recorded with evidence: the audit trail your procurement team and your AI-skeptics both ask for.
Every token in a gated output is attributed to the outcome it produced — not pooled, not estimated. That is how enterprise AI spend justifies itself to procurement and finance: cost traceable to specific work, not to a billing line.
This is the difference between an AI that produced something and an AI that produced something that meets your bar — at a cost you can account for. Shape turns both into a default, not an exception.
An audit trail isn't paperwork. Cost attribution isn't overhead. Together they are what makes AI velocity legible — to your team, to procurement, to finance, to anyone who needs to trust both the output and the spend.
Learn — The flywheel, and new skills from your work
Most AI tools reset every session. Across an enterprise, that means every practitioner rebuilds the same context every day — project decisions, architectural choices, standing constraints. The cost is invisible on any single turn. Across hundreds of seats and thousands of sessions, it is the primary driver of AI underperformance.
Across a deployment, every practitioner loses 10–30 minutes per session rebuilding context. At 50 seats, that is a workday of organizational productivity — every single day.
Learn changes that trajectory. Every gated outcome becomes a signal that feeds the classifier — and what that compounds into looks like this:
Every practitioner's AI immediately retains project context, team conventions, and organizational decisions across sessions — no daily context rebuild.
Routing accuracy improves as the classifier learns your organization's request patterns. High-effort requests stop being treated as instant lookups.
Reusable patterns emerge from how your teams work. Common workflows begin formalizing into organizational skills available across the deployment.
In the first enterprise deployment, the context delivered per request collapsed to a fraction of the original load. Not because context was removed — because the system learned the organization well enough to deliver precisely what each request actually needs.
That reduction is not compression. It is the measurable output of a system that learned.
Patterns become skills
Learn doesn't just sharpen the classifier — it grows new organizational capabilities.
a team solves a recurring problem
the system detects the pattern
the pattern is formalized into a specification
propagates to every seat in the deployment, without redeploying anything
In production, grāmatr's own engineering and deployment workflows were formalized this way — patterns extracted from real work, promoted into skills, and then executed consistently across every release cycle. The same mechanism is available to enterprise deployments: institutional knowledge that was previously locked in individual practitioners becomes governed, reusable infrastructure.
Enterprise admins control exactly which skills propagate and to whom. One team's discovery becomes the whole organization's advantage — with full governance over what is shared, who can access it, and what gets retired.
Your AI plans before it acts.
One of the most common concerns about AI tools: "What if it does something I did not ask for?" Autonomous agents that act without oversight create anxiety for good reason. A single misrouted action can waste hours or break things.
grāmatr addresses this with structured control gates. Every complex request follows a deliberate sequence:
Plan means stop. Your AI presents what it intends to do and waits for your confirmation before executing. This is not a suggestion — it is architecture. The system is designed so that high-stakes actions require human approval before they proceed.
Most AI tooling chases more autonomy. grāmatr is built for smarter autonomy — the AI knows when to act and when to ask.
Built on security.
Security is not a feature we added to grāmatr. It is the foundation everything else sits on.
Your data is yours
Interaction data is encrypted at rest at the storage layer, with database-level row isolation per user. This is an architectural control, not a policy decision.
Your interactions build organizational intelligence — isolated by row-level security and encrypted at rest. Team and enterprise sharing is governed by explicit admin controls.
Tiered directive governance
What makes grāmatr's security model different is that it extends to how intelligence itself propagates:
Your interactions build your intelligence. Encrypted at rest, isolated by row-level security at the database level.
Team admins decide which patterns get shared across the team. Everything else stays private. No data flows between users without explicit admin authorization.
Enterprise admins control what gets incorporated into organizational intelligence. Full authorization required. Governed by the five-level scope hierarchy.
In our analysis of the leading AI context tools as of March 2026:
| Tool | Tiered governance + row-level security + encryption at rest |
|---|---|
| grāmatr | Enforced at the database level |
| Mem0 | Not offered (as of March 2026) |
| Zep | Not offered |
| Letta | Not offered |
| LangMem | Not offered |
grāmatr enforces isolation at the database level — your intelligence is protected by architecture, not policy.
Works with your tools.
grāmatr connects to your AI tools through the Model Context Protocol (MCP) — the open standard that Anthropic describes as "a USB-C port for AI applications."
There is nothing proprietary to install. If your tool supports MCP, grāmatr works with it.
One baseURL. Either protocol.
For tools that speak the OpenAI or Anthropic API directly, integration is a single environment variable — point them at the grāmatr gateway and every request runs the Loop:
# OpenAI-compatible tools (Cursor, ChatGPT, VS Code)
export OPENAI_BASE_URL=https://gateway.gramatr.com/v1
export OPENAI_API_KEY=<your-grāmatr-key>
# Anthropic-compatible tools (Claude Code, Claude SDK)
export ANTHROPIC_BASE_URL=https://gateway.gramatr.com
export ANTHROPIC_API_KEY=<your-grāmatr-key>Beyond the gateway, a grāmatr plugin marketplace — in private preview today, with a public marketplace on the roadmap — adds tool-native support for the specific tools your teams already run.
For development environments like Claude Code and Cursor, grāmatr deploys as an MCP server configuration — no proprietary tooling required. For enterprise-wide rollout, grāmatr connects to every AI tool your teams use via MCP or REST, with admin-governed access controls and deployment telemetry from day one.
Frequently asked questions.
How is grāmatr different from Mem0 or Zep?
Mem0, Zep, and similar tools are memory layers — they store context and retrieve it when prompted. grāmatr is an intelligence layer. The difference: memory tools give your AI access to old information. grāmatr builds intelligence from interaction patterns — routing accuracy, context compression, skill extraction — without training on the content of your work. grāmatr also routes requests intelligently before they reach expensive models and creates new capabilities from your work patterns.
Does grāmatr work with ChatGPT, not just Claude?
Yes. grāmatr is built on MCP (Model Context Protocol), which is becoming the industry standard for AI tool integration. It works with Claude Code, Cursor, ChatGPT, Gemini, and browser-based AI tools. Your intelligence layer is not locked to any single model or platform. One intelligence layer, every tool — that is the design principle.
How is my data protected?
All data is encrypted at rest with row-level security enforced at the database level. Your intelligence is isolated by architecture, not policy. Team and enterprise features include admin-controlled governance with full scope hierarchy. Read more about enterprise security →
How does grāmatr integrate with an existing enterprise AI program?
grāmatr sits above the model layer as an intelligence infrastructure. It connects to every AI tool your teams use via MCP or REST API — Claude Code, Cursor, ChatGPT, Gemini, Codex, and any MCP-native platform. Deployment is additive: teams connect their existing tools to grāmatr without changing their workflows. The intelligence layer begins learning from the first interaction. See enterprise deployment options →
How quickly does enterprise deployment compound?
Measurably from day one — practitioners stop rebuilding context every session immediately. Within the first week, routing accuracy improves as the classifier learns your organization's request patterns. Within the first month, reusable skills begin emerging from team workflows. By quarter one, the drop in context delivered per request is typically observable in production telemetry. The compounding is architectural — it accelerates as more sessions contribute to the learning layer.
Deploy the intelligence layer your AI program is missing.
Every session your teams run makes the next one more accurate, more efficient, and more auditable. Institutional intelligence compounds from day one. Talk to Us
See enterprise deployment options, or review the proof.