How It Works

Five stages.
One Loop.

Real-time intelligent context engineering is not a feature. It is a disciplined five-stage Loop that runs before, during, and after every interaction with every AI tool you use — and that levels you up every time it cycles.

1Classify
2Deliver
3Execute
4Shape
5Learn
↻ Loop closes — next cycle starts at the new level

The Loop.

Most AI tools treat context as a straight line: store something, retrieve it later. grāmatr works as a cycle — five stages that run on every request, with the last stage feeding the first. Every rotation makes the next one sharper.

1

Classify

Every request is classified in milliseconds before the model runs. The output is not a label — it is a contract describing how much effort the request deserves, what intent is being expressed, which capabilities apply, what hard constraints, and what quality criteria the answer must satisfy.

2

Deliver

The pre-classifier decides whether context is needed at all — and if it is, exactly what. Only that gets delivered, assembled in real time from a typed knowledge graph and routed to the model before it runs. Same payload, every tool: Claude, Codex, Cursor, Gemini.

3

Execute

The agent does not free-run. It executes through a typed phase template with a mandatory plan gate, so the work proceeds in disciplined steps rather than a single uncontrolled pass — and produces output the rest of the Loop can verify and learn from.

4

Shape

Every output runs against typed quality-gate criteria set <em>before</em> the work began. The output either meets the gate or it does not ship. Every PASS or FAIL is recorded with evidence — the audit trail your procurement team and your AI-skeptics both ask for.

5

Learn

Every gated outcome becomes a signal that feeds the classifier. The next request starts smarter than the last. Patterns get promoted into reusable skills. The cycle doesn't just repeat — it levels up.

The five stages of the Loop.

01

Classify — A contract, not a label

Most AI memory tools retrieve everything that might be relevant and dump it into the context window. The more memories you store, the more noise your AI has to sort through. Context gets longer. Responses get slower and less accurate.

grāmatr takes the opposite approach. Before every request reaches an expensive AI model, the patent-pending pre-classification architecture triages it in milliseconds. The output is not a label — it is a contract describing how much effort the request deserves, what intent is being expressed, which capabilities apply, what hard constraints, and what quality criteria the answer must satisfy.

The frontier model receives that contract and starts working immediately, instead of burning thousands of tokens figuring out what was wanted.

"Context engineering is the delicate art and science of filling the context window with just the right information for the next step."

Andrej Karpathy — Former Tesla AI Director, OpenAI Co-founder · source

"Building with language models is becoming less about finding the right words for your prompts, and more about answering the broader question of 'what configuration of context is most likely to generate our model's desired behavior?'"

Classify runs in under 100 milliseconds. Every request. Every tool.

02

Deliver — Surgical context, every tool

Without the Loop, every turn pays four taxes: the full system prompt rides along, the model spends reasoning tokens figuring out what context it needs, it spends tool tokens searching for and fetching that context, and it pays again when the first fetch was wrong.

With the Loop, the pre-classifier decides on every turn whether context is needed at all — and if it is, exactly what. Only that gets delivered, assembled in real time from a typed knowledge graph and routed to the model before it runs. Those four taxes are paid once per turn — and the savings compound across every turn of every session.

And the payload is portable. The same intelligence packet routes to Claude Code, Cursor, ChatGPT, Gemini, and every MCP-native tool. Start a project in one, hand it off to another, come back. Your context travels — and your investment is in the intelligence layer, not any single model.

Consistency across tools today. Freedom to adopt better tools tomorrow.

03

Execute — Disciplined work, not free-running agents

One of the most common concerns about AI is: what if it does something I didn't ask for? Autonomous agents that act without oversight create anxiety for good reason. A single misrouted action can waste hours or break things.

grāmatr addresses this in Execute. The agent does not free-run. It executes through a typed phase template with a mandatory plan gate, so the work proceeds in disciplined steps — OBSERVE, THINK, PLAN, stop for approval, BUILD, VERIFY, LEARN — rather than a single uncontrolled pass.

The model produces output the rest of the Loop can verify and learn from. Plan means stop: high-stakes actions present what they intend to do and wait for confirmation before executing. That is not a suggestion — it is architecture.

See the full control-gate sequence below →

04

Shape — Typed gates, audited outcomes

Velocity without quality is a spike. grāmatr's Shape stage is where AI behavior gets shaped to your standard, not the model's default.

Every output runs against typed quality-gate criteria set before the work began — not after. The output either meets the gate or it does not ship. Every PASS or FAIL is recorded with evidence: the audit trail your procurement team and your AI-skeptics both ask for.

This is the difference between an AI that produced something and an AI that produced something that meets your bar. Shape turns the second into a default, not an exception.

An audit trail isn't paperwork. It is what makes velocity legible — to your team, to procurement, to anyone who needs to trust the output.

05

Learn — The flywheel, and new skills from your work

Most AI tools start every session from zero. Developers lose 10 to 30 minutes each morning rebuilding context that existed yesterday — and over a work week, that adds up to hours of wasted effort.

Learn changes that trajectory. Every gated outcome becomes a signal that feeds the classifier — and what that compounds into looks like this:

Day 1

Your AI remembers your preferences, your project structure, and your instructions — across sessions, not just within one.

Week 1

Requests start getting routed more accurately. The classifier identifies what kind of work you do most and adjusts.

Month 1

Patterns in your decision-making emerge. Your AI anticipates your coding conventions, your communication style, and your workflow preferences without being told.

Month 3

The intelligence packet has shrunk from 40,000 tokens to 1,200. Not because anything was deleted. Because the system learned you well enough that it no longer needs to explain everything from scratch.

That 97% reduction is not compression. It is the measurable output of a system that learned.

Patterns become skills

Learn doesn't just sharpen the classifier — it grows new capabilities. You have a productive session where you solve a problem in a specific way. The system detects the pattern. That pattern gets formalized into a specification. The specification becomes a deployable skill — a repeatable workflow that you, your team, or your entire organization can use. New skills ship without redeploying anything.

Two skills currently in production were promoted from usage patterns this way. WriteWebsite was born from a single productive session building the NEXT90 website — and it's the skill that generated the website you're reading now. DeployGramatr was learned from the manual production deployment workflow Brian was running by hand, then formalized into a one-command skill that shipped 15 production releases in the breakthrough week of March 24–31, 2026.

When teams adopt grāmatr, admins control exactly which capabilities get shared and which stay private. One person's discovery becomes the whole team's advantage — with full governance over what propagates.

Your AI plans before it acts.

One of the most common concerns about AI tools: "What if it does something I did not ask for?" Autonomous agents that act without oversight create anxiety for good reason. A single misrouted action can waste hours or break things.

grāmatr addresses this with structured control gates. Every complex request follows a deliberate sequence:

OBSERVE Analyze the request, gather context
THINK Evaluate options, identify risks
PLAN Propose a course of action
STOP — Wait for human approval
BUILD Execute the approved plan
VERIFY Check the work against success criteria
LEARN Feed outcomes back into the intelligence loop

Plan means stop. Your AI presents what it intends to do and waits for your confirmation before executing. This is not a suggestion — it is architecture. The system is designed so that high-stakes actions require human approval before they proceed.

The AI memory space is focused on more autonomy. grāmatr is focused on smarter autonomy — where the AI knows when to act and when to ask.

Built on security.

Security is not a feature we added to grāmatr. It is the foundation everything else sits on.

Your data is yours

Interaction data is encrypted at rest at the storage layer, with database-level row isolation per user. This is an architectural control, not a policy decision.

Your interactions make your AI smarter — isolated by row-level security and encrypted at rest. Team and enterprise sharing is governed by explicit admin controls.

Tiered directive governance

What makes grāmatr's security model different is that it extends to how intelligence itself propagates:

User level

Your interactions build your intelligence. Encrypted at rest, isolated by row-level security at the database level.

Team level

Team admins decide which patterns get shared across the team. Everything else stays private. No data flows between users without explicit admin authorization.

Enterprise level

Enterprise admins control what gets incorporated into organizational intelligence. Full authorization required. Governed by the five-level scope hierarchy.

In our analysis of the leading AI context tools — Mem0, Zep, Letta, LangMem — as of March 2026, none offer tiered governance with row-level security and encryption at rest. grāmatr enforces isolation at the database level — your intelligence is protected by architecture, not policy.

Works with your tools.

grāmatr connects to your AI tools through the Model Context Protocol (MCP) — the open standard that Anthropic describes as "a USB-C port for AI applications."

There is nothing proprietary to install. If your tool supports MCP, grāmatr works with it.

Claude Code Supported
Cursor Supported
ChatGPT Supported
Gemini Supported
Web Apps Supported

Setup takes minutes, not days. For developer tools like Claude Code and Cursor, it is an MCP server configuration. For ChatGPT, Gemini, and browser-based tools, grāmatr connects through its web interface. Either way, your intelligence layer starts learning from your first interaction.

Frequently asked questions.

How is grāmatr different from Mem0 or Zep?

Mem0, Zep, and similar tools are memory layers — they store context and retrieve it when prompted. grāmatr is an intelligence layer. The difference: memory tools give your AI access to old information. grāmatr learns from every interaction and gets smarter over time. The 40,000-to-1,200 token reduction is not a storage optimization — it is the measurable result of a system that learned. grāmatr also routes requests intelligently before they reach expensive models, creates new capabilities from your work patterns, and carries your preferences across every AI tool you use.

Does grāmatr work with ChatGPT, not just Claude?

Yes. grāmatr is built on MCP (Model Context Protocol), which is becoming the industry standard for AI tool integration. It works with Claude Code, Cursor, ChatGPT, Gemini, and browser-based AI tools. Your intelligence layer is not locked to any single model or platform. One intelligence layer, every tool — that is the design principle.

How is my data protected?

All data is encrypted at rest with row-level security enforced at the database level. Your intelligence is isolated by architecture, not policy. Team and enterprise features include admin-controlled governance with full scope hierarchy. Read more about enterprise security →

How long until my AI actually gets smarter?

You will notice differences from day one — your AI remembers your preferences and project context across sessions immediately. Within the first week, routing accuracy improves as the system learns what kinds of requests you make. As usage increases, patterns in your workflow start shaping how your AI responds. The intelligence packet progressively compresses — delivering better, more targeted results with less data.

Do I need to be a developer to use grāmatr?

No. grāmatr works with browser-based AI tools like ChatGPT and Gemini through its web interface — no command line required. If you use developer tools like Claude Code or Cursor, setup is an MCP server configuration that takes minutes. Either way, once connected, grāmatr works in the background. You interact with your AI tools exactly as you do now — they just get smarter over time. Get started →

Ready to make your AI smarter?

The flywheel starts turning the moment you connect. Request Early Access