The productivity floor is about to move. Permanently.
Not a spike. Not a demo. A new sustained baseline — driven by real-time intelligent context engineering that classifies every request, delivers only the context it needs, and verifies every output against typed quality gates. Below: the mechanism, the team multiplier you should plan for, and a much higher demonstrated ceiling from a public twelve-month GitHub timeline.
Where most knowledge workers ship from today.
A solo practitioner, a six-person product team, a forty-person enterprise function — the shape is the same. Whether the output is code, research briefs, contracts, campaigns, analyses, decks, content, or customer responses, the AI underneath the work resets every session. Ten to thirty minutes at the start of every interaction goes to re-explaining context. The same conventions get re-pasted every Monday. The same corrections get re-made every Tuesday. The same near-misses ship through review because nothing learned from the last round.
That is the floor — not because the people are bad, not because the model is bad, but because the layer between them is doing nothing. The productivity tax of context-rebuilding is real, recurring, and quietly compounding in the wrong direction.
Industry baselines on engineering throughput drawn from DORA / State of DevOps research. The same pattern shows up in legal, research, marketing, and operations workflows — every role that uses AI daily pays the same context tax.
One operator. Three eras. A standard team baseline for reference.
The bars are one operator's public weekly contributions over the trailing twelve months, colored by era. The bottom dashed line is the industry reference: what a typical disciplined 8-person dev team produces per week in total (commits + PRs + issues + reviews combined). The comparison gets fairer as the eras go on — the gray period is unstructured solo work, the orange period is structured platform-building, the blue period is the Loop enforcing discipline on every request. The story reads in two glances: how high the operator's bars went, and how the structure underneath them changed.
Solo work across multiple codebases, no intelligence-context platform underneath, no enforced structure. Output looks like a typical strong individual contributor: bursts of high activity, quiet stretches between, holiday troughs. Raw effort produced raw output. The 8-person-team reference line on the chart isn't a fair comparison at this stage — a team brings review, conventions, and continuity that solo-coder output doesn't have. The point of this phase isn't 'I out-coded a team.' The point is that nothing compounded across weeks, because there was nothing under the work doing the compounding.
The substrate underneath the Loop was being built — storage, retrieval, the classifier itself. The mechanism wasn't yet running on every request, but the work was structured around building it, which forced some of the discipline a real platform demands. The average across this window was ~210 contributions per week, with the work-on-platform lift visible in the chart. Now the team-baseline reference starts to become a fair comparison: this is structured platform work, the kind a team produces.
The real-time intelligent context engineering layer came online on a single dated change — a merged pull request in a private repository, March 24, 2026. Weekly contributions since: 599, 776, 845, 1,098, 1,817, 1,852, 1,484, 893. The 8-week trailing average is ~1,170 contributions per week — more than ten times the typical disciplined 8-person team baseline, sustained for eight straight weeks. The public contribution calendar shows it holding.
Why does the floor move and stay? Real-time intelligent context engineering, in five stages.
Routing without discipline is a spike. Discipline without speed is a meeting. The floor moves — and stays — because every request runs through the same five-stage Loop. The last stage, Learn, is what makes the cycle a flywheel rather than a one-shot.
Classify
Every request is pre-classified in milliseconds. The classifier produces a structured directive that tells the model how to approach the work before reasoning begins. The model starts immediately, instead of figuring out what kind of work the turn is while doing it.
Deliver context, just-in-time
Only the context this specific turn needs gets delivered — assembled in real time, routed to the model before the model runs. Without the Loop, every turn pays a context tax in reasoning tokens, fetch tokens, and retries when the first fetch was wrong. With the Loop, the model goes straight to the work. The savings compound across every turn of every session.
Execute
The agent does not free-run. It works through a structured phase template with enforced planning, so the work proceeds in disciplined steps rather than a single uncontrolled pass.
Shape
Every output runs against quality criteria set before the work began. The output either meets the gate or it does not ship. Every PASS or FAIL is recorded with evidence — the audit trail your procurement team and your AI-skeptics both ask for. This is where AI behavior gets shaped to your standard, not the model's default.
Learn
Every gated outcome becomes labeled signal that feeds the classifier. The next request starts smarter than the last. The cycle doesn't just repeat — it levels up. Learn is what makes the Loop a flywheel instead of a one-shot, and it is why the new floor does not drift back down.
Velocity alone is a spike. Velocity inside disciplined gates, with audited outcomes feeding the next decision, is a permanent level-up.
What a moved floor looks like — solo, team, or organization.
The honest pitch is not 10×. The honest pitch is that the floor rises and stays up. For a solo practitioner, that means an AI that stops resetting every morning and finishes the day's work measurably faster than yesterday. For a team, it means a 1.5× to 3× sustained throughput multiplier — reviews compound, conventions persist, new hires onboard at week one with the same context the team built over months.
At blended professional-services rates — the kind of math an enterprise sponsor will actually run — a 2× sustained floor across a ten-person AI-augmented team is millions of dollars per year in recovered billable capacity. Not a one-time productivity win. An ongoing operating-cost shift. The same multiplier applies whether the work output is code, briefs, contracts, campaigns, or customer responses.
And the reality is, it can go much higher.
The 1.5–3× multiplier is what a team should plan for. The ceiling — what the same Loop produces at the top of its range — is much higher. The evidence below comes from a software-platform context because that is where the operator's public artifacts live; the Loop applies identically to any work that leaves a paper trail. Some receipts are fully public; others are private artifacts available for review under enterprise due diligence.
Public receipts can be opened by anyone at the links above. Private artifacts are available for review under standard enterprise due diligence — contact us to schedule.
What does the chart show? Twelve months, one dated inflection, a new sustained level.
Public GitHub contributions for the first user of the Loop, trailing twelve months. The prior floor varied — bursts and quiet stretches, classic context-resets-every-session output. A single dated inflection on March 24, 2026 (PR #77, multi-stage classification) takes the weekly average from roughly 140 to roughly 1,170 — and it has not fallen back.
Public GitHub contributions — trailing 12 months, every contribution independently verifiable
Sources. Operator bars and per-era averages: public GitHub contribution calendar, May 12, 2025 – May 10, 2026 (53 weekly buckets, 15,840 contributions). The ~80/wk 8-person-team reference is an estimate from published industry benchmarks (DORA / State of DevOps, GitHub Octoverse), not a number we made up. Methodology section below explains what it does and does not capture.
The floor rises wherever the Loop runs.
The chart shows one operator because the data is public. The principle — real-time intelligent context engineering with typed quality gates — applies wherever AI work happens daily: solo, team, or enterprise; engineering, legal, research, marketing, operations, or anything else.
Individuals
Your AI stops resetting every morning. Your preferences persist. Your conventions persist. Your last week's decisions persist. The personal floor rises and stays — across every tool you use.
For Individuals →Teams
Conventions persist across every team member. Reviews compound. New hires onboard at week one with the same context the team built over months. The team floor lifts and stays.
For Teams →Enterprises
Audited outputs against typed quality gates. Token-savings as a verifiable metric. Compliance as code. The audit trail that procurement and legal both ask for.
For Enterprise →Partners
Add real-time intelligent context engineering to your client deliverables. Revenue share, client-isolated tenancy, the same Loop your competitors cannot replicate.
For Partners →What does GitHub not measure? The chart shows code velocity; the Loop accelerated everything else too.
Public contribution counts capture commits, pull requests, issues, and code reviews. They do not capture the rest of what the Loop produces — and the rest is where the enterprise case lives.
Code velocity is the measurable signal. The Loop produces the rest in the same hours.
How to read the numbers. How to verify them.
What "GitHub contributions" measures
GitHub contributions is a composite metric: commits to default branches, pull requests opened, issues opened, and code reviews submitted. It is broader than raw commit count. When the chart says "1,852 contributions," that includes all four activity types for the week.
The dated inflection
The week of March 22-28, 2026 is the visible inflection on the contribution calendar. The mechanism change behind it is a single merged pull request — the multi-stage classification pipeline — in a private repository. Before that PR, the Loop was being built. After it, the Loop was running. The trailing-eight-week average since then is ~1,170 contributions per week. The PR itself is available for review under enterprise due diligence; the calendar inflection is public.
The technical name of the mechanism
On this page the Loop is described as real-time intelligent context engineering. The underlying technical name in the patent application is pre-classification routing with intelligent contract delivery. The patent application is on file with the USPTO; the application number is held confidential and is available under standard enterprise due diligence. Classification completes in under 100ms per turn.
What the ~80/wk team reference is — and what it is not
The ~80/wk dashed reference line on the chart is an industry estimate for a typical disciplined 8-person dev team's total weekly contributions (commits + PRs + issues + reviews combined). It is calibrated from public benchmarks (DORA throughput tiers, GitHub Octoverse, State of DevOps) and intended as a directional anchor, not a precise team-by-team match. It does not capture every variable a particular organization's team produces — only the comparable signal: how much measurable code-side work a typical team ships per week.
Sourcing the 1.5–3× team multiplier
The 1.5×–3× sustained team-throughput multiplier cited on this site is derived from the operator-vs-team comparison visible on the chart above. Against the ~80/wk industry-baseline 8-person team, a single operator running the Loop has produced ~1,170/wk sustained over the most recent eight weeks — roughly 14× the team baseline. Even before the Loop ran on every request, the same operator running the intelligence-work substrate averaged ~210/wk — roughly 2.6× a typical team. The 1.5×–3× claim is the conservative range a budgeted team should plan for; it sits well below the operator ceiling on the chart, and is bracketed even by the pre-Loop intelligence-work phase. We will publish quantitative team benchmarks once enough beta-customer team data is in to characterize the multiplier with confidence intervals.
Operator, team, and individual claims — read them separately
Three distinct numbers on this page, deliberately not conflated. (1) The operator ceiling (~1,170/wk post-inflection) is what one practitioner with the Loop running has actually produced — existence proof. (2) The team multiplier (1.5×–3× sustained) is the conservative range a budgeted team should plan for above whatever its current baseline is. (3) The individual floor lift is qualitative on this page — your AI stops resetting, your preferences persist — and the personal-productivity literature suggests the gain scales with how heavily you use AI; quantitative individual benchmarks will publish once beta data characterizes them. Three different scopes, three different evidence bases.
How to verify — and what's behind the curtain
What's fully public, no login required: the trailing-12-month contribution calendar, the gramatr GitHub org (creation date, member list), the open-source brand-spec and brand-spec-validator repositories, and this site's /changelog. What's private but available under standard enterprise due diligence: the gramatr platform repository, individual PR contents, release-tag history, detailed git log, and operational repositories for internal infrastructure. Anyone can verify the chart; serious buyers can verify the rest under NDA.
Move your team's floor.
The floor is what your team should expect. The ceiling is what the same Loop has already done in public. The next move is yours.
First user, existence proof, public timeline: Brian Handrigan. The Loop is the product; the operator is one of the receipts.