NOOPS Weekly — Week of 14 April 2026

This was the week the harness thesis grew up.

Mark Pesce's Alpha and Harnesses paper — written for the University of Sydney — formalises what NOOPS has been tracking for months into a four-tier hierarchy: the router, the copilot, the dark factory, and the meta-harness. Each tier extracts more alpha from token expenditure than the last. Each has a shorter shelf life than the one below it, because the models absorb capability from the bottom up. The paper's verdict on the harness layer is both validating and sobering: "Harness alpha is real... but it is bleeding alpha, continuously depreciating." The businesses that thrive will ride the highest tier they can reach, extract maximum value while the window is open, and invest that value in the things the flywheel cannot mint — physical infrastructure, relationships, trust, regulatory position.

The same week that the harness thesis got its formal treatment, the open-weights tier crossed the frontier in a way that is difficult to dismiss. Qwen3.6-35B-A3B — a Mixture-of-Experts model with 3 billion active parameters, running on consumer hardware — approached Opus-class performance on agentic coding tasks. Simon Willison headlined it "Qwen beats Opus." It arrived the same day Anthropic released Opus 4.7. The gap between "good enough" and "best" is now compressing on a daily basis, not quarterly. Meanwhile, ASML's 83% monopoly on lithography reminded everyone that the deepest infrastructure layers are immune to this compression — you cannot erode a €350M machine that weighs 180 tonnes with a better prompt.

Forty-four signals published across four days. The wiki received updates to fourteen thesis pages and three new entity pages. Six themes emerged.

The harness hierarchy: router, copilot, dark factory, flywheel

The centrepiece of the week is Mark's paper, which takes the informal language NOOPS has been developing — harnesses, alpha, spoons, the token economy — and turns it into a structured analytical framework.

The hierarchy is clean. At the bottom, OpenRouter routes tokens to the cheapest provider. It adds no process, captures no productive alpha — only the informational alpha of watching where demand flows. Thin margin, high volume. Above it, Claude Code is the copilot: an interactive harness where the human and the model co-produce alpha, bounded by the human's capacity to direct and absorb the output. The practitioners who differentiate at this tier are not those who use the harness better — they are those who harness the harness, building CLAUDE.md files, custom skills, and prompt libraries that front-load process before the human even intervenes.

Above the copilot sits the dark factory. Steve Yegge's Gas Town and the SkyPilot research-first agent methodology both deploy autonomous agent armies with minimal human involvement. SkyPilot's result is worth repeating: pointing Claude Code at llama.cpp with a research-first approach — where the agent reads arXiv papers and studies competing implementations before writing code — produced five kernel fusions that made CPU inference 15% faster on x86. A code-only agent would not have found these optimisations. The process that directs the tokens matters more than the volume of tokens consumed.

And at the top: the meta-harness. Lee et al. at Stanford published Meta-Harness, a system that uses tokens to optimise the process that directs tokens. The results are not incremental: a 6× performance gap on the same benchmark by changing only the harness around a fixed model. On text classification, Meta-Harness outperformed the best hand-designed harness by 7.7 points while using 4× fewer context tokens. Mark's conclusion: "When the process itself becomes the target of token expenditure, alpha compounds."

The hierarchy reads as a menu but behaves as a sequence. Each tier generates more alpha than the last, and each tier has a shorter shelf life. The Bitter Lesson predicts that general capability plus compute absorbs specialised engineering over time. Today's copilot advantage is measured in months. The dark factory's orchestration will be absorbed into the models. Even the meta-harness will be absorbed — and when it is, the model will improve itself without external harness engineering at all. As Mark wrote: "Building the best harness is a means, never an end."

Open weights cross the watershed — and Gresham's Law bites

The open-weights crossing that last week's weekly flagged as "aspiration becoming observable fact" now has its clearest data point yet. Qwen3.6-35B-A3B approaches frontier performance on agentic coding with a model that fits on a laptop with 8GB of RAM. Mark's reaction: "watershed crossing for open weights looking promising." The timing — the same day as Opus 4.7 — creates a natural experiment the market will watch closely.

Earlier in the week, the theoretical framework for this dynamic arrived. The NOOPS analysis of Gresham's Law applied to AI tokens argued that in domains where frontier and open-weights tokens generate equivalent alpha, the cheaper token wins. This is not a prediction — it is an economic law. "Good enough" tokens drive out great tokens, just as debased coinage drives out full-weight coinage in Gresham's original observation. The frontier's alpha premium compresses to the domains where capability differentials are still measurable: complex reasoning, finance, legal analysis.

The Good-Enough Token Mint signal expanded the argument: the best business on the spending side of the token economy is minting "good enough" tokens. Costs are far lower than frontier, demand is structurally larger, and Jevons' paradox operates at the mint level — cheaper tokens expand total demand rather than cannibalising it. The competitive pressure is fierce, and sustainable advantage is determined by input infrastructure, not model sophistication. Which points straight back to the infrastructure thesis.

Gemma 2B on commodity x86 silicon now matching GPT-3.5 and Aisle's "system over model" approach finding zero-days with adequate intelligence both reinforce from different angles: the harness and the system matter more than the model's benchmark score. When a budget model in a disciplined harness outperforms a frontier model in an undisciplined one, the Gresham dynamic is already operating.

Infrastructure declares itself at every layer

The infrastructure thesis continued to find evidence faster than we can catalogue it.

At the deepest layer, ASML's 83% share of worldwide lithography machine sales is the smiling curve at its most extreme. This is more concentrated than TSMC's foundry share, more concentrated than Nvidia's GPU share. Each layer deeper in the semiconductor stack is more concentrated than the one above it. ASML's moat is not technical alone — it is physical. Each EUV system costs €200–350M, weighs 180 tonnes, and has no competing product from any other company on Earth.

At the regulatory layer, Australia is out front on two dimensions. The AEMC proposed grid-connection standards for data centres — requiring "ride-through" during voltage and frequency disturbances, aligned with Texas, Ireland, and Finland. Chair Anna Collyer: "Data centres aren't passive loads anymore; they're active grid participants." Simultaneously, Wired reported the US government will ask data centres to disclose power consumption — a measure Mark flagged as "SIGNAL — Australia led the way and now the US is following." Regulatory infrastructure is globalising, and Australia is setting the template.

At the financial layer, the Anthropic tender offer at US$800B and Bloomberg's report that Mythos is heading to UK financial institutions next week mark the frontier model companies' transition from technology vendors to institutional infrastructure. Pip White, Anthropic's EMEA North Head: "The engagement from CEOs in the UK has been strong." When institutional finance — one of the most conservative sectors for technology adoption — deploys frontier models directly rather than through mediated SaaS, the cultural-adoption thesis has crossed a threshold.

The NOOPS Blue Owl Capital analysis argued that the market is systematically misreading Blue Owl's 48% YTD decline as a signal about AI infrastructure risk. The decline reflects fear of AI disrupting the PE-owned software companies in Blue Owl's lending book — not weakness in the data centre infrastructure Blue Owl finances. The central irony: institutions are selling the physical substrate AI runs on to free capital for the AI IPOs that need that substrate. Selling the foundation to buy the building.

AWS Trainium capacity nearly sold out, helium identified as an overlooked infrastructure bottleneck, and the continuation of RAMageddon — now reaching Windows Surface devices as the thin client returns by stealth — all point the same direction. The infrastructure layer is not overbuilt. It is rationed.

Paradigm casualties accumulate

The paradigm-shift thesis, tracked under Kuhn's framework of scientific revolutions, continued to find new casualties this week.

The pull request is going the way of the IDE. Sean Wang's Latent Space analysis, shared by John, argued that when agents generate code at volumes and velocities exceeding human review capacity, the PR becomes a bottleneck the new paradigm routes around. John's framing: "One of the foundational pieces of software engineering, in particular open source software, the pull request, is going the way of the IDE." The PR is not merely a tool — it is the mechanism by which distributed development communities coordinate trust, quality, and governance. What replaces it is an open question.

Snap's CEO named "small squads leveraging AI" in an SEC 8-K filing while laying off 1,000 employees (16% of workforce) and closing 300 roles. John noted the sentiment shift: "Two years ago such layoffs would send shareholders into a panic." The market now rewards AI-driven workforce reduction. Once "small squads leveraging AI" appears in 8-K filings, it becomes defensible for every other public company board considering the same move.

Ben Thompson turned on OpenAI — and the analytical framework broke with him. Microsoft renamed Copilot rather than rebuilding it, retreating from the brand rather than fixing the product. Steve Blank declared pre-watershed startups dead on arrival. Apollo flagged tech valuations back to pre-AI-boom levels. The grief cycle — anger, bargaining, depression — was visible across the discourse in a single day.

And Allbirds pivoted to AI, renaming itself NewBird AI and gaining 700% on a $21M market cap. A shoe company raising $50M on an AI announcement alone. John: "But even still down 95% on IPO." The AI pivot has become a survival strategy for companies being displaced by AI — the label itself is now worth more than many of the businesses it is being attached to.

Cybersecurity crystallises as infrastructure

This week cemented what NOOPS flagged on 14 April as "CYBER IS INFRA" — the first clearly defined soft infrastructure category.

The framing arrived from multiple directions simultaneously. Cybersecurity as proof of work: John's threshold argument crystallised the hardware-moment thesis — "Provided threshold conditions are met, more compute wins. We have met those threshold conditions now with Mythos." Once model capability crosses the threshold where scale matters, the cybersecurity contest becomes a compute contest. Aisle demonstrated that a "system over model" approach using adequate (not frontier) intelligence can find zero-day vulnerabilities through systematic coverage rather than model brilliance. GitHub Actions agent hijack showed that the attack surface has moved from training to runtime — agents, not models, are the vector.

OpenAI's Trusted Access for Cyber programme contrasts with Anthropic's closed Glasswing consortium. Two governance models for frontier cyber capability: open verification (OpenAI) versus closed consortium (Anthropic). Both agree on the premise: model capability in cybersecurity is now dangerous enough to require gated access. They disagree on the shape of the gate.

The investment implication: defensive cybersecurity infrastructure — runtime monitoring, agent sandboxing, identity verification, vulnerability remediation — is the next layer of the infrastructure stack to harden. The vendors who own that layer capture recurring value as long as frontier models keep advancing.

What we're watching next week

Mythos UK deployment: Bloomberg reported next week. The access-management patterns that emerge from institutional finance deploying a frontier model will set precedents for every other conservative sector.
Qwen3.6 real-world performance: Simon Willison's headline needs validation in production agentic workflows. Does "beats Opus on benchmarks" translate to "replaces Opus in harnesses"?
AEMC submissions: The grid-standards consultation closes 7 May. Industry responses will signal how seriously data centre operators take the "active grid participant" reclassification.
Anthropic IPO timeline: The $800B tender offer sits alongside Q4 2026 IPO discussions. Any acceleration signals — investment bank engagement, SEC filings — will move the institutional capital reallocation timeline forward.
Meta-harness replication: The Stanford paper's 6× harness gap is extraordinary. Independent replication — or failure to replicate — will determine whether this is a real phenomenon or an artefact.

The harness is where the alpha is today. The infrastructure is where it lives tomorrow. The smart money is already making the transition.

John Allsopp & Mark Pesce — Sydney, 17 April 2026