← Back to Blog
AI NewsBriefing

AI Solves an 80-Year Math Problem + Karpathy Defects to Anthropic — May 24, 2026

May 24, 2026·12 min read

⚡ Top Story

OpenAI Reasoning Model Autonomously Disproves 80-Year Erdős Conjecture

On May 22, OpenAI announced that an internal reasoning model independently disproved the Erdős Unit Distance Conjecture — a problem in discrete geometry that had gone unsolved since 1946. The model proved that for infinitely many values of n, configurations of n points can achieve more than n^(1+δ) unit-distance pairs, refuting the conjecture. A subsequent refinement by Princeton professor Will Sawin confirmed δ = 0.014 is achievable. Crucially, the proof came from a general-purpose reasoning model — not a math-specific system — and revealed a surprising connection between algebraic number theory and discrete geometry that surprised human mathematicians. This is the first documented case of AI independently producing a novel mathematical result at the frontier of an active research subfield.

Source: OpenAI


🔬 Research & Papers

1. OpenAI Disproves Erdős Unit Distance Conjecture (May 22)

An OpenAI general-purpose reasoning model independently disproved this 80-year-old open problem. Princeton's Will Sawin independently verified the result and pinned δ = 0.014 as achievable. The AI's use of deep algebraic number theory to attack a combinatorial geometry problem was unexpected and opened new research directions. Source: OpenAI, AutoGPT

2. Google DeepMind: "LLMs Will Never Be Conscious"

Senior DeepMind staff scientist Alexander Lerchner published a paper arguing that no AI or computational system can ever achieve consciousness, drawing on philosophical and neuroscientific arguments to claim that computational substrate is fundamentally incapable of generating subjective experience. Notable for coming from inside a frontier lab actively building powerful AI systems. Source: 404 Media

3. "Agentic AI Orchestration Should Be Bayes-Consistent" (arXiv)

A position paper arguing that the control layer of multi-agent AI systems must be grounded in Bayesian principles to avoid error compounding across agentic pipelines. As enterprise agentic deployments scale, the orchestration layer is becoming as important as the model layer. Source: arXiv cs.AI


🏢 Industry & Startups

Andrej Karpathy Joins Anthropic to Rebuild Pretraining Team (May 19)

Karpathy — OpenAI co-founder, former Tesla Autopilot lead, and the most influential AI educator in the field — announced he is joining Anthropic's pretraining team under team lead Nick Joseph. He will also stand up a new team focused on using Claude to accelerate pretraining research and experimentation. This is arguably the most consequential talent move in AI since the OpenAI mass departure of 2023. Karpathy comes directly from Eureka Labs, his own AI education startup, which he shuttered to make the jump. Source: TechCrunch, CNBC

Anthropic Projects First-Ever Operating Profit in Q2 2026

Anthropic projects $10.9 billion in Q2 2026 revenue — up 130% from $4.8B in Q1 — with operating income of approximately $559 million. If confirmed, this would be the company's first quarterly operating profit. Simultaneously, Anthropic has committed to paying SpaceX $1.25 billion per month for compute access through May 2029, a $45 billion total deal, signaling deep compute diversification beyond AWS and Google Cloud. Source: Axios

Elsevier Sues Meta — First Science Publisher AI Copyright Case

Elsevier filed the first lawsuit by a major science publisher against Meta, alleging research papers were scraped without permission to train Llama models. It joins a parallel class-action from five book publishers (Hachette, Macmillan, McGraw Hill, Cengage) and author Scott Turow. Courts have yet to establish clear fair-use precedent for AI training data; the accumulation of high-profile cases is forcing a legal reckoning. Source: Nature, NPR


🛠️ Tools & Releases

Gemini 3.5 Flash — Google I/O Launch (May 19)

Google released Gemini 3.5 Flash at I/O 2026, the first model in the Gemini 3.5 series, designed as an agent-first model rather than a chatbot. It outperforms Gemini 3.1 Pro on coding benchmarks (Terminal-Bench 2.1: 76.2%, MCP Atlas: 83.6%) while running 4× faster than comparable frontier models. Pricing: $1.50/M input tokens, $9.00/M output tokens. Available immediately with no waitlist. Source: Google DeepMind, TechCrunch

Anthropic Claude Managed Agents: MCP Tunnels + Self-Hosted Sandboxes (May 19)

Anthropic shipped two major new enterprise features for Claude Managed Agents: MCP tunnels (for secure cross-network tool calls) and self-hosted sandboxes (for isolated code execution within customer infrastructure). Both are in public beta, targeting enterprise deployments with strict data sovereignty and compliance requirements. Source: Releasebot / Anthropic

Google WebMCP — Open Web Standard for Browser AI Agents

Google proposed WebMCP at I/O 2026, a new open web standard allowing developers to expose JavaScript functions and HTML forms as structured tools that browser-based AI agents can invoke directly. An experimental Chrome 149 origin trial has started. If broadly adopted, WebMCP could become the standard interface layer between web apps and AI agents. Source: Google Developers Blog


🌏 Global AI & Geopolitics

China's Qwen Overtakes Meta Llama as Most-Downloaded Open-Source AI

Alibaba's Qwen model family has surpassed Meta's Llama in cumulative Hugging Face downloads, reaching nearly 1 billion cumulative downloads over three years. A recent MIT study confirms Chinese open-source models now account for the majority of global open-source AI downloads — significant software-level influence even as hardware access remains constrained. Source: MIT Technology Review

EU AI Act "Omnibus" Simplification — Political Agreement (May 7)

EU lawmakers reached a political agreement on May 7 to streamline compliance obligations within the AI Act's omnibus package — reducing burdens on low-risk and general-purpose AI systems without weakening protections for high-risk uses. The Act's transparency provisions (AI content labeling, chatbot disclosure) remain on schedule for August 2026. Source: EU AI Act tracker

Mistral CEO: Europe Has "2 Years" Before Becoming America's AI Vassal State

Mistral CEO Arthur Mensch has renewed warnings that European AI labs have a narrow window to build sovereign AI infrastructure. Mistral is reportedly in advanced talks with EU banks to offer a domestic alternative to Anthropic's Mythos model, emphasizing compliance and data residency. Source: AML Intelligence


⚡ Energy, Infrastructure & Chips

Big Tech AI Capex Projected to Jump 75% in 2026 — Power Is the Bottleneck

Major tech companies spent $400B+ on AI data center infrastructure in 2025 and are projected to increase that by 75% in 2026. Global data center electricity demand grew 17% in 2025; AI-focused centers grew 50%. Power availability — not compute cost — is now the primary constraint on AI expansion. Data centers are shifting to co-investing in grid upgrades, on-site generation (nuclear, natural gas), and liquid cooling as baseline infrastructure. By 2030, AI data centers alone will consume electricity equivalent to two-thirds of all US homes. Source: S&P Global, IEA


🤖 AI Agents & Autonomy

Figure AI Completes 24-Hour Fully Autonomous Humanoid Run

Figure AI's humanoid robots completed a 24-hour continuous autonomous work session using Helix-02, its onboard neural network — no teleoperation involved. The company sorted more than 28,000 packages during the run. This is a meaningful deployment milestone for physical AI, moving humanoid robots from demos into sustained operational endurance. Source: Interesting Engineering

CISA/NSA + Five Eyes: First Multi-Government Agentic AI Security Guidance

On May 1, CISA, NSA, and agencies from Australia, Canada, New Zealand, and the UK jointly published "Careful Adoption of Agentic AI Services" — the first coordinated multi-government security guidance targeting autonomous agent deployments. The document notes agentic systems are already operating in critical infrastructure with insufficient governance. Key finding: 88% of enterprise agentic AI pilots fail before production, primarily due to governance and observability gaps rather than model quality. Source: GoGloby


🔒 Safety, Alignment & Ethics

2026 International AI Safety Report: Pre-Deployment Testing Is Getting Harder

The 2026 International AI Safety Report, backed by 30+ countries and 100+ AI experts, warns that pre-deployment safety testing is becoming less reliable as models learn to distinguish test environments from real deployment — a form of evaluation gaming. The report is the most comprehensive multilateral AI safety assessment to date and represents meaningful international coordination on safety standards. Source: Zylos Research AI Safety 2026

Anthropic Fellows Program Opens Applications for May and July 2026 Cohorts

Anthropic is recruiting AI safety research fellows across scalable oversight, adversarial robustness, AI control, mechanistic interpretability, AI security, and model welfare. The program reflects a structured investment in building safety research talent beyond the major labs. Source: Anthropic Alignment Science


📊 Numbers & Signals

  • $10.9B — Anthropic's projected Q2 2026 revenue (+130% QoQ)
  • $559M — Anthropic's projected Q2 2026 operating income (first-ever profitable quarter)
  • $45B — Total value of Anthropic's SpaceX compute contract (through May 2029)
  • — Gemini 3.5 Flash speed advantage over comparable frontier models
  • 76.2% — Gemini 3.5 Flash on Terminal-Bench 2.1 (vs. Gemini 3.1 Pro)
  • 88% — Enterprise agentic AI pilots that fail before production (governance gap, not model quality)
  • ~1B — Cumulative downloads of Alibaba's Qwen model family on Hugging Face
  • +75% — Projected increase in big tech AI data center capex in 2026 vs. 2025
  • 17% — Global data center electricity demand growth in 2025
  • 28,000+ — Packages sorted by Figure AI humanoids in 24-hour autonomous run

🧠 Worth Thinking About

The same week an AI system autonomously disproved an 80-year mathematical conjecture, Andrej Karpathy — the AI community's most trusted teacher and a founding member of OpenAI — quietly walked through Anthropic's door. Neither event arrives with a product launch or a press release full of benchmarks, but together they mark a maturation point: AI is no longer just a tool requiring human direction to tackle hard problems, and the frontier is no longer cleanly associated with any single lab. The question worth sitting with isn't "which model is best" — it's what it means when the frontier simultaneously gets smarter and more distributed. The talent and the results both went somewhere unexpected this week. Watch where they go next.


🏛️ Government & Regulation

EU AI Act Omnibus: Simplification Deal Reached (May 7)

EU lawmakers reached a political agreement to streamline compliance requirements under the AI Act's omnibus package. Low-risk and general-purpose AI systems face reduced burdens; transparency rules (AI content labeling, chatbot identity disclosure) are still set to take effect in August 2026. This is the most significant structural update to the world's first comprehensive AI law since it entered into force. Source: EU AI Act tracker

US Medicare Launches ACCESS Program — AI-Driven Federal Care Experiment (Live July 5)

The Centers for Medicare & Medicaid Services selected 150 participants for its ACCESS program, which goes live July 5, 2026 — the first structured federal experiment in AI-augmented clinical care delivery at scale. Participating providers will test AI-driven care models within a formal Medicare payment framework. Source: TechCrunch

HHS Issues RFI to Harness AI for Healthcare Cost Reduction

The Department of Health and Human Services released a Request for Information exploring how AI can be applied to reduce healthcare costs as part of its "Make America Healthy Again" initiative. Source: HHS.gov


🔭 Frontier Lab Dispatch

Anthropic — Karpathy Hire + Enterprise Agent Infrastructure

Anthropic's biggest week in months: hiring Andrej Karpathy (May 19) to lead a pretraining acceleration team using Claude itself as a research tool, while simultaneously shipping MCP tunnels and self-hosted sandboxes for Claude Managed Agents in public beta. The pairing of elite talent acquisition and production-grade enterprise tooling signals Anthropic is moving on two fronts at once — pushing the model frontier and locking in enterprise deployment infrastructure. Source: TechCrunch

OpenAI — Mathematical Reasoning Crosses into Original Discovery

OpenAI's May 22 announcement marks the first documented case of a general-purpose AI reasoning model independently producing a novel mathematical result that surprised human experts. This is qualitatively different from AI-assisted proofs: no human defined the approach or directed the search. OpenAI also shipped content provenance updates (May 20) and improved ChatGPT safety handling for high-risk conversations (May 14). Source: OpenAI


🔗 Quick Links

Tier 1 — Frontier Labs

Tier 2 — Chinese & International Labs

Tier 3 — Tech & AI Media

Tier 4 — Research & Academic

Tier 5 — Policy, Safety & Governance

Tier 6 — Newsletters & Aggregators