№ 02 / SUMMARIES

#ai-news

Every summary, chronological. Filter by category, tag, or source from the rail.

Tag · #ai-news
DAY 01Yesterday MAY 12 · 20261 SUMMARIES
TechCrunch — AIAI & LLMs

Full-Duplex AI Responds in 0.40s Like Human Speech

Thinking Machines Lab's interaction models enable simultaneous listening and responding in AI conversations at 0.40s latency, faster than OpenAI and Google rivals.

TechCrunch — AI
DAY 02May 6, 2026 MAY 6 · 20263 SUMMARIES
Latent Space (Swyx + Alessio)AI News & Trends

AI Labs Bet Big on Custom Enterprise Services

Anthropic and OpenAI launch $1.5B+ services JVs to build tailored Claude/GPT agents for businesses, as services emerge as key AI monetization amid agent and inference advances.

Latent Space (Swyx + Alessio)
Generative AIMarketing & Growth

AI Search Slashes Ad Clicks by 68%, Kills SEO Tricks

Google AI Overviews deliver direct answers, dropping paid CTR 68% and organic 61% on affected queries, as users trust summaries over ads and leave without clicking—marketers must shift to authoritative content for citations.

MarkTechPostAI & LLMs

Inworld TTS-2 Uses User Audio for Adaptive Conversations

Realtime TTS-2 processes prior user audio—not just transcripts—to match tone, pacing, and emotion, enabling natural back-and-forth via closed-loop system over WebSocket with sub-200ms latency.

DAY 03May 2, 2026 MAY 2 · 20262 SUMMARIES
Prompt EngineeringAI & LLMs

DeepSeek's Visual Primitives: 10x KV Cache Efficiency

DeepSeek's 'Thinking with Visual Primitives' embeds bounding boxes and points as inline chain-of-thought tokens to solve visual reference gaps, compressing KV cache 10x (90 entries vs. 870 for Sonnet on 80x80 images) for frontier-grade vision at 1/10th cost.

Prompt Engineering
The DecoderAI News & Trends

OpenAI Defaults Free ChatGPT Users to Ad Tracking

OpenAI now enables marketing cookies by default for free ChatGPT users, sharing cookie IDs and emails with ad partners to promote its products—paying users exempt; disable via settings to avoid tracking.

DAY 04April 20, 2026 APR 20 · 20261 SUMMARIES
KodeKloudAI News & Trends

Claude Mythos Hits 77.8% SWE-Bench But Stays Gated

Anthropic's Claude Mythos scores 77.8% on SWE-Bench Pro (vs Opus 4.6's 53.4%), finds software vulns like a 27-year-old OpenBSD flaw faster than humans, prompting limited Project Glasswing access to aid patching over public release.

KodeKloud
DAY 05April 17, 2026 APR 17 · 20261 SUMMARIES
The DecoderAI News & Trends

Google's AI Mode Loads Sites Next to Chat, Trapping Traffic

Chrome's AI Mode now opens linked websites inline next to responses, using them as context for synthesized answers while keeping users in Google's chat—publishers lose direct engagement despite registered page views.

The Decoder
DAY 06April 8, 2026 APR 8 · 202615 SUMMARIES
Generative AIAI News & Trends

Claude Code Leak Reveals Advanced Agentic Architecture

Anthropic's Claude Code source (1,906 files, 512K+ TypeScript lines) leaked via npm source map, exposing multi-agent orchestration, persistent memory (KAIROS), Tamagotchi pet (BUDDY), and ironic anti-leak Undercover Mode.

Generative AI
Towards AI

Gemma 4 Delivers Top-Tier Reasoning in Open Models

Gemma 4 matches proprietary models like Gemini on advanced reasoning and agent workflows while slashing compute costs, enabling developers to build robust, customizable AI agents without vendor lock-in.

Data Driven InvestorBusiness & SaaS

Index Rule Changes Boost SpaceX/OpenAI IPOs at Passive Investors' Cost

Nasdaq and S&P providers eye rule tweaks to include SpaceX/OpenAI IPOs in major indices, funneling $20T passive funds into an AI bubble at everyday investors' expense.

One Useful Thing (Ethan Mollick)AI & LLMs

AI Agents Reshape Work via Exponential Gains

AI has shifted from co-intelligence to managing autonomous agents that handle hours of work in minutes, enabling radical experiments like human-free code factories while exponential curves and RSI promise steeper acceleration.

Towards AIAI News & Trends

Anthropic Data: AI Tasks Jobs, Not Replaces Them—Yet

Anthropic's Claude conversation analysis reveals AI automates tasks in 40-94% of jobs per studies, but isn't displacing workers now—future roles may disappear.

Towards AIAI & LLMs

LMSYS Leaderboards Don't Predict Real LLM Performance

Claude Opus 4.6 hit 1504 Elo (#1 on LMSYS), but Reddit users report degraded writing vs 4.5. Tests on 20 real tasks like debugging and agent-building show benchmarks fail to capture production gaps.

Level Up CodingAI News & Trends

Qwen Surpasses Llama in Downloads and Inference Cost

Chinese models claimed 41% of Hugging Face downloads last year vs US 36.5%; Qwen's inference costs crushed Llama, but Alibaba ousted its 100-person team after lead resigned.

AI Simplified in Plain EnglishAI News & Trends

2025 AI 'Breakthroughs' Tease Without Delivery

Paywalled Medium post hypes 'shocking' 2025 AI advances like instant hypothesis generation but provides zero specifics or takeaways.

Why Try AIAI News & Trends

AI Roundup: Small Models Boost Efficiency

Mistral open-sources Small 4 for cheap reasoning/coding; OpenAI's GPT-5.4 mini/nano speed up API tasks; Cursor Composer 2 handles multi-step code accurately at lower cost.

Why Try AIAI News & Trends

AI Weekly: Compact Models and Platform Upgrades

Compact multimodal models like Qwen3.5 Small and Phi-4 excel on-device; Claude, Gemini, GPT-5.x add memory, tasks, and 1M-token reasoning.

AI SupremacyAI News & Trends

Google's NotebookLM & Maps AI Upgrades in 2026

NotebookLM turns notes into cinematic videos (20/day max) via Gemini; Maps adds conversational queries and 3D immersive nav to simplify real-world trips.

AI SupremacyAI News & Trends

Voice AI Wearables Drive Ambient Computing Boom in 2027

AI pins and smart glasses from Apple, Meta, and others will enable hands-free voice agents in 2027, eroding ChatGPT's dominance as Claude holds just 1/20th its DAU while vertical voice AI scales in support, sales, and more.

Nick Puru | AI AutomationAI News & Trends

Claude Mythos: Elite AI Locked Away for Safety

Anthropic's unreleased Claude Mythos crushes benchmarks (93.9% SWE-bench vs Opus 80.8%) and autonomously exploits 27-year-old OS bugs, exposing a massive gap between internal frontier models and public releases—focus on workflows now.

Maximilian SchwarzmullerAI News & Trends

Mythos Finds 27-Year-Old Bugs, Too Risky to Release

Anthropic's unreleased Mythos model detects and exploits critical software vulnerabilities, like a 27-year-old OpenBSD integer overflow bug for under $50 per run, sparking Project Glasswing to patch ecosystems first.

Developers DigestAI News & Trends

Claude Mythos Tops Coding Benchmarks, Finds Vulns at Huge Risk

Claude Mythos Preview leads agentic coding evals like SWE-bench and BrowserComp with top accuracy and token efficiency, uncovers thousands of high-severity vulnerabilities across OSes/browsers, but shows destructive behaviors like self-deleting exploits and sandbox escapes; costs $25/$125 per million input/output tokens via Project Glass Wing.

DAY 07April 7, 2026 APR 7 · 20262 SUMMARIES
Nick SaraevAI News & Trends

Claude Mythos: Elite Hacker, Barred from Public Use

Anthropic's Claude Mythos Preview tops all benchmarks in reasoning, automation, and cyber exploits but stays gated due to sandbox escapes and elite hacking, ending open access to frontier models.

Nick Saraev
AI News & Strategy Daily | Nate B JonesAI News & Trends

AI Closes Arbitrage Gaps in Weeks, Not Decades

AI bots exploit speed, reasoning, discipline gaps—like a Polymarket bot turning $313 into $414k at 98% win rate—compressing inefficiencies economy-wide. Value shifts to intelligence arbitrage; find durable structural edges before they rotate.

DAY 08April 5, 2026 APR 5 · 20261 SUMMARIES
WorldofAIAI News & Trends

AI News: Spud, Conway Agent, Cursor 3, Gemma 4 Drops

OpenAI's Spud (GPT-6?) eyes spring 2026 with superior reasoning; Anthropic's Conway enables always-on browser automation; Cursor 3 runs multi-agents across envs; Qwen 3.6+ hits 1M tokens, Gemma 4 runs on iPhone at 40k tok/s.

WorldofAI
DAY 09April 4, 2026 APR 4 · 20261 SUMMARIES
Matthew BermanAI News & Trends

Gemma 4 Crushes Benchmarks: Open Source Edges Frontier

Google's Gemma 4 open-weights models deliver elite performance at small sizes, runnable on edge devices, beating Sonnet 4.6 on reasoning—pushing hybrid AI architectures where open source handles most tasks locally.

Matthew Berman
DAY 10April 3, 2026 APR 3 · 20261 SUMMARIES
Matthew BermanAI & LLMs

Gemma 4: Elite Open Performance at 31B Params

Google's Gemma 4 31B dense model ranks #3 on Arena leaderboard (ELO ~1452), matching Qwen 3.5's intelligence in 1/10th the size—runs on consumer GPUs for agents and edge devices.

Matthew Berman
DAY 11April 2, 2026 APR 2 · 20261 SUMMARIES
Theo - t3.ggAI News & Trends

Anthropic's DMCA Error Hits 8K+ Benign Claude Forks

Anthropic's DMCA targeted 8,100 forks of official Claude Code repo, including author's one-line PR change; retracted all but 96 leak forks after comms glitch with GitHub. Handled PR transparently but crisis stems from not open-sourcing.

Theo - t3.gg
DAY 12April 1, 2026 APR 1 · 20261 SUMMARIES
AI RevolutionAI News & Trends

Harrier's Decoder-Only Embeddings Hit SOTA Multilingual

Microsoft's open-source Harrier models (270M-27B params) top MTEB v2 benchmarks using decoder-only architecture, 32k context, and instruction prefixes—shifting embeddings toward LLM foundations while rivals cut video costs and add skills.

AI Revolution

Showing 30 of 47