Summaries · #open-source

DAY 01Yesterday MAY 12 · 20261 SUMMARIES

MarkTechPostMay 12, 2026

AntAngelMed: 103B MoE Medical LLM Matches 40B Dense at 7x Speed

103B-param open-source medical LLM activates only 6.1B params via 1/32 MoE, rivals 40B dense models with 7x efficiency, tops HealthBench/MedBench, runs 200+ tps on H20.

MarkTechPost

DAY 02Monday MAY 11 · 20261 SUMMARIES

MarkTechPostMay 11, 2026

TwELL Delivers 20% LLM Speedups via GPU-Optimized Sparsity

Use ReLU gate activation + L1=2e-5 on hidden activations to induce 99.5% sparsity in feedforward layers, then TwELL CUDA kernels yield 20.5% inference and 21.9% training speedups on H100s with no accuracy loss.

MarkTechPost

DAY 03Sunday MAY 10 · 20264 SUMMARIES

AI EngineerAI AutomationMay 10, 2026

Replay Logs Fail Agents: Use VM Snapshots Instead

Replay durability constrains agent code with growing logs; split into context logs (DB durable) and execution snapshots (14MB Firecracker VMs, <1s save/100ms restore) for multi-day sessions.

AI Engineer

WorldofAIAI AutomationMay 10, 2026

Hermes Desktop App Enables Easy Self-Evolving AI Agents

Hermes Agent runs 24/7 persistent, self-improving AI agents locally with long-term memory and closed learning loops; new Desktop App adds intuitive UI for setup, multi-agent management, and tools on Windows, macOS, Linux.

MarkTechPostSoftware EngineeringMay 10, 2026

Rust CUDA Kernels via Direct PTX Compilation

cuda-oxide lets you write safe Rust SIMT GPU kernels that compile directly to PTX using a custom rustc backend, skipping C++ or DSLs—host/device in one .rs file, with cargo oxide build producing binary + .ptx.

Nate Herk | AI AutomationAI AutomationMay 10, 2026

Build Hermes AI Agent: VPS Setup to Scaled Automations

Follow this step-by-step guide to deploy Hermes Agent on a VPS, integrate Telegram, create skills/crons, backup to GitHub, and scale multiple agents for proactive AI assistance.

DAY 04Saturday MAY 9 · 20263 SUMMARIES

Y CombinatorAI AutomationMay 9, 2026

Trigger.dev: Async Infra Powers 90% AI Agents

Trigger.dev evolved from Zapier-for-devs background jobs to a reliable SDK for executing AI agents, hitting PMF with v3's hosted execution and checkpoint-resume primitives—perfectly timed for agent era, now 90% usage from agents.

Y Combinator

Better StackAI AutomationMay 9, 2026

Symphony: Agents Autonomously Claim and Complete Tasks

OpenAI's Symphony uses issue trackers like Linear to let coding agents claim tasks, spin up isolated workspaces, and only ping humans for reviews—solving the 3-5 session supervision bottleneck. Install by prompting an agent with a 2000+ line spec to build it.

MarkTechPostDeveloper ProductivityMay 9, 2026

Spec-Kit: Specs-First AI Coding for Reliable Production Code

GitHub's open-source Spec-Kit (90k+ stars) uses Spec-Driven Development to ground AI agents in structured specs, generating testable code that matches intent—fixing 'vibe-coding' failures in prototypes turned production.

DAY 05Friday MAY 8 · 20262 SUMMARIES

AI Summaries (evaluation playlist)AI AutomationMay 8, 2026

Anthropic Open-Sources Wall St Analyst Agents

Anthropic released 10 end-to-end Claude agents mimicking Goldman Sachs analyst roles, with prompts, checklists, 11 licensed data connectors, and 7 vertical bundles—democratizing workflows once locked behind $25k terminals and bank secrecy.

AI Summaries (evaluation playlist)

The PrimeTimeSoftware EngineeringMay 8, 2026

Zig Rejects Bun's Fork Over LLM Policy and Flawed Speed Hack

Bun's Zig fork uses LLM for 4x faster debug builds via parallel analysis, but Zig rejects it for non-determinism risks and upstream incompatibility; Zig prioritizes careful engineering with LLVM bypass for true 40s-to-0.5s speedups.

DAY 06Thursday MAY 7 · 20264 SUMMARIES

MarkTechPostAI & LLMsMay 7, 2026

TokenSpeed Beats TensorRT-LLM 9-11% on Agentic Coding Inference

TokenSpeed open-source engine optimizes agentic workloads with long contexts (>50K tokens) and multi-turn convos, delivering 9% lower latency and 11% higher throughput than TensorRT-LLM at 70-100 TPS/user on NVIDIA B200.

MarkTechPost

AI RevolutionMay 7, 2026

DeepSeek-TUI: Viral Open-Source Claude Code Rival

DeepSeek-TUI, a Rust-based terminal AI coding agent powered by DeepSeek V4's 1M-token context, hit 10k+ GitHub stars in days as a cheap, customizable alternative to Claude Code, built by a music/law student using AI-assisted coding.

AI News & Strategy Daily | Nate B JonesMay 7, 2026

OpenClaw's April Shift: Model-Swappable Agent Runtime

OpenClaw evolved from viral demo to durable agent runtime with task orchestration, mature memory, and channels—enabling workflows that swap models like Claude, Codex, or Gemma 4 to survive provider changes.

Sam WitteveenAI & LLMsMay 7, 2026

IBM Granite Speech 4.1: 3 ASR Models for Accuracy, Features, Speed

IBM's 2B Granite Speech 4.1 suite offers three trade-offs: base leads Open ASR Leaderboard (WER 5.33, RTF 231), Plus adds diarization/timestamps, NAR hits RTF 1820 on H100 via transcript editing.

DAY 07May 5, 2026 MAY 5 · 20264 SUMMARIES

Towards AIAI & LLMsMay 5, 2026

637MB LLM Runs Offline on Base MacBook Air, Works Surprisingly Well

TinyLlama, a 637MB open-source LLM, runs instantly on a stock MacBook Air via Ollama—no internet, GPU, or API needed—handling Node.js servers and casual chats effectively, lowering the bar for useful local AI.

Towards AI

AI EngineerAI AutomationMay 5, 2026

SIE: Dynamic Inference for Small Models on Shared GPUs

Open-source SIE engine from Superlinked enables hot-swapping small embedding models (e.g., Stella, ColBERT) on one GPU via LRU eviction, cutting costs and solving context rot in agents by preprocessing data.

WorldofAIDesign & FrontendMay 5, 2026

Open Design: Free Open-Source Claude Design Clone

Open Design replicates Claude Design's AI-powered UI generation locally for free, using any model or CLI agent, with 31 skills and 72 design systems for production-ready landing pages, decks, and prototypes.

Generative AIAI AutomationMay 5, 2026

Self-Host Vane + Ollama for Private AI Web Research

Install Vane in Docker on Windows 11 with local Ollama and Qwen3.5:9b to run citation-backed searches privately, bypassing cloud services like OpenAI.

DAY 08May 4, 2026 MAY 4 · 20262 SUMMARIES

Level Up CodingSoftware EngineeringMay 4, 2026

North Korea Hit Axios NPM Maintainer, Exposing 100M Downloads

OpenAI detected NK hackers, but they compromised Axios (100M weekly downloads) via fake job offer to maintainer Jason Saayman on Microsoft Teams—not OpenAI directly.

Level Up Coding

The DecoderAI AutomationMay 4, 2026

Symphony: Agents Autonomously Manage Tasks from Linear

OpenAI's Symphony spec lets Codex agents pull open tickets from Linear, work independently until completion, and self-file issues—boosting merged PRs 6x in 3 weeks by eliminating human micromanagement.

DAY 09May 3, 2026 MAY 3 · 20263 SUMMARIES

AI EngineerAI & LLMsMay 3, 2026

Tiny LLMs and On-Device Agents via LiteRT-LM on Edge Hardware

LiteRT-LM runs Gemma 2B/4B models at 1000+ tokens/sec on phones and delivers agent skills with function calling, while tiny 100-500M param models excel in fine-tuned in-app tasks like voice-to-action at 85-90% reliability.

AI Engineer

DIY Smart CodeAI AutomationMay 3, 2026

HyperFrames Wins for AI Agents: 7s Setup vs Remotion's 50s

HyperFrames delivers 7-second time-to-first-video with zero build step and Apache 2.0 license, beating Remotion's 50s React-heavy setup—ideal for AI agents generating videos from HTML prompts without coding skills.

Data and BeyondAI AutomationMay 3, 2026

Open-Source AI Auto-Tags PDFs for Accessibility

OpenDataLoader delivers production-ready, open-source PDF auto-tagging via heuristic or hybrid AI modes, reconstructing structure for screen readers and AI pipelines without proprietary tools.

DAY 10May 2, 2026 MAY 2 · 20261 SUMMARIES

Chase AIAI & LLMsMay 2, 2026

10 New OSS Tools to Supercharge Claude Code

Recent open-source tools for Claude Code deliver wins like 5% token savings via caveman brevity, 71.5x fewer tokens with Graphify graphs, local design cloning, video processing, and self-healing browsers—check repos for immediate productivity boosts.

Chase AI

DAY 11May 1, 2026 MAY 1 · 20262 SUMMARIES

Chase AIDesign & FrontendMay 1, 2026

Open Design: GUI Claude Design Clone Without Usage Limits

Open Design replicates Claude Design's graphical interface for AI-generated prototypes and slide decks, built on Huashu Design, integrates with any LLM CLI like Claude Code to bypass Anthropic usage restrictions, and includes 31 skills plus 72 pre-built design systems.

Chase AI

MarkTechPostAI & LLMsMay 1, 2026

Qwen-Scope SAEs Unlock Actionable LLM Internals

Qwen-Scope's open SAEs on 7 Qwen models decompose activations into interpretable features for steering outputs, proxy benchmark analysis (ρ=0.85 correlation), toxicity classification (F1>0.90), and training fixes like 50% code-switching reduction.

DAY 12April 30, 2026 APR 30 · 20262 SUMMARIES

Better StackDeveloper ProductivityApr 30, 2026

SiYuan: Refactor Notes Like Code Without Broken Links

SiYuan uses permanent block IDs for unbreakable references and built-in SQL databases, letting developers organize technical notes like structured codebases locally, outperforming Obsidian's file links and Notion's cloud lock-in.

Better Stack

AICodeKingAI & LLMsApr 30, 2026

Gemma Chat: Offline Vibe Coding with Gemma 4 on Mac

Gemma Chat runs Google's Gemma 4 locally on Apple Silicon Macs via MLX for private, offline app building with live previews, file editing, and agentic tools—no API keys or subscriptions needed.

DAY 13April 29, 2026 APR 29 · 20261 SUMMARIES

AI Summaries (evaluation playlist)Apr 29, 2026

Pi's Self-Modifying Agents: Power and Perils

Mario Zechner built Pi, a minimalist self-modifying AI coder powering OpenClaw. With Armin Ronacher, they praise its potential but warn against over-automation eroding code quality—human judgment remains key.

AI Summaries (evaluation playlist)