Level Up Coding
Every summary, chronological. Filter by category, tag, or source from the rail.
Harness Engineering: Stack Rules, Skills & Agents for Reliable AI Dev
Harness Engineering builds reliable AI code generation by stacking Rules (guidelines), Skills (SOPs), Sub-Agents (roles), Workflows (handoffs), Scripts (gates), and MCP (external tools) into a verifiable system, demonstrated in a minimal Go CLI project.
HTML Replaces Markdown for Interactive AI Outputs
Prompt AI agents for single-file HTML instead of long Markdown reports to create navigable, editable, interactive artifacts that humans can actually use, review, share, and act on.
Claude Code's CI Auto-Fix Closes PR Review Loop at $25 Each
Anthropic's Code Review now auto-patches code issues in open PRs via CI, eliminating manual fixes after agent-verified findings ranked by severity—upgrading the $15-25/PR tool amid past backlash.
Token Bucket Fails at Window Boundaries—Use Sliding Window
Token bucket rate limiting lets clients burst 40 requests across a minute boundary despite 100/min limit; sliding window counters prevent this by tracking requests in the last N seconds from now, enforcing even distribution.
Collaborative AI Writer: WebSockets + CRDT + Claude
Build multi-user real-time AI writing with FastAPI WebSockets for connections, CRDTs for conflict-free text sync, Claude streaming fanned to all users, and per-user token-bucket rate limiting to avoid bursts.
Meta Fired 1,100 AI Labelers After Union Vote Over Privacy
Meta terminated 1,100 low-wage data labelers earning $12-18/hr who saw sensitive user content for AI training; they voted to unionize six weeks prior, fired before union formed, despite Meta's automation claim.
AWS KMS Envelope Encryption Secures Data at Scale
Encrypt data efficiently with AWS KMS envelope pattern: Use master keys to generate ephemeral AES-256 DEKs for fast local encryption/decryption, storing only encrypted DEKs alongside ciphertext for auditable, revocable access.
Ditch Harmful Code: Software Isn't Morally Neutral
Reject the lie that 'it's just code'—some software builds digital slot machines and predatory debt traps, profiting from addiction and misery; evaluate projects by their real impact.
Skip Heavy Clean Architecture in Python Unless Scale Demands It
Over-applying clean architecture in Python FastAPI apps requires 7 changes for one field addition, killing velocity; Django's simple models need just 2 lines, proving less structure ships faster.
CLI Tools Like VHS for Reproducible Terminal Demos
Script terminal sessions in VHS .tape files for pixel-perfect GIFs/MP4s with custom fonts, speeds, and padding—instead of unreliable screen recordings.
9 Sections to Fix AI UI Inconsistency with DESIGN.md
AI agents build functional code but incoherent UIs; Google's DESIGN.md spec uses 9 markdown sections to enforce design system consistency across pages.
Manual Deployment Unlocks Foundry Hosted Agents
Deploy Foundry hosted agents by building container images in ACR, setting up Foundry Project with RBAC, creating via Azure SDK with env vars and resources (cpu=0.25, mem=0.5Gi), then assigning Azure AI User RBAC to Agent ID—avoids azd preview failures.
CUDA Matrix Transpose: Naive to Swizzled Optimization
Matrix transpose on GPU pits coalesced reads against writes; solve via shared memory tiling, then fix bank conflicts with padding or XOR swizzling, plus float4 vectorization for peak bandwidth.
Claude Code's 5-Layer Agent Kit Fixes Common Failures
Claude Code embeds a 5-layer architecture—CLAUDE.md memory, Skills expertise, Hooks guardrails, Subagents delegation, MCP tools—that most engineers overlook, preventing agent breakdowns from poor memory, modularity, or delegation.
Slash Claude Tokens with Graphify Graphs + Caveman
Graphify creates persistent codebase graphs to eliminate repeated repo scans by AI agents, while Caveman skill cuts response tokens up to 75% via caveman-style minimalism.
Ditch preferred_username for Azure AD Guest Auth
Using preferred_username as identity anchor worked for employees but failed silently for all B2B guests, causing 403 errors post-launch. Anchor on oid instead for reliable identification.
Standardize AI Android Coding on Ubuntu with Agent Kit
Install android-agent-project-kit once per repo to enforce shared Android standards across Claude, Codex, and Cursor agents, fixing inconsistencies in architecture, Compose patterns, tests, and PRs for predictable outputs.
Fix Prompt Fragility by Decomposing Agents into Microservices
Monolithic LLM prompts fail unpredictably from tiny changes because one model juggles routing, reasoning, validation, and more—decompose into sub-agents and nano models to shrink context 50-80%, cut costs 60-80%, and eliminate cascades.
North Korea Hit Axios NPM Maintainer, Exposing 100M Downloads
OpenAI detected NK hackers, but they compromised Axios (100M weekly downloads) via fake job offer to maintainer Jason Saayman on Microsoft Teams—not OpenAI directly.
k-NN on Google Searches Builds Explorable Knowledge Graph
Embed 800 results from 100 Google queries, run cosine k-NN to reveal 42.2% cross-query connections—every document links to at least one from a different search in its top 8 neighbors.
Hermes Agent: Always-On Memory via Bounded Core Files
Hermes embeds persistent memory directly in the system prompt using MEMORY.md (2,200 chars max) for agent notes and USER.md (1,375 chars) for user profile, forcing curation and enabling prefix caching, with optional external providers for additive recall.
Claude Code Skills Fix LLM Memory Gaps
Claude Code Skills package domain knowledge, workflows, and instructions into auto-loading modules, eliminating repetitive context re-entry in every new session.
Scale Compose Nav with Nested Graphs and State Layers
For apps with 20-50 screens, use one root NavHost with nested feature graphs, centralized route objects, and layered state (nav args for IDs, ViewModels for data, composables for UI) to prevent navigation fragility.
Reward Queries to Fix RAG Agent Failures
LLM search agents fail from poor initial queries; SmartSearch uses process rewards to refine them, preventing bad retrievals like mistaking actor Kevin McCarthy (1914) for politician (1965).
AI Intelligence: Compression Over Scale
True intelligence compresses data into minimal algorithmic rules via MDL, not memorizes petabytes. A 76k-parameter model solves 20% of ARC puzzles at inference, outpacing trillion-parameter LLMs through neuro-symbolic code generation.
Resilient LLM Streaming: Jitter, Breakers, 90s Checks
After 50k AI page generations, boost streaming success from 92% to 99%+ by treating networks as foes: jittered backoff stops thundering herds, 90s health checks catch silent stalls, circuit breakers prevent self-DOS.
AI Coding Saves 30-35% on Boilerplate, Needs Human Guardrails
In production, AI tools like Cursor and Claude cut coding time 30-35% by generating boilerplate schemas, tests, and refactoring explanations—but fail on domain logic, deprecated APIs, and context, requiring explicit prompts, version checks, and manual edge-case tests.
Flink Treats Batch as Streaming for Unified Low-Latency Processing
Apache Flink processes unbounded streams and bounded batches with one engine using operators, state, windows, and exactly-once guarantees, eliminating dual codebases for real-time apps like recommendation engines handling millions of events.
Preprocessing Swings CNN Accuracy from 65% to 87% on CIFAR-10
Raw CIFAR-10 pixels yield 65% test accuracy; normalization/standardization lift to 69%; geometric augmentation maintains ~67%; photometric brightness/contrast crashes to 20%; combined pipeline with deeper CNN hits 87%.
Ghosted After Take-Home? Turn It Into a GitHub Playground
Don't delete unused take-home code—publish it publicly on GitHub, iterate with new patterns, and transform it into a showcase that attracts contracts elsewhere.
Showing 30 of 69