#devops
Every summary, chronological. Filter by category, tag, or source from the rail.
Custom Elevated Sandbox Enables Safe Codex on Windows
OpenAI built a custom Windows sandbox for Codex using dedicated users, restricted tokens, firewall rules, and multi-binary setup to limit writes to workspace, block outbound network by default, and grant user-like reads without constant approvals.
CI/CD Breaks for Agents: Use Continuous Compute Loops
Traditional CI/CD chokes on thousands of agent PRs with cache thrash and merge bottlenecks; replace with intent-driven agent loops featuring inline validation, premerge reconciliation, and stateful continuous compute for sub-minute iterations.
MRC: Resilient Networking for 100K+ GPU AI Training
OpenAI's MRC protocol uses multi-plane topologies and packet spraying across hundreds of paths with SRv6 source routing to eliminate congestion, route around failures in microseconds, and connect 131k GPUs with just two switch tiers, enabling non-stop frontier model training.
OpenAI's Codex Controls: Sandbox, Rules, Telemetry
OpenAI deploys Codex coding agents with sandboxing for bounded execution, auto-approvals for low-risk actions, network/command restrictions, and OpenTelemetry logs to enable safe, auditable developer workflows without broad access.
AWS KMS Envelope Encryption Secures Data at Scale
Encrypt data efficiently with AWS KMS envelope pattern: Use master keys to generate ephemeral AES-256 DEKs for fast local encryption/decryption, storing only encrypted DEKs alongside ciphertext for auditable, revocable access.
MRC: OpenAI's Protocol for Resilient AI Training Networks
OpenAI's MRC extends RoCE with multipath spraying, microsecond failure recovery via SRv6, and multi-plane designs to deliver predictable performance in 131k-GPU clusters, using 2/3 fewer optics and 3/5 fewer switches than traditional setups.
MRC Enables 100k+ GPU Clusters with Resilient Multipath Networking
OpenAI's MRC protocol spreads packets across hundreds of paths for microsecond failure recovery, connecting 100,000+ GPUs via just 2 switch tiers—cutting power, cost, and downtime in AI training supercomputers.
Ditch preferred_username for Azure AD Guest Auth
Using preferred_username as identity anchor worked for employees but failed silently for all B2B guests, causing 403 errors post-launch. Anchor on oid instead for reliable identification.
SIE: Dynamic Inference for Small Models on Shared GPUs
Open-source SIE engine from Superlinked enables hot-swapping small embedding models (e.g., Stella, ColBERT) on one GPU via LRU eviction, cutting costs and solving context rot in agents by preprocessing data.
AI EngineerSecure AI Agents via MCP Toolbox Custom Tools
MCP Toolbox prevents confused deputy attacks by letting developers pre-write constrained SQL tools with bound parameters, separating agent flexibility from app-controlled security for runtime agents.
Replace Cron with Temporal for Reliable Data Jobs
Cron fails on retries, overlaps, and writes due to zero observability. Temporal workflows add retries (3s initial, 2x backoff, 8 max attempts), atomic writes, unique output files per run ID, SKIP overlap policy, and full execution history via UI—surviving crashes with state in Temporal.
Self-Host Vane + Ollama for Private AI Web Research
Install Vane in Docker on Windows 11 with local Ollama and Qwen3.5:9b to run citation-backed searches privately, bypassing cloud services like OpenAI.
Proactive Synthetic Monitoring Catches DevOps Failures Early
Simulate user actions like logins, searches, and API calls to detect regressions, availability issues, and performance degradation before production traffic, integrating tests into CI/CD for consistent validation.
IBM TechnologySageMaker Fine-Tuning: LoRA Beats QLoRA on Cost-Perf Balance
LoRA cuts trainable params by 96% vs full fine-tuning, balancing cost savings and accuracy on Llama2-7B/Mistral7B; QLoRA saves 8x memory but trains slower due to dequantization overhead.
Composable Specialists Beat Monoliths for Enterprise AI
Panel agrees enterprises need Granite 4.1's task-specific models and Bob's orchestration for cost control, with DiLoCo enabling distributed training to sidestep grid limits.
IBM TechnologyBigtable Scales Petabytes for Real-Time NoSQL Workloads
Bigtable auto-scales to hundreds of petabytes and millions of ops/sec with low latency, powering Google Search/YouTube/Maps; ideal for time series, ML features, and streaming via Flink/Kafka integrations.
Google Cloud TechScale PyTorch DDP Multi-Node on AWS EC2: Infra-First Guide
Multi-node DDP demands identical environments, data access, and open security groups across EC2 instances; use torchrun launcher with DDPManager for minimal code changes and reliable gradient sync via NCCL.
TPUs Dominate at Infrastructure Scale Over Per-Chip GPU Wins
Google's TPU v8t (training) and v8i (inference) lag Nvidia GPUs per chip but deliver superior performance at scale—9600-chip superpods hit 121 exaFLOPS FP4—via cube topology and Virgo networking, optimizing for AI's bandwidth-heavy workloads.
Batch Size Unlocks 1000x LLM Inference Efficiency
Reiner Pope deduces frontier LLM training and serving mechanics from roofline analysis, revealing batch size as the core driver of latency-cost tradeoffs, with optimal batches of ~2000 tokens amortizing weights for massive gains.
Dwarkesh PatelClaude Code's DIY-Heavy Tech Stack Picks
Claude Code prefers custom/DIY solutions in 12/20 tooling categories but defaults to Vercel (100% JS deploys), Stripe (91% payments), Shadcn (90% UI), GitHub Actions (94% CI/CD), revealing AI's influence on new dev stacks.
GitHub RCE via Single Git Push X-Stat Injection
Authenticated users exploited X-Stat field injection in GitHub's internal git protocol for RCE on GitHub.com and GHES using a standard git push, enabling access to millions of repos (CVE-2026-3854, High severity).
Scaffold AI Agent Prod Infra in 60s with Google Starter Pack
Google's Agent Starter Pack CLI generates full production-ready AI agent stack—FastAPI backend, Terraform IaC, CI/CD, Vertex AI eval, observability—in 60 seconds, cutting typical 3-9 month infra setup to minutes across 6 templates.
DIY Smart CodeGemma 4 Prod Stack: Model Armor, ADK Agents, Tracing
Deploy secure, observable Gemma 4 agents on Cloud Run using load balancers for Model Armor integration, ADK for model-agnostic agents with vLLM, and Prometheus/Cloud Trace for metrics like GPU util and latency.
Google Cloud TechMount S3 Buckets as File Systems with AWS S3 Files
AWS S3 Files mounts buckets directly as file systems on EC2, containers, and Lambda—eliminating FUSE hacks and sync scripts for AI/ML workflows, but misconfigurations risk exposing, corrupting, or losing data.
Self-Host Gemma 4 on Cloud Run GPUs: Ollama vs vLLM
Deploy open Gemma 4 LLM on serverless Cloud Run GPUs two ways: Ollama bakes model into container for instant cold starts; vLLM mounts from GCS FUSE for model swaps without rebuilds. Full CI/CD via Cloud Build.
AI Drafts Code Fast But Misses Context and Silent Bugs
Fully delegating dev workflow to AI sped up drafting but caused production issues like hollow tests, context-blind pipelines, AI self-reviews, and 34% webhook drop from unmodeled behavioral changes. Humans must supply context, break review loops, and validate impacts.
Zero Leak Debt: Kill 100+ Leaked Secrets Platform-Wide
Leaked secrets from 2022 still process payments as 'leak debt'; ruthlessly audit across local dev, CI/CD, and production to reach zero static secrets that never leak, expire unexpectedly, or need manual rotation.
8 AI Agents Turn Terminal into Free Cyber Audit Lab
One command spawns 8 specialist AI agents in Claude Code to audit codebases for vulnerabilities across OWASP Top 10, CWE Top 25, and more—boosted Claude Ads score from 62/100 (C) to 90/100 after fixes.
Agrici DanielScaling TPUs on GKE for Massive AI Workloads
GKE treats TPU slices as atomic units for seamless scaling up to 9k+ chips, with flexible capacity like DWS Flex/Calendar and custom fallbacks for cost-efficient ML training/inference.
Google Cloud TechSelf-Host Archon v3 on Hetzner VPS with Docker
Provision Hetzner VPS, apply cloud-init YAML for auto-setup of Archon v3 with Caddy HTTPS reverse proxy, Postgres DB, then configure .env secrets and optional form auth for secure 24/7 access via subdomain.
Showing 30 of 73