The stream in tweet form. 361 of 361 posts · 30 at a time
Dive Club@diveclub·
This is a conversation about shifting from traditional design software like Figma to an agent-based, local-first workflow using Codex. The guest demonstrates how he uses the tool to build custom internal apps, automate image generation, and manage his entire project lifecycle through local markdown files and AI agents.
IBM Technology@ibmtechnology·
This is a technical primer on how vLLM manages GPU memory to improve inference throughput. The video explains the mechanics of KV caching and paged attention, offering a few specific configuration tips for optimizing memory utilization, prefix caching, and chunked prefill in production environments.
AI Engineer@aiengineer·
This video outlines a structured rubric for designing AI agent skills to avoid "skill hell," focusing on four key areas: choosing between user-invoked and model-invoked triggers, organizing content into steps and reference material, using "leading words" for steering, and pruning unnecessary instructions. You can find the framework and examples in the writing-great-skills resource.
Google Cloud Tech@googlecloudtech·
A technical walkthrough of building a multi-agent system using Google's Agent Development Kit and the Model Context Protocol. The video demonstrates how to use ADK’s built-in session state, instruction templating, and three-tier security callbacks to manage agent routing and tool execution without writing custom orchestration logic.
Brian Casel@briancasel·
This video demonstrates a workflow for building a custom time-tracking and invoicing web app using Claude Code. The creator walks through his process of using a PRD-generation tool to define scope and data models before having the AI write the application code.
IBM Technology@ibmtechnology·
This is a high-level conceptual overview of how AI-assisted code review works, explaining the roles of static analysis, linters, and LLMs in a development pipeline. It functions as an introductory primer for managers or developers curious about the benefits and risks of integrating these tools into their existing workflows.
IBM Technology@ibmtechnology·
This is a high-level conceptual overview of how AI-assisted code review works, explaining the roles of static analysis, linters, and LLMs in a development pipeline. It serves as a general introduction to the topic rather than a hands-on tutorial or a review of specific AI Code Review tools.
MicroConf@microconf·
This talk is a practical guide for SaaS founders on how to execute price increases without alienating their user base. The speaker outlines a framework for choosing between adjusting price points, packaging, or value metrics, and provides a communication strategy for rolling out changes while managing customer expectations and internal team alignment.
AI Engineer@aiengineer·
This presentation argues for replacing expensive, latency-heavy frontier models with smaller, local models (SLMs) for specific production tasks. The speaker outlines a "prototype big, deploy small" workflow, demonstrating how to build a golden dataset from production traces to validate that a smaller model can handle your specific use case as effectively as a foundation model.
AI Engineer@aiengineer·
This video advocates for a "prototype big, deploy small" workflow, encouraging developers to use frontier models for initial testing before migrating to smaller, local models for production. The presenter demonstrates how to build a "golden dataset" from production traces to validate these smaller models using Arize Phoenix for evaluation.
AI Engineer@aiengineer·
This talk argues that the current approach to AI agents—stuffing tools and context into a single large model—is fundamentally flawed. The speaker advocates for a shift toward "domain-specific agents" that are modular and composable, rather than relying on monolithic systems or just adding more tools via the Model Context Protocol.
AI Engineer@aiengineer·
This talk argues that the current "all-in-one" agent architecture is flawed because it relies on stuffing too much context into a single model. Instead, the speaker advocates for building smaller, domain-specific agents that can be composed together, treating them like modular software components rather than monolithic assistants.
AI Engineer@aiengineer·
This talk argues that as AI lowers the barrier to writing code, the primary bottleneck in software development has shifted back to traditional business analysis. The speaker advocates for using established frameworks like story mapping and a "Value-Architecture-Design" (VAD) approach to ensure teams build high-impact tools rather than just automating existing inefficiencies.
AI Engineer@aiengineer·
This talk argues that as AI lowers the barrier to writing code, the primary bottleneck in software development has shifted back to traditional business analysis. The speaker advocates for using established techniques like story mapping and a "Value-Architecture-Design" (VAD) framework to define requirements before prompting AI, emphasizing that the goal is to build software that is actually used rather than just demoed.
AI Engineer@aiengineer·
This talk from Nishant Gupta outlines the shift from "prompt engineering" to "systems engineering" for production-grade AI agents. He argues that because LLMs are stochastic, developers must build a deterministic "control plane" around them to handle orchestration, observability, and safety, rather than letting agents interact directly with production infrastructure.
AI Engineer@aiengineer·
This talk outlines the shift from "prompt engineering" to "systems engineering" for production AI agents. Nishant Gupta argues that because LLMs are stochastic, you must wrap them in a deterministic control plane—using patterns like circuit breakers, policy-based tool validation, and observability—to prevent infrastructure-level failures like recursive retry loops.
AI Engineer@aiengineer·
This video outlines a conceptual framework for automating the software development lifecycle of AI agents. The speakers propose using a multi-agent system to handle the specification, building, evaluation, and optimization of other agents, aiming to replace manual human oversight with an "eval-driven" loop.
AI Engineer@aiengineer·
This video is a conceptual pitch for Mutagent, a platform designed to automate the agent development lifecycle. The speakers outline an "eval-driven" workflow where an orchestrator agent manages the specification, building, testing, and monitoring of other AI agents to remove human bottlenecks from the iteration loop.
AI Engineer@aiengineer·
This talk outlines a workflow for using AI agents to synthesize bespoke software implementations from abstract specifications. The speaker argues that by using deterministic simulation environments to expose "forbidden" state information, agents can move upstream from writing code to designing robust distributed systems.
AI Engineer@aiengineer·
This talk outlines a methodology for using AI agents to synthesize bespoke software implementations from abstract specifications, rather than relying on general-purpose frameworks. The speaker details a multi-step workflow—moving from abstract spec to simulation to concrete implementation—using Resonate as a case study for building durable execution systems.
AI Engineer@aiengineer·
This talk outlines an architecture for automating ETL pipeline recovery using a combination of deterministic anomaly detection, Q-learning for action selection, and an external safety layer. The speaker emphasizes that the system's reliability stems from its modular design—separating observation from decision-making—rather than the RL model itself, and provides a public repository containing the synthetic benchmark and experimental scripts used to validate the approach.
AI Engineer@aiengineer·
This talk outlines a framework for automating ETL pipeline recovery using a combination of deterministic anomaly detection, Q-learning for action selection, and a hard-coded safety layer. The speaker demonstrates how this architecture—available in their GitHub repository—reduces mean time to recovery by offloading routine failures to an agent while escalating high-risk or novel cases to human engineers.
AI Engineer@aiengineer·
This talk argues that trying to force LLMs to be deterministic is a dead end for production debugging. Instead, the speakers propose a "record and replay" pattern—similar to Mozilla rr—where you capture inputs and outputs at the boundaries of your agent’s nodes to reconstruct and step through failed execution traces without needing to re-run the model.
AI Engineer@aiengineer·
This talk argues that chasing "bitwise determinism" in LLM agents is a losing battle and proposes a "record and replay" architecture instead. The speakers demonstrate how to capture inputs and outputs at the boundary of each agent node—rather than the network layer—to allow engineers to step through failed production traces offline without re-triggering the model.
Machine Learning Street Talk@machinelearningstreettal·
This is a wide-ranging technical conversation with Thomas Ahle about Normal Computing's efforts to automate chip design. The discussion moves from the practical challenges of building an open-source Verilog simulator to the theoretical hurdles of formal verification and the physics behind their thermodynamic computing hardware.
AI Engineer@aiengineer·
This talk argues that "voice-in, visuals-out" is a more practical UX pattern than full voice-to-voice because it leverages the human brain's visual processing power while staying within a more forgiving latency budget. The speaker outlines three technical requirements for building these responsive agents: using low-latency models like Haiku, triggering frequent, eager inference turns rather than waiting for silence, and aggressive use of prefix caching to keep response times under one second.
AI Engineer@aiengineer·
This talk outlines a practical strategy for building "voice-in, visuals-out" AI products by prioritizing latency over conversational perfection. The speaker argues that while voice-to-voice interfaces are currently hindered by extreme latency requirements, voice-to-visuals allows for a more forgiving one-second response window, which he achieves by using fast models like Claude Haiku, eager inference intervals, and aggressive prefix caching.
AI Engineer@aiengineer·
This presentation outlines a conceptual framework for enterprise fraud detection that replaces isolated document analysis with a graph-based approach. Varsha Shah explains how combining entity correlation, probabilistic risk modeling, and data normalization can identify patterns across disparate financial systems, though the talk remains high-level and does not provide implementation code or specific tool stacks.
AI Engineer@aiengineer·
This is a high-level architectural overview of a multi-component framework for enterprise fraud detection. The speaker explains how to move from isolated document validation to a connected system using graph-based entity correlation, probabilistic risk modeling, and cross-jurisdictional data normalization.
Google Cloud Tech@googlecloudtech·
A demonstration of Google Antigravity using voice prompts to orchestrate AI sub-agents. The presenters walk through a workflow where the tool generates a full-stack note-taking app by delegating tasks like frontend development, backend Go/SQLite setup, and testing to specialized agents.