#safety
Every summary, chronological. Filter by category, tag, or source from the rail.
Predicting AI Model Behavior via Deployment Simulation
OpenAI uses 'Deployment Simulation'—replaying real, de-identified user conversations with new models—to predict safety risks and undesired behaviors before public release, outperforming traditional synthetic evaluations.
OpenAI's Deployment Simulation for Agentic Coding Risk Assessment
OpenAI has introduced a deployment simulation framework that uses simulated tool calls to evaluate the safety and reliability of agentic coding systems before they are deployed in real-world environments.
Governance by Construction for Generalist Agents
The paper proposes 'Governance by Construction' as a paradigm for AI safety, shifting from post-hoc monitoring to embedding constraints directly into the agent's architecture and execution environment.
Scaling AI Content Provenance via C2PA and SynthID
OpenAI is adopting a multi-layered provenance strategy by combining C2PA metadata standards with Google's SynthID watermarking to ensure AI-generated content remains identifiable even after file transformations.
Showing 4 of 4