Large Language Models
50 articles

How we monitor internal coding agents for misalignment
How OpenAI uses chain-of-thought monitoring to study misalignment in internal coding agents—analyzing real-world deployments to detect risks and strengthen AI s...

OpenAI to acquire Astral
Accelerates Codex growth to power the next generation of Python developer tools

Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training
A developer successfully replicated David Ng's RYS method on consumer AMD GPUs, discovering that transformers possess discrete "reasoning circuits" consisting o...

OpenAI Japan announces Japan Teen Safety Blueprint to put teen safety first
OpenAI Japan announces the Japan Teen Safety Blueprint, introducing stronger age protections, parental controls, and well-being safeguards for teens using gener...

Introducing GPT-5.4 mini and nano
GPT-5.4 mini and nano are smaller, faster versions of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads.

Equipping workers with insights about compensation
New research shows Americans send nearly 3 million daily messages to ChatGPT asking about compensation and earnings, helping close the wage information gap.

Why Codex Security Doesn’t Include a SAST Report
A deep dive into why Codex Security doesn’t rely on traditional SAST, instead using AI-driven constraint reasoning and validation to find real vulnerabilities w...

Rakuten fixes issues twice as fast with Codex
Rakuten uses Codex, the coding agent from OpenAI, to ship software faster and safer, reducing MTTR 50%, automating CI/CD reviews, and delivering full-stack buil...

Designing AI agents to resist prompt injection
How ChatGPT defends against prompt injection and social engineering by constraining risky actions and protecting sensitive data in agent workflows.

Wayfair boosts catalog accuracy and support speed with OpenAI
Wayfair uses OpenAI models to improve ecommerce support and product catalog accuracy, automating ticket triage and enhancing millions of product attributes at s...

From model to agent: Equipping the Responses API with a computer environment
How OpenAI built an agent runtime using the Responses API, shell tool, and hosted containers to run secure, scalable agents with files, tools, and state.

Improving instruction hierarchy in frontier LLMs
IH-Challenge trains models to prioritize trusted instructions, improving instruction hierarchy, safety steerability, and resistance to prompt injection attacks.

New ways to learn math and science in ChatGPT
ChatGPT introduces interactive visual explanations for math and science, helping students explore formulas, variables, and concepts in real time.

OpenAI to acquire Promptfoo
OpenAI is acquiring Promptfoo, an AI security platform that helps enterprises identify and remediate vulnerabilities in AI systems during development.

How Descript enables multilingual video dubbing at scale
Descript uses OpenAI models to scale multilingual video dubbing, optimizing translations for both meaning and timing so dubbed speech sounds natural across lang...

Codex Security: now in research preview
Codex Security is an AI application security agent that analyzes project context to detect, validate, and patch complex vulnerabilities with higher confidence a...

How Balyasny Asset Management built an AI research engine for investing
See how Balyasny built an AI research system with GPT-5.4, rigorous model evaluation, and agent workflows to transform investment analysis at scale.

Introducing GPT-5.4
Introducing GPT-5.4, OpenAI’s most most capable and efficient frontier model for professional work, with state-of-the-art coding, computer use, tool search, and...

GPT-5.4 Thinking System Card

Reasoning models struggle to control their chains of thought, and that’s good
OpenAI introduces CoT-Control and finds reasoning models struggle to control their chains of thought, reinforcing monitorability as an AI safety safeguard.

Ensuring AI use in education leads to opportunity
OpenAI shares new tools, certifications, and measurement resources to help schools and universities close AI capability gaps and expand opportunity.

Introducing ChatGPT for Excel and new financial data integrations
OpenAI introduces ChatGPT for Excel and new financial app integrations, powered by GPT-5.4 to accelerate modeling, research, and analysis in regulated environme...

Introducing the Adoption news channel
Practical insights and frameworks to turn AI progress into business advantage

The five AI value models driving business reinvention
Five AI value models show how leaders can sequence AI from workforce fluency to process reinvention and build durable business advantage.

Extending single-minus amplitudes to gravitons
A new preprint extends single-minus amplitudes to gravitons, with GPT-5.2 Pro helping derive and verify nonzero graviton tree amplitudes in quantum gravity.

How Axios uses AI to help deliver high-impact local journalism
Axios COO Allison Murphy explains how the company uses AI to support local reporters, streamline newsroom workflows, and deliver high-impact local journalism at...

Understanding AI and learning outcomes
OpenAI introduces the Learning Outcomes Measurement Suite to assess AI’s impact on student learning across diverse educational environments over time.

GPT-5.3 Instant: Smoother, more useful everyday conversations

GPT-5.3 Instant System Card

Our agreement with the Department of War
Details on OpenAI’s contract with the Department of War, outlining safety red lines, legal protections, and how AI systems will be deployed in classified enviro...
Joint Statement from OpenAI and Microsoft
Microsoft and OpenAI continue to work closely across research, engineering, and product development, building on years of deep collaboration and shared success.
Introducing the Stateful Runtime Environment for Agents in Amazon Bedrock
Stateful Runtime for Agents in Amazon Bedrock brings persistent orchestration, memory, and secure execution to multi-step AI workflows powered by OpenAI.
OpenAI and Amazon announce strategic partnership
OpenAI and Amazon announce a strategic partnership bringing OpenAI’s Frontier platform to AWS, expanding AI infrastructure, custom models, and enterprise AI age...

Scaling AI for everyone
Today we’re announcing $110B in new investment at a $730B pre money valuation. This includes $30B from SoftBank, $30B from NVIDIA, and $50B from Amazon.
An update on our mental health-related work
OpenAI shares updates on its mental health safety work, including parental controls, trusted contacts, improved distress detection, and recent litigation develo...
Pacific Northwest National Laboratory and OpenAI partner to accelerate federal permitting
OpenAI and Pacific Northwest National Laboratory introduce DraftNEPABench, a new benchmark evaluating how AI coding agents can accelerate federal permitting—sho...
OpenAI Codex and Figma launch seamless code-to-design experience
OpenAI and Figma launch a new Codex integration that connects code and design, enabling teams to move between implementation and the Figma canvas to iterate and...
Disrupting malicious uses of AI | February 2026
Our latest threat report examines how malicious actors combine AI models with websites and social platforms—and what it means for detection and defense.
Arvind KC appointed Chief People Officer
OpenAI appoints Arvind KC as Chief People Officer to help scale the company, strengthen its culture, and lead how work evolves in the age of AI.
Why we no longer evaluate SWE-bench Verified
SWE-bench Verified is increasingly contaminated and mismeasures frontier coding progress. Our analysis shows flawed tests and training leakage. We recommend SWE...
OpenAI announces Frontier Alliance Partners
OpenAI announces Frontier Alliance Partners to help enterprises move from AI pilots to production with secure, scalable agent deployments.
Our First Proof submissions
We share our AI model’s proof attempts for the First Proof math challenge, testing research-grade reasoning on expert-level problems.
Advancing independent research on AI alignment
OpenAI commits $7.5M to The Alignment Project to fund independent AI alignment research, strengthening global efforts to address AGI safety and security risks.
Introducing OpenAI for India
OpenAI for India expands AI access across the country—building local infrastructure, powering enterprises, and advancing workforce skills.
Introducing EVMbench
OpenAI and Paradigm introduce EVMbench, a benchmark evaluating AI agents’ ability to detect, patch, and exploit high-severity smart contract vulnerabilities.
GPT-5.2 derives a new result in theoretical physics
A new preprint shows GPT-5.2 proposing a new formula for a gluon amplitude, later formally proved and verified by OpenAI and academic collaborators.
Introducing Lockdown Mode and Elevated Risk labels in ChatGPT
Introducing Lockdown Mode and Elevated Risk labels in ChatGPT to help organizations defend against prompt injection and AI-driven data exfiltration.
Scaling social science research
GABRIEL is a new open-source toolkit from OpenAI that uses GPT to turn qualitative text and images into quantitative data, helping social scientists analyze res...
Beyond rate limits: scaling access to Codex and Sora
How OpenAI built a real-time access system combining rate limits, usage tracking, and credits to power continuous access to Sora and Codex.
Introducing GPT-5.3-Codex-Spark
Introducing GPT-5.3-Codex-Spark—our first real-time coding model. 15x faster generation, 128k context, now in research preview for ChatGPT Pro users.