groundy
security

ChatGPT's New Lockdown Mode Borrows Apple's Name for a Prompt-Injection Kill Switch

OpenAI's ChatGPT Lockdown Mode disables web browsing, images, and Deep Research, conceding that model-level defenses against prompt injection have plateaued as of early 2026.

6 min · · · 3 sources ↓

OpenAI shipped a feature called Lockdown Mode for ChatGPT on February 13, 2026, and the name is not an accident. Like Apple’s Lockdown Mode for iOS, it accepts that the underlying platform cannot reliably defend itself and gives administrators a blunt toggle: disable the risky surfaces entirely. ChatGPT in Lockdown Mode cannot browse the live web, generate images, run Deep Research, or download files for analysis. It is, by OpenAI’s own description, a concession that the model layer cannot tell a real system prompt from a malicious one, so the infrastructure layer has to step in and cut the network cord.

What Lockdown Mode disables

The feature restricts ChatGPT to cached content only. No live network requests leave OpenAI’s controlled environment while the mode is active, according to Groundy’s analysis of the launch. The full list of disabled capabilities is blunt: image generation in responses, Deep Research, Agent Mode, Canvas code network approval, and file-download-for-analysis are all off. Manually uploaded files remain usable, because the exfiltration risk there is confined to the session’s text output rather than an outbound network call.

Workspace admins can enable Lockdown Mode through role-based controls and granular per-app overrides. The feature is available only on ChatGPT Enterprise, Edu, Healthcare, and Teachers plans.

The DNS side-channel that validated the threat model

One week after Lockdown Mode shipped, Check Point researchers disclosed a DNS side-channel vulnerability in ChatGPT’s Code Execution sandbox. The attack vector was straightforward: encoded subdomain lookups silently exfiltrated data via DNS queries, bypassing the sandbox’s intended isolation boundaries. OpenAI patched it on February 20.

Lockdown Mode was already shipping when the disclosure landed. The disclosure validated the exact threat model the feature was built to address. If ChatGPT can make network requests, a sufficiently clever prompt can coerce those requests into leaking data. The DNS channel is just the one that was found. The design assumption behind Lockdown Mode is that other channels exist and have not been found yet.

Why OpenAI stopped trying to fix this at the model layer

OpenAI’s language around the launch is unusually direct about what the feature is not. Lockdown Mode operates as an infrastructure-level control that addresses the model’s inability to reliably distinguish legitimate system prompts from malicious injected instructions, according to Groundy’s analysis. That is a concession phrased as a product decision.

Prompt injection has been a known class of vulnerability since at least the first generation of instruction-tuned LLMs. Through early 2026, the research community had proposed structured instruction hierarchies, input sanitization layers, and fine-tuned classifiers, none of which had produced a reliable general defense.

OpenAI building a network kill switch instead of a smarter classifier is the company’s most honest public acknowledgment that model-level content filtering had plateaued for this threat class as of early 2026.

Separately, Creati.ai reports that access to the GPT-4o model was revoked specifically because of its tendency toward sycophancy, agreeing with user premises even when factually incorrect or malicious, making the model more susceptible to social engineering and jailbreaking. If accurate, this suggests OpenAI has encountered a model behavior that amplifies the very attack surface Lockdown Mode is designed to contain.

Elevated Risk labels: informational, not enforceable

Alongside Lockdown Mode, OpenAI introduced Elevated Risk labels: standardized indicators that appear in settings across ChatGPT, ChatGPT Atlas, and Codex whenever network-related capabilities are enabled. These are informational signals, not enforcement mechanisms. OpenAI has stated it will remove the labels as security mitigations improve, which frames them as temporary transparency rather than a durable security boundary.

The labels serve a purpose regardless: they give administrators a visible signal that a session’s attack surface has expanded. Whether that signal changes behavior depends entirely on whether the humans reading it act on it.

The enterprise-only gap

Lockdown Mode is an enterprise feature for a problem that is not enterprise-specific. Any ChatGPT session that can browse the web, execute code, or download files is theoretically vulnerable to prompt injection that exfiltrates data through those channels. The enterprise tier gets the kill switch because enterprise customers have administrators who can configure it and compliance teams who demand it. Consumer users get none of this, per Groundy’s reporting.

OpenAI has said it plans to bring Lockdown Mode to consumers in the coming months, per Baijiahao’s coverage. The design challenge for a consumer version is obvious: there is no administrator to configure per-app overrides, and the feature works by degrading the experience. Asking a consumer to toggle off web browsing, image generation, and Deep Research in exchange for a smaller attack surface is a hard sell when those are the features that drive engagement.

What Lockdown Mode does not cover

The feature closes the network exfiltration channel. It does not close everything.

A prompt injection attack that exfiltrates data through the model’s text output in the same conversation is unaffected by Lockdown Mode, because that exfiltration path does not require an outbound network request. Similarly, side effects in code execution that stay within the sandbox, or any channel that does not cross the network boundary, operate outside Lockdown Mode’s threat model, as Groundy’s analysis notes.

This is not a design flaw; it is a scope boundary. Lockdown Mode is a network containment tool, not a prompt-injection fix. The distinction matters because the marketing around the feature, and some of the coverage, conflates the two. An attacker who can trick the model into revealing sensitive context within the chat session still can. They just cannot phone it home via DNS.

The Apple parallel

The naming is deliberate and the design philosophy is the same. Apple’s Lockdown Mode for iOS, introduced in 2022, disables or restricts features like message attachments, web JIT compilation, and incoming FaceTime calls from unknown numbers. Apple’s framing at the time was similarly candid: the mode is for users who believe they are personally targeted by sophisticated attacks and are willing to accept a degraded experience to reduce their attack surface.

OpenAI has adopted the same tradeoff structure. The feature exists because the platform cannot reliably defend itself, the user must opt in, and the result is a product with fewer capabilities. Apple’s Lockdown Mode was initially criticized as an admission that iOS security had limits; the same reading applies here. The difference is that Apple shipped it for every iPhone user on day one. OpenAI shipped it for paying enterprise customers and left the consumer gap open.

If OpenAI’s approach works, expect competitors to follow the same pattern: accept that model-level defenses are insufficient, and build infrastructure toggles that admins can flip when the risk calculus changes.

Frequently Asked Questions

Can admins disable specific capabilities without enabling full Lockdown Mode?

Yes. Admin controls support per-action granularity, so an administrator can disable file-download-for-analysis while leaving web browsing active, or restrict Agent Mode without touching image generation. Full Lockdown Mode is the all-off preset. The per-action controls let admins shape the attack surface more surgically rather than accepting the full capability hit.

Do Google or Anthropic offer a comparable network-containment toggle for their chat products?

As of late May 2026, no verified public information exists about equivalent infrastructure-level kill switches for Google Gemini or Anthropic Claude. OpenAI is first to ship a named, toggleable mode that cuts live network access at the platform layer. Competitors have not announced matching features.

What security infrastructure does Lockdown Mode layer on top of?

Lockdown Mode sits above an existing enterprise stack that includes sandboxed execution environments, anomaly detection monitoring, and structured access controls. The DNS side-channel that Check Point disclosed bypassed the sandbox isolation boundary, which suggests those lower layers have gaps the network cutoff is meant to cover when the sandbox fails.

How do Elevated Risk labels decide a session has expanded risk?

According to Creati.ai, the labels rely on a separate classification model running in parallel with the user’s chat session, scanning input patterns for jailbreak attempts, sycophancy exploitation, and data exfiltration commands. OpenAI has not independently confirmed this architecture. What is confirmed is that the labels are informational indicators, not enforcement mechanisms, and are scheduled for removal as mitigations improve.

How many ChatGPT users lack access to Lockdown Mode right now?

ChatGPT passed 900 million weekly active users in February 2026. Lockdown Mode is restricted to Enterprise, Edu, Healthcare, and Teachers plans, so the overwhelming majority of those users have no network-containment toggle. OpenAI has described consumer availability only as coming in the coming months with no firm date.

sources · 3 cited

  1. OpenAI Ships Lockdown Mode and Elevated Risk Labels for ChatGPT Sessions analysis accessed 2026-06-04
  2. OpenAI Launches Lockdown Mode and Elevated Risk Labels to Combat Prompt Injection Attacks in ChatGPT analysis accessed 2026-06-04
  3. OpenAI adds Lockdown Mode and Elevated Risk labels to strengthen ChatGPT security primary accessed 2026-06-04