In early March 2026, a red-team security startup deployed an autonomous AI agent against McKinsey’s internal AI platform and gained read-write access to 46.5 million chat messages, 728,000 confidential files, and the system prompts controlling how the chatbot advises 43,000 consultants—using SQL injection, a technique older than most junior developers.
The implications reach further than one firm’s misconfigured database. Enterprise AI adoption has sprinted ahead of security fundamentals, and that gap is now actively exploitable.
What Is McKinsey’s Lilli, and Why Does It Matter?
Lilli is McKinsey’s firmwide generative AI platform, named for Lillian Dombrowski, the first professional woman hired by the firm in 1945.1 Since its full rollout in July 2023, more than 75% of McKinsey’s 43,000 employees use it monthly, generating over 500,000 prompts per month. Consultants rely on it to search the firm’s entire intellectual property—over 100,000 documents and decades of proprietary research—synthesize findings, draft proposals, and locate internal subject matter experts.2
Critically, Lilli is the only internal platform that employees are permitted to use with confidential client data. Everything flows through it: strategy engagements, merger discussions, financial modeling. That single fact makes its security posture a matter of significant consequence—for McKinsey’s clients, not just the firm itself.
How the Attack Worked
On February 28, 2026, CodeWall’s autonomous offensive security agent began mapping Lilli’s attack surface with no credentials and no insider knowledge. What it found formed an attack chain that required no novel techniques.3
Step one: unauthenticated API surface. The agent discovered over 200 API endpoints. Of those, 22 required no authentication whatsoever. One of those endpoints wrote user search queries to a backend database.
Step two: an injection point that standard tools missed. The endpoint safely parameterized SQL values—the standard defense against injection—but directly concatenated JSON field names (keys) into the SQL query. Most automated scanning tools, including OWASP ZAP, test value injection. They do not test key injection. The agent identified that database error messages reflected JSON keys verbatim, signaling an injection point.
Step three: blind SQL injection iterations. Without direct database access, the agent ran blind SQL injection iterations, reading error messages to reconstruct the query structure, until it gained read-write access to the production database.
Step four: IDOR chaining. The agent also identified an Insecure Direct Object Reference (IDOR) vulnerability—a broken authorization check—that allowed cross-user data access when chained with the SQL flaw.
CodeWall CEO Paul Price stated the process was “fully autonomous from researching the target, analyzing, attacking, and reporting.”4
The Attack Pattern in Code
The core vulnerability pattern is worth illustrating. A safely parameterized query looks like this:
cursor.execute( "SELECT * FROM queries WHERE user_id = ?", (user_id,))The Lilli endpoint behavior resembled this anti-pattern:
field_name = request.json.get("fieldName")query = f"SELECT * FROM queries ORDER BY {field_name}"cursor.execute(query)When field_name is "name; DROP TABLE queries--", the database executes the injected SQL. When error messages reflect the field name back, an attacker can enumerate the database structure through deliberate errors. Standard scanners never send malformed keys, only malformed values—which is why this class of vulnerability routinely survives automated security scans.
Why This Is Classical AppSec, Not an AI Failure
The most important analytical point about this incident: the AI model itself was not compromised. No jailbreaking occurred. No prompt injection exploited the LLM. Lilli’s underlying language model behaved exactly as designed throughout.
Promptfoo’s analysis of the incident notes: “the model became the interface to a compromised application” rather than the security failure itself.5 The vulnerability chain—unauthenticated endpoints, SQL injection, broken object-level authorization—is textbook OWASP Top 10 territory, present in enterprise software for decades.
That architectural choice is what made the incident notable. Because the SQL injection granted write access to the same database storing Lilli’s system prompts, an attacker could theoretically rewrite how the chatbot behaves for 43,000 consultants—which guardrails it follows, how it cites sources, what advice it generates—without deploying new code or triggering standard security alerts. As Promptfoo noted: “A write can become a prompt change. A metadata edit can change what the system retrieves.”5
The Broader Enterprise AI Security Gap
The McKinsey incident is not an anomaly—it is a data point in a measurable pattern.
Industry research indicates that while 94% of enterprises deploy AI in production, only 23% have mature AI security programs.6 That gap between adoption speed and security maturity is where attackers operate.
| Metric | Statistic | Source |
|---|---|---|
| Enterprises using AI in production | 94% | Practical DevSecOps, 2026 |
| Organizations with mature AI security | 23% | Practical DevSecOps, 2026 |
| Orgs lacking full AI risk visibility | 64% | CSA AI Security Governance Report |
| Security teams saying AI threats outpace expertise | 59% | Microsoft Data Security Index 2026 |
The CrowdStrike 2026 Global Threat Report documents adversaries actively injecting malicious prompts into GenAI tools at more than 90 organizations, with AI-enabled attack activity rising 89% year-over-year.7 eSecurity Planet reported over 91,000 attack sessions targeting AI deployments between October 2025 and January 2026.8
The OWASP Top 10 for LLM Applications 2025 lists the specific failure modes driving these incidents: prompt injection, sensitive information disclosure, system prompt leakage, vector and embedding weaknesses, and excessive agency.9 The vulnerability that standard tools missed—JSON key injection—maps to inadequate input validation, a risk class that has appeared in every version of the OWASP Top 10 for traditional web applications since 2003.
What Holds Up—and What Doesn’t
Responsible reporting requires acknowledging critical analysis. Edward Kiledjian’s examination of CodeWall’s disclosure raises substantive questions: the firm provided no proof-of-concept payloads, no file hashes, no screenshots substantiating the claimed data access.10 There is a meaningful difference between data that was theoretically reachable via SQL injection and data that was actually accessed and exfiltrated—CodeWall’s public writeup does not clearly distinguish these categories.
The timeline for blind SQL injection is also difficult to verify without a detailed technical breakdown. Blind injection typically progresses slowly.
McKinsey’s forensic investigation, conducted by an external firm, “identified no evidence that client data or client confidential information were accessed by this researcher or any other unauthorized third party.”4
What is not disputed: the technical attack chain was real, McKinsey patched it within 24 hours of disclosure, and the vulnerability had persisted for over two years despite internal security scanning.
What Practitioners Need to Do Now
The actionable remediation from this incident is straightforward, if not always practiced:
- Audit SQL construction for identifier injection. Check everywhere that field names, column names, table names, or ORDER BY parameters are dynamically constructed. Parameterization of values is not sufficient.
- Treat system prompts as crown-jewel assets. Store them in governed, versioned configuration with change logging—not as mutable rows in an application database.
- Map your unauthenticated API surface. Run authentication audits against every endpoint, not just those flagged in threat models. Twenty-two unauthenticated endpoints in a production system handling confidential client data represents a systematic gap in security review.
- Include AI-specific coverage in penetration tests. Traditional scanners do not test AI-specific attack surfaces. The OWASP LLM Top 10 provides the testing framework; ensure your pen tests use it.
Organizations with mature AI governance programs report 45% fewer security incidents and resolve breaches 70 days faster than those without formal oversight structures.6 The McKinsey incident demonstrates exactly what “without formal oversight structures” looks like in practice: a two-year-old SQL injection flaw in the backend of a system trusted with every confidential client conversation the firm conducts.
Frequently Asked Questions
Q: Was McKinsey’s AI model (the LLM) itself hacked? A: No. The language model was not compromised. The attack exploited classical application security failures—SQL injection and broken authentication—in the backend infrastructure supporting the AI platform. The LLM behaved as designed throughout.
Q: What made the SQL injection in this case unusual? A: The injection point was JSON field names (keys), not values. Standard security scanning tools test for value injection and would not flag this vector. The backend safely parameterized values but concatenated key names directly into SQL, creating a blind spot in conventional automated testing.
Q: Why does write access to a system prompt database matter? A: System prompts control how an AI responds—what guardrails it follows, how it cites sources, what it will and won’t discuss. Write access means an attacker can silently alter AI behavior across an entire organization without deploying code or triggering standard security monitoring.
Q: Did McKinsey confirm data was actually accessed? A: McKinsey stated that a third-party forensic investigation “identified no evidence that client data or client confidential information were accessed by this researcher or any other unauthorized third party.” CodeWall’s claims about the volume of data accessed have not been independently verified.
Q: What’s the single highest-priority action for enterprise AI security teams? A: Audit your SQL construction for identifier injection (field names, table names, ORDER BY parameters) and conduct an authentication audit across all AI platform API endpoints. These classical AppSec controls prevent the class of attack that compromised Lilli—and standard automated scanners will not catch them.
Footnotes
-
McKinsey & Company. “Meet Lilli, our generative AI tool.” McKinsey.com, 2023. https://www.mckinsey.com/about-us/new-at-mckinsey-blog/meet-lilli-our-generative-ai-tool ↩
-
McKinsey & Company. “Rewiring the way McKinsey works with Lilli.” McKinsey.com. https://www.mckinsey.com/capabilities/tech-and-ai/how-we-help-clients/rewiring-the-way-mckinsey-works-with-lilli ↩
-
CodeWall. “How We Hacked McKinsey’s AI Platform.” CodeWall.ai, March 2026. https://codewall.ai/blog/how-we-hacked-mckinseys-ai-platform ↩ ↩2
-
Claburn, Thomas. “AI agent hacked McKinsey chatbot for read-write access.” The Register, March 9, 2026. https://www.theregister.com/2026/03/09/mckinsey_ai_chatbot_hacked/ ↩ ↩2
-
Promptfoo. “McKinsey’s Lilli Looks More Like an API Security Failure Than a Model Jailbreak.” Promptfoo.dev, March 2026. https://www.promptfoo.dev/blog/mckinsey-lilli-appsec-vs-ai-jailbreak/ ↩ ↩2
-
Practical DevSecOps. “AI Security Statistics 2026: Latest Data, Trends & Research Report.” https://www.practical-devsecops.com/ai-security-statistics-2026-research-report/ ↩ ↩2
-
CrowdStrike. “2026 Global Threat Report: AI Accelerated Adversaries.” CrowdStrike.com, 2026. https://www.crowdstrike.com/en-us/press-releases/2026-crowdstrike-global-threat-report/ ↩
-
eSecurity Planet. “AI Deployments Targeted in 91,000+ Attack Sessions.” https://www.esecurityplanet.com/threats/ai-deployments-targeted-in-91000-attack-sessions/ ↩
-
OWASP GenAI Security Project. “2025 Top 10 Risk & Mitigations for LLMs and Gen AI Apps.” genai.owasp.org. https://genai.owasp.org/llm-top-10/ ↩
-
Kiledjian, Edward. “CodeWall says it hacked McKinsey’s AI platform. Here’s what holds up—and what doesn’t.” Kiledjian.com, March 10, 2026. https://kiledjian.com/2026/03/10/codewall-says-it-hacked-mckinseys.html ↩