ShareLock Splits MCP Poisoning Across Tools, Defeating Per-Tool Scanners by Construction

ShareLock, a multi-tool threshold poisoning attack described in an arXiv preprint submitted June 25, 2026, splits a malicious instruction into cryptographic shares spread across several MCP tool descriptions. No single description carries enough of the payload to trip a per-server scanner; the shares recombine into the attack inside the agent’s context only after a threshold set of cooperating tools load. That reconstruction step is something no deployed MCP scanner checks for, which is the gap the paper is built around.

Why an agent treats a tool description as an instruction

MCP tool descriptions are not documentation; they are instructions the model reads and acts on as part of its context window. The Model Context Protocol spec defines a server as a program that exposes tools, each with a name and a natural-language description, to a host such as Claude, ChatGPT, or VS Code. When the agent decides which tool to call, it is reading those descriptions the same way it reads any other prompt. A description that says a tool “reads the user’s calendar and posts a summary to a webhook” is, to the model, an instruction set with the same standing as the user’s own message.

That makes tool descriptions a prompt-injection surface, not a code-injection surface. The risk is not that the server runs code the host never authorized; it is that the server controls text the model treats as authoritative. The documented MCP threat model already includes prompt injection, cross-tool orchestration, and rug-pull tool redefinitions. The assumption baked into that model is that each tool, examined on its own, either contains a malicious instruction or it does not. ShareLock attacks exactly that assumption.

How single-tool poisoning and the scanner arms race worked

The first generation of defenses assumed the payload lived in one description, because the first generation of attacks put it there.

The paper labels the prevailing approach the “monolithic plaintext embedding paradigm”: the entire malicious instruction sits in a single tool description as readable text. It “fails to withstand manual inspection or automated detectors,” the authors write, because a string of suspicious prompt-injection language in one description is precisely what a scanner is built to find. That is the failure mode commercial products target. The eSully MCP Scanner, a representative per-server product, inspects individual tool descriptions, variable names, and prompt content for malicious patterns. Its architecture is one server in, one verdict out.

That architecture works against monolithic payloads. It is also exactly what ShareLock is designed to defeat.

How ShareLock splits one payload across many tools

ShareLock applies Shamir’s threshold secret-sharing scheme to tool description poisoning, turning one obvious attack into several boring-looking descriptions.

Shamir’s scheme is the (k, n) construction behind most threshold cryptography: split a secret S into n shares such that any k of them reconstruct S, while any k-1 reveal nothing, not even partial information. ShareLock replaces the secret with a malicious instruction and distributes the resulting shares across multiple tool descriptions, the paper’s abstract states. A “covert reconstruction trigger” planted during a server update then causes the agent to aggregate the loaded shares and reconstruct the hidden instruction, which the authors describe as producing “critical breaches of system assets or private data.”

The abstract does not specify the k and n parameters the authors used, nor which tools or servers carried shares in each scenario; those values live in the paper’s body and tables. What the abstract does establish is the shape of the attack: it does not live in any one tool. It lives in the set.

Information-theoretic secrecy is the property that makes ShareLock invisible to a per-tool scanner, and it is a mathematical guarantee rather than an implementation gap.

Shamir’s scheme provides information-theoretic secrecy: with fewer than k shares, the secret is uniformly random from the observer’s point of view. This is not “hard to detect” in the computational sense that AES is hard to break; it is that any k-1 shares contain literally zero information about the secret. So a scanner that reads tool descriptions one at a time is evaluating inputs that are, by construction, indistinguishable from honest shares of a legitimate instruction. There is no pattern to find, because the pattern only exists in the aggregate.

This is the structural gap the established per-tool threat model and the Elastic Security Labs taxonomy do not account for. Elastic’s September 2025 work does flag “orchestration injection” as a distinct category, describing attacks that use multiple tools across different servers or agents, but frames it as prompt-level manipulation rather than threshold secret sharing. ShareLock hardens that same vector cryptographically: the cross-tool cooperation is not visible in the prose, only in the combination.

The closest supply-chain analog is a malicious package that is individually benign and only misbehaves when a specific set of dependencies co-resolves. Each package passes review on its own; the compromise exists only in the dependency set. Per-tool MCP scanners face the identical blind spot, with the added wrinkle that “the set” is decided at runtime by whatever the agent chooses to load.

What ShareLock’s benchmark actually measured

Across mainstream LLMs and two MCP clients, the authors report ShareLock holding above 90% attack success while beating single-tool poisoning on description-based detection.

The evaluation, as described in the abstract, covers four multi-tool scenarios tested across mainstream LLMs on two distinct MCP clients. The headline result is twofold: ShareLock “significantly outperforms existing single-tool poisoning strategies in tool description-based detection,” and it does so “while maintaining an average attack success rate exceeding 90%.” Both are the paper’s own measurements, reported in the abstract rather than the full results tables, so the per-model and per-scenario breakdowns remain to be verified against the PDF.

For context on where per-tool and per-skill scanners already top out against simpler attacks, a separate June 2026 benchmark is instructive. SkillHarm (arXiv:2606.02540), from Ohio State, Amazon AGI, and Stanford, tested skill-package attacks against agent defenses and found existing scanners nearly unable to reliably stop them. SkillHarm targets skill packages (the SKILL.md manifest format used by agent frameworks), not MCP tool descriptions, so its numbers are not ShareLock’s numbers. They do bracket the ceiling: when defenses already struggle against attacks that carry their payload intact in a single place, an approach whose payload is provably undetectable on a per-tool basis starts from a much lower floor.

What catching it would require

Catching ShareLock requires cross-tool, session-level correlation, a primitive no current MCP scanner performs and that per-server auditing cannot produce.

A per-server scanner never sees more than the shares that arrived from one source. To detect a threshold attack, a defender would need to observe the agent’s loaded tool set over a session, run the same secret-sharing reconstruction an attacker would, and check whether the reconstructed output is a coherent malicious instruction. That requires the scanner to model not a single tool but a combination of tools, across servers, across the lifetime of a context window. It is a session primitive, not a static-analysis primitive.

The practical consequence is that the assurance a marketplace gives when it scans each submitted server for malicious descriptions is narrower than it looks: the scan rules out monolithic payloads, not cooperative ones. Two clean-looking servers, each scanned and passed, can still form a threshold set once an agent loads both.

What it means for MCP as a shared standard

With MCP now a Linux Foundation standard adopted across Anthropic, OpenAI, and Google DeepMind, the gap ShareLock exposes lands on whoever governs server trust.

The Model Context Protocol was introduced by Anthropic in November 2024, subsequently adopted by OpenAI and Google DeepMind, and donated to the Agentic AI Foundation under the Linux Foundation in December 2025. That trajectory is what makes a per-tool scanner gap durable rather than vendor-specific: the same protocol, the same tool-description format, and broadly the same agent context model now run across the three largest AI platforms. An attack that works against the protocol works against the ecosystem.

Two implications follow for anyone buying, building, or evaluating an MCP security tool. First, a scanner’s per-server verdict answers the wrong question for threshold attacks; the relevant unit is the loaded tool set, and a product that does not perform cross-tool reconstruction cannot rule ShareLock out, only rule monolithic poisoning out. Second, the trust boundary has moved from the manifest to the session. A defender who wants assurance against cooperative poisoning needs a record of which tools co-loaded and a way to reason about their union, neither of which the current scanning products advertise.

The paper does not ship a detector, and its results are preprint-grade as of this writing. What it does is redraw the threat model: MCP poisoning is no longer strictly a property of individual tool descriptions, and defenses that assume otherwise are solving yesterday’s attack.

Frequently Asked Questions

What detection rates did per-tool scanners achieve against simpler attacks before ShareLock appeared?

The June 2026 SkillHarm benchmark (Ohio State, Amazon AGI, Stanford) found existing skill-package scanners caught only 55.6% of fixed-payload single-tool attacks and 68.8% of self-mutating variants. Those percentages cover attacks that carry their full payload in one location; a threshold scheme whose individual shares contain provably zero signal is not a harder version of that problem but a categorically different one.

Which MCP threat classes were publicly documented before ShareLock?

Security researchers in April 2025 catalogued prompt injection, tool permissions enabling data exfiltration, and lookalike tools that silently replace trusted servers. Elastic Security Labs added an orchestration injection category in September 2025 covering multi-server attacks but described it as prompt-level manipulation. None of those prior frameworks anticipated a cryptographic splitting layer where each share is individually indistinguishable from benign content.

Does a ShareLock attack require one operator to control all cooperating servers?

No. Shares are generated by whoever authors the payload, but each poisoned tool description can reside on a separately operated server. An attacker who compromises k independent servers through supply-chain means can distribute shares without owning a majority of a user’s typical session, which separates ShareLock’s threat model from single-operator attacks and complicates marketplace accountability.

How expensive would session-level cross-tool reconstruction checking be in practice?

Checking every possible subset of n loaded tools for a coherent malicious reconstruction requires evaluating C(n, k) subsets per candidate threshold k, which grows quickly as agents load more tools. A practical scanner would need heuristics such as grouping tools by server origin or co-loading frequency, but those heuristics create exploitable gaps: an attacker can design shares to appear uncorrelated across servers and sidestep the filter.