Vercel's Remend Turns Streaming-Markdown Repair Into a Dependency

Vercel’s Remend (npm i remend), announced 3 December 2025, detects and closes the unterminated Markdown blocks LLM token streams leave behind: open code fences, dangling bold and italic markers, broken links, half-finished list items. It runs as a pre-processor before any Markdown renderer and rewrites its output as the real closing tokens arrive. Packaged as a standalone dependency, it exposes the termination logic Vercel previously shipped only inside its Streamdown renderer.

Why does token streaming break Markdown rendering?

Models stream tokens, not documents, and Markdown’s grammar assumes it sees whole blocks. A code fence opened with triple backticks is indistinguishable from “open” until its matching close arrives dozens or hundreds of tokens later. The same applies to bold and italic markers, link syntax where the URL is still arriving, and list items whose numbering depends on siblings not yet generated.

Renderers like react-markdown were built for static documents. They re-parse on every chunk and have no concept of a block still in flight. Feed them a half-streamed fence and they render the interior as plain text, or worse, as a live code block that flashes open and closed as tokens arrive. The flicker, the broken formatting, and the half-rendered code every streaming-chat user has seen all trace back to this mismatch.

Every team that has shipped a streaming surface has solved it with a private patch: a regex that closes dangling fences, a function that balances brackets, a useEffect that rewrites the buffer before it hits the renderer. Remend is that patch, published.

What is Remend and how was it built?

Streamdown, Vercel’s open-source drop-in replacement for react-markdown, shipped 21 August 2025 built specifically for AI streaming, bundling GFM, Shiki code highlighting, KaTeX, Mermaid, memoized rendering, and rehype-harden. Remend is the unterminated-block handling that Streamdown shipped with, now callable ahead of any renderer.

The corrected string is handed to whatever renderer sits behind it (unified, remark, rehype, or anything else), with Remend’s synthetic closures yielding to genuine closing tokens as they arrive later in the stream. Vercel describes the heuristics as “battle-tested in production AI applications,” with explicit guards against false positives on LaTeX underscores, product-code asterisks and underscores, list-item markers, and nested link brackets.

Where does Remend fit in Vercel’s AI stack?

Remend sits at the rendering layer of a stack Vercel has been assembling since the AI SDK shipped in 2023 for building conversational streaming interfaces. Below it sit the model providers and the SDK’s streaming transport, consumed through useChat from @ai-sdk/react. Above it sit Streamdown, the react-markdown replacement, and the AI Elements and AI Cloud surfaces. The Streamdown README explicitly lists unterminated-block parsing via Remend as a feature and shows it consumed through useChat.

The architectural read matters more than the component list. Extracting the termination logic into its own package decouples it from Streamdown: any app already consuming the SDK’s stream through useChat can run the same repair in front of its existing renderer. The correction now lives one dependency below the renderer, not bound to it.

How does Remend compare to StreamMD and streamark?

Remend, StreamMD, and streamark each solve streaming Markdown from a different layer: Remend repairs the string before render, StreamMD restructures the render to skip redundant work, and streamark owns the entire parse.

StreamMD (altrusian) attacks re-render cost rather than termination. It uses block-level incremental parsing with React.memo on frozen blocks, so that only the active block re-renders as tokens arrive. The author claims O(1) per-token parsing. The demo page reports react-markdown performing roughly 500 re-parses over a 500-token stream against StreamMD’s ~20, in a 30kB bundle with zero runtime dependencies. As of June 2026, these figures are not independently benchmarked.

streamark (siinghd) takes a third route: a character-level state machine that auto-completes dangling tokens, built with Bun, zero runtime dependencies, and an optional React wrapper. As of 25 May 2026 the repo is archived and flagged NOT READY.

Library	Layer it works at	Handles partial input by	Status (June 2026)
Remend (Vercel)	Pre-processor, renderer-agnostic	Closing unterminated blocks before render	Active; extracted from Streamdown, 3 Dec 2025
StreamMD (altrusian)	Custom renderer	Memoizing frozen blocks, re-rendering only the active block	Active; performance claims author self-reported
streamark (siinghd)	Custom parser	Auto-completing dangling tokens in a state machine	Archived 25 May 2026

The three are not strict substitutes, since a pre-processor, a custom renderer, and a custom parser operate at different layers of the same pipeline.

When does framework-layer recovery mask real bugs?

A recovery heuristic is a safety net, and safety nets catch things you would rather know about. Remend silently closes fences and balances markers that should not have been left open in the first place. When the upstream cause is just the streaming protocol cutting a chunk mid-block, that is benign. When the cause is your prompt producing malformed output, your transport slicing tokens badly, or the model emitting broken Markdown at boundaries, Remend hides the symptom.

The practitioner move is to instrument, not just adopt. Log the unterminated cases Remend repairs. If the same block type fails open in the same place repeatedly, that is a token-boundary defect in your stack, and Remend is rendering it invisible.

The heuristic guards have known limits, too. The false-positive handling for LaTeX underscores and product-code asterisks is rule-based, so novel edge cases can still mis-close formatting. A dependency that fixes your Markdown is also a dependency whose failure mode is silently wrong formatting on input it did not anticipate.

Is streaming-Markdown repair now framework infrastructure?

The news peg is consolidation, not a launch. Remend shipped in December 2025; streamark archived in May 2026. Across the niche, as of June 2026 the partial-Markdown hack is migrating from per-app code into packaged dependencies, and the framework-backed option is winning. Teams that once maintained a private fence-closing function now consume it from the same vendor that ships their rendering layer.

That is a net improvement in the cost of shipping a streaming surface. The tradeoff is that rendering-correctness is no longer your code, which means its failure modes are no longer your code either. The teams that get the most out of Remend will be the ones who keep watching what it repairs.

Frequently Asked Questions

What exactly should teams log when instrumenting Remend’s repairs?

Track the block type being closed (fence, bold, link, list), the token offset where the synthetic close was inserted, and how long the real closing marker took to arrive. A fence that stays synthetically closed past 200 tokens often means the model stopped emitting the block, not that transport sliced it. Correlate by prompt template to find which prompts produce malformed output.

Does Remend sanitize or security-harden the repaired Markdown?

No. Remend only closes unterminated blocks; it does not strip scripts, rewrite dangerous URLs, or apply rehype-harden. Streamdown pairs Remend’s repair with a separate rehype-harden pass, so teams adopting Remend in front of react-markdown still need their own sanitization step for untrusted model output.

What does Remend do with malformed-but-already-terminated Markdown?

Remend’s contract is unterminated-block completion, not semantic repair. A GFM table that is technically closed but structurally broken, or a blockquote with mismatched nesting, passes through unchanged because the markers are balanced from Remend’s perspective. Genuinely malformed-but-closed output still depends on the downstream renderer’s own error tolerance.

Why can’t teams rely on StreamMD’s 25x-faster claim when choosing?

The 500-versus-20 re-parse figure and the O(1) per-token claim come from StreamMD’s own demo page, not an independent benchmark or a published methodology. Two of the three options also work at different layers (StreamMD restructures rendering, Remend repairs strings), so a raw re-parse count is not a like-for-like comparison even if the numbers were independently verified.

What would push mid-stream Markdown repair back into per-app code?

If a model family starts emitting well-formed Markdown at every token boundary, or if the transport layer begins sending block-complete chunks instead of fixed-size token slices, the unterminated-block problem shrinks and Remend becomes overhead. The consolidation thesis assumes token-by-token streaming with mid-block cuts stays the dominant pattern.