Reading Vercel's Fluid Compute vs Cloudflare Workers Benchmark

Benchmarks published by independent developer Theo Browne and reported by Vercel show Fluid Compute running 1.2x to 5x faster than Cloudflare Workers across five frameworks, averaging 2.55x. The number that matters for teams actually choosing a serverless runtime, though, is the one neither side benchmarked: the cost at production request volumes, which can diverge by an order of magnitude because the two platforms bill for fundamentally different things.

What the benchmarks measured

The benchmarks tested 100 iterations per framework across Next.js, React SSR, SvelteKit, a pure-math workload, and vanilla JavaScript. The headline: Fluid Compute averaged 2.55x faster than Cloudflare Workers across the suite.

The methodology is worth reading closely. Cloudflare Workers ran with shared CPU and 128 MB RAM, which is the platform’s default production configuration. Fluid Compute ran with 2 vCPU and 4 GB RAM, which is its default. These reflect what each vendor ships, but they are not equivalent resource allocations. A benchmark that puts a 4 GB container against a 128 MB isolate and reports “faster” is technically accurate and practically misleading if you treat the ratio as a property of the runtime rather than the tier.

Vercel’s blog post also noted that roughly one in five Cloudflare Workers requests on Next.js and SvelteKit took over 10 seconds on tasks averaging 1.2 seconds. Cloudflare Workers’ Next.js response times ranged from 0.800s to 3.971s, a 3.171-second spread. That tail latency is the more useful finding in the benchmark than the average ratio, because it points to a real scheduling pathology rather than a raw compute deficit.

Cloudflare’s engineering response

Cloudflare published a detailed technical response attributing the initial gap to a range of infrastructure tuning issues and library-level differences between the platforms.

On investigation, Cloudflare found that a warm-isolate routing heuristic was poorly suited to CPU-bound burst workloads, introducing queuing delays when requests arrived in clusters. A long-standing V8 garbage-collector tuning parameter was also constraining performance; adjusting it yielded a measurable improvement with only a small memory increase, and the fix is now live for all Workers.

Notably, Cloudflare clarified that the scheduling problem inflated wall-clock latency but not billable CPU time. Workers pricing is CPU-time-based, so the queuing delay did not increase customer bills. Whether that is reassuring depends on whether your users notice latency or only your finance department does.

Cloudflare also identified buffer-allocation inefficiencies in the OpenNext adapter and upstreamed fixes for hot paths. After the fixes, Cloudflare reports parity on all benchmarks except Next.js.

What neither side measured

The benchmark suite tested warm-path throughput on CPU-bound SSR renders. Three things production teams actually care about went unmeasured.

Cold starts. Workers uses V8 isolates, which have no container spin-up overhead. Third-party analysis reports cold starts near 0 ms, with Vercel’s Edge Runtime at 50 to 250 ms and Node.js functions at 200 to 500 ms or more. For traffic patterns with bursty or low-volume endpoints, cold-start latency dominates the user experience, and the structural advantage belongs to the isolate model.

I/O-bound workloads. The benchmarks tested compute-heavy SSR renders. Most production serverless functions spend their time waiting on database queries, API calls, and cache reads. Vercel’s infrastructure runs on AWS, and Fluid Compute deploys in-region with the customer’s database, which Vercel argues reduces I/O latency for functions making multiple database round trips per request. That claim is plausible but unbenchmarked by either side.

Multi-region tail latency. As of mid-2026, Cloudflare operates over 300 PoPs to Vercel’s roughly 100 edge regions. For globally distributed traffic, the probability of hitting a nearby PoP matters more than average throughput on a single-region benchmark.

The billing model is the actual decision axis

Raw latency is a tiebreaker. Billing model is the constraint. The two platforms charge for compute in fundamentally different ways, and the gap compounds with volume.

Workers bills per request plus CPU time; Vercel bills per GB-hour plus execution duration. A function that uses 5 ms of CPU on a 128 MB isolate costs almost nothing on Workers. The same function on Vercel is billed as if it held its entire memory allocation for the wall-clock duration, including any time spent waiting on I/O. For high-volume, CPU-light workloads, this billing-model asymmetry can produce cost differences of an order of magnitude.

Where each platform genuinely wins

The resource ceilings tell the story more clearly than any benchmark ratio.

As of mid-2026, Workers caps at 128 MB memory and 30 seconds CPU time on the Paid plan. Vercel Node.js functions support up to 800 seconds execution and 3 GB memory on Pro and above. For sustained compute workloads (PDF generation, video processing, heavy SSR with large bundles), the resource ceiling favors Vercel.

Factor	Cloudflare Workers	Vercel Fluid Compute
Runtime model	V8 isolates	Node.js containers on AWS
Cold start	~0 ms	~50–500 ms depending on tier
Memory ceiling	128 MB	3 GB
CPU time ceiling	30 s (Paid)	800 s (Pro+)
Billing unit	CPU-millisecond	GB-hour
PoPs	300+	~100
Best for	I/O-light, high-volume, globally distributed	Compute-heavy SSR, sustained workloads, in-region DB access

For a high-traffic API gateway routing requests to downstream services, Workers’ per-CPU-ms billing and isolate cold starts are structurally superior. For a Next.js app doing multiple heavy database queries per page render, Vercel’s in-region deployment and higher resource ceiling are the better fit. The benchmark debate is noise if you haven’t first answered which column describes your workload.

Frequently Asked Questions

What does each platform cost at 50 million monthly requests with light CPU usage?

At 50M requests/month averaging 5ms CPU time, Workers Paid runs roughly $12 to $15 ($5 base plus $0.30 per additional million requests and $0.02 per million CPU-ms). Vercel Pro bills $20 per seat per month plus approximately $0.18 per GB-hour for Edge functions, pushing the same profile to an estimated $80 to $150 or more. The multiplier holds because Vercel charges against the full memory allocation for the wall-clock duration, including time the function spends idle on I/O.

How much did the V8 garbage-collector tuning actually improve Workers performance?

Relaxing a young-generation size configuration that had been unchanged since June 2017 yielded roughly a 25 percent boost on the benchmark workloads with only a small increase in per-isolate memory. The tuning is live for all Workers users, so any CPU-bound function benefits regardless of framework. The original setting predated Workers itself and was never revisited as V8 evolved.

What happens when a Workers function hits the 30-second CPU limit?

The invocation is terminated with an error. There is no automatic escalation to a higher tier. Functions that periodically exceed 30 seconds of CPU time must be restructured into smaller, chained calls or moved to a platform with a longer ceiling. Vercel Node.js functions on Pro plans support up to 800 seconds, making them viable for PDF generation, video processing, or other sustained-compute tasks that Workers cannot accommodate.

Do the benchmark results apply to API-only workloads with no server-side rendering?

Not directly. All five benchmarks tested SSR paths (Next.js, React SSR, SvelteKit) or synthetic CPU tasks. Pure API routing, JSON serialization, and request proxying involve minimal rendering and far less CPU per request. On those workloads, Workers’ near-zero cold start and per-CPU-ms billing give it a structural advantage the SSR benchmarks were not designed to measure.