Postgres was never designed for serverless. Every connection forks a dedicated OS process, allocating 5, 10 MB of memory, and the default max_connections is 100. Scale that to hundreds of concurrent Lambda invocations and the database spends more time managing processes than serving queries. For years the fix was PgBouncer, a thin middleware layer that multiplexes hundreds of application connections down to a handful of real Postgres sessions. Vercel’s Fluid Compute, now promoted as a flagship offering on the Vercel homepage, claims to make that middleware unnecessary by keeping instances warm across requests. The claim is plausible. The evidence is thin.
Why Postgres hates serverless
The process-per-connection model is not a bug in Postgres; it is the architecture. Each client gets its own backend process, its own memory context, and its own transaction state. This is fine for a traditional application server with a bounded pool of 20, 50 persistent connections. It breaks when a serverless runtime spawns a new process per incoming HTTP request, each one opening its own TCP connection to the database.
At 500 concurrent connections, the Postgres process scheduler becomes a bottleneck according to this PgBouncer guide on dev.to. The default max_connections of 100 means the problem arrives sooner than most teams expect: a modest traffic spike on a serverless frontend can exhaust the connection limit before the database has served a single query.
The solution the industry converged on was connection pooling: an intermediary that accepts many client connections but holds far fewer real connections to Postgres, reusing them across requests.
PgBouncer as mandatory infrastructure
PgBouncer became the default answer. In transaction mode, it can reduce 300 application connections to roughly 20 actual Postgres connections by returning each connection to the pool immediately after its transaction commits, per the same dev.to guide. The math is compelling enough that Azure Database for PostgreSQL now ships built-in PgBouncer, scaling to 10,000 client connections via async I/O on port 6432 alongside the database VM. Microsoft treats the pooler as essential infrastructure, not optional plumbing.
PgBouncer itself is under active development. Version 1.25.0 (November 2025) added LDAP authentication, client-side direct TLS, and improved SCRAM authentication performance. Version 1.25.1 (December 2025) patched CVE-2025-12819, a SQL injection vulnerability during authentication with non-default search_path tracking.
The transaction-mode compatibility tax
PgBouncer’s transaction mode is the only mode that delivers the aggressive connection multiplexing serverless stacks need, but it comes with real compatibility costs. Session-level features break because the pooler cannot guarantee that the same backend process serves the same client across transactions. Teams building on serverless+Postgres learn these limits through production incidents, not documentation.
What Fluid Compute claims to do
The Vercel homepage promotes Fluid Compute as “A compute model for all workloads. With Active CPU pricing”. The product’s positioning implies persistent instances that serve multiple requests while retaining serverless elasticity, though the homepage does not describe the architecture in detail.
The inference for database connections is straightforward: if a single instance handles multiple requests over its lifetime, it can hold a connection open across those requests. The cold-start-per-request model that forces connection churn goes away, and with it the need for an external pooler to multiplex hundreds of ephemeral connections down to a manageable number.
Vercel’s infrastructure ambition is not in doubt. The company raised $300M at a $9.3B valuation in September 2025 and appointed HashiCorp co-founder Mitchell Hashimoto to its board in March 2026. Fluid Compute is a flagship product, not an experiment.
What the docs do not say
Here is the problem: the research record contains no Vercel documentation, blog post, or technical specification describing Fluid Compute’s connection-lifecycle semantics. The homepage copy is marketing. The claim that Fluid eliminates the connection-pool tax is an inference from the product’s positioning, not a documented behavior.
For an engineer evaluating whether to drop PgBouncer from a Vercel-hosted stack, the unanswered questions are:
- Does a Fluid instance maintain a single TCP connection to Postgres across multiple request-response cycles, or does it reconnect per request?
- What happens to connection state during a scale-to-zero event? Is the connection gracefully terminated, or does Postgres see a client disconnect?
- Is connection reuse guaranteed, or is it best-effort depending on instance lifecycle and regional scheduling?
- What is the documented maximum connection lifetime on a Fluid instance?
Without answers to these, dropping PgBouncer is a bet on Vercel’s runtime behavior, not a decision supported by a spec.
When you still need a pooler
Even if Fluid Compute delivers on the full connection-reuse promise for Vercel-hosted workloads, PgBouncer and its successors remain necessary in several scenarios:
- Multi-cloud architectures where some services run on AWS Lambda or Cloudflare Workers and share the same Postgres instance. Those runtimes still incur per-invocation cold starts.
- Non-Vercel serverless platforms where Fluid’s persistent-instance model does not apply. PgBouncer, Supavisor, and AWS RDS Proxy continue to fill this gap.
- Mixed connection profiles where some clients need session-level features (prepared statements,
LISTEN/NOTIFY) while others are fine with transaction-mode multiplexing. This requires a pooler that can route connections by mode. - Compliance and observability layers where the pooler is the point of control for query logging, connection auditing, or authentication proxying.
The second-order pressure
The interesting question is not whether Fluid kills PgBouncer. It will not; the installed base and multi-cloud reality prevent that. The question is what happens to the economics of connection-pooling-as-a-service when a major platform vendor absorbs the pooler’s core function into the runtime.
If Vercel can document and guarantee that Fluid instances hold connections open across invocations, the value proposition of a dedicated pooler layer narrows to the edge cases listed above. Pooler vendors (Supavisor, Prisma Accelerate, and similar services not covered in the available sources) would need to compete on those edge cases or on cross-platform compatibility rather than on solving the core serverless+Postgres problem.
That shift also puts pressure on database vendors to expose connection-aware client SDKs that handle lifecycle management without external middleware. Postgres itself shows no signs of abandoning process-per-connection, so the burden falls on the layers above it.
The infrastructure is moving in a direction where the pooler becomes optional in narrow, single-vendor stacks. That is a real change. It is also a change that currently rests on homepage marketing copy and product positioning rather than published technical specifications. When the docs ship, this analysis will either look prescient or premature.
Frequently Asked Questions
Does PgBouncer have pooling modes besides transaction mode?
PgBouncer offers three modes. Session mode preserves every Postgres feature (prepared statements, LISTEN/NOTIFY, SET, advisory locks, temp tables) but only reclaims connections when a client disconnects, giving a multiplexing ratio near 1:1. Statement mode checks connections back after each individual SQL statement, maximizing reuse but breaking anything that spans multiple statements, including explicit transactions and cursors. Transaction mode is the practical compromise for serverless stacks, which is why it gets the most coverage.
What should teams monitor when moving off PgBouncer to a persistent-instance runtime?
Track Postgres backend count (it should drop if connection reuse is working), average connection age (should increase from seconds to minutes or hours), and connection error rates during scale-up and scale-down events. Watch for leaked connections too: if the runtime does not cleanly close TCP sockets when instances are recycled, idle backends accumulate on the database until hitting max_connections, reproducing the exact problem the pooler was there to solve.
How does AWS RDS Proxy differ from PgBouncer for serverless Postgres stacks?
RDS Proxy is fully managed by AWS, integrates with IAM authentication natively, and maintains per-Lambda-environment connection pools. Unlike PgBouncer, it is closed-source, only works with Aurora or RDS-hosted databases, and adds a small latency overhead (low single-digit milliseconds) for the proxy layer itself. PgBouncer can run against any Postgres endpoint, including self-hosted and non-AWS cloud instances, making it the only option for multi-cloud or on-prem setups.
Would LISTEN/NOTIFY and advisory locks work under Fluid Compute if connections stay open?
If a Fluid instance genuinely holds a single Postgres connection open across requests, LISTEN/NOTIFY and advisory locks would work within that one instance’s lifetime. The limitation is scope: notifications sent from one Fluid instance would not reach listeners on another, and advisory locks acquired by instance A would block queries from instance B. This is identical to running multiple application servers against Postgres without a pooler, and it means these features remain unusable for any coordination that spans requests hitting different instances.