Reader

Fluid compute is Vercel’s next-generation compute model designed to handle modern workloads with real-time scaling, cost efficiency, and minimal overhead. Traditional serverless architectures optimize for fast execution, but struggle with requests that spend significant time waiting on external models or APIs, leading to wasted compute.

To address these inefficiencies, Fluid compute dynamically adjusts to traffic demands, reusing existing resources before provisioning new ones. At the center of Fluid is Vercel Functions router, which orchestrates function execution to minimize cold starts, maximize concurrency, and optimize resource usage. It dynamically routes invocations to pre-warmed or active instances, ensuring low-latency execution.

By efficiently managing compute allocation, the router prevents unnecessary cold starts and scales capacity only when needed. Let's look at how it intelligently manages function execution.

Reader

How Fluid compute works on Vercel