Gillius's Programming

Cold Start in Azure for Scale-to-Zero Services

22 Oct 2025

In this post I explore performance of a few ways to run services that can scale to zero in Azure. This can be useful to reduce costs in services with very low utilization such as a personal cloud learning environment or idle dev/test environments of an enterprise application. My previous post about Java startup performance focused on improving cold start times in the hopes to make scale-to-zero possible on a service like Azure Container Apps. Unfortunately, I was quite disappointed in container apps’ cold start performance and quite surprised with Azure Functions Flex Consumption. In this post I will detail my experience and compare to Azure Functions and Azure App Service.

Applications

Each application implements a single HTTP endpoint (or HTTP trigger for functions) taking a name query parameter and returning a text greeting. All resources are in the US East 2 Azure region with public networking (no VNets).

For the Functions framework I use Flex Consumption (introduced Dec 2024). Microsoft says cold starts are much better versus the old Consumption mode, and the old mode doesn’t support private networking (a requirement for many enterprise deployments), so for these two reasons I did not evaluate it.

I created four variations:

quarkus-rest

Java app created with Quarkus 3.28.3 CLI (quarkus create app --extension=quarkus-rest). Compiled to native with graalvm 25 into a 105MB container (with 49MB executable) from the generated Dockerfile.native-micro, which uses quay.io/quarkus/ubi9-quarkus-micro-image:2.0-amd64 as its base. Served from an Azure Container Registry Basic.

This application starts in about 50ms on an M2 Pro Macbook, or 0.050-0.250s in the cloud based on logs.

quarkus-azure-functions

Java app created with quarkus create app --extension=quarkus-azure-functions. By default, this uses the long-deprecated function V3 framework, but I created a custom host.json enables function v4 framework.

azure-functions-archetype

Java function app built from Maven azure-functions-archetype as a “fat” jar (single jar with dependencies) via azure-functions-maven-plugin’s option “buildJarWithDependencies” since the documentation says it improves cold start times.

JS quickstart

A “hello world” JavaScript app from the function tools (func init + func new) as described in Azure Functions Quickstart for JavaScript.

Results

I first checked the options that explicitly design for scale-to-zero:

(Note: Results include an estimated 40ms network round trip time (RTT) and up to 150ms SSL setup time)

Application Service Size Runtime Cold Start Response Time (s)
quarkus-rest Container App 0.25 CPU / 512 MB GraalVM 25 native + UBI9 container 15 - 37
quarkus-azure-functions Flex Consumption Function 512 MB Java 21 1 - 1.8
azure-functions-archetype Flex Consumption Function 512 MB Java 21; “fat” jar 0.9 - 1.7
JS quickstart Flex Consumption Function 512 MB Node 22 1 - 2.2

Additionally, I compared against Azure Web App, Azure Container App with a minimum instance of 1, and Flex Consumption “always ready” mode.

Application Service Size Runtime Idle Response Time
JS quickstart Flex Consumption Function
(1 always ready instance)
512 MB Node 22 up to 200 ms
quarkus-rest Container App
(min scale 1)
512 MB GraalVM 25 native + UBI9 container up to 140 ms
quarkus-rest App Service Premium P0v3 (1 vCPU + 4 GB) GraalVM 25 native + UBI9 container up to 175 ms
quarkus-rest App Service Basic B1 (1 Core + 1.75 GB) GraalVM 25 native + UBI9 container up to 500 ms
quarkus-rest App Service Free F1 GraalVM 25 native + UBI9 container 300 - 800 ms, 20s after long time

(Note: I wished to try the new P0v4 launched in Sept 2025, but it didn’t show available in my subscription)

Like the first table, results include typical 50ms RTT and 150ms SSL setup time. Therefore, the three services less than 200ms are probably within measurement error.

Measuring Details

Measuring cold start time is easier with the container app service where you can set the cooldown time that determines when the service scales to zero. For Azure Functions, you can’t control when the instance is stopped, so I waited between 15-60 minutes between tries or more, in some cases after an overnight period and assumed any time over 200ms involved some level of “cold” starting. I measured times using the Chrome DevTools network tab and noticed 3 different types of responses:

Therefore, for the responses under 200ms I can assume the service is “hot” and network overhead dominates.

Discussion

Based on response times, the free service likely scales to zero or puts apps into some deep hibernation. I am unsure about the basic app service plan; 500ms is likely too quick for a total cold start, but it’s definitely slower than network overhead alone.

Although the time range min and max for the 3 functions overlaps heavily, I am surprised that the azure-functions-archetype cold response time was similar (and sometimes slightly faster than) the JavaScript version, which I assumed would start faster.

The Azure functions framework starts so fast I suspect they use pre-warmed containers with their base code. It would be interesting to research how much delay is added when bringing in VNet integration, authentication, and database connections.

I am surprised at the container app start times and wonder how it would compare to scaling a deployment on a custom Kubernetes cluster (as container app service is based on Kubernetes). The startup is so slow I would not use it in scale-to-zero configuration for anything performance sensitive. It can still make sense in a batch processing scenario, depending on how the pricing works out for the use case versus functions. As a container app, a GraalVM native build does not provide any valuable benefit while adding a bit of build complexity.

For “always ready” use, container apps charge 1/8th the price for CPU when it is idle, which is less than $10/month for the smallest 0.25 CPU + 512 MB size, which is not an issue in an enterprise environment, but if you don’t need the flexibility of a containerized services, you can write your application as a function for scale-to-zero with potentially reasonable response times and potentially have 0 cost with the monthly free grants.

Function apps do support an “always ready” mode to avoid cold start times with reduced pricing when idle, which appears even slightly cheaper than container apps and under $10/month for 512MB size. There is no monthly free grant for this mode, though.

Azure Web App Sharing Alternative

I also compared the native-compiled Quarkus app container running in Azure Web App. Even at basic tier it’s an always-on service; however, it can run up to 8 to 64 apps per plan depending on the tier. The issue with sharing unrelated apps is that scaling is not practical as they must scale together, but if you have multiple apps that will have very low traffic, and you can accept the risk of an app starving the others of CPU and memory, this is cheaper than container apps. In this scenario, memory usage is important and native-compiled allows more services to fit per plan. If you must have a container-based service that is always ready, this seems like the cheapest possible option. If Azure functions adds container support for Flex Consumption, that would likely be a better choice though.

In an enterprise context, container apps give a great price and flexibility for infrequently used small services in a safer way than sharing App Service plans, and allows for future scaling. The introduction of Azure Flex Consumption functions in Dec 2024 have greatly improved serverless support in terms of cost and functionality, but still cannot run a custom container.

Conclusions

Other Notes

Another benefit to container apps is that you get more of the memory you request. On Azure App Service the framework runs with your service and takes up about 1GiB of memory (presumably Kudu is part or all of that) – for example with quarkus-rest in App Service B1 I saw 70% memory usage (1.22 GiB of 1.75 GiB), and in P0v3 28% (1.12 GiB of 4 GiB) for the two quarkus-rest apps. Even after stopping one of the apps the memory stayed the same so it seems the overhead is per plan instance and not per app. In the quarkus-rest container app I saw a memory working set bytes between 14MiB-22MiB, suggesting I could use the full 512MiB of the container app but only ~750MiB of the 1.75GiB of the App Service B1.