Cold Start in Azure for Scale-to-Zero Services

22 Oct 2025

In this post I explore performance of a few ways to run services that can scale to zero in Azure. This can be useful to reduce costs in services with very low utilization such as a personal cloud learning environment or idle dev/test environments of an enterprise application. My previous post about Java startup performance focused on improving cold start times in the hopes to make scale-to-zero possible on a service like Azure Container Apps. Unfortunately, I was quite disappointed in container apps’ cold start performance and quite surprised with Azure Functions Flex Consumption. In this post I will detail my experience and compare to Azure Functions and Azure App Service.

Applications

Each application implements a single HTTP endpoint (or HTTP trigger for functions) taking a name query parameter and returning a text greeting. All resources are in the US East 2 Azure region with public networking (no VNets).

For the Functions framework I use Flex Consumption (introduced Dec 2024). Microsoft says cold starts are much better versus the old Consumption mode, and the old mode doesn’t support private networking (a requirement for many enterprise deployments), so for these two reasons I did not evaluate it.

I created four variations:

quarkus-rest

Java app created with Quarkus 3.28.3 CLI (quarkus create app --extension=quarkus-rest). Compiled to native with graalvm 25 into a 105MB container (with 49MB executable) from the generated Dockerfile.native-micro, which uses quay.io/quarkus/ubi9-quarkus-micro-image:2.0-amd64 as its base. Served from an Azure Container Registry Basic.

This application starts in about 50ms on an M2 Pro Macbook, or 0.050-0.250s in the cloud based on logs.

quarkus-azure-functions

Java app created with quarkus create app --extension=quarkus-azure-functions. By default, this uses the long-deprecated function V3 framework, but I created a custom host.json enables function v4 framework.

azure-functions-archetype

Java function app built from Maven azure-functions-archetype as a “fat” jar (single jar with dependencies) via azure-functions-maven-plugin’s option “buildJarWithDependencies” since the documentation says it improves cold start times.

JS quickstart

A “hello world” JavaScript app from the function tools (func init + func new) as described in Azure Functions Quickstart for JavaScript.

Results

I first checked the options that explicitly design for scale-to-zero:

(Note: Results include an estimated 40ms network round trip time (RTT) and up to 150ms SSL setup time)

Application	Service	Size	Runtime	Cold Start Response Time (s)
quarkus-rest	Container App	0.25 CPU / 512 MB	GraalVM 25 native + UBI9 container	15 - 37
quarkus-azure-functions	Flex Consumption Function	512 MB	Java 21	1 - 1.8
azure-functions-archetype	Flex Consumption Function	512 MB	Java 21; “fat” jar	0.9 - 1.7
JS quickstart	Flex Consumption Function	512 MB	Node 22	1 - 2.2

Additionally, I compared against Azure Web App, Azure Container App with a minimum instance of 1, and Flex Consumption “always ready” mode.

Application	Service	Size	Runtime	Idle Response Time
JS quickstart	Flex Consumption Function (1 always ready instance)	512 MB	Node 22	up to 200 ms
quarkus-rest	Container App (min scale 1)	512 MB	GraalVM 25 native + UBI9 container	up to 140 ms
quarkus-rest	App Service Premium	P0v3 (1 vCPU + 4 GB)	GraalVM 25 native + UBI9 container	up to 175 ms
quarkus-rest	App Service Basic	B1 (1 Core + 1.75 GB)	GraalVM 25 native + UBI9 container	up to 500 ms
quarkus-rest	App Service Free	F1	GraalVM 25 native + UBI9 container	300 - 800 ms, 20s after long time

(Note: I wished to try the new P0v4 launched in Sept 2025, but it didn’t show available in my subscription)

Like the first table, results include typical 50ms RTT and 150ms SSL setup time. Therefore, the three services less than 200ms are probably within measurement error.

Measuring Details

Measuring cold start time is easier with the container app service where you can set the cooldown time that determines when the service scales to zero. For Azure Functions, you can’t control when the instance is stopped, so I waited between 15-60 minutes between tries or more, in some cases after an overnight period and assumed any time over 200ms involved some level of “cold” starting. I measured times using the Chrome DevTools network tab and noticed 3 different types of responses:

Reused connection to “hot” service: ~40ms
New connection to “hot” service: ~200ms (Chrome timing subtab shows 50ms for DNS, 100ms for SSL setup)
Assumed or known “cold” start: >~200ms

Therefore, for the responses under 200ms I can assume the service is “hot” and network overhead dominates.

Discussion

Based on response times, the free service likely scales to zero or puts apps into some deep hibernation. I am unsure about the basic app service plan; 500ms is likely too quick for a total cold start, but it’s definitely slower than network overhead alone.

Although the time range min and max for the 3 functions overlaps heavily, I am surprised that the azure-functions-archetype cold response time was similar (and sometimes slightly faster than) the JavaScript version, which I assumed would start faster.

The Azure functions framework starts so fast I suspect they use pre-warmed containers with their base code. It would be interesting to research how much delay is added when bringing in VNet integration, authentication, and database connections.

I am surprised at the container app start times and wonder how it would compare to scaling a deployment on a custom Kubernetes cluster (as container app service is based on Kubernetes). The startup is so slow I would not use it in scale-to-zero configuration for anything performance sensitive. It can still make sense in a batch processing scenario, depending on how the pricing works out for the use case versus functions. As a container app, a GraalVM native build does not provide any valuable benefit while adding a bit of build complexity.

For “always ready” use, container apps charge 1/8th the price for CPU when it is idle, which is less than $10/month for the smallest 0.25 CPU + 512 MB size, which is not an issue in an enterprise environment, but if you don’t need the flexibility of a containerized services, you can write your application as a function for scale-to-zero with potentially reasonable response times and potentially have 0 cost with the monthly free grants.

Function apps do support an “always ready” mode to avoid cold start times with reduced pricing when idle, which appears even slightly cheaper than container apps and under $10/month for 512MB size. There is no monthly free grant for this mode, though.

I also compared the native-compiled Quarkus app container running in Azure Web App. Even at basic tier it’s an always-on service; however, it can run up to 8 to 64 apps per plan depending on the tier. The issue with sharing unrelated apps is that scaling is not practical as they must scale together, but if you have multiple apps that will have very low traffic, and you can accept the risk of an app starving the others of CPU and memory, this is cheaper than container apps. In this scenario, memory usage is important and native-compiled allows more services to fit per plan. If you must have a container-based service that is always ready, this seems like the cheapest possible option. If Azure functions adds container support for Flex Consumption, that would likely be a better choice though.

In an enterprise context, container apps give a great price and flexibility for infrequently used small services in a safer way than sharing App Service plans, and allows for future scaling. The introduction of Azure Flex Consumption functions in Dec 2024 have greatly improved serverless support in terms of cost and functionality, but still cannot run a custom container.

Conclusions

The best solution for very low scale and potentially zero-cost services is Azure Function Flex Consumption with its reasonable cold start times and free grants, although there is risk of over-spend as the lowest max instances setting is 40. However, it does not support containers.
If containers are required, Azure Container Apps’s cold start times are disappointing, but has a quite reasonable idle usage cost to keep a service always running (1/8th CPU cost, same memory cost).
Native compilation of Java doesn’t enable any new practical use case with these Azure services. While it could enable aggressive sharing on an Azure App Service plan, that adds a lot of complication to save a little cost.

Other Notes

Managed identity was used to access the storage account (functions) and container registry.
The default quarkus-azure-function host.json used the deprecated V3 functions framework, which had 4-7 seconds cold start time.
Application Insights was enabled on the function apps, but not on the container and web apps.

Another benefit to container apps is that you get more of the memory you request. On Azure App Service the framework runs with your service and takes up about 1GiB of memory (presumably Kudu is part or all of that) – for example with quarkus-rest in App Service B1 I saw 70% memory usage (1.22 GiB of 1.75 GiB), and in P0v3 28% (1.12 GiB of 4 GiB) for the two quarkus-rest apps. Even after stopping one of the apps the memory stayed the same so it seems the overhead is per plan instance and not per app. In the quarkus-rest container app I saw a memory working set bytes between 14MiB-22MiB, suggesting I could use the full 512MiB of the container app but only ~750MiB of the 1.75GiB of the App Service B1.