API Payload Compression in ASP.NET Core

People talk about payload compression as if it were a single checkbox in Program.cs. Turn on gzip, maybe add Brotli, and move on. That approach is no good for a production system that serves high-volume JSON, handles uploads, runs behind a proxy, and needs predictable latency under load.
In real systems, compression is a transport concern with application-level consequences. It affects bandwidth, CPU, latency, caching behaviour, security posture, and even how you shape your contracts. ASP.NET Core gives you built-in middleware for response compression and request decompression, but the framework does not make the architectural decisions for you. You still need to decide what to compress, where to compress it, when to reject it, and how to avoid using compression as a bandage for bad API design. The current ASP.NET Core guidance is still clear on the fundamentals, use compression to reduce payload size, prefer server or proxy compression where available, and use the built-in middleware when Kestrel or HTTP.sys are serving the app directly because they do not provide built-in compression themselves.
The first mistake Developers make is thinking compression is mainly about speed. It is really about trade-offs. Compression reduces bytes on the wire, which often improves responsiveness, especially for JSON and other text-heavy payloads. At the same time, it costs CPU to compress and decompress data. That trade-off is usually favourable for medium and large JSON responses over public networks, but not always favourable for tiny payloads or already-compressed binary content. That is why serious API design starts with payload shape first and compression second. If your endpoint returns bloated documents with duplicated fields, unnecessary nesting, and data the caller never asked for, compression will help, but only after you already lost the bigger battle. Microsoft’s guidance frames compression as a way to reduce response size and improve responsiveness, not as a replacement for lean responses.
A useful mental model is to separate outbound compression from inbound decompression. Outbound compression is the default case. Your API produces JSON, problem details, text, CSV, or other compressible formats, and the client advertises supported encodings through Accept-Encoding. The response compression middleware examines the request and response, selects a provider such as Brotli or gzip, and writes the compressed payload if the response type is eligible. Inbound decompression is different. There, the client sends a compressed request body and marks it with Content-Encoding, and the request decompression middleware unwraps it before model binding or request body reading happens. ASP.NET Core supports both directions, but they solve different problems and they should not be enabled with the same level of enthusiasm. Response compression is broadly useful. Request decompression is useful only when clients are actually sending large compressed payloads, typically large JSON, text, or similar upload bodies.
In practice, the best default for a modern ASP.NET Core API is straightforward. Compress responses that are actually compressible. Prefer Brotli when the client supports it. Fall back to gzip for compatibility. Leave already-compressed formats alone. If you are running behind IIS, Apache, or Nginx, prefer server-based compression because Microsoft explicitly notes that server modules generally outperform the ASP.NET Core middleware. If you are serving directly from Kestrel or HTTP.sys, use the middleware because those servers do not currently offer built-in compression support.
The second mistake Developers make is compressing everything indiscriminately. Compression is not magic. It works best on text-heavy formats because they contain repeating structure. JSON is the classic win because property names, quotes, punctuation, and repeated values compress well. XML, HTML, CSS, JavaScript, CSV, plain text, and problem details are all strong candidates. JPEG, PNG, MP4, ZIP, and many other binary formats are not. Recompressing data that is already compressed often gives you negligible size reduction and unnecessary CPU overhead. This is exactly why the ASP.NET Core response compression middleware is configured around MIME types. You tell it what content types are eligible instead of asking it to blindly compress whatever leaves the process.
Here is a production-friendly baseline for a .NET API using minimal APIs. It enables response compression, explicitly adds Brotli and gzip, includes JSON-related MIME types, and sets both providers to Fastest because API latency usually matters more than squeezing out the very last percentage point of compression ratio.
using Microsoft.AspNetCore.RequestDecompression;
using Microsoft.AspNetCore.ResponseCompression;
using System.IO.Compression;
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddResponseCompression(options =>
{
options.EnableForHttps = true;
options.Providers.Add<BrotliCompressionProvider>();
options.Providers.Add<GzipCompressionProvider>();
options.MimeTypes = ResponseCompressionDefaults.MimeTypes.Concat(new[]
{
"application/json",
"application/problem+json",
"text/plain",
"text/csv"
});
});
builder.Services.Configure<BrotliCompressionProviderOptions>(options =>
{
options.Level = CompressionLevel.Fastest;
});
builder.Services.Configure<GzipCompressionProviderOptions>(options =>
{
options.Level = CompressionLevel.Fastest;
});
builder.Services.AddRequestDecompression();
var app = builder.Build();
app.UseRequestDecompression();
app.UseResponseCompression();
app.MapGet("/api/orders/{id:int}", (int id) =>
{
var response = new
{
Id = id,
Customer = "ACME Insurance",
Lines = Enumerable.Range(1, 250).Select(i => new
{
LineNumber = i,
Sku = $"SKU-{i:0000}",
Quantity = i % 5 + 1,
Price = 49.99m + i
})
};
return Results.Json(response);
});
app.Run();
This gives you the right starting point, but serious systems usually need more discipline than a baseline setup. One example is compression level. Many developers instinctively choose Optimal, assuming it must be better because the name sounds better. That is too simplistic. In APIs, especially low-latency APIs, Fastest is often the better operational choice because it cuts CPU cost and still captures most of the size reduction on JSON. Optimal can make sense for larger batch-style responses or download scenarios where throughput matters more than raw request latency. The right answer is not theoretical. Benchmark it with your own payloads and concurrency profile.
A useful way to think about the pipeline is this:
Another place where Devlopers get sloppy is HTTPS compression. ASP.NET Core exposes EnableForHttps, and the documented default is false. Microsoft also warns that enabling compression for HTTPS responses containing remotely manipulable content may expose security problems. That warning exists because compression can become part of a side-channel when attacker-controlled input and secret-bearing content share the same compressed response. In normal internal or line-of-business APIs, many people still enable HTTPS compression because the benefits are real and the attack surface may be limited, but that decision should be deliberate. If you reflect attacker-supplied content into a response that also carries secrets, tokens, or sensitive dynamic values, do not just enable HTTPS compression and forget about it. Understand what is actually in those responses.
Request decompression deserves even more caution. The feature is real and useful, but it is not something to switch on simply because the middleware exists. The request decompression middleware automatically inspects Content-Encoding and decompresses supported request bodies, which saves you from writing custom request-body handling code. That part is good. The hard part is operational safety. Inbound compressed payloads shift CPU work onto your servers and can amplify resource consumption if abused. If you accept large compressed uploads, you should pair that with request size limits, timeout controls, careful endpoint scoping, and monitoring. The middleware also needs to run before anything reads the body, otherwise you are too late.
A targeted inbound example looks like this:
app.UseRequestDecompression();
app.MapPost("/api/import/products", async (HttpContext httpContext) =>
{
httpContext.Features.Get<IHttpMaxRequestBodySizeFeature>()?.DisableMaxRequestBodySize();
using var reader = new StreamReader(httpContext.Request.Body);
var json = await reader.ReadToEndAsync();
return Results.Ok(new
{
Message = "Compressed request accepted",
Characters = json.Length
});
});
app.Run();
That sample shows the mechanics, but the operational point matters more than the syntax. You should not enable inbound decompression across every endpoint unless the endpoints really need it. A typical CRUD API that accepts small POST and PUT bodies gets little benefit from compressed requests. A bulk import endpoint that accepts a multi-megabyte JSON document might benefit a lot.
You should also think about where compression belongs in a broader deployment. If you are behind Nginx or IIS, server-side compression at the edge is often the better place for outbound response compression because it takes work off the app and can be tuned centrally. Microsoft’s guidance says exactly that, noting that the performance of the ASP.NET Core middleware probably will not match dedicated server modules. That does not make middleware wrong. It just means you should not ignore the reverse proxy when you have one. If you already terminate traffic behind a capable gateway, that is often the best place to handle compression consistently.
Caching behaviour is another area where compression changes system behaviour more than people expect. Once you serve multiple encoded versions of the same representation, the cache key must vary by encoding. That is why compressed responses are tied to Accept-Encoding, and why intermediaries need to treat the compressed and uncompressed versions as distinct representations. If you run API caching, CDN caching, or reverse-proxy caching, compression is no longer just a transport tweak. It becomes part of representation management. That matters even more if you also use ETags. In a well-behaved system, you need consistency in how representations are generated and validated, especially if compression is handled at the proxy layer instead of the app layer.
Another point is that compression interacts with streaming. If your endpoint sends buffered JSON in one shot, compression is easy. If you are sending data progressively, such as large streamed responses, NDJSON, SSE-style traffic, or anything latency-sensitive where flushing behavior matters, compression may introduce buffering or delivery characteristics that work against the protocol. In those cases, the question is not just "can I compress this?" but "does compression preserve the delivery behaviour I actually want?" For real-time or progressive-delivery endpoints, the right answer is often endpoint-specific rather than global.
The security and resilience side should not be ignored either. Kestrel exposes minimum request and response data rate limits, and the documented defaults are 240 bytes per second with a 5 second grace period. That matters because slow clients, large request bodies, and decompression work can combine into unpleasant failure modes if you do not have sane guardrails. Kestrel also supports request header timeouts and request body size limits, and those settings should be part of your overall posture when you accept uploads or large bodies, compressed or otherwise. Compression is not an isolated tuning knob. It sits inside your wider transport hardening model.
Here is a fuller example with Kestrel limits and explicit compression setup:
using Microsoft.AspNetCore.ResponseCompression;
using System.IO.Compression;
var builder = WebApplication.CreateBuilder(args);
builder.WebHost.ConfigureKestrel(options =>
{
options.Limits.RequestHeadersTimeout = TimeSpan.FromSeconds(30);
options.Limits.MinRequestBodyDataRate = new(bytesPerSecond: 240, gracePeriod: TimeSpan.FromSeconds(5));
options.Limits.MinResponseDataRate = new(bytesPerSecond: 240, gracePeriod: TimeSpan.FromSeconds(5));
options.Limits.MaxRequestBodySize = 20 * 1024 * 1024; // 20 MB
});
builder.Services.AddResponseCompression(options =>
{
options.EnableForHttps = true;
options.Providers.Add<BrotliCompressionProvider>();
options.Providers.Add<GzipCompressionProvider>();
options.MimeTypes = ResponseCompressionDefaults.MimeTypes.Concat(new[]
{
"application/json",
"application/problem+json"
});
});
builder.Services.Configure<BrotliCompressionProviderOptions>(options =>
{
options.Level = CompressionLevel.Fastest;
});
builder.Services.Configure<GzipCompressionProviderOptions>(options =>
{
options.Level = CompressionLevel.Fastest;
});
var app = builder.Build();
app.UseResponseCompression();
app.MapGet("/api/report", () =>
{
var report = Enumerable.Range(1, 10_000).Select(i => new
{
Id = i,
Name = $"Item {i}",
Status = i % 3 == 0 ? "Pending" : "Complete",
Timestamp = DateTime.UtcNow.AddMinutes(-i)
});
return Results.Json(report);
});
app.Run();
That is the kind of configuration that belongs in a serious service. It does not just say "turn compression on." It defines the transport assumptions that go with it.
There is also a design lesson here for internal APIs and service-to-service calls. Developers sometimes assume compression matters only for internet-facing traffic. That is not always true. In cloud environments, especially across regions, VNets, or heavily loaded east-west traffic paths, payload size still matters. Compressing large JSON documents between services can reduce network cost and improve throughput. The catch is that the CPU trade-off now happens on your own estate at scale. If a service is already CPU-bound, compression can make it worse. If the network is the bottleneck, compression can help a lot. Again, this is why you benchmark real workloads instead of arguing from instinct.
If you want a clean set of rules that hold up in practice, they are these. Shape payloads properly first. Compress text-heavy responses by default. Prefer Brotli with gzip fallback. Leave already-compressed binaries alone. Use request decompression only for endpoints that genuinely need it. Prefer edge or proxy compression when your hosting stack supports it well. Treat HTTPS compression as a conscious security decision, not a default checkbox. Add limits and timeouts when you accept large request bodies. Measure the CPU and latency profile under realistic traffic before you call the job done. Those rules are not glamorous, but they are what separate a neat code sample from a production-grade API.
The big point is simple. Payload compression in ASP.NET Core is not a trick. It is part of transport engineering. When you treat it that way, the implementation becomes clearer. You stop asking whether you should "turn on gzip" and start asking the questions that actually matter: where should compression happen, which representations benefit, what security caveats apply, what limits protect the server, and whether your payloads deserved to be that large in the first place.
That's what serious API payload compression looks like in modern .NET. It's not complicated, but it does require intent.




