Distributed Transactions Without DTC
Coordinating APIs via Outbox + Event Grid

In the early days of enterprise .NET, distributed transactions seemed too simple. The Distributed Transaction Coordinator (DTC) quietly linked database and message queue operations, giving us the illusion of atomicity across boundaries. The trouble was, that simplicity came with a cost, blocking coordination, slow two phase commits, and a brittle web of locks that refused to scale and made me go grey overnight. Modern systems have outgrown that illusion. Once you split your architecture into multiple services, each with its own persistence and its own message broker, the idea of a global commit point stops being viable. Instead, we trade the illusion of atomicity for something more powerful.
From DTC to the Outbox Pattern
The Outbox pattern exists because databases are honest, and message brokers tell lies. When you insert a record into SQL Server, you can rely on the atomic guarantee. When you publish a message to Service Bus or Event Grid, there’s no guarantee that the message won’t be lost between your save and your publish. The Outbox acts as a local transaction buffer, you save your domain event into a database table as part of the same transaction that updates your entity. Once committed, a background process reads the Outbox table and forwards events to the broker, marking them as dispatched.
In code, this looks deceptively simple:
await using var tx = await db.Database.BeginTransactionAsync(stopToken);
order.MarkAsPaid();
db.Outbox.Add(new OutboxMessage(order.Id, "OrderPaid"));
await db.SaveChangesAsync(stopToken);
await tx.CommitAsync(stopToken);
Later, a dispatcher picks up undelivered messages, publishes them to Event Grid, and marks them as sent. The beauty is that no matter what fails, the API, the broker, or the network, the system can replay its intent until it succeeds. The Outbox gives you “at least once” without the weight of a distributed transaction.
Event Grid as the Coordinator
Azure Event Grid is not a queue, it’s a fabric. It’s designed for fan out, filtering, and guaranteed delivery of discrete domain events. When you emit your Outbox messages into Event Grid, you’re not pushing a work item to a consumer, you’re declaring that something happened and letting interested systems react.
If I have a Payment API and a Fulfilment API. The Payment API writes a record of a successful charge and publishes a PaymentCompleted event to Event Grid. The Fulfilment API subscribes to that event type, starts preparing the order, and emits its own OrderReadyForDispatch event once complete. Neither service blocks, and neither waits for a commit from the other. The “transaction” completes through a series of autonomous, verifiable steps.
Each service only owns its own truth. The coordination happens through events, and Event Grid’s retry and dead letter policies take care of reliability.
Designing Idempotent Handlers
The dark side of eventual consistency is duplication. Because Event Grid guarantees “at least once” delivery, handlers must be idempotent. That means each event must carry enough information to allow the receiver to check if it has already processed it. The most common strategy is to store a processed event log keyed by message ID or some other unique identifier.
if (await db.ProcessedEvents.AnyAsync(e => e.EventId == evt.Id, stopToken))
return; // already handled
Idempotency is not an afterthought. It’s what lets you replay events during recovery without breaking downstream state. In a distributed world, it’s the new transaction boundary.
Handling Failures Without Rollback
A DTC transaction rolls back if any participant fails. With Outbox driven eventing, you don’t roll back, you compensate. If the Payment API succeeds but the Fulfilment API fails to respond, you can emit a PaymentRefundInitiated event after a timeout. Compensation is explicit, not automatic. It makes failure visible, and that’s a good thing. You can log, retry, or escalate without freezing the whole system and you always know whats going on.
Compensation workflows can be implemented with Durable Functions, a lightweight saga orchestrator, or even a background service that listens for missing transitions. The key is to treat the Outbox as the single source of truth for intent and let compensating events handle the recovery.
Implementing the Outbox Dispatcher
A production quality Outbox dispatcher in .NET can be implemented as a hosted background service. It runs on a configurable schedule, reads unsent messages, publishes them, and updates their state.
public sealed class OutboxDispatcher(OutboxDbContext db, IEventPublisher publisher, ILogger<OutboxDispatcher> log)
: BackgroundService
{
protected override async Task ExecuteAsync(CancellationToken stopToken)
{
while (!stopToken.IsCancellationRequested)
{
var pending = await db.OutboxMessages
.Where(m => !m.Dispatched)
.OrderBy(m => m.CreatedUtc)
.Take(50)
.ToListAsync(stopToken);
foreach (var msg in pending)
{
try
{
await publisher.PublishAsync(msg, stopToken);
msg.MarkDispatched();
}
catch (Exception ex)
{
log.LogWarning(ex, "Failed to publish {MessageId}", msg.Id);
}
}
await db.SaveChangesAsync(stopToken);
await Task.Delay(TimeSpan.FromSeconds(10), stopToken);
}
}
}
This approach scales linearly. You can deploy multiple instances of the same service, each processing different batches without conflict. With proper locking or “claimed until” timestamps, you can safely scale horizontally.
Tracing and Observability
Outbox dispatchers form the connective layer of your system. Without visibility, diagnosing cross service flows becomes impossible. Integrating OpenTelemetry tracing here is vital. Tag spans with message IDs, entity IDs, and event types, then export traces to Application Insights or Grafana Tempo. The result is an end to end trace showing how a command in one API triggers a chain of Outbox writes and Event Grid publishes that culminate in a state change somewhere else.
Testing
Testing eventual consistency feels counter intuitive because you can’t assert instantaneous state. Instead, you assert convergence over time. Write integration tests that wait until all events for a given aggregate have propagated. Use in memory Event Grid simulators during tests and verify that Outbox records eventually reach a dispatched state. This shows wether the system eventually stabilises into the correct outcome.
Evolving the Pattern
The Outbox + Event Grid combination is only the beginning. You can extend it with an Inbox pattern to handle deduplication, or introduce a “Transaction Log Tailing” approach where change data capture streams replace explicit Outbox tables. You can even replace Event Grid with Service Bus Topics for ordered message delivery when required. What we really care about is that the transaction boundary remains local and the communication remains event driven.
Distributed systems can’t pretend to be monoliths. The Outbox pattern and Azure Event Grid let you model cross service coordination without DTC. You gain scalability, traceability, and resilience at the cost of simplicity, and that’s a trade worth making more times than not.





