Skip to main content

Command Palette

Search for a command to run...

Building a Camunda-Like Workflow Tracker Without BPMN in .NET

Updated
9 min read
Building a Camunda-Like Workflow Tracker Without BPMN in .NET
P
Senior Software Engineer specialising in cloud architecture, distributed systems, and modern .NET development, with over two decades of experience designing and delivering enterprise platforms in financial, insurance, and high-scale commercial environments. My focus is on building systems that are reliable, scalable, and maintainable over the long term. I’ve led modernisation initiatives moving legacy platforms to cloud-native Azure architectures, designed high-throughput streaming solutions to eliminate performance bottlenecks, and implemented secure microservices environments using container-based deployment models and event-driven integration patterns. From an architecture perspective, I have strong practical experience applying approaches such as Vertical Slice Architecture, Domain-Driven Design, Clean Architecture, and Hexagonal Architecture. I’m particularly interested in modular system design that balances delivery speed with long-term sustainability, and I enjoy solving complex problems involving distributed workflows, performance optimisation, and system reliability. I enjoy mentoring engineers, contributing to architectural decisions, and helping teams simplify complex systems into clear, maintainable designs. I’m always open to connecting with other engineers, architects, and technology leaders working on modern cloud and distributed system challenges.

Camunda is a strong, mature platform, and solves hard problems very well. But its also designed to cover a huge surface area, visual modelling, execution, orchestration, human tasks, retries, compensation, and governance. If you have a full development team that already owns the business logic, the data model, and the deployment pipeline, you might not need all that capability. What you might just need is visibility, auditability, and the ability to answer questions about how work actually flows through the system. In those cases, a lightweight workflow tracker gives you most of the value at a fraction of the complexity, and lets engineers stay in code rather than in diagrams.

Most workflow engines start from the wrong place.

They begin with diagrams, XML, modelling tools, and an assumption that your business logic wants to be expressed as a flowchart. In practice, most production systems already have workflows. They just live in code, databases, queues, APIs, and human decisions. The missing piece isnt orchestration logic. Its visibility.

All you might want is a reliable way to answer simple questions. What step is this case on? How long did it spend there? Who touched it? What changed? Why did it fall from the happy path? Those questions are about tracking, not modelling.

Below, we’ll think about how we could build a stripped-down workflow tracker in .NET. No BPMN. No visual modeller. No engine deciding what happens next. The application decides that. Our system records what happened, when it happened, and why. Then it makes that data easy to query.

The result looks a lot like the most valuable parts of Camunda, but without the weight.

The core idea

A workflow is not a graph. It is a sequence of events over time.

Every time a piece of work moves forward, something observable happens. A step starts. A step completes. A decision is made. Data changes. Someone intervenes. If we capture those events in a consistent way, we can reconstruct the workflow after the fact with surprising power.

The tracker does not control execution. It listens.

This single constraint simplifies everything. There is no retry logic here. No compensation. No tokens moving through gateways. Your application owns those concerns already. The tracker’s only responsibility is to record transitions and state changes with strong guarantees.

Defining a workflow instance

We start by defining what a workflow instance means in our system.

An instance represents one unit of work moving through a process. That might be a loan application, an insurance submission, a customer onboarding case, or a background job. The tracker does not care.

An instance has a stable identifier, a workflow type, and some high-level metadata. It is created once and never deleted.

In C#, that looks like a simple aggregate.

public sealed class WorkflowInstance
{
    public long Id { get; init; }
    public string WorkflowType { get; init; }
    public string ExternalReference { get; init; }
    public DateTimeOffset CreatedAt { get; init; }
}

The ExternalReference is critical. This is how the rest of your system relates back to the workflow. It might be a ProgramId, SubmissionId, OrderId, or something similar.

Modelling steps as facts, not definitions

Traditional engines define steps up front. We dont.

Instead, every time the application reaches a meaningful point, it emits a step event. A step is identified by a string key that has meaning to the domain.

Examples might be submission_received, assessed, or review_started.

The tracker does not validate these names. It records them.

A step event captures when a step started, when it completed, who or what performed it, and whether it ended successfully.

public sealed class WorkflowStepEvent
{
    public long Id { get; init; }
    public long WorkflowInstanceId { get; init; }
    public string StepKey { get; init; }
    public StepStatus Status { get; init; }
    public DateTimeOffset Timestamp { get; init; }
    public string? Actor { get; init; }
}

The status might be Started, Completed, Failed, or Cancelled. That is enough to reconstruct timelines and durations later.

This model deliberately allows repeated steps. If a case goes back to manual review three times, you will see three distinct events. That turns out to be extremely valuable for analytics.

Capturing decisions and data changes

Steps alone are not enough. Many workflows branch based on data. We want to record those decisions without baking logic into the tracker.

Whenever the application makes a decision that affects flow, it emits a decision event.

public sealed class WorkflowDecisionEvent
{
    public long Id { get; init; }
    public long WorkflowInstanceId { get; init; }
    public string DecisionKey { get; init; }
    public string Outcome { get; init; }
    public DateTimeOffset Timestamp { get; init; }
}

This might represent something like risk_outcome = refer or eligibility = declined. The tracker does not care how the decision was made. It just records the fact that it happened.

Optionally, we can also capture variable snapshots. These are key-value pairs recorded at points of interest.

public sealed class WorkflowVariableSnapshot
{
    public long WorkflowInstanceId { get; init; }
    public string Name { get; init; }
    public string Value { get; init; }
    public DateTimeOffset RecordedAt { get; init; }
}

This is not a full variable store. It is an audit trail. You record what mattered, when it mattered.

Writing events safely

The most important technical requirement is durability. If your application says a step completed, the tracker must not lose that fact.

The simplest approach is an append-only relational schema. Inserts only. No updates except for correcting mistakes explicitly.

Every call into the tracker is a single database transaction that inserts one or more events. There is no orchestration state to lock or mutate. In practice, this scales extremely well.

You can expose the tracker via a small internal API.

public interface IWorkflowTracker
{
    Task RecordStepAsync(
        long instanceId,
        string stepKey,
        StepStatus status,
        string? actor,
        CancellationToken stopToken);

    Task RecordDecisionAsync(
        long instanceId,
        string decisionKey,
        string outcome,
        CancellationToken stopToken);
}

The application calls this at natural boundaries. After a handler completes. When a background job finishes. When a user clicks approve.

This keeps the integration friction low.

Reconstructing the workflow timeline

Once events are stored, the interesting part begins.

To reconstruct the current state of a workflow, you query all events for an instance ordered by timestamp. From that stream, you derive projections.

You can compute the current step by finding the latest Started event without a corresponding Completed or Failed event. You can calculate durations by pairing Started and Completed timestamps. You can detect loops by counting repeated step keys.

This logic lives in read models, not the write path.

public sealed class WorkflowTimeline
{
    public IReadOnlyList<StepExecution> Steps { get; init; }
    public IReadOnlyList<DecisionRecord> Decisions { get; init; }
    public TimeSpan TotalElapsed { get; init; }
}

Because everything is immutable, rebuilding projections is deterministic and safe. You can even change projection logic later and re-run it over historical data.

Analytics

This is where the tracker earns its keep, you can now answer questions like:

How long does each step take on average?
Where do cases get stuck?
How often do we bounce back to manual review?
Which decisions correlate with long cycle times?
How many workflows are active right now and where are they sitting?

These are simple SQL queries over event tables. No engine internals. No proprietary formats.

Because step keys are just strings, teams can evolve workflows without migrations. New steps appear naturally in analytics as they are used.

Heat Mapper

One of the main selling points of Camunda is the the view that highlights where instances spend time (hot spots) and how often paths are taken.

You could build a very similar UI even without BPMN. You just need to choose a visual shape for your workflow, then paint it with metrics.

The simplest approach is to render your workflow as a directed graph where each step key is a node and each observed transition between steps is an edge. You don’t need a modeller. You infer the graph from real executions: for each workflow instance, sort step events by time and emit transitions like stepA -> stepB. Aggregate those transitions across all instances and you get a live map of how the process actually runs.

Once you have nodes and edges, you compute heat metrics:

  • Node heat (time): average or p95 time spent in that step. You get this by pairing Started and Completed timestamps per step execution.

  • Node heat (volume): how many instances entered the step in a time window.

  • Edge heat (frequency): how often a transition happens, optionally as a percentage of all outgoing transitions from the source step.

  • Stuck heat: count of instances currently “in” a step (Started with no Completed/Failed yet), and how long they’ve been there.

From a UI point of view, you can build it in React with a graph library like React Flow, Cytoscape.js, or D3. You lay out nodes (either auto-layout or a simple left-to-right rank layout), draw edges, and then colour nodes/edges based on the selected metric. Add a time window filter (last 24h, 7d, 30d), environment filter, and a toggle between average and p95, and you’ve got something that feels very close to Camunda’s operational heatmaps.

A practical structure that works well is:

  • Overview page: “Workflow map” with heat applied (time or frequency).

  • Step drilldown: click a node to see distributions (p50/p95), failure rate, top decision outcomes, and a list of slowest instances.

  • Instance timeline: click an instance to see the ordered event stream with durations, actors, and variable snapshots at key points.

The key limitation is that you won’t automatically get a nice “business diagram” unless you define one. But that’s often a feature, not a bug. Your map becomes an honest picture of runtime behaviour. If you want it to look more like a designed process, you can optionally let teams maintain a lightweight “layout config” (node positions, grouping into lanes, friendly labels) without turning it into BPMN.

Comparing this to a full workflow engine

This approach deliberately gives up control in exchange for clarity.

You cannot model flows visually here. You cannot press a button to advance a token. That is intentional. Those features are expensive to operate and rarely reflect how systems actually behave. What you gain is observability, auditability, and freedom. Your application logic stays in code where it belongs. The tracker becomes a shared language across teams. In many places, this covers eighty percent of the value people think they need BPMN for.

When this approach is not enough

There are real cases where orchestration engines make sense. Long-running sagas with retries and compensation. Complex asynchronous dependencies across systems. Human task scheduling with SLAs and escalations.

The key insight is that you do not need to start there.

A lightweight tracker often becomes the foundation even when a full engine is later introduced. It provides ground truth. It tells you what your workflows actually look like, not what the diagram says they should look like.

If you strip workflow down to its essence, it is just time, decisions, and movement. By recording those facts cleanly, you unlock powerful insight without forcing your system into a modelling straightjacket.This kind of tracker is fast to build, cheap to run, and easy to explain.

Building a Camunda-Like Workflow Tracker Without BPMN in .NET