Part 6. Testing Strategy for Modular Monoliths Beyond Unit Tests

By the time you reach this point in the series, you have made a lot of architectural promises. You have said that modules are isolated, that behaviour is local, that data is owned, that communication is explicit, and that authorisation does not leak across boundaries.
Testing is where those claims get audited. Not by diagrams, and not by good intentions, but by code that either passes cleanly or fails in ways you did not expect. Tests have a way of surfacing the truth about a design, because they force the architecture to be exercised rather than explained. This is the part many Developers quietly dread. It exposes the gap between architecture that sounds good in conversation and architecture that actually holds up under pressure. When tests are aligned with real boundaries, they reinforce the design. When they are not, they reveal exactly where those promises start to break down.
Why Traditional Testing Advice Breaks Down Here
Most traditional testing advice assumes one of two worlds. Either you are building a small application where unit tests are sufficient, or you are working in a distributed system where end-to-end tests are unavoidable. A modular monolith sits in an awkward middle ground that neither model fits particularly well.
If you only write unit tests, you miss important classes of failure. Broken wiring goes unnoticed. Boundary violations slip through. Configuration errors hide until runtime. The system looks healthy in isolation, but the pieces do not actually fit together the way you think they do.
At the other extreme, relying heavily on end-to-end tests creates a different set of problems. They are slow to run, brittle to maintain, and hard to diagnose when something fails. Over time, people stop trusting them, and once trust is gone, the tests stop doing their job.The mistake is treating testing as a ladder, moving neatly from unit tests to integration tests to end-to-end tests. In a modular monolith, testing works better as layers of confidence, with each layer answering a different question about the system.
The Question Your Tests Should Answer
Before writing any test, I ask one thing:
What architectural promise am I trying to protect?
If you can’t answer that, the test probably doesn’t matter.
Level 1: Slice Tests (Your Primary Workhorse)
Vertical slices change what “unit testing” really means. The unit is no longer a method, a class, or a service. The unit is the use case itself, the thing the system actually does.
A slice test exercises the full behaviour of that use case. It runs the handler, applies validation, touches persistence, and enforces business rules. What it deliberately avoids is anything outside the slice’s responsibility. There is no HTTP layer involved. There is no serialisation. There is no real infrastructure beyond what the module owns. The key shift is where mocking happens. Mocks belong at the module boundary, not inside the slice. Inside a module, you want reality. You want real code paths and real interactions, because that is how you gain confidence that the module actually works.
A CreateUser slice test, for example, should use a real UsersDbContext backed by an in-memory or test database. It should not mock repositories, and it should not mock EF. If the slice cannot work with its own real persistence model in a test, that is a signal worth paying attention to.
You want to know:
Does this behaviour actually work?
Not:
Can I make this method return the value I expect?
Why Mocking Internals Destroys Confidence
Mocking internals buys you speed, but it does so at the cost of truth. A test that mocks repositories, DbContexts, or domain behaviour is not really testing the system. It is testing your expectations about how the system should behave.
In a modular monolith, most bugs do not live in pure logic. They live in mapping, configuration, wiring, and persistence assumptions. Those are exactly the areas that heavy mocking conveniently hides, which is why everything looks fine in tests. When a slice test fails, you want it to fail because the system is wrong. You do not want it to fail because the test was overly clever or made assumptions that no longer hold. Tests that lean toward reality may be a little slower, but they give you something far more valuable: confidence that the behaviour you see in production is the behaviour you exercised in tests.
Level 2: Module Integration Tests (Boundaries Under Load)
Slice tests tell you that a feature works in isolation. They do not tell you whether modules cooperate correctly once they start talking to each other. That gap is where module integration tests earn their keep.
A module integration test boots a module in a way that is close to reality. It uses real persistence. It uses real in-process messaging. And it exercises multiple slices together, allowing behaviour to flow across a boundary rather than stopping at it.
For example, creating a user publishes an event. The Billing module reacts to that event. A billing profile is created. There is no HTTP involved and no UI in the way, just behaviour moving from one module to another exactly as it would at runtime.
This level of testing answers a different question: do my modules interact the way I believe they do? When these tests fail, the cause is usually not a bug in business logic. It is an architectural failure. An event contract changed. A handler was not registered. A transaction boundary moved. A dependency leaked across a boundary.
That distinction matters, because architectural failures require architectural fixes. Treating them like logic bugs only papers over the real problem.
Level 3: Contract Tests (Protecting Module Agreements)
If modules communicate through contracts, those contracts deserve tests of their own. A contract test does not care how something is implemented. It cares about shape and meaning, and whether both sides still agree on what is being exchanged.
For example, an event is expected to contain specific fields. A query should return data in a particular form. A command should accept a defined set of inputs. None of this is about behaviour inside a module. It is about the agreement between modules.
These tests protect you from one of the most dangerous changes in a modular system: “I just renamed this property, nothing else uses it.” Something always uses it. The damage just happens quietly if nothing is watching. Contract tests live close to the contract itself. They run fast. They fail loudly. And when they fail, they fail for a good reason. They are cheap insurance against silent breakage in places where the system relies on trust between modules. If modules communicate through contracts, those contracts need to be treated as first-class citizens. A contract test exists for one purpose only, to protect the agreement between modules. It does not care how something is implemented internally. It cares about shape, semantics, and intent.
An event must contain specific fields with specific meaning. A query must return data in an expected form. A command must accept a defined set of inputs and reject anything else. These tests are not about business behaviour, they are about ensuring that two independently evolving pieces of the system still understand each other. They protect you from one of the most dangerous changes in modular systems, “I just renamed this property, nothing else uses it”. Something always uses it. The problem is that without contract tests, you do not find out until much later, and usually in a place far removed from the change.
Good contract tests live close to the contract. They run fast. They fail loudly. And when they fail, they tell you exactly what agreement was broken. They are cheap insurance against silent, slow-burn failures that otherwise erode confidence in the architecture over time.
Level 4: Architecture Tests (Your Early Warning System)
Architecture tests sit at the top of the stack, and they do not test behaviour at all. They test structure. That is precisely why they are so effective, and often so uncomfortable. These tests ask blunt questions. Does Billing depend on Users.Infrastructure? Does any module reference another module’s DbContext? Are internals leaking across assembly boundaries? None of these questions are about whether the application appears to work today.
They do not care if the app runs. They care if the architecture is honest. When an architecture test fails, it is usually because someone made a small, reasonable-looking change that would have turned into a serious problem six months later.
Architecture Tests Need Tooling Support
Architecture tests don’t work on intent alone. They need enforcement at build time.
In .NET, that usually means using a structural testing framework that can inspect assemblies and dependencies directly. The two I see used most often in real systems are:
NetArchTest – simple, focused, and very effective for dependency rules
ArchUnitNET – more expressive, closer to formal architectural modelling
The specific tool matters less than what you do with it.
A good architecture test doesn’t try to prove the system works. It tries to prove the system can’t cheat.
For example, this single test prevents one of the most damaging boundary violations you can make:
Types.InAssembly(typeof(BillingRoot).Assembly)
.ShouldNot()
.HaveDependencyOn("Users.Infrastructure")
.GetResult()
.IsSuccessful
.Should()
.BeTrue();
This test doesn’t care about behaviour. It doesn’t care about data. It cares about honesty.
When it fails, it fails because someone crossed a boundary they weren’t meant to cross. That’s exactly when you want the build to break.
If you only ever write one kind of architecture test, write this kind.
Where End-to-End Tests Actually Fit
End-to-end tests are not useless. They are just overused. In a modular monolith, their role is much narrower than many people expect.
They should be few in number and focused only on critical paths. Their job is not to validate business rules or edge cases. It is to confirm that the system is wired together correctly at the highest level. Authentication works. Routing is correct. Major workflows do not explode when exercised end to end.
If you find yourself relying on end-to-end tests to validate business logic, that is usually a sign that something is missing at a lower level. You are compensating for gaps in slice tests, module integration tests, or contract tests. That is a smell. End-to-end tests are expensive and blunt instruments. Used sparingly, they provide confidence. Used as a safety net for everything else, they slow teams down and hide the real problems.
The Testing Pyramid Rewritten
In practice, the balance looks very different from the traditional testing pyramid. In a modular monolith, the weight shifts toward the places where behaviour and boundaries actually live.
You want many slice tests, because they validate real use cases in isolation. You want a healthy number of module integration tests, because they confirm that modules cooperate the way you think they do. On top of that, you add a thin but deliberate layer of contract and architecture tests to protect agreements and structural integrity. End-to-end tests should exist, but only in very small numbers.
If your test suite is inverted, with lots of end-to-end tests propping up a weak foundation, something upstream is wrong. Tests are compensating for missing confidence elsewhere, and that is usually a sign that the architecture itself needs attention.
Tests as Architectural Pressure
Here’s something that doesn’t get said often enough:
If something is hard to test, it’s probably badly designed.
Modular monoliths surface this fast.
If you find yourself struggling to test:
A slice
A module boundary
A permission rule
That struggle is feedback.
Ignore it and you’ll pay later.
Listen to it and the design improves.
Where We Are Now
At this point in the series, you have:
Real module boundaries
Feature-centric design
Isolated data
Honest communication
Localised authorisation
A testing strategy that reinforces all of it
What’s left is the question everyone eventually asks.
How do you know when a modular monolith is ready to split?
That’s not a technical question. It’s a judgment call





