How To Test Event-Sourced Applications

Yves Lorphelin  |  03 August 2021

In this article Yves Lorphelin, Principle Solutions Architect at Event Store, explains how to test event-sourced applications during development to help validate the model, as well as assessing the impact of changes on existing systems.


There is a lot of content already available on how to model and implement event-sourced applications. There are also numerous libraries to help build such systems.

Unfortunately there is not much content out there around testing event-sourced applications. This article discusses this subject and gives some general guidance on what to test and how to test event-sourced systems.

The testing style discussed in this article has 3 purposes:

  • During initial development, help validate the model and its correctness.
  • Assess the impact of changes on the existing system.
  • Documentation: all parts will have tests covering their inputs and outputs.

These goals have the following constraints: the tests are heavily focused on one and only one part of the whole system. They solely check outputs given some inputs. This means those tests are not integrations scenarios, and do not focus on the internal parts. Essentially this is about isolated, black box testing.

Why black box? While the private implementation of a components might change, the probability of the inputs and outputs being stable is higher. After all, in an event-sourced system, we derive the state of the system by saving domain events. Domain events are stable; businesses do not change their processes lightly. On the other hand, the private implementation details of the components will change according to non-functional requirements, or some new use case. Those changes will (should) not affect the inputs and outputs of pre-existing functionalities.

The more personal reasons are:

  • My background in electronics: this is explained further in this article.
  • I'm heavily invested in a custom line of business applications that I need to support and let evolve for years, while the domain evolves, the highest rate of breaking changes is in how the UI and services are implemented (Soap, Rest, GRpC,... ). So I need to have a way to isolate domains related to test from how the domain is exercised at runtime.
Watch Yves' webinar on Testing Event-Sourced Applications.

An opinion on systems

The way I test event-sourced applications is heavily biased by the way I look at systems, due to my background in electronics and signal processing. In this field, one way to understand how a complete system works is to divide it into a series of components wired together.

Each component is characterized by a series of inputs, a function, some outputs and a feedback loop.


Does this look like the drawings in Thinking in Systems a Primer? Yes, that kind of model is widely used in all sort of fields.

Each component might get a name, and is characterized by its function and operational limits, built using more components. Components all the way down...


For example, an oscillator converts direct current (DC) to alternative current (AC). Oscillators are built using transistors or amplifiers. Amplifiers are complicated internally.


Event-sourced components and their input and outputs

So what kind of components do we have at our disposal in a typical event-sourced system using CQRS? What are their inputs and outputs?


Alberto Brandolini's artwork "The Picture that Explains Almost Everything" can help us there (note this is a slightly modified version for the purpose of this article).


Entities have behaviours, they are the active part of the system.

  • Inputs
    • Events: needed to rebuild some state.
    • Commands: the action to be performed.
  • Outputs
    • Events: The state of an entity is private to the entity, this means that, as a rule of thumb, no tests will directly exercise them. The state will be indirectly exercised by the black box tests.
      Sometimes though, some aspect of rebuilding the state might get complicated, so a few ephemeral tests might exist.

Typical tests will look like:

  • Given an event, another event, many more events, ...
  • When some action
  • Then (only) an event, another event, more events, ...

What we want to test here is correctness of the output given the inputs. Are the events correct, do they contain the necessary data, are they in the correct order?

Pseudo-code sample: adding an assessment to a portfolio. Scenario: an assessment test can be manually added.

AssessmentTest assemssment= new AssessmentTest():
  .Given() // None , since this is a new test
  .When(assemssment.AddManually("TheId", Now ) ) // note that we pass the time.
  .Then(new AddedManually { "Id":"TheId", "On": "2021-03-13T15:26:05+00:00" }

Scenario: an assessment test can be added by an import through a platform called TAO (an online assessment test platform, used to create & deliver tests to be taken).

AssessmentTest assemssment= new AssessmentTest():
  .Given() // None, this is a brand new test
  .When(assemssment.Import("TheId", TaoDetails, Now ) ) 
  .Then(new Imported { "Id":"TheId", "On": "2021-03-13T15:26:05+00:00" }
  .Then(new TaoVersionAvailable { "Id":"TheId", "On": "2021-03-13T15:26:05+00:00", "Platfrom":"Tao", "Delivery":"ID_of_delivery_in_Tao" })

External systems

No system lives in isolation, we will need to call and have actions performed on systems we do not own. The actual calls are, most of the time, wrapped around some integration interface.

  • Inputs
    • Commands
  • Outputs
    • Events

State, as in entities, is private: the same rule of thumb applies.

Typical tests will look like:

  • Set up the dependencies
  • Given Some state
  • When some action
  • Then (only) an event, another event, more events,...

What we want to test here is correctness of the output given the inputs: are the events correct, do they contain the necessary data, are they in the correct order? In some cases, we will need to fabricate test doubles of calls to the external service, in order to simulate failures etc...

Pseudo-code sample: scenario: successfully downloading some zip and extract metadata, using the 3d party TAO services.

successfulDownload = ( fileToDownload) -> OK  // this is a mocked dependency
metadataExtract = (stream) -> { "Code": "TheId", "TargetGroup": "FR" } // this is a mocked dependency
TaoIntegration taoIntegration = new TaoIntegration(successfullDownload,metadataExtract )
 .Given () // None, this is a new clean download
 .When (taoIntegration.DownLoad(identifier))
 .Then (new DownloadedStarted { "Id":"identifier" , "On": "2013-03-13T15:20:00+00.00"})
 .Then (new TaoTestDownloaded { "Id":"identifier" , "On": "2013-03-13T15:20:30+00.00"})
 .Then (new MetaDataExtracted { "Id":"identifier" , "Code":"TheId", "TargetGroup":"FR" })

Scenario: unsuccessful download of a zip, using the 3d party TAO services.

unSuccessfulDownload = ( fileToDownload) -> Error  // this is a mocked dependency
metadataExtract = (stream) -> Error  // this is a mocked dependency
TaoIntegration taoIntegration = new TaoIntegration(unSuccessfulDownload,metadataExtract )
 .Given () // None, this is a new clean download
 .When (taoIntegration.DownLoad(identifier))
 .Then (new DownloadedStarted { "Id":"identifier" , "On": "2013-03-13T15:20:00+00.00"})
 .Then (new TaoDownloadFailed { "Id":"identifier" , "On": "2013-03-13T15:20:15+00.00", "Reason":"error"})
 .Then (new TaoDownloadFailed { "Id":"identifier" , "On": "2013-03-13T15:20:30+00.00", "Reason":"error"}) // Retries.
 .Then (new TaoDownloadAbandonned { "Id":"identifier" , "On": "2013-03-13T15:20:45+00.00"}) 

Read models

Read models are most of the time used by user interfaces, or exposed through some kind of service.

  • Inputs
    • Events.
  • Output
    • some data, in some form, in some store.

Typical tests will look like

  • Given an event, another event, more events,...
  • Then some data, in some storage,...

Again correctness is tested, given some inputs, we expect some specific data in a given storage engine. Testing the outputs is as simple as executing the target system provided queries.

This type of test also helps during initial developments to ensure that all the data needed to build up the read models is available in the events.

Surprisingly, from a test authoring perspective, read models are probably the most difficult to test in isolation. Why? We will need to update a specific storage, a database, a document store; and we should not abstract those away. So if we have the opportunity to use an in-memory or zero-deploy version, we should use it: nowadays docker images can help as well, but this does make the test setup more complex.

Another aspect is composition: some served read model might need to query other storage to provide a complete data view.

For example, we want to provide a typical detail page read model, where the fullname of the user who did the last change is shown. The events will not contain the fullname of the user, but the userId. When building or serving such a read model we'll need to get the fullname from the system owning this piece of information.

Where that composition can happen, and how to test it in isolation is a whole topic in itself. This talk by Yves Reynhout is a great resource on this topic.

Pseudo-code sample: scenario: provide a list of downloads and their status.

TaoDownloads downloads = new TaoDownLoads(dependencies);
ExtractData extractData // provides a way to extract the data from the real underlying storage,
 .For (downloads, extractData)
 .Given (new DownloadedStarted { "Id":"identifier_0" , "On": "2013-03-13T16:20:00+00.00"})
 .Given (new DownloadedStarted { "Id":"identifier_1" , "On": "2013-03-13T15:20:00+00.00"})
 .Given (new DownloadedStarted { "Id":"identifier_2" , "On": "2013-03-13T15:30:00+00.00"})
 .Given (new TaoTestDownloaded { "Id":"identifier_1" , "On": "2013-03-13T15:20:30+00.00"})
 .Given (new TaoDownloadFailed { "Id":"identifier_2" , "On": "2013-03-13T15:20:15+00.00", "Reason":"error"})
 .Given (new TaoDownloadFailed { "Id":"identifier_2" , "On": "2013-03-13T15:20:30+00.00", "Reason":"error"}) // Retries.
 .Given (new TaoDownloadAbandonned { "Id":"identifier_2" , "On": "2013-03-13T15:20:45+00.00"}) 
 .Then (new DownloadItem {"identifier_0" , "download started", "2013-03-13T16:20" } ))
 .Then (new DownloadItem {"identifier_1" , "download successfull", "2013-03-13T15:20" } ))
 .Then (new DownloadItem {"identifier_2" , "download  abandonned", "2013-03-13T15:20" } ))

Note: some may prefer to test those projections using integration test, usually through some API query. This is perfectly valid of course, but I have had many cases where the API is not available yet, or when there is no API to query because the target store is not fully under control. For example, the target store is owned by the BI team.

Reactive components

Reactive components are triggered by events in the system: some need state, others do not. Depending on the purpose of the reactions, an external system, one or more entities might be triggered.

  • Inputs
    • Events
  • Output
    • Commands

Typical tests will look like

  • Given an event, another event, more events...
  • Then a command, another command, even more commands...

Once again, the private state of the reaction does not have dedicated tests.

In general, reactive components also have dependencies. Those will be also simulated in a simple way in order to verify the behaviour of the reactions. An example will be provided in part 2.

Pseudo-code sample: scenario: whenever we receive a 3d party message saying an assessment is delivered, download it and then copy it to some file share.

successfulDownload = ( fileToDownload) -> OK  // this is a mocked dependency
metadataExtract = (stream) -> { "Code": "TheId", "TargetGroup": "FR" } // this is a mocked dependency
TaoIntegration taoIntegration = new TaoIntegration(successfullDownload,metadataExtract )
CopyToFileShare copyToFileShare= (stream) -> { OK, location }

TaoDownloadReaction reaction = new TaoDownloadReaction (taoIntegration, copyToFileShare)

 .For (reaction)
 .Given (new AssessmentAvailable {"Id": "identifier" }) // the event we received from de 3d party
 .Then ( new Download {"Id": "identifier"})
 .Then ( new CopyToFileShare {"Id": "identifier"})

Note: In this context, 3d party means any service or application that the team does not have under their control.

What about CQRS?

Within the context of testing, the real question is about how to test command handlers.

Note that the above components are a CQRS implementation, they exhibit two of the most important qualities we look for in CQRS:

  • Clean separation of models, purpose built for reading or appending information.
  • A one way data flow.

Most of the time, command handlers are implemented as a command message sent to a handler alongside any dependencies needed. Handlers are typically chained together to form a pipeline.

For example

  • ArchiveHandler (ArchiveCommand)
  • RequiresArchiveRole ( ArchiveHandler )
  • LogErrors ( RequiresArchiveRole ( ArchiveHandler ))

The handler then loads or creates the relevant entity, calls some operation on it and then returns some information. That information is either an acknowledgment the work has been done or an error; both may contain additional information. Depending on the stack, the error might be an exception or a return value.

The operation on the entity has already been tested. So what we test here are the specifics of the command handler. This requires either mocking the dependencies or running them in integration style tests.

Note that those tests do generally evolve into integration style tests.

  • Inputs
    • Commands
  • Output
    • Error, additional information
    • Ok, additional information

Typical tests will look like

  • Given ( a command )

  • Then ( OK )

  • Given ( a command )

  • Then ( Error )

Why don't you test the states?

Because the state is hidden from view, it should not be accessible from outside the components that uses it, so it's almost impossible to test it.

The other reason to not have tests directly on the state is that it will evolve with new requirements. It means that new inputs or outputs might require changing the structure of the state, any tests on the state would immediately break. If we focus on input and outputs, the tests will not break, if they do break then we introduced a bug inside the component.

In short, the way the state is tested is the same as you would private methods: indirectly.

But I can do that only with integration tests or by generating events!

Of course, integration style test might cover this fully, including all the special corner case like disconnection and so on. There are numerous high quality libraries that allow you to generate data.

Remember this style of testing is a frame of reference that has three goals:

Feedback on models and abstractions: while whiteboard or traditional analysis help to set up models and the shape of the system, I need some way to test them with focus. I don't want to have to create a complicated test setup and tear down, because this or that part is served behind a web server or behind proxies, nlbs, etc...

Feedback on changes: when models are changed, new events introduced or deprecated I need a quick way to know where the impact is and where they are used. Explicitly creating events or commands, explicitly checking their data will give right away what parts are impacted by any changes. For example splitting events, merging them, running the tests will immediately tell me where they are used and for what purpose. Note that some of those tests are transient: they exist while coding and will never be committed to the main repository.

Documentation: those kinds of test cover all inputs and outputs of the system, and are readable by a human.

Photo of Yves Lorphelin

Yves Lorphelin Yves is Principal Solution Architect at Event Store and helps customers and users reap the benefits of Event Sourcing and EventStoreDB. He has been in the industry for over 20 years, always focusing on finding and solving the actual problems that businesses want to solve with their IT systems.