Event sourcing 101

Speaking about Event sourcing is speaking about an architecture pattern: the Event-driven architecture. It has become a more common as solution with specialised db like EventStoreDB, Kafka and other new tools that makes it easier to apply at scale.

You can find some reference on Microsoft Azure docs and on Martin Fowler’s blog for some more piece of information. It is not a new subject, but I wanted to write down the key concepts to sum it up as well as some insight of how to implement it.

What is event sourcing

Definition

With an event driven architecture, everything is an event. All the changes made to an entity have been made through multiple events. To trace back the actualized state of the entity, you need to source (i.e. read from the beginning) all the events that happened on that entity.

One of the most common example would be a bank account, the sum of the events (deposit and withdrawal of money) correspond to the actual balance of the account.

Let’s see where and when this type of architecture can be useful.

Advantages

Event sourcing is a bit different from traditional pattern where you have a DB always actualized with the last state of your system’s entities.

Looking at the benefits of Event Sourcing, it can be summarized as such:

Event driven architecture is best for system that needs to have a good Audit mechanism, but can be applied to many more.
With all events saved, replaying the events allows building back the entity’s state at different point in time.
- Also, each event come with context (on the who, what, from) which makes it easier to follow the transformation of the entity for better debugging and observability.
It is well suited for asynchronous interactions.
- Multiple events can come in at the same time, and the service can sync or catch up independently with its event store (which can be of any type like Kafka, EventStoreDB, or a NoSQL DB).

Key concepts

Components

Here are the main components:

An Event store which is where all the events generated by our system are stored
A System containing:
- A Decide function to decide based on the business logic which event(s) to create
- An Apply function to apply the change to the state

First, the system’s input is a command sent by another user/system. Then the events can be saved but also forwarded to other services. Finally, depending on the interaction, you can trigger actions on state change. (retro-action)

Diagram

Let’s have a simple diagram of the core parts of an event-driven architecture and their interactions:

sequenceDiagram autonumber participant E as External participant A as Application participant S as Current state participant ES as Event Store E ->> A: New command opt Decide (Business logic) A -->> A: Create one or multiple events end A ->> ES: Save event(s) A ->> S: Update state opt React to state change S ->> A: Apply triggered action Note over A: Act as an internal command
back to ❶ end

For the big blocks, they represent:

External: A user sending a command to the system
Application: Our system that is using Event Driven Architecture
Current State: Which is an abstract way to represent the current state of our system (makes the schema clearer)

The “Decide” component which receives the command and create the events should behave independently of the “Apply” component which only applies the events to modify the current state. This way, when replaying the events to build back the current state, you won’t trigger new events.

Implementation

Design

Since the pattern may not talk to everyone, here would be a graph of the implementation. It is a bit simplified and maybe limited, but the main actors are there.

As a note:

The Event Store is a database type of component to store the events
The Current State is another database type component that save the current state of the entities
The rest of the component in the system are supposedly all in one application.

graph LR A((user)) -- send command--> B{Controller} subgraph system B <--> S[Business Service] S -- "❶" create event #2 --> E[(Event Store)] E --> H1[Event Handler #1] E --> H2[Event Handler #2] E --> H3[Event Handler #3] D -- "❸" trigger refresh --> D H2 -- "❹" apply event #2 --> D S -- "❷" request current state --> D D[(Current State)] -- "❺" return refreshed state --> S end

The User can send command via an API, you would have a controller to interface with and pass the data to the Business Service which hold the logic for the event creation.

In this context, one event is created and there is no retro-action

The current state is updated when you request the entity. The update is basically replaying the events, the corresponding handlers will pick the events based on an id to match it to the entity and then apply the modification to the state.

The refreshed state can then be sent back to the User.

Domain Driven Development

To put it simply, DDD (Domain Driven Development) is the separation of the code per domain instead of doing it per type of components.

Usually event sourcing works well with DDD because you apply events to an entity and usually follow this pattern:

{Entity}{Verb}Event
# e.g. UserAddedEvent, InvoicePaidEvent, ProductUpdatedEvent

That entity can easily be matched to your business domain (Invoice, User, Product).

Regrouping for one entity its corresponding events, handlers, models, data sources adapters (for the current state), and other logic components will make a lot of sense in a context where you need to deal with multiple entities from multiple domains.

Command and Query Responsibility Segregation (CQRS)

When browsing through event sourcing topics, you’ll often see CQRS which stands for Command and Query Responsibility Segregation. Which is a pattern where the process for querying data and updating it (command) differs.

It works well with Event Driven, because to query the data you need to replay the events in order to get its current state. But the creation of data goes through the emission of events.

In opposition to a traditional model where you store and retrieve the same model of the data from your database.

It is very well explained on the microsoft CQRS documentation and matches the event driven architecture way of handling data. If you worry about the performance, you can optimize how to query the current state of your entity with caching, so you don’t have to replay all the events all the time.