In the summer of 2021 my friend Aaron Pedersen asked me to join him at IBM to rebuild an internal knowledge and sharing platform called Lighthouse. He'd been working on the front-end architecture of this application for years with a team called JLoop. They'd done a great job of creating a UI that was well thought out, and easy to use. IBM loved their contributions but the application still had fatal flaws.
The issues were not due to the UI, but the backend. The microservices were grinding to a standstill. Depending on the microservice, you could get response times in the 20-30 second range which was having a huge impact on usability. There were a number of reasons for this, including:
The IBM team had made the decision a few years earlier to move to a graph database for most (if not all) queries. This added complexity to the use cases for which a graph db was not a great fit.
They had the classic issue of microservice interdependency. All of their microservices call other microservices adding latency to each request. This got compounded when you had a particularly slow microservice.
This drove the decision to find a way to resolve these issues.
While we were diving into these questions a new use case popped up. They needed a way to have auditability and to track any and all changes a user made. This was a familiar problem to me and a hard one when using regular CRUD systems. Historically, I'd have tried doing things like analyzing the changes to an entity through proxies and or property editors etc. But it always turned into a mess. This is often the front door for why people do Event Sourcing. The need for a reliable audit trail.
One of the folks on the team (thanks Tom Friedhof) had the suggestion of using event sourcing. I wasn't familiar with event sourcing and had to do some research. This led me down an unexpected path that would change the way I build applications!
What is Event Sourcing?
After some research, I understood that event sourcing was really three ideas.
DDD (Domain Driven Design)
CQRS (Command query responsibility segregation)
Storing and replaying events
Back in 2003 (I'm old), like everyone else of that era, I purchased the book on Domain Driven Design by Eric Evans which was an interesting read. The idea is that you should structure your code to match the business domain. At the time some of the ideas seemed impractical and hard to achieve given the way we architected systems at that time. It took time for some of these ideas to really take hold. The main concept in relation to event sourcing is the idea of using an Aggregate Root.
According to Evans DDD:
An Aggregate is a cluster of associated objects that we treat as a unit for the purpose of data changes. Each Aggregate has a root and a boundary. The boundary defines what is inside the Aggregate. The root is a single, specific Entity contained in the Aggregate.
CQRS (command query responsibility segregation) was new to me. This was the concept of separating your reads and writes. At first, my thought was, I already do this. Don't I have different controllers and services to front queries vs updates? But I was missing the point.
Reads rarely are the same as your updates. What this means is that the use cases you have for reads are rarely exactly the needs of your updates. This on its face is obvious, but the impact is profound. This means that if you use an ORM you're in an inherently inefficient system that returns more info than you need it to based on the needs of the use case. Of course, you write custom queries to resolve this. But the next part of event sourcing is where I truly had my epiphany.
Lastly, the concept of storing and replaying events. To me, this looked like what we'd been using queues for. I'd been a heavy user of queues over the years to emit events for asynchronous operations. For example, if a user has just signed up, fire off an event to your queue and a consumer can pick it up and send their registration email. This was just a good way to separate concerns and a good architectural decision. But I'd never thought of storing all our data that way. After some thought it made sense since everything we do is an event.
If I update a username, that is an event -
If I overdraw my account, that is an event -
If I call my Mom, that is an event -
It seemed simple enough, but I was worried this would lead to a world of complexity.
I recognized that this was a different paradigm in storing data, but I didn’t grasp the true idea that this was a sea change. A new data model that stored events which then allowed the replaying and folding of events to rebuild our state.
The idea is that you store discrete events for every state change and persist them in the new data model. Each of these events are immutable. If you need to modify a data point you'd create a new event to accomplish that. Then to rebuild state, all you need to do is replay all the events.
Tom suggested we do a quick proof of concept using a fully fledged open-source event database called EventStoreDB. I was game at this point since my curiosity was piqued.
I spent the next few weeks building an architecture to support event sourcing using EventStoreDB and I cobbled together a system that worked. At this point, I wasn't very confident in what I had built so I reached out to Event Store for advice and Chris Condron (CTO at the time) came in to take a look. I walked Chris through what I'd built and Chris took a dim view of it. The issue was in how I had chosen to replay events to rebuild state. He pointed me to some example projects in C# for inspiration in how to handle this.
It was at this point that I had the lightbulb moment.
I'd been building applications wrong for the last 20+ years.
By storing events instead of relying on a traditional relational (or in the case of IBM a graph) db, you gain a world of insights you lost when you just update. By storing events, you gain a third dimension to your data: the dimension of time. By storing each state-transition discretely, you’ve added a historical understanding of every change. Giving each state change the who, what, why, and when that you never had before.
For example, given the previous use case of updating a user's username, in a traditional CRUD application using a relational DB, I'd just do an update to the user and overwrite their username. To do this I would have used an ORM to load the User entity, changed the username, and then persisted it. However, by doing this I've lost all kinds of information. First, I lost what the username was originally. I lost information about the actual change, that the username was changed. I lost information about the context, why was the username updated? I lost information about who did it? I lost information about when it happened. The list goes on.
At that moment of realization, I was not only deeply ashamed to have never seen this but also truly excited. This was a new way of managing data. By using Event Store to store each state transition we gained a new level of fidelity in our data. Traditional operational databases whether relational, document, or graph databases lose data, a critical asset for modern information-driven enterprises.
This solved all of our use cases:
We now could audit anything we wanted since we could track who did what when and most importantly: Why
We could resolve our latency issues by splitting our microservices apart and just replay the events we needed dynamically. If we need to have a convenient way to cache data for the given bounded context we could have a relational db store that locally, but be updated by Event Store
I went on to create an Event Sourcing framework for Java called Dewdrop to help others build out an architecture to event source. And ended up becoming the VP of Engineering for Event Store. I'm very excited about the future of EventStoreDB, Event Store Cloud, Event Store the business, and the whole Event Driven Application space, I believe it will fundamentally change the way software is written and data is stored. I'm honored to help present the opportunity to change how others view and manage data.