Event sourcing started being adopted by highly transactional environments such as Stock Exchange or Gambling companies. Today it’s being used in many other domains. I’ve been discussing architectural styles with many people and realized there are common misconceptions on what Event Sourcing is. So let’s try to understand what it is not, based on those misconceptions.
Event Sourcing term was first coined by Greg Young as an architectural pattern representing an event-centric approach for an application to store business entities. An example could be for an e-commerce application to store a product as the sequence of changes that were performed over its properties, instead of storing the final state of a product as we do in the traditional approach.
That gives us the flexibility to do things like moving back and forth in time, performing temporal queries, auditing, troubleshooting, etc. It also makes us think of our applications more in terms of behaviours than in terms of structure.
The term event and the existence of a persistent event log quickly led to the association between event sourcing with messaging, which is a popular pattern applied in the scope of EDA (Event Driven Architectures), and I guess that’s where the misconceptions begin. Having a distributed system performing all its communication asynchronously through events doesn’t mean it’s event sourced.
Let’s say we have the below script that defines two functions for building a key-value store.
The script contains two functions,
db_get, responsible for storing and retrieving a key-value pair from a file, respectively. The
db_get extracts the value part of a line containing a specific key. The
db_set appends the key/value pair if the key is missing, and replace the value part if the key already exists.
If we run the script above, the contents of the
state_store file will be:
And the output of the script will be something like:
Just storing the last snapshot of data might be really limited in some cases. For example sometimes it would be useful to understand the changes that led to this state for auditing or troubleshooting purposes.
Let’s change the above example a bit so we can have that.
The above example appends a really simplistic immutable fact/event that indicates “This entry was updated with this value”.
db_set function now simply appends the key/value pair to the end of the file instead of attempting to change an existing state. The
db_get function got a bit more complex because we want to keep the functionality of showing only the last state when retrieving an entity, and not a list of changes.
event_store file will contain:
And the output of the script will still be the same as in the previous example.
This is a too simplistic view and definitely not production-ready. The event schema in the example is a simple key/value pair that allows us to have a PUT semantics. In a real-world scenario, we would also need to keep track of things like the event time and the event schema version.
But this still shows some interesting properties of an event sourced application.
- Auditability: Inspection of past events
- High-performance writes: Append-only storage is write-friendly
- Replayability: I can replay my sequence of events back and forth
- Temporal queries: Yes, we just need to add that timestamp :)
Ok, I just event sourced my original bash application. What it is not, then?
It’s not a top-level architecture
A top-level architecture dictates how multiple components are deployed and communicate with each other. That could be, for example, EDA (Event Driven Architecture) or SOA (Service Oriented Architecture). Event Sourcing is not that. Instead, event sourcing is an application architecture pattern and, as any other design pattern, it intends to be a well documented approach to a specific common problem.
Applying event-sourcing to an entire system is actually considered an anti-pattern. It’s a way of creating a big monolith that is event-sourced internally.
It’s not a framework
We don’t build an event sourcing framework, the same way we don’t build a framework for any other design pattern.
Nevertheless, event sourcing implies persistence and that event store can be an isolated system that facilitates applying the pattern, but it’s not the pattern itself. Examples of popular storage systems used for this are Event Store and Apache Kafka.
It’s not CQRS
Event sourcing and CQRS (Command-Query Responsibility Segregation) have been applied closed together for several years now, but they are different patterns.
CQRS was originally documented as CQS (Command Query Separation) by Bertrand Meyer at 1988 in his book, Object-Oriented Software Construction, and it has been applied a lot since then.
Of course, by modelling state as events, it comes almost naturally that a different read path will be required. Customers woudn’t like to get back a sequence of events. Rather, they want a consistent state view of the information they are expecting. You may expect different read stores (caches, materialized views, etc) optimized for certain kinds of queries or spanning multiple bounded contexts.
It’s not asynchronous or eventual consistent
It has nothing to do with eventual consistency. The above script is storing events and each read is reading all the events in order to show the lastest state.
Instead of doing that synchronously, I could have done that by having a background task reading the events and updating the state somewhere else, and changing the read path to go to that place instead. That would make it eventual consistent. But we don’t have to build it that way.
The most popular event sourced application in the world is Git. Git is fully synchronous and takes all the advantages of being event sourced, like doing temporal queries or going back and forth in time to cherry-pick code in different points in time, amongst other use-cases.
It’s not storing everything that happened
Well, I know the script does that, but that was simply to keep brevity.
Unless we have a really small amount of events or a really big budget for storage, the state is not built out of every single event that ever existed. Associated to event-sourcing is the concept of doing snapshots or compaction, which allows one to be a bit more realistic on what to consider the “beginning of time”.
What to conclude from this?
Event Sourcing is being commonly mistaken with EDA (Event Driven Architecture) and all properties that are associated to it. Event sourcing is an application architecture pattern that does allow for better auditing, providing the ability of replaying the state or moving into a specific point in time and it does naturally imply a performance improvement on the write side simply because storing events in an append-only persistent store is really fast.
Event sourcing an entire system is a big mistake and considered an anti-pattern. It’s not an architecture by itself and it does not make applications communicate better, it just makes a single application more consistent and auditable by storing the facts of what happened instead of storing the current view.
It’s pretty much like when I lost my car keys and I replay in my head everything that happened until I find them again, and I do it synchronously and consistently, otherwise I would need to take the Bus.
subscribe via RSS