There was a time where I felt a little bit embarrassed to admit that I didn’t really know what a microservice was, even though microservices had been a hot topic in the blogosphere for years.
Turns out that in retrospect maybe that’s not so surprising given that even though it’s something unmistakably described as positive, every other day it seems there’s a negative anecdote about using microservices in Hacker News.
I suspect that is because, like many buzzwordy tech concepts (SoA, BigData, Web2, Web3 etc), it is difficult to find a clear definition of what makes a microservice, a microservice, and consequently be able to implement them in a beneficial way.
The goal of this blog post is to provide a description of how you can successfully get the benefits attributed to Microservices.
What benefits?
Microservices are often described as providing the following main benefits:
- Improved scalability
- Better fault tolerance
- Programming language agnostic
- Enable more independent teams
We’ll drill down into these, but before that let me list a few more that you’ll find in the first Google search results that are especially arbitrary:
- Better security
- Faster time to market and future proofing
- Greater business agility and support for DevOps
From these last three I’ll just say that there’s nothing inherent in breaking down a single executable/application (or monolith) into several microservices that makes anything more secure. The concerns are the same and the attack surface might even be bigger in the latter case.
“Faster time to market”/“Greater business agility and support for DevOps” isn’t guaranteed at all, and if done poorly, breaking down a monolith might lead to obscure dependencies that make deployments more complicated. There are a few high profile cases of this happening. I believe Segment’s case was one of them.
In the next sections let’s try to describe the main benefits in more detail.
Improved scalability
When we have several microservices instead of a single application we can scale them independently.
Have a microservice responsible for video format conversions that is struggling? Just spin up 2 or three of it and have the load be spread out between them.
This is the promise, however for this to work out the microservices need to be independent, even between the instances of the same microservice (we’ll get to this later).
Better fault tolerance
This one is especially beneficial but not guaranteed at all. In fact, breaking down an application into microservices and just have the communication be over a REST api (or any RPC-like protocol) is guaranteed headache, just look at what happened with inVision.
Very briefly, here, the problem is the temporal coupling that exists in the communication between microservices. In a a request/response scenario there’s an expectation that we’ll get a response in a reasonable amount of time. If we don’t, then that’s treated as a failure.
Imagine a chain of calls, microservice A calls B, B call C and C is borked and does not respond. B fails, then A fails. This coupling makes it so that all microservices need to be up at the same time, always.
Programming language agnostic
There’s not much to say about this one. I don’t know how much of a quantifiable benefit this is. It’s definitely a nice to have though.
Allows for more independent teams
Yes and no. For this to be the case we really need the microservices to be independent, not only in their data model, but also in the way the communicate with each other. A good example of a failure in this case is Segment’s where they had a shared library that when changed potentially impacted all microservices.
It’s impossible for microservices to be truly independent if they need to talk to each other, however, there’s an approach we can take that at least promotes a mindset where you don’t need to be concerned with how other microservices will consume the one you are working on (which is what you need for truly independent teams).
So how do we get these benefits really?
First, let me emphasize again that all that’s coming next doesn’t really need things to be running in different processes (microservices). It can all work in a single one (monolith), but this approach makes it trivial to go from monolith to separate microservices.
First thing to break free of is temporal coupling. This is the request/response pattern where when a request is performed there’s an expectation to get a response in a reasonable amount of time.
We’re so used to this pattern that it might not be immediately obvious the amount of assumptions made when using it. Take for example this handleUserRegistration
function that handles an express request to register a new user.
async function handleRegisterUserRequest(req: Request, res: Response) {
const {username, email, password, firstName, lastName} = req.body
if(await userService.isUserNameUnique(username) && !await userService.isEmailTaken(email)) {
const newUserDocument = await userService.createNewUser({username, email, password, firstName, lastName})
await emailService.sendWelcomeEmail({email, firstName, lastName})
res.redirect('/wecome-page')
} else {
res.sendStatus(400)
}
}
Notice that the emailService
being down might compromise the ability to register users. The email might have a “confirm you own this email” link for example.
That being said, imagine following the naive approach of just pulling out the userService
and emailService
from this codebase into their own microservices and instead of calling a method, we’d do an HTTP request to to the user microservice trigger the creation of a user and another http request to the email microservice to trigger the sending of the email.
Not much would change other than adding latency and more opportunities for failures.
So, instead of sending a request and waiting for a response we just send a command with all the data we need to register a user. The term command comes from CQRS which stands for Command Query Responsibility Segregation, and it’s really the segregation part we are interested in.
We can rewrite the above handleRegisterUserRequest
like this:
async function handleRegisterUserRequest(req: Request, res: Response) {
await sendCommand(new RegisterUserCommand({
username,
email,
password,
firstName,
lastName
}))
return res.redirect('/welcome-page')
}
I’m intentionally leaving what sendCommand
does vague right now, we’ll look into it in more depth in a minute.
The thing I wanted to emphasize here is that the expectation of receiving a response is not there anymore. It also means that if the commands are consumed asynchronously (they can wait in a queue until they are consumed) then the microservice that handles them being offline won’t trigger a cascade of failures.
Now you might be thinking, what if the email or username are already taken? The user is still redirected to a welcome page? That’s not a great experience. In this particular case we would want to check that before issuing the command (and check it again when handling the command). We’ll revisit this again with a solution later.
Regarding how this approach influences the ability to scale, think of it from the perspective of “cattle vs pets”. The cattle vs pets analogy is often thrown around in DevOps circles. It’s this idea that your microservices should be treated as cattle and not pets. This means it should be possible to add and remove them without any “ceremony”.
If the microservices can come and go at any time, this means that a command might start being consumed and not be able to finish, which means it might need to be retried. It also means that running the same command twice should yield the same results. Commands should be idempotent.
This requirement of commands needing to be idempotent and the fact that you can’t make too many assumptions as to when a command is actually executed affects how we store the data that the microservices rely on. Ideally that data should not be temporally constrained as well. That’s what the next section is all about.
Event sourcing
Event sourcing means that the events are the source of data. State is derived from them.
A common example of this is a bank account’s balance. If there’s 2 deposits (the events) of $50 you can infer that the balance is $100 (the current state).
This is a fundamentally different way of storing state (the state is implicit, transitions are explicit). In a “normal” database-driven approach we just store the state, which you can think of as a snapshot in time. With events as the source of data we have a way of knowing how we got to a particular state. This is not possible with state alone (is the balance $100 because there were 5 deposits of $20 or 2 of $50? No way to tell if you only know the balance).
This quality of being able to reconstruct state is what relieves the temporal constraint on data.
To help bring this point home imagine this example where we rely on an email microservice. Given that several email sending providers have the ability to send transactional emails (where there are notifications for when the user reads the email, for example) we want to leverage it in the following way:
- when the user’s subscription payment is overdue we send an email
- when the user opens it and a certain amount of time goes by and they are still late in their subscription payment send them another email with a discount
- if they don’t open that after a certain period of time we cancel their account
We want to do this but we want the email microservice to be free of this very specific billing/marketing logic.
Here’s how that might go. Imagine we have a userSubscription microservice that periodically identifies users whose subscription payment is overdue (the details here are not really important). When a user’s subscription payment is overdue a command is sent to the email microservice that triggers the sending of the email, here’s how that command might look like:
{
type: 'SendEmail',
data: {
id: 'some generated id',
from: '[email protected]',
to: '[email protected]',
subject: 'Your subscription',
body: 'We noticed that your payment is overdue...'
},
metadata: {
originEntity: 'john\'s subscription id'
}
}
The events in the email service’s event stream (this is where the events get stored, and the event stream is owned by the email service, no other services should write to it) can be something like:
{
type: 'EmailSent',
data: {
id: 'some generated id',
from: '[email protected]',
to: '[email protected]',
subject: 'Your subscription',
body: 'We noticed that your payment is overdue...'
},
metadata: {
originEntity: 'subscription.john_subscription_id'
}
}
{
type: 'EmailOpened',
data: {
id: 'some generated id',
timestamp: '2022-06-06T08:29:30Z'
},
metadata: {
originEntity: 'subscription.john_subscription_id'
}
}
Now, I’m leaving all sorts of details out on purpose, like how we handle the webhook that notifies us that the email was open, and how that leads to the EmailedOpened
event with the metadata with the correct originEntity
. Although all these details are important, they would be distracting us from the main point, which is how these two services can interact in this event-based approach.
First let us state explicitly that the userSubscription service must know about the format of the SendEmail
command and also the events the email service “raises”. This shouldn’t be too surprising since this microservice uses the email microservice, so it depends on it. The important thing here to highlight is that the email service does not have anything in it that is dictated by other services, in this case specifically from the userSubscription service.
When the userSubscription service sends the SendEmail
command it includes an originEntity
in the command’s metadata. That originEntity is copied to all the events’ metadata that get raised when handling the command, in this case that’ll be EmailOpened
.
The userSubscription service can then “listen” to events raised by the email service and when it sees one that matches one of its entities it can update it in its own userSubscription event stream. Here’s how the subscription service’s event stream might look like for John’s subscription:
{
type: 'PaymentOverdue',
data: {
timestamp: '2022-06-06T07:29:30Z'
subscriptionId: 'john_subscription_id'
},
metadata: {}
}
{
type: 'PaymentOverdueNotificationSent',
data: {
timestamp: '2022-06-06T07:30:30Z'
subscriptionId: 'john_subscription_id'
},
metadata: {}
}
{
type: 'PaymentOverdueNotificationOpened',
data: {
timestamp: '2022-06-06T09:30:30Z'
subscriptionId: 'john_subscription_id'
},
metadata: {}
}
{
type: 'PaymentReceived',
data: {
timestamp: '2022-06-07T09:30:30Z'
subscriptionId: 'john_subscription_id'
},
metadata: {}
}
Subscriptions and Consumers
I’ve casually mentioned event streams and subscriptions (to event streams). They deserve special attention here in case this is a new concept to you.
First, an event stream is just a sequence of events. Some events on that stream will be related to each other, they will “belong” to a specific entity (sometimes called also called a subject). For example a event stream for an account:
This stream has data for 2 entities, for example if we “filter” by account.123
we get:
A command stream is exactly like an event stream (sometimes event/command streams are referred to as message streams and their contents as messages because of these two “types” of “messages”). It’s just the semantic of its contents that changes. Messages in a command stream have a type that denotes an action should be performed, e.g. SendEmail
. Contrast this with a message in an event stream like EmailSent
where it symbolizes an event that has taken place in the past.
How are these streams consumed by the microservices then?
It’s common to have a specific abstraction of a consumer, for example the userSubscription microservice from the previous example would have a consumer for the userSubscription command stream, another for the userSubscription event stream, and another for the email event stream.
A small aside. Command streams should not be shared, there can be several microservices writing to it, but only one consuming it. Having multiple consumers on the email command stream would be like having multiple different microservices responding to requests to send emails.
Coming back to what a consumer really is: it’s just a way to keep track of where inside the stream we are at. For example a consumer for stream account
might have saved that it’s at message with sequence number 2 (seq no
in the images above).
Consumers are typically persistent, which means that if the microservice that references it goes down, when it comes back up the consumer will be there for it and will “remember” that the next message should be the 3rd. This is what enables the microservices from being temporally decoupled from the data (their view of the data won’t change while they are “not looking”).
Subscriptions usually encompass a filter over the events of a stream (e.g. account.123
filters only events for that specific entity), a consumer, and definitions of how the event types in the stream should be handled, for example: a map where the key is SendEmail
and the value is a function that receives the SendEmail
command message as a parameter. Here’s a code example:
const subscription = await messageStore.createSubscription('email:command', {
SendEmail: function(command) {
const toAddress = command.data.to
const fromAddress = command.data.from
//...
}
}, 'microservice:email')
Here the command stream name is email:command
and the filter is email:command.*
(not explicitly shown, but you can imagine that being a reasonable default, it means handle all events/messages whose entity/subject starts with email:command.
, for example email:command.123
). The subscription also has a definition for how SendEmail
should be handled.
The microservice:email
is the name of the consumer that you can imagine will be automatically created if it doesn’t exist (or can have been created beforehand).
With this information in mind, let’s have a go at explaining with the example how we are not temporally coupled with respect to how we consume the data.
Coming back to userSubscription and the email microservices examples, imagine that the userSubscription microservice goes down immediately after sending a SendEmail
command. The email is sent, the user opens it and that is all handled while the userSubscription microservice is down (by the email microservice).
When it comes back up and it resumes its subscription on the email microservice’s event stream it can handle the EmailSent
and EmailOpened
events without any issues.
In case it isn’t obvious the userSubscription microservice will know that the events from the email microservice’s event stream are of its interest because the originEntity
in the events’ metadata.
If you check go back and look at the JSON with the example of the email events you’ll see that the originEntity
was something like subscription.john_subscription_id
.
From that, the userSubscription microservice identifies that the message is of its interest. It can also check to which entity (john) it pertains to. Handling this event should yield new events in the userSubscription’s event stream, in the example that’s the PaymentOverdueNotificationSent
and PaymentOverdueNotificationOpened
.
Using this approach, the userSubscription and email microservice would still work, in the limit, with both of them never being active at the same time. This is what enables that big benefit of being able to scale up and down without any worries.
Super quick aside about naming of consumers. You might have noticed that I named the consumer in the example microservice:email
. This is because usually there’s tooling that will let you list the consumers of a stream. If you were to look at the email event stream’s consumers in the example above you’d be able to see microservice:userSubscription
and you’d know that way that the userSubscription microservice is “using” the email microservice. Here’s how that looks like if you are using NATS JetStream:
Projections and Aggregations
Having data in streams has all these nice properties but it also has different consumption patterns. One we’ve already mentioned is to consume message by message using a subscription, the other is what what we’ll be describing next.
Imagine we want to know an account’s current balance. We have to go through all the events in the account’s stream and calculate its balance, for example for account.123
:
The balance is 250
.
This is called a projection.
Projections are often used to guarantee idempotency of command executions. Say that the email microservice is taken down for whatever reason after sending an email but before notifying that it handled the SendEmail
command. Consumers support having manual acknowledgements of when a message has been dealt with and if the message is not acknowledged it is redelivered so it can be retried.
One important aspect here is that the SendEmail
command is written to the stream using email:command.emailId
where emailId
is a unique id.
The email microservice writes the events to the email stream using the same id as the command’s: email.emailId
.
That means that when the email microservice gets a command, e.g.: email:command.0082b815-bfe6-4b8a-aeef-20517043ca65
, it can perform a projection over email.0082b815-bfe6-4b8a-aeef-20517043ca65
and answer the question: did we already send this email? And if so, ignore it.
Here’s how that could look like in code:
const initalProjectionValue = {
isEmailSent: false,
isEmailOpened: false
}
const result = await messageStore.project(`email.${emailId}`, {
EmailSent: (projected, event) => {
projected.isEmailSent = true
},
EmailOpened: (projected, event) => {
projected.isEmailOpened = true
}
}, initalProjectionValue)
if (result.isEmailSent) {
//ignore command
}
Just a few comments about the code sample before we move on. I named the instance that is capable do performing projections over streams messageStore
. This is by no means universal, eventStore
would’ve been another good name, in fact there’s a product called EventStoreDB that supports all of this we’ve been describing.
So, a projection is the process of going over all or a subset of events in a stream to extract information.
An aggregation is a little bit different than a projection. What we pretend to achieve with aggregations is to “convert” a stream’s data to something more easily consumable, specifically to a database table.
Coming back to the user registration example, before we issue the RegisterUser
command we want to be able to quickly check if a username is already taken.
If we have the list of users in a database we can quickly check for the uniqueness of the username and that way save us the trouble of having to issue a RegisterUser
command.
We can, and should, still perform a projection on the user microservice when handling the RegisterUser
command message. We need to do this to validate that the user hasn’t registered already and we should not skip this step because if a RegisterUser
command is retried we have to guarantee its idempotency.
Aggregation is then the process of continuously updating the state of entities in a database. The process is in all identical to creating a microservice that handles commands, but in an “aggregator” we subscribe to an event stream and use each event to update a record in a database.
For example here’s how the mongodb user collection might look like while we have a user aggregator going through the user event stream:
In the picture above you can see that the consumer is at event with sequence number 2 and that the db values reflect that.
You might have noticed the lastUpdateSeq
property. Using it enables use to make sure that if we see the same event twice, like a command, the result of handling it again will be the same as doing it just once.
As the messages in a stream have a specific order, and after they are added their order is immutable. Because of this, we can safely make some assumptions based on the sequence number of a message. For example if we’ve “seen” message with sequence number 5 and suddenly we see 4 (again) then we can safely ignore it.
For example, using mongo, we could have the following aggregator subscription handling the NameChanged
event:
This assumes we don’t scale aggregators up and down. If we do, we can’t make the assumption that the events will reach our aggregator in order. This isn’t a big deal as an aggregator should be fast enough that this isn’t a problem. If it is then we’ll have to rely on another solution that doesn’t involve the sequence number being used this way.
Also, the sequence number is a term specific to Nats Jetstream which is the store I’ve been using lately. It might have different names in different event stores.
View Data and its Consumption Patterns
Sometimes the aggregated data is referred to as View Data
as that is often its intended use: to “feed” the views.
Coming back to the CQRS paradigm: we issue commands that are handled by microservices, these produce events, these events are aggregated and stored in a database.
All of this is “disconnected” in the sense that from the point of view of who’s issuing a command all is done after the command is sent. A good example that makes this easy to picture is the checkout process at Amazon. After clicking check out the confirmation screen comes almost immediately. Details come later in an email.
That’s just one pattern though, another one is, let’s call it “just hope for the best”, assuming the data will be ready after issuing the command. Some media websites do this where you post a message/comment (presumably because some sort of moderation happens in the background). After posting the message you are sent to you feed/comments section and the message most of the times will be there (sometimes it isn’t and it leads to people “double posting”).
This is definitely an area where using this event-based architecture is harder than traditional approaches. However, with websockets you can achieve an experience that is indistinguishable from a traditional approach.
In broad strokes here’s how it goes: say that you want to provide instant feedback on a name change operation. On your user microservice you’d create a subscription on the user event stream and you’d specifically handle the NameChanged
event (it’s quite common for a microservice can subscribe to its own event stream).
On handling NameChanged
you issue a type NotifyUser
command to a socket-notification
microservice, something like: socket-notification:command.userId
. The socket-notification microservice runs a socket service that the front-end can connect to.
The data in the NotifyUser
command should contain the necessary information so that when we send the websocket event to the browser, using the right user’s socket connection, we’ll know what to do with it, in this case just the event type (NameChanged
) is probably enough.
So the full cycle from a website’s point of view is to issue a POST request to trigger the name change, and wait for the NameChanged
event to come through as a socket event.
From a user’s perspective this is indistinguishable form a traditional request/response cycle. The difference is that any of the microservices involved might get taken down during this process, and as long as they come back up the only noticeable thing from the users point of view is that the “response” took a little bit longer to arrive.
Conclusion
That’s it. What this blog post describes can be summarized as an Event-Driven architecture. It has the properties that enable having parts of it running in different processes, at the extreme, without any of them needing to be running at the same time which is what we need to be able to really reap the benefits attributed to using microservices.