At the Financial Times, we built our first microservices in 2013. We like a microservices-based approach, because by breaking up the system into lots of independently deployable services - making releases small, quick and reversible - we can deliver more value, more quickly, to our customers and we can run hundreds of experiments a year.
However, to benefit from this new approach we had to make some pretty big changes to our culture. I don’t think we’d have been successful in adopting microservices without also adopting devops. Our delivery teams are responsible for building stable, resilient systems and fixing them when they go wrong.
So how do we go about building stable, resilient systems from microservices? And how do we make sure we can fix any problems as quickly as possible?
I’ll talk about building necessary operational capabilities in from the start: how monitoring can help you work out when something has gone wrong and how observability tools like log aggregation, tracing and metrics can help you fix it as quickly as possible.
We’ve also now being building microservice architectures for long enough to start to hit a whole new set of problems. Projects finish and teams move on to another part of the system, or maybe an entirely new system. So how do we reduce the risk of big issues happening once the team gets smaller and there start to be services that no-one in the team has ever touched?
The next legacy systems are going to be microservices, not monoliths, and you need to be working now to prevent that causing a lot of pain in the future.
You need to find ways to maintain active ownership of any service that is important to your company. This means knowing exactly what services are out there, whether they are important, and making sure people know they are on the hook if there is a problem. It also means making it as easy for people to restore service and get up to speed on this functionality they may never have looked at before.