Last year Newsweaver began development of a new project with 2 primary goals. Firstly we had a new analytics product to be developed and secondly we wanted to use this product as the starting point for a new approach to building and delivering software within Newsweaver. In this post I will explain our approach to development, our new architecture, how we run our software and monitor it. This is an overview, future blog posts on our new analytics product will provide more detail on each of these topics. In the spirit of agile we are open to continuous improvement, so while this represents where we are or where we are heading, both are likely to change.
What we are trying to solve
Newsweaver is growing as a company and as we grow we need the ability to scale our teams and our software more effectively. Traditionally our product consisted of a main application with long running tasks delegated to approximately 20 back-end services, all talking to a shared database. This was essentially what’s referred to as a Monolith. The database became an integration point, with changes having widespread impact. New developers needed to understand the complete codebase which impacts time required for on-boarding.
When we deployed, it was one release of the whole application. This process, although mostly automated, was still costly. This cost led us to deploy once monthly, which meant we had a month’s worth of features in one release. That’s a lot of change in one go and with additional teams this would be even more change. We wanted the ability for teams to work relatively independently and respond with greater agility.
There were many criteria for evaluating possible solutions. We had to take into account the limitations, outlined above, with our current approach but we also wanted to consider the impact of changes on:
- our sysadmins
- ease of scaling
- cost of deployment
- ease of monitoring.
This list of criteria led us to microservices. An approach that ticked all the boxes we were looking at and tied in with our thinking on Domain-Driven Design, Continuous Delivery and DevOps at the time.
The buzz around microservices has been growing over the last couple of years. The hype is best summed up by the following picture (also see Docker).
But behind the hype is a solid approach to building software. Microservices made sense for Newsweaver as they solve many of the issues inherent in scaling development teams. It is however important to note that they are not a silver bullet and introduce some complications of their own. The microservice approach to application architecture requires a blog post all by itself, which we will do soon.
Microservices are small (this is the micro bit) services which communicate in order to perform some bigger goal. This requires an approach to building and running software that borrows heavily from the DevOps, Domain-Driven Design and Continuous Delivery movements. One approach to defining the composition of microservices is to use the Domain-Driven Design approach of bounded contexts. Each bounded context represents a different service or more likely provides a gateway into a set of services/smaller bounded contexts. Our services communicate through well defined REST API's.We use Consumer-Driven Contracts in order to ensure that we can make changes with confidence. Borrowing from DevOps, the running, monitoring and maintenance become a concern for the people that write the software. Continuous delivery concentrates on short release cycles and ease of deployment. These are enabled through separation of concerns in smaller services and shorter Mean Time To Repair.
When microservices are built correctly they provide a number of benefits. They enable largely independent functionality changes, decoupled deployment cycles, horizontal scaling to solve bottlenecks, impact of failure goes down(you break 1 thing rather than whole application). It also becomes easier to onboard new developers as it’s conceptually easier to understand a smaller code base for individual services. However if they are built incorrectly they can be difficult to run and you end up maintaining multiple applications in a live environment rather than one monolith. You also can end up spreading complexity and coupling which is a worse problem than nice centralised complexity and coupling also known as a big ball of mud.
The running burden
The observant among you will be questioning how we are easing the burden on operations by giving them lots of small services to deploy, monitor and run. We do this via automation, lots of automation.
We automate as much of our processes as possible or at the very least require one command/button push in order to trigger sensitive/restricted actions. This includes build creation(for new branches stash/bamboo), code quality checks(sonarqube for reporting and JaCoCo to fail builds for lack of coverage, issues count or technical debt), automated test on every checkin, automatic artifact creation, build promotion, automatic build delivery to test environments, automatic provisioning of new environments(ansible), service deployment(kubernetes), service discovery(smart stack) and monitoring(graphite/grafana,dropwizard metrics).
About the only thing not automated is pull requests. This is a manual process and enables us to share knowledge within teams while also ensuring the quality and maintainability of the codebase.
Our unit of deployment is a docker image and each microservice resides in its own image. The image is immutable and is promoted through our various environments (test,staging,production). Kubernetes is used for the deployment, management and replication of these images on our servers. This removes a lot of the overhead associated with deployment as this is all automated and simply requires a kubernetes config file per service.
We have zero downtime deployments on the parts of our system running on the new environment. This is a great benefit and again ties into our development approach as a whole, zero downtime deployments mean we are not tied to certain deployment times so we can respond with greater agility to important features or bugs when they appear. The intention within Newsweaver is to move the entire application to microservices.
Having lots of microservices means there are lots of things that can break. We have multiple service types, all of which have multiple instances. We need to know instantly when something is broken, ideally before the customer does or even better before it actually breaks. This is why monitoring is a first class member of our ecosystem. It is not enough for it to be an afterthought of a microservices implementation.
We use Graphite with Grafana (makes the graphs pretty) sitting on top for displaying metrics and Beacon for alarming. We have a number of data sources. We generally write our services using dropwizard. Dropwizard exposes monitoring metrics for application and the JVM via Yammer metrics, we use collectd for our infrastructure OS level metrics. We have other systems feeding into Graphite via native plugins Spark and Cassandra.
Kanban and Lean
Kanban is used as a way to organise and schedule our work. This has been a success so far. The Kanban board provides good visibility for people on the teams. It helps to shift the focus away from getting features complete by a certain time to getting features complete. This enables us to focus instead on quality and getting the right features built. The development team commit to a certain priority of features for a given cadence(similar to sprint) but not to the fact that all those features will be built in that timeframe. Product managers benefit by having greater agility in responding to market demands due to combined benefits of independent deployment cycles and short Kanban cadences.
Our features are expressed in stories and story size. Leanness is very important. This is where we tie it all together and take advantage of our architecture, automated testing, automated deployment and zero downtime deployments. They are all interwoven and when coupled with small stories enable us to deploy validate and iterate quickly on a feature.
This is a quick overview of our approach to software in Newsweaver. What we use, what we use it for and why we use it. Each of these topics warrant further posts which we will get around to producing over the next number of months. Up next is a post on our analytics product and the technologies we use.