5 take-homes for building a cloud self-service platform

Self-service development is an essential characteristic of cloud computing. It allows engineers to develop and ship code faster, releasing new features and products to market at pace. But it can be one of the hardest cloud-based capabilities to implement…

Steve Wade, ex-platform lead at Mettle, has served in technology leadership roles across financial services, government, real estate and gaming. He recently presented as part of the eSynergy Tech Series, looking back at his time leading the team at Mettle as they created a dynamic self-service platform for engineers.

The challenge at Mettle

Mettle is a venture inside of NatWest. It was spun-up to implement business banking for small to medium-sized enterprises. The Mettle offering is completely digital.

The challenge was for Steve and his team to remove red tape, innovate at speed and save money (cost per customer), all the while providing a scalable platform to service Mettle’s customers.

There was initially a single platform engineer, swamped with requests from developers wanting to innovate. This engineer was unable to help because of their backlog. There was a constant tug of war between development and operations… This had to be addressed.

Steve’s 5 take-homes…

     1. “Once you’ve found a pattern that works, your next goal should be helping others to do the same.” – Steve Wade.

At Mettle, we found a pattern that worked. One that scaled for the number of developers we had, and that enabled innovation within the product teams, without the platform team and the platform getting in the way.

   2. “Confidence is contagious and so is a lack of confidence” – Vince Lombardi…

We had to stop the existing tug of war between the developers and the platform team – we had to become one big team working on a common goal to deliver functionality and features to our customers.

This meant instilling confidence and providing self-service to the product teams, while getting away from having a massive backlog of things we needed to do for them.

    3. In any good transformation, you need a good mission statement: one you can stand behind.

At Mettle, it was this: “To provide an easily extensible, config driven, ephemeral platform, with clear ownership with – most importantly – reliability and consistency.”

We defined the use of terraform, upfront, from day one. We wanted an à la carte menu, allowing the developers and the product teams to pick and choose the modules they needed to be able to deploy and run their applications or microservices.

We made this developer friendly by implementing Atlantis as a centralised tool that all infrastructure would run through (terraform execution by pull request).

Crucially, we achieved clear ownership by leveraging GitOps, and deployed changes through the environments by embedding GitOps into the way developers work – we did not replace the way they worked with GitOps.

We established a cookie-cutter environment for developers to deploy any application they wanted, to any Kubernetes platform, in a consistent fashion. This consistency was achieved through the three-character environment prefix: Dev / UAT / Stage production. That flowed through everything we did at Mettle.

     4. “Technology is easy, people are hard.” – Heather Downing.

The people part of the puzzle was the biggest challenge. Application engineers are not Kubernetes experts – their expertise is in application development. We had to work collaboratively with them on the migration and rollout, and leveraging the platform.

We had the stability of being backed by a large organisation but they put us very much under the microscope (particularly with regards to compliance and audit). Because we used Git as our central source of truth, our commits and our PR merges to master provided the most complete audit trail possible.

Our challenge throughout, was distilling something that – from the outside – looked ridiculously complicated, down to the bare minimum that people at Mettle needed to be able to leverage the platform and innovate for their customers while building better products.

 

      5. The platform team’s customers are the developers. If the developers are happy, the platform team will surely be happy.

The most challenging part of the migration was getting everyone on the same page from an understanding point of view. There was a shared responsibility and ownership model but there was also a shared support model. So, we needed to make sure all the product teams understood the components they needed to deploy changes for their apps – but also understood the bigger picture of how everything was put together.

Our work meant that:

  • Production deployment increased by 50%
  • There was a cluster MTTR of 20 minutes (excluding data)
  • Developers were 75% less focused on operations and more focused on delivering value to the customers.

There are now clear ownership boundaries at Mettle. The platform is efficient and enables self-service. Developers can get on with their product development and keep the product leads and product teams happy. In short: everybody wins.

For more information on what Steve and his team achieved at Mettle, watch his eSynergy Tech Series presentation here.

Watch the webinar

Catch up on our March Tech Series with Steve Wade