Imagine being on call for a service and getting paged for high CPU but not knowing why. Traffic hasn't increased and no recent deployments have happened. However, through the Clutch Slack sink, you can see in Slack that a teammate resized your service’s HPA/ASG. Visibility into infrastructure changes is a top level concern for Clutch, which is why the Slack sink is one of the out-of-the-box features.
Services are bound to degrade. It’s a matter of when, not if. In a distributed system where there are many interdependent microservices, it is increasingly difficult to know what will happen when a service is unavailable, latency goes up, or when the success rate drops. Usually, companies find out the hard way when it happens in production and it affects their customers. This is where Chaos Engineering helps us.
Design is one of those things that is tricky but imperative to get right. While this may seem obvious, creating and implementing a great design is much harder than you might think.
Ever wonder how Clutch handles serving frontend assets without a CDN in a distributed multi-version deployment? This article will touch on how Lyft deploys Clutch and some of the early problems we faced when rolling it out.
Today we are excited to announce the open-source availability of Clutch, Lyft’s extensible UI and API platform for infrastructure tooling. Clutch empowers engineering teams to build, run, and maintain user-friendly workflows that also incorporate domain-specific safety mechanisms and access controls.