// increment.com
This months issue of Increment shares approaches to reliability and resiliency in our software, technologies, and teams, and offers perspectives on the realities of failure in the systems we build.
// protocol.com
A little over 10 years ago, a group of operations-oriented engineers decided they were fed up with software developers who didn't care if their code actually worked. Those engineers created the Velocity Conference in order to band together and to come together as a community. That community sparked a revolution known as DevOps.
// datadoghq.com
Datadog, moved their job system to Kubernetes. It took substantially more CPU time than before, yet completed jobs at a 40-50% slower rate. This post describes how they solved this performance regression. The solution involved some performance experiment design, light performance tuning, and timing analysis to get back to parity.
// codeascraft.com
For Etsy, 2020 was a year of unprecedented and volatile growth. Their site traffic leapt up in the second quarter, when lockdowns went into widespread effect, by an amount it normally would have taken several years to achieve. If they over-scaled, there was a risk of wasting money. If they under-scaled, it could have been much worse for their sellers.
// archive.org
[Video] Jonah Edwards runs the core infrastructure team at the Internet Archive. In this video he goes through the scale of the infrastructure that runs the Internet Archive.
// atodorov.me
A decent part of the proposals for a “new Kubernetes” are design choices made by Hashicorp's Nomad, which is a pretty underrated orchestrator, and drastically simpler.
// productcrunch.substack.com
Can the characteristics of an agile roadmap be distilled into a set of universal principles? Principles help you assess an existing roadmap or create a roadmap with them in mind. The result is an attempt at defining agile roadmapping principles.
// theregister.com
Windows Server releases every three years or so, and continues to make progress on its goals of easier administration, removing reliance on the server desktop GUI, and stripping down the operating system so that most features are optional components.
// engineering.mercari.com
This blog post provides an understanding of the retry pattern used in microservices architecture, why it should be used, a few considerations while using the retry pattern, and how to use it in Python.
// gru.gq
This is a discussion on the complexity of the SolarWind hack. Could SolarWind have been too difficult for the KGB to use them in an enablement operation? Yes. Of course, SolarWind wasn’t close to reaching that level.