SRE NEWSLETTER

Issue #30 // June 4, 2021

DevOps Roadmap
// roadmap.sh
A visual, step by step guide for DevOps or any other operations role.
Best Practices Around Production Ready Web Apps with Docker Compose
// nickjanetakis.com
For DockerCon 21, Nick Janetakis, gave a 29 minute talk where he covered a bunch of Docker Compose and Dockerfile patterns that he's been using and tweaking while developing and deploying web applications. This post is a written form of the video.
You Don't Need to Rebuild Your Development Docker Image on Every Code Change
// vsupalov.com
Local development in Docker can feel really slow. If you are using docker build frequently and your containers need to be restarted a lot, this post will help you to save some time.
Why Do Config Changes Keep Coming Up in Major Incidents?
// surfingcomplexity.blog
Lorin Hochstein gives a few hypothesis as to why so many outages involve configuration changes, including the organization actually being MORE mature.
How to Hire, Assess, & Manage SREs
// blameless.com
Are you considering adopting SRE? Blameless explains the roles and responsibilities of an SRE team within your organization, and how to start building one.
The Advanced Principles of Chaos Engineering
// verica.io
Build a hypothesis around steady-state behavior. Vary real-world events. Run experiments in production. Automate experiments to run continuously. Minimize blast radius.
How a Jenkins Job Broke our Jenkins UI
// slack.engineering
Slack uses Jenkins to continuously build and test their mobile apps before release. One day their Jenkins UI stopped working although the jobs continued to run. This post is a breakdown of they ended up in this state and how they fixed the problem.
Salesforce Multi-Instance Service Disruption RCA
// help.salesforce.com
Salesforce made the news last month with a huge outage. This is an excellent, in-depth root cause analysis on the issue.
TeamTNT Targets Kubernetes, Nearly 50,000 IPs Compromised
// trendmicro.com
By analyzing data belonging to a few TeamTNT servers, Trend Micro discovered the tools and techniques used by the group to scan and compromise Kubernetes clusters in the wild.
Stack Overflow Sold to Prosus for $1.8 Billion
// wsj.com
Prosus, the owners of Udemy and Codeacademy struck a $1.8 billion deal to acquire Stack Overflow in a bet on growing demand for online tech learning.