SRE NEWSLETTER

Issue #41 // October 22, 2021

The big news this week comes from Cloudflare. They are working to become the 4th major cloud provider and taking a complete different approach than Amazon, Microsoft, and Google.

We also see a number of articles about planning for failure. Site reliability engineering isn't about avoidance. It's knowing that failure will happen and coming up with the appropriate strategy on how to handle it.

Eating the Cloud from Outside In
// swyx.io
"AWS is playing chess. Cloudflare is playing go." Cloudflare is getting lots of love after announcing R2. This is a really good article about their unorthodox strategy of surrounding and smothering alternative services (like the game, go) as opposed to directly competing (as in chess).
Redis Anti-Patterns Every Developer Should Avoid
// developer.redis.com
Redis is the most loved database according to Stack Overflow, and I believe the most common database ran on AWS now. This article shares a list of 12 things to watch out for.
Multicloud Failover is Almost Always a Terrible Idea
// cloudpundit.com
A full AWS / Azure / GCP outage of all regions is extremely rare. Even if you use IaaS or containers, there's a ton of effort just understanding the different architectures and features of each cloud provider. The odds of you making an error is higher if you don't stick to one.
SRE Toolkit: Failure Domains
// willett.io
A simple little article about avoiding cascading failure. Worth reading just for the cartoon cow.
Software Developers have Stopped Caring about Reliability
// drewdevault.com
An opinionated piece fighting against the concept of "move fast and break things". The article is less about reliability and mostly about the overall laziness and lack of craftsmanship in application development - especially when it comes to usability.
How We Cut Our Load Balancing Cost by More Than 96%
// setops.co
I've seen this a number of times... You're overpaying in the cloud because you've provision a resource per application or per service (such as a load balancer) either due to copy and pasting IaC files or some "best practice". Save some money. Share.
Nomad vs. Kubernetes
// nomadproject.io
If you're already familiar with HashCorp's Nomad skip this one, but K8s get's so much love, I thought this was worth sharing. Nomad offers an architecturally simpler alternative to Kubernetes. This article dives goes into more detail on why Nomad may be a better fit for your containers.
vscode.dev
// code.visualstudio.com
Now when you go to https://vscode.dev, you'll be presented with a lightweight version of VS Code running fully in the browser. Open a folder on your local machine and start coding. No install required.
Four Lessons Every Company Should Learn from the Back-to-Back Facebook Outages
// venturebeat.com
I hope you have AdBlock for this site. Yuck. Here's a summary: (1) Acknowledge human error as a given and aim to compensate for it. (2) Conduct blameless post-mortems. (3) Avoid cascading failures. (4) Favor decentralized IT architectures.