For readers who are pressed for time and already familiar with basic chaos engineering concepts, you can jump to the 5-minute guide to running a Chaos Day, and the case study on condensing the entire process into a 2.5 hour mini chaos event.
For those that want more depth, read on.
We’ve created this playbook to help teams and organisations design, plan, execute and review a Chaos Day. It’s not just for engineers; it is for everyone involved in delivering software. Product owners can learn more about the risks and impacts of failure, testers can learn how to explore edge cases and test for resilience and designers can benefit from a greater understanding of the user experience of failure and how to design interfaces that are adaptable.
This playbook is for any organisation, regardless of their tech stack or maturity. You don’t have to use containers, Kubernetes, or be in AWS, GCP, Azure or any other cloud platform to gain the benefits of probing your system’s response to failure.
a day of chaos is for everyone
- Chaos Days are great opportunities to run experiments that explore security threats. For a distillation of our thinking on how best to apply security within continuous delivery, look at our Secure Delivery Playbook.
- Any size of service benefits from Chaos Engineering. This playbook describes an approach that can be scaled up from a single service to an entire platform. We’ve further advice on why, when, and how to build a Digital Platform in our Digital Platform Playbook.
- For teams practising the You Build It You Run It (YBIYRI) operating model to build, deploy, operate, and support their own digital services, Chaos Days is a perfect tool to better understand how their services respond to failure. You can learn more about the YBIYRI model in our You Build It, You Run It playbook, by Steve Smith and Bethan Timmins.