Confused Tester in Chaotic World #ChaosTesting
"You can’t legislate against failure, focus on fast detection and response."
You can think this as a fairy tale story -
As once upon a time, in theory, if everything works perfectly, we have a plan to survive the disasters we thought of in advance.
But big Question is How Did That Work Out ?
We are here to answer that Big question with our session.
While it is possible to sit down and anticipate some of the issues you can expect when a system fails it, knowing what actually happens is another thing.
This really depends on what your tolerances for failure are and based on the likelihood of them happening.
The result of this is you are forced to design and build highly fault tolerant systems and to withstand massive outages with minimal downtime.
The prevailing wisdom is that you will see failures in production; the only question is whether you'll be surprised by them or inflict them intentionally to test system resilience and learn from the experience. The latter approach is chaos engineering.
The important aspect of Chaos Engineering is Chaos Testing.
Historically, the emphasis has always been on mean time to failure (MTTF); working hard to extend the time between system failures, with little emphasis on how fast a failure could be corrected.
In today's world, the emphasis needs to shift to mean time to recover (MTTR), minimizing the time it takes to recover from a failure.
At a high level, chaotic testing is simply creating the capability to continuously, but randomly, cause failures in your production system. This practice is meant to test the resiliency of the systems and the environment, as well as determine MTTR.
Adopting chaotic testing will help improve your MTTR, improve organizational confidence in the resiliency of your production environment, and it will also keep you out of tomorrow's headlines.
A case study to showcase the real world how can we handle our failures By testing proactively instead of waiting for an outage.
The product understudy over here is one of the key products serving the major contact center industries across the globe.
The impact of outage in an contact center with 40K+ agents specially during peak seasons is huge. Contact center are considered as backbone of industries like e-commerce, telecom, travel etc. and dealing directly with people.
How we ensured a seamless takeover between contact centers across the globes even if an entire high availability contact center goes down. The established calls in an scales of multiple thousand also recovers in fraction of milli seconds. And how this all was achieved by testing all the unknowns in an controlled environments continuously.
Outline/Structure of the Case Study
- Introduction and context setting. - 5 Mins
- What is Chaos Testing? - 5 Mins
- Should We be Chaos Testing. - 5 Mins
- Case Study - What we learnt with a leading contact center product in while testing in production - 10 Mins
- What are the advantages and challenges of Chaos Testing? - 10 Mins
- Feedback, Q&A, Reflection - 5 Mins
- What is Chaos Testing?
- Should We be Chaos Testing.
- How to start Chaos Testing?
- What are the advantages of Chaos Testing?
- A story how we identified the need of Testing in production and what we learnt out of it.
- We will not tell "this is how you should do it" but i hope you all will find some of lessons useful to your context.
Anybody & everybody who is part of professional career
Prerequisites for Attendees
- Come with an open learning mind
- Ask questions with focus on discovery and learning
- Participate fully in activities - this is a co-learning session and i am no an expert to give one way presentation and talk.
- Share experiences
schedule Submitted 5 months ago
People who liked this proposal, also liked:
Ashish Kumar - Shifting Gears for better quality or faster delivery - Shift Left or Shift Right!Ashish KumarAgile & Lean CoachSiemens
schedule 5 months agoSold Out!
In an Agile world,we are being asked to move faster—reducing the length of time to delivery while still continuing to improve the quality. At the same time, we are faced with increased pressure to reduce testing costs. The main aim of Shifting left is ‘Early Defect Detection to Defect Prevention’.
Bugs are cheaper when caught young. Being a testing professional during start of my career where i have seen the agile transformation from ground zero. I have experienced all what a tester can face usually in start of a agile way of working.
Most difficult phase is shift from defect identification or detection mode towards a more collaborative approach of defect prevention.
Why a tester was no more quality police and he/she is not the sole responsible person for quality but it was whole team. Development teams need to focus on quality from the beginning, instead of waiting for errors & bugs to be discovered late in the game.
These learning and many more in my early career helped me when I started coaching agile teams for delivering better quality product. In recent time where I am leading a transformation , one of the major ask was to break the boundaries and silos between development and system testing teams. This topic is close to my heart because i have practiced , performed and tried to perfect time and again.
I want to share How “Shift-Left” Testing Can Help Your Product Quality? What it means to testing community, with vastly different skill sets, are getting involved in the testing process. More specifically, it means that development teams are being incorporated into the testing cycle earlier than ever before.