“GameDay” - Achieving resilience through Chaos Engineering

location_city Melbourne schedule Jun 29th 11:30 AM - 12:15 PM place EN 413 (L80+) people 35 Interested
Agility has brought us iterative software development, independent feature teams, nimble architectures and distributed, scalable infrastructure. But how do you have confidence that your production environment keeps working in the face of this emergent complexity and fast paced change?
The answer is to anticipate failure, and to build resilience into every layer. This requires your whole system - not just the software and infrastructure, but also people and processes - to be able to respond quickly and appropriately to unexpected events. And the way to simulate the truly unexpected is to do experiments through the introduction of some chaos.
GameDays bring together people from across an organisation to collaboratively break, observe and recover a system - with the impact on the holistic customer experience at front of mind. Apart from learning how the technical system responds under stress, some of the main benefits come from the shared understandings and process improvements which are generated. GameDays should be more than just an event or a one off exercise - they embody an enduring mindset and a culture.
This session will examine, from a first hand perspective, several case studies of where GameDays have been successfully executed in organisations ranging from startup to enterprise scale. The theoretical underpinnings to chaos engineering will be explored, and a range of practical tips and reference material will be shared.

Outline/Structure of the Talk

  • Why Chaos Engineering and GameDay? (10 mins)
    • Principles of Chaos Engineering
    • How GameDay is different from Chaos Monkey and DR tests
    • What are the circumstances in organisations that contribute to challenges around resilience
    • We want to inspire organisations to give GameDays a go

  • Case Studies - How GameDay has been implemented in three different organisations and the common challenges and pitfalls (15 mins)
    • Startup scale (health insurance provider)
    • Medium scale (national classifieds provider)
    • Enterprise scale (one of the big 4 banks)

  • How to plan a GameDay (5 mins)
    • People
    • Scenario and hypothesis generation process
    • Logistics

  • Practical tips to ensure GameDay has an enduring effect on resilience and culture (10 mins)
    • Managing cultural perceptions of failure
    • Communicating the value of chaos engineering to stakeholders of all levels in an organisation
    • The journey towards automated resilience testing (e.g. Chaos Monkey)

  • Questions (5 mins)

Learning Outcome

  • What is chaos testing and why do it
  • What is a GameDay
  • How to plan a GameDay
  • Common challenges and pitfalls
  • How to ensure that your GameDay isn’t a one hit wonder and has an enduring effect on resilience and culture

Target Audience

Members of agile delivery teams, delivery managers, technical leaders, product owners

schedule Submitted 3 years ago

  • Anna Fiofilova

    Anna Fiofilova - Survival guide for women in IT

    Anna Fiofilova
    Anna Fiofilova
    Sr. Software Engineer
    REA Group
    schedule 3 years ago
    Sold Out!
    20 Mins
    Working in the IT industry today is hard, but it is even harder if you are a woman. There is still lots of “old school thinking" that women face daily. This talk is a survival guide based on real-life stories and different challenges from women of different ages, cultural backgrounds and roles in the Australian IT industry.
    Like any survival guide, this one provides you with the essential information to help you identify and overcome the most frequently encountered hazards. Each chapter contains useful tips, instructions and practical advice on a particular issue so you can implement the skills and techniques even under the most stressful circumstances. From the hiring process to promotions and corporate events, you'll have the tools to survive.
    You will learn these skills and more:
    • Assess your situation and prioritize your needs;
    • Surviving techniques for the hiring process;
    • Assemble your own custom emergency kit with essentials tools;
    • Manage extreme work conditions and overtime;
    • Survive corporate parties and drinks;
    • Build trust network and create allies; 
    • Identify your enemies and their habits.
    Preparation is the key. If you are starting your career in IT or navigating through it - this guide is for you.
  • Steve Mactaggart

    Steve Mactaggart - Evolving the role of team leadership in a devops transformation

    20 Mins

    There is much discussion about the changing roles of Development and Operations staff when organisations undergo agile/digital/devops transformations. But what about the changing role of the Team Leader?

    In pre-agile environments, as a Team Lead, your role is one of structure and co-ordination, it is through you that work routes. You know the skills and capacity of your team and are regularly making decisions about what can and can’t be done.

    But as your team starts to work in agile teams, the need for you to keep them busy is reduced, as this is now a responsibility of the product owner and agile team itself.

    You might find yourself asking “Do we need Team Lead’s in an agile/devops culture?”, and if so “What value can I provide?

    This sessions looks at the opportunities existing Team Leaders have to support and drive digital transformation through the discussion of focus' they can bring to the team.

  • John Contad

    John Contad - The Importance of Teaching in Organizations

    John Contad
    John Contad
    Development Lead
    schedule 3 years ago
    Sold Out!
    45 Mins

    Mentorship matters. A lot.

    The future is going to be weird: technologies are growing faster than we can teach them, and we need more experts quick. In this talk, we'll discuss the many ways we teach DevOps practices in an organization as analogues of how systems transmit data. We'll talk about the advantages and pitfalls of:

     - Broadcast systems (e.g., Universities)
     - 1:1 Discovery (e.g., Mentorships)
     - Gossip protocols (e.g., Communities and guilds)

    We'll unpack each methodology, discuss the information dispersal mechanisms and attributes of each system, and see where they fit. Because really: DevOps isn't about technology choice, or language, or infrastructure. First and foremost, it's about people.


  • Robert Lamb

    Robert Lamb - The Viable Systems Model: a framework for organisational design

    45 Mins

    The Viable Systems Model (VSM) offers a systematic approach to diagnosing organisational issues and designing organisational structures in consideration of how the organisation adapts to its environment. This session will provide an overview of the VSM's insights and modes of use. 

    We propose that enterprises can be considered to have three aspects: setting the strategy (direction), getting the work done (realisation), and structuring for sustainable effectiveness (organisation).

    Popular contemporary frameworks for Strategy development include the Business Model Canvas and systems thinking approaches such as causal loop diagrams and system dynamics models.

    Realisation methodologies include Lean and Six Sigma techniques for process improvement and BPM for process management, as well as specialist IT practices such as Agile and Enterprise Architecture. 

    There appears to be a gap when it comes to methodologies for organisational design.

    The Viable Systems Model (VSM), developed by Stafford Beer on cybernetics principles in the 1960s, offers a complementary, systems oriented, approach to the Organisation dimension. 

    Stafford Beer's books are not widely available, and much of the discussion on the internet is highly technical and specialised. However, the basic concepts of the VSM can be very useful in analysing and designing organisational structures and interactions.

    This talk will provide a non-technical introductory overview to the elements, objectives and application of the VSM, and will invite participants to consider its applicability to their own needs.

  • Jagannath Vaikuntham

    Jagannath Vaikuntham - Ensuring Better Quality with Docker

    Jagannath Vaikuntham
    Jagannath Vaikuntham
    Quality Craftsman
    schedule 3 years ago
    Sold Out!
    45 Mins

    Docker is an awesome container platform. As part of this talk , I would like to show and discuss , how it can help solve some of the common and annoying issues faced while testing . Namely :

    • "It worked on my machine" problem
    • Testing different configurations, with the same codebase
    • Efficiently testing on your / dev's machine.
    • Setting up the Continuous Integration Environment
    • Scaling  / Parallelising Test Runs ( via Selenium Grid  & Docker Compose ) 
  • Tim Pittman

    Tim Pittman / Shannon Rowe - From Consultant to Client

    20 Mins

    Join Shannon and Tim for a fast paced account of their move from being high-flying consultants to down-to-earth product people.

    Where do they now add value? How has their relationship with their team changed? How many timesheets are they doing?

    All will be revealed!



  • Tom Partington

    Tom Partington - An introduction to Web Performance Optimisation - practical steps for reducing costs and improving the user experience

    45 Mins

    The web is increasingly becoming the standard way in which we conduct our business, but despite the use of ever improving technologies many websites are frustratingly slow and getting slower. It is becoming more difficult to compete for and retain users attention, and If you operate in the online space you can no longer afford to ignore the performance of your website or platform.

    This talk will provide an introduction to and the benefits of web-performance optimisation, explain why it matters now more than ever and why it's so commonly overlooked during the development process. It will also show how you can identify and fix the most common performance pitfalls, resulting in reduced costs, and increased user engagement and satisfaction.

    During this talk there will be an opportunity to follow along on your own laptop and learn how to use some of the tools firsthand.

  • Colin Panisset

    Colin Panisset - Simplifying Amazon ECS by Weaving overlay networks

    45 Mins

    Amazon ECS is a widely-used Docker container scheduling and orchestration platform, but some of the constraints it applies makes for ... interesting ... workarounds. Chief among the challenges are connecting containers to each other, across services, clusters, and even regions. Working in hybrid cloud environments is also a problem.

    This talk will show you one way to address these challenges: by using the "weave" Docker overlay network, a self-discovering and self-configuring mesh network that permits configuration simplicity and out-of-the-box parity with the popular docker-compose structure.