Deep Learning Based Selenium Test Failure-triage Classification Systems

Problem Statement:

While running thousands of automated test scripts on every nightly test schedule, we see a mixed test result of pass and failures. The problem begins when there is a heap of failed tests, we caught in the test-automation trap: unable to complete the test-failure triage from a preceding automated test run before the next testable build was released.

Deep Learning Model:

The classification was achieved by introducing Machine Learning in the initial months and followed by Deep Learning algorithms into our Selenium, Appium automation tests. Our major classification was based on the failed test cases: Script, Data, Environment, and Application Under Test and that internally had hundreds of sub-classifications.

To overcome this problem, we started to build and train an AI using Deep Learning, which simulates a human to categorize the test case result suite. Based on the test result failure for each test, the AI model would predict an outcome through API, categorizes and prioritize on the scale of 0 to 1. Based on the prediction model, the algorithm takes appropriate response actions on those test cases that are failed like re-run test again or run for different capabilities. We kick-started this by gathering the historical data set of 1.6 million records, which was collected over a 12 months period, data including the behavior of the test case execution and the resulting suite.

This Deep Learning-based algorithm has been provided the quality to break down the new defects based on its category, and a classification score is given on a scale of 0-1. We’ve also established a cutoff threshold based on its accuracy of improving, and to group the failed test cases based on its similarity. Classification of the test cases is done in high granularity for sophisticated analysis, and our statistical report states that the classification of the defects has been increased with 87% accuracy over a year. The system has been built based on the feedback adapting models, where for each right classification it will be rewarded and for the wrong, a penalty is given. So whenever receiving a penalty the system will automatically enhance itself for the next execution.

The algorithm has a powerful model for detecting false-positive test results calculated using the snapshot comparisons, test steps count, script execution time and the log messages. Also, the model has been built with other features like – duplicate failure detection, re-try algorithms and defect logging API, etc.

The entire classification system has been packaged and deployed in the cloud where it can be used as a REST service. The application has been built with its own reinforcement learning where it uses the classification score to enhance itself and this is programmed to perform in an inconclusive range.

In sum, this deep learning solution can assist all Selenium testers to classify their test results in no-time and can assist to take next steps automatically and allow team could focus its efforts on new test failures.

Link: https://github.com/testleaf-software/reinforced-selenium-test-execution

 
 

Outline/Structure of the Talk

- Failures classification (Traditional Approach) and Pain points (5 Minutes)
- General Classification Types: Script, Data, Environment or application (2 Minutes)
- Deep Learning Model design for failure triage classification system (5 Minutes)
- Historical data set: features and challenges (3 Minutes)
- Predictive scoring and cutoff threshold ( 2 Minutes)
- Grouping, Attribute-based assertion (3 Minutes)
- Feedback adapting models (Rewards and Penalty) (3 Minutes)
- Package, deploy in the cloud and make it as REST service (2 Minutes)
- Our success and failures during these deep learning implementations (5 Minutes)
- Reinforcement learning for false positives and true negatives, inconclusive range. (5 Minutes)

- Demonstration. (10 Minutes)

Learning Outcome

The test automation engineer can learn and implement the deep learning model at their workplace with this learning and our public code repository and introduction to the algorithms can assist them to classify themselves based on their data set and configured classification and sub-types in a short span. In addition, the attendees also get:

  • Understanding about ways to apply Machine Learning on their test execution results
  • Classification of their test results based on categorical algorithms
  • Ways to improve their test automation test cycle time

Target Audience

Every test automation engineer

Prerequisites for Attendees

A basic understanding of the ML algorithm and their pain points towards their failure classifications from test automation results.

schedule Submitted 11 months ago

Public Feedback


    • Gopinath Jayakumar
      keyboard_arrow_down

      Gopinath Jayakumar / Babu Narayanan Manickam - Expanding boundaries of WebDriver with DevTools Integration

      45 Mins
      Demonstration
      Beginner

      Problem Statement

      Though Selenium taking most of the stake in the UI test automation tool market comfortably, there were always challenges that were for selenium test automation engineers are handicapped with especially when dealing with modern JS technologies. For example,

      • dealing with DOM elements to solve the stale / loading / non-interactable elements,
      • handling full screenshots to know how the elements at the left, bottom, etc,
      • measuring the performance of request and response resources at different speeds,
      • monitoring the memory of the pages, controls, etc,
      • attaching to an existing browser for debugging the failed scripts and many more.

      These problems were largely resolved with the integration of selenium with devtool protocols. And that makes the selenium engineer's life merrier than before.

      Why this proposal can be different from others?

      1. Our solution can be executed as independent with chrome dev tools or with selenium. That gives the power back to the automation engineer to choose what and how to debug/run their tests.
      2. We used this solution for one of our largest enterprise customers and moved this solution to public repository this week (for this conference and beyond). With that said, we tested reasonably with more than 10,000+ test scripts and more than 1M tests.
      3. The present solution is completely (100%) packed with all Chrome Dev Tools API in Java and with that said, any Java Selenium automation engineer can bind in minutes for their existing code base with no additional dependencies.
      4. Finally, we love to present at the local home to start our selenium conference campaign. Where else?

      Solution:

      The present proposal largely connected with Chrome and Selenium in Java language. However, there is no limitation to expand the boundaries for other language bindings and browsers.

      Google Chrome, the most picked browser for browsing, which makes it the primary concentration for developers and testers. DevTools is one such boon for developers, testers especially the new aged test automation engineers. With that said, we built the following design pattern to allow chrome dev tools API to marry Selenium using debugger address / remote targets.

      Selenium Devtools

    • Abhijeet Vaikar
      keyboard_arrow_down

      Abhijeet Vaikar - End-end test code as a first class citizen

      45 Mins
      Case Study
      Intermediate

      "All tests in today's automated regression run have been marked as Untested. What happened?"

      "No notifications are being sent for test runs on the channel"

      "I pulled latest code, and the framework dependency shows compilation error"

      "What does this new method in the framework do?"

      How often do you hear such things within your team?

      As Quality champions, we need to walk the talk. When we expect our developers to write quality code, write unit tests, build features without introducing bugs, the onus lies on us (as test engineers) to do the same. With almost every test engineering team writing automated tests to check functionality of their products and services, it becomes very important to ensure that the test automation framework and the test scripts are bug-free and follow good standards of software engineering.

      It cannot be stressed enough that test automation code should be as good as production code. In order to build production-quality test automation framework and scripts, a number of steps can be taken at:

      1. Code & System Level

      2. Process & People Level

      Our test engineering team went through a transition from having random & unexpected failing test runs to having greater confidence in the quality of the tests. Learn from this case study of our journey to ensure that end-end UI automated tests are built with quality in mind. We will also see demonstration of some of the use cases.

    • Rajdeep varma
      keyboard_arrow_down

      Rajdeep varma - The Joy Of Green Builds - Running Tests Smartly

      Rajdeep varma
      Rajdeep varma
      Automation Lead
      Bumble
      schedule 10 months ago
      Sold Out!
      45 Mins
      Talk
      Intermediate

      So you have got a few UI tests and they are running in parallel, great! However, life will not be so sweet once these 'a few' turns into 'a lot'. We grew from a few to 1500 UI tests (although not particularly proud of this number, there are situations and reasons)

      We started with a simple parallel distribution of tests 3 years ago. As test count increased failure count run time increased along with increased flaky tests. Mobile tests had their own challenges (eg. device dropping-off, random wi-fi issues, etc) To keep up with this, we created a queue and workers based solution which could distribute the tests more efficiently (https://github.com/badoo/parallel_cucumber). Over time, we made more improvements, in particular:

      • Segregation of failures based on infrastructure issues and re-queue the tests
      • If a device/emulator malfunction, rescue the tests to another device
      • Repeating a single test on 100s of the worker in parallel to detect flakiness
      • Repeat a test if a known network issue
      • Terminating the build early if more than a certain number of tests have failed
      • Health check of each device, before each test to ensure reliability
      • Muting a test if failure is known, and highlight outdated mutes if the related task is fixed

      In this talk, I will talk about the initial challenges with running UI tests in parallel (Selenium and Appium), how we approached the queue based solution and continuous improvement of this solution; finally, how attendees can use it at their workplace or create their own solution based on our learnings.

    • Gaurav Singh
      keyboard_arrow_down

      Gaurav Singh - How to build an automation framework with Selenium : Patterns and practices

      Gaurav Singh
      Gaurav Singh
      Test automation Lead
      Gojek
      schedule 1 year ago
      Sold Out!
      45 Mins
      Talk
      Beginner

      With an ever increasing no of businesses being conducted on web the testing need to write automated tests for the app's UI is something that can never be ignored. As you all know Selenium provides an API that enables us to do this quite effectively.

      However, when tasked with setting up the automation framework, there are a lot of questions that arise in the minds of aspiring test developers regardless of what level they are in their career.


      Some of such questions are:

      1. How does one actually go about the business of building a robust and effective automation framework on top of selenium?
      2. What are the elementary building blocks to include in the framework that an aspiring automation developer should know of?
      3. How should we model our tests? XUnit style vs BDD?
      4. Are there good practices, sensible design patterns and abstractions that we can follow in our code?
      5. What are some of the anti patterns/common mistakes we should avoid

      A lot of literature, documentation and blogs exists on these topics on the web already.

      However In this talk,

      I would combine this existing knowledge and my years of experience in building automation frameworks and breakdown these elements and walk you through exactly the sort of decisions/considerations and practices that you can take while starting to implement or improve the UI automation for your team.

      Hope to see you there!

    • Syam Sasi
      keyboard_arrow_down

      Syam Sasi - When Ansible meets Selenium Grid - Story of building a stable local iOS simulator farm

      Syam Sasi
      Syam Sasi
      Senior Software Engineer
      Carousell
      schedule 11 months ago
      Sold Out!
      45 Mins
      Case Study
      Beginner

      There are dozens of matured docker based solutions available in the market for Android automation testing,

      but

      what about iOS testing?

      It’s hard to create and maintain the docker based solution for iOS testing since it demands the Xcode and for the optimal performance the system needs the Apple certified hardware.

      Combining Ansible with Selenium Grid yields a powerful combination because it allows us to set-up our grid and nodes in just a few seconds. In this talk I will demonstrate how to use simulators to build a reliable and scalable in-house iOS simulator lab using Ansible, Selenium Grid and Appium.

    • Rabimba Karanjai
      keyboard_arrow_down

      Rabimba Karanjai - Testing Web Mixed Reality Applications: What you need to know for VR and AR

      Rabimba Karanjai
      Rabimba Karanjai
      Researcher
      Mozilla
      schedule 10 months ago
      Sold Out!
      45 Mins
      Talk
      Intermediate

      There are already over 200 million users consuming VR applications by 2018. And with Google, Mozilla pushing WebXR capabilities in browser and vendors like BBC, Amnesty International, Universal, Disney, Lenskart and a lot of them adopting them to their websites, we will soon see a huge rise i demand for Web VR and Mixed Reality applications.

      But how do you test them in scale? How do you define "smooth" as opposed to just responsive?

      In this talk I will go over some key details about the WebXR specification. The work that Mozilla, Google and the W3C Immersive Web Group is doing. The differences between testing a regular web page and a Mixed Reality enabled one. What to watch for and how you can automate it.

    • Shi Ling Tai
      keyboard_arrow_down

      Shi Ling Tai - Start with the scariest feature - how to prioritise what to test

      Shi Ling Tai
      Shi Ling Tai
      CEO
      UI-licious
      schedule 11 months ago
      Sold Out!
      20 Mins
      Talk
      Beginner

      It can be intimidating for inexperienced teams embarking on their test automation journey for an existing code base. There is so much to test, and so many ways to test. I often see teams stuck with debating on where to start and what tools to use and best practices:

      "We should start from unit tests"

      "No, integration tests are better!"

      "Should we use tool A or tool B?"

      I see this play out all the time, and I've been there before. And the worst that could happen is decision paralysis and inaction.

      The bigger question really is "What to test?".

      My rule of thumb is "Start with the scariest code". I'll share with you my framework for evaluating the ROI of writing a test for a feature and prioritising what to test.

    • Sanjay Kumar
      Sanjay Kumar
      ChroPath Creator
      AutonomIQ
      schedule 1 year ago
      Sold Out!
      45 Mins
      Case Study
      Intermediate
      • Want to save 70-80% manual effort of automation script writing!!
      • Wasting time in verifying xpath one by one?
      • Want to complete automation script without wasting much time?
      • Are you still wasting time in writing english manual test cases?
    • Naveen Khunteta
      keyboard_arrow_down

      Naveen Khunteta - Best Practices to implement the test automation framework starting from Design - To -> Infrastructure - To -> Execution.

      45 Mins
      Talk
      Intermediate

      Best Practices - How to get the best 'Return ON Investment' (ROI) from your Test Automation.

      This has been observed that, most of the test frameworks wont be able to survive due to lack of expertise, no maintenance, no best practices being followed, and finally your test automation will be dead after few months, and there is no "Return ON Investment" from this. This is the most common problem, most of the companies are struggling and finally back to square to the Manual testing.

      My proposal : HOW to leverage your test automation in terms of best practices, best ROI, and how to adopt best automation culture in your organisation.
      I strongly propose some of the important points/suggestions to achieve this in your Organisation/Team.
      1. Test Automation Practices:
      • Design Patterns (Web/Mobile/API)
      • What to Automate/Not to Automate
      2. Common Automation Frameworks at Org Level:
      • How to design Generic Utilities, Libraries and different Components, which can be suitable for all the teams in the same Org.
      • Best practices to design your Tests (Automation).
        • Common Design Patterns
        • Common application level and Page libraries
        • Best Practices to use Assertions in your Tests (How and What to write for assertions). Most of the people don't write proper assertions and this is making your test unreliable and no defects found during execution.
      3. Inclusion of API/Backend libraries in your UI test automation as an external Maven/Gradle Dependencies to avoid un-necessary tasks, some of the important points to be considered here:
      • User Creation from APIs (No need to automate user creation from web/app for all the test cases)
      • API tests are stabled most of the time
      • API calls takes lesser time as compared to web, hence include API calls in your UI/App framework to save time.
      • Less flaky test

      4. Best Code Review Process (Do not merge your code into Master without proper Code Review)

      • Implement PR (PULL Request) Process
      • Static Code Analysis using SonarQube, Cobertura, JACOCO etc..
      • Get the benefits of Best Test Automation Quality Matrices
      • Sometimes, Manual (Functional Tester) should review your code (Assertion, test steps and use cases) to get the best coverage
      5. Quality is A Team responsibility:
      • Developers, POs, Manual QEs and Automation engineers should be included to get an overview of test automation coverage.
      6. Maintenance of the Frameworks
      • After couple of months, it should not make your life miserable if you don't maintain your libraries and framework properly.
      • Do not use Hard Coded values, make it simple and Generic.
      7. Infrastructure Setup for Test Design and Test Execution:
      • Proper Browser - OS lab setup
      • Proper Mobile Labs setup with different Devices - IOT, iOS, Android, iPad, Tablets
      • Proper CI - CD common configuration using Jenkins, Dev Ops, AWS, Docker and Cloud setup
      • Handling multiple Docker nodes using Kubernates (use of Selenoid, GRID on Cloud)
    • Praveen Umanath
      keyboard_arrow_down

      Praveen Umanath - State-of-the-art test setups: How do the best of the best test?

      20 Mins
      Talk
      Intermediate

      The best engineering teams are releasing code hundreds of times in a day. This is supported by a test setup that is not just fast, but robust and accurate at the same time.

      We look at data (anonymized) from millions of tests running on BrowserStack, to figure out the very best test setups. We also analyze the testing behavior of these companies—how do they test, how frequently do they test, how many device-browser-OS combinations do they test on. Do they gain speed by running more parallels or leaner test setups?

      Finally, we see how these steps help these teams to test faster and release continuously, and how it ties-in to the larger engineering strategy.

    • Lavanya Mohan
      keyboard_arrow_down

      Lavanya Mohan / Priyank Shah - Analytics - Insights from unsaid customer feedback

      45 Mins
      Talk
      Beginner

      Are we investing our efforts in building things that actually matter? Is the new feature that we rolled out adding value to the customer? Is the new release doing better than the previous releases? How do we get answers to these questions and more? Analytics is our answer!

      Analytics information helps not just the business teams but also QAs, Devs, PMs and other members of the project in multiple different ways. It could help us uncover some critical issues, it could help us understand customer sentiments better, it could help us get a broader picture of how the customer actually uses the product and whether it was how we intended it to be, it can help us get ideas about what small or large changes the customers are looking for without them having to explicitly tell us.

      Analytics is important information to us. So, it is also critical that the information is correct. That means analytics information produced also needs to be tested and validated.

      This talk is intended to understand the testing of analytics events and why they are important to us.

      In this talk, we will cover our experience of how analytics information helped us understand our customers better and invest our time in building the right things. We will also cover how we validated it to ensure that the data that we were seeing was actually correct. In addition to this, we will also briefly cover some details about other sources of information that can be looked at if we are working in a mobile world.

      Please note: We’re open to tune the proposal based on feedback

    • Smita Mishra
      keyboard_arrow_down

      Smita Mishra - Vision Boards - Project your goals

      Smita Mishra
      Smita Mishra
      CEO
      QAZone Infosystems
      schedule 10 months ago
      Sold Out!
      20 Mins
      Talk
      Intermediate

      How do teams share their understanding on the common goals? It is either audio or visual. Recording each talk and storing them ( tagged) is not the most effective way to share common knowledge. Sketching is not new to agile teams. We are taking it a step forward in the form of Vision Boards. Vision Board – is creative visualization of your goals. While our focus in this talk, remains on- how teams could use the board, Individuals use these in order to make their life goals into reality. There are pictures or sketches of what they want – all pasted together on one board – so they constantly remind themselves of their ultimate goals in the bigger scheme of things. These goals may not be achievable with one task. They may need a series of tasks which do not directly seem to be connected with the goal. But these visualizations captured - are very good indicators of what success means to one.

      We used Vision Boards to visualize our customer experience, their reactions and expected patterns of use for our application. This board single handedly kept all our teams aligned and as many changes happened – the teams knew their true north when they were discussing how to design the screens and which features to build on (priority). Our already agile teams were constantly looking at the short term goals of prioritized features, but vision board helped them reduce chaos and clutter and saved lot of time on understanding the overall requirement - it also served as the basis for User Stories.

    • Krishnan Mahadevan
      keyboard_arrow_down

      Krishnan Mahadevan - My experiments with Grid

      45 Mins
      Tutorial
      Intermediate

      Everyone starts off with a simple grid setup which involves a hub and one or more nodes.

      This traditional setup is a good start but the moment one starts to get serious with the selenium grid and decide to house their own selenium grid for their local executions, that is when issues start.

      My experiences with the Selenium grid in the past couple of years has led me to get introduced some of the most prevalent problems with maintaining an in-house selenium grid.

      • Nodes get unhooked randomly due to network glitches.
      • Nodes introduce false failures due to memory leaks.
      • Selenium Grid running out of capacity.
      • Nodes require OS upgrades/patches etc.
      • Needing to deal with auto upgrades by browsers (especially chrome and firefox)

      Some of these issues I managed to fix by building a "Self Healing" Grid wherein the nodes automatically get restarted after they have serviced "n" tests. But that still didn’t solve many of these other problems.

      That was when I felt, what if there was an on-demand selenium grid.

      What if the Grid could do the following ?

      • The Grid auto scales itself in terms of the nodes based on the current load.
      • The Grid does not require a lot of infrastructure to support it.
      • The Grid can plug itself into some of the cloud providers or leverage a solution such as Docker so that the nodes can be spun and shutdown at will.

      That was how the idea of "Just Ask" an on-demand grid was born.

      Just-Ask is an on-demand grid. It has no nodes attached to it.

      It’s designed to spin off nodes on demand, run test against the newly spun off node and after test runs to completion, clean-up the node as well. The node can be backed by anything. It could be Docker (or) it could be a VM running on any of the popular clouds.

      The session aspires to walk the audience through with my experiments with the selenium grid, my learnings on the selenium grid internals and how I used all of that knowledge to build my own On Demand Selenium Grid. What better avenue to share these learnings than a Selenium Conference.

      The session will introduce the audience to the grid internals and their concepts such as

      • What is a Selenium Remote Proxy ? What is it used for? What can you do with it?
      • What is a Hub (or) Node level Servlet ? When would you need one ?
      • All of this followed by a quick demo on "Just Ask", the on-demand grid that I have built and open sourced here: https://github.com/rationaleEmotions/just-ask

    • Martin Schneider
      keyboard_arrow_down

      Martin Schneider / Prabhagharan D K - Building and scaling a virtual Android and iOS device lab

      45 Mins
      Case Study
      Intermediate

      Virtual mobile devices (emulators/simulators) are a cost-effective and straightforward alternative to testing on physical devices. We showcase how to set-up and scale an Android emulator farm using Appium, Docker and SQS and how it fits into our larger testing and quality strategy.
      Maintaining physical test devices for mobile automation can be expensive and time-consuming. On top of the initial investment, you need to consider maintenance cost, replacement devices and efforts for manual scaling. On the other side of the spectrum, cloud providers take care of these restrictions, but their services can come at a hefty price tag, especially when your use-case requires a large number of devices. We present a middle path and demonstrate how to use virtual devices to build a reliable and scalable in-house device lab using Docker and Appium.

    • Gopinath Jayakumar
      keyboard_arrow_down

      Gopinath Jayakumar / Babu Narayanan Manickam - Selenium Internals in Python Programming - Workshop

      90 Mins
      Workshop
      Beginner

      Are you keen to know Selenium 4.x(experimental) deeper with internal architecture and workings with its dearest drivers and its API in Python programming? Are you in Selenium with other language bindings and willing to cross-skill in Python to make up for the deep learning/machine learning/data science projects?

      If one of the above is yes, this workshop will get your hands dirty with practical exposure with several webdriver commands in modern applications and of course, the course is designed to assist all levels of test automation engineers experience.

      In sum, all attendees of this workshop will sign off with a working selenium codebase in python with git repo and best practices to deploy using pytest framework on their projects.

    • Koushik Chatterjee
      keyboard_arrow_down

      Koushik Chatterjee - Let Your Precious Time Be Optimized While Automating

      Koushik Chatterjee
      Koushik Chatterjee
      QA Software Engineer
      Testleaf
      schedule 11 months ago
      Sold Out!
      20 Mins
      Case Study
      Beginner

      Automation is now preferred the most, precisely in the agile environment as it saves a lot of time. But wouldn’t it be a privilege to save some few more time by locating the web elements? Writing automation script swill reduces the time in terms of regression testing, but generating the XPath and even an entire code statement will surely add a lot to the automation scripting.

      My personal opinion about automation is that automation isn’t pure automation if most of the works are done manually with a hectic routine and having a high chance of erroneous XPath. Even the task itself will kill tonnes of time, right-click – inspect the element and writing the XPath in the DOM would take a lot of time.

      As an automation tester, finding the locators and XPath are vital for test automation and one of the most used. But doing this iteratively for an entire day wouldn’t be interesting, Ruto does this sleek, not within seconds but in milliseconds. Opening the DOM and typing the XPath is time-consuming but writing the entire statement with few clicks will save a lot of time.

      Ruto is free extension that is used to find XPath for test automation, the tool makes finding elements easy in the DOM along with that it provides code snippets based on the action that need to be perform on that element. To store the found element Ruto automatically suggests a variable name from the user’s perspective along with its return type. The user will be informed if the element is located inside a frame/iframe, Ruto also provides a whole set of elements from a web page as snippets. Apart from this, we’ve added one additional feature (experimental) to support Page Object Model.


      XPath supported by Ruto:

      • Supports dynamic XPath
      • Axes based XPath can be achieved with Ruto by adding the web elements as source and target
      • Ruto can be solely used to handle web table
      • Provide multiple XPath at a time for an element
    • Khanh Do
      keyboard_arrow_down

      Khanh Do - Leveraging Artificial Intelligence to create self-healing tests

      Khanh Do
      Khanh Do
      QA Architect
      Kobiton
      schedule 1 year ago
      Sold Out!
      45 Mins
      Tutorial
      Intermediate

      A key requirement for successful test automation is to get past the brittle or fragile nature of test scripts. Any Selenium (or Appium) developer has encountered the dreaded "NoSuchElement Exception". A locator that worked yesterday may fail today. What's a test engineer to do?

      Fortunately the field of AI provides promising solutions and allows for the creation of self-healing tests. Tests that can find elements across all environments. Tests that can learn from "human-in-the-loop" intervention and work perfectly thereafter. Imagine automated tests that "just work"!

      This session will look at how to apply the latest in AI and Machine Learning technologies to improve your test scripts. With the plethora of new open source AI libraries made available by companies such as Google, the ability to leverage AI in your applications is more accessible than ever.

      This session will be a primer on AI technologies and how they can be utilized for perfect test automation.

    • Babu Narayanan Manickam
      keyboard_arrow_down

      Babu Narayanan Manickam / Sarath Muthuswamy - Reinforced Selenium test selection using Recommender Systems

      20 Mins
      Case Study
      Beginner

      Problem Statement

      One of our enterprise oracle customers has more than 30,000+ UI & API regression tests for end to end Apps Integration testing. The average time it takes to run these tests (depends on parallelization) is anywhere between 11-12 hours.

      With considerable time to complete all regression tests, the objective was to find a working AI model to identify the probable failing test scripts and prioritize them early along with the most critical business automated tests.

      Solution:

      Selecting the most promising selenium test scripts to detect application defects may be harder considering the uncertainties on the impact of committed code changes and breakage of many traceability links between code and automated tests. The design was to automatically choose the test scripts and prioritize in CI tool with the goal to minimize round-trip-times between code commits and developer feedback on failed test scripts.

      In our devops environment, where new test scripts are created and obsolete tests are deleted constantly, the reinforced method learns to prioritize error-prone test scripts higher under the guidance of a reward function and by observing previous test cycles from historical data. By applying reinforced learning techniques on data extracted, it is evident that reinforcement learning enables better automatic adaptive test script selection and prioritization in CI and automated regression testing.

      As the first step, we created a predictive model that estimates the probability of each test failing for a newly proposed build. Instead of defining the model manually, we built it by using a large data set containing results of tests on historical builds/releases and then applying the standard machine learning technique - gradient-boosted decision tree.

      Features:

      • Code changes based on build metadata
      • Code Owner Information
      • Historical test runs

      With this model, we can analyze a particular code change to find all potentially impacted tests that transitively depend on modified files, and then estimate the probability of that test detecting a regression introduced by the change. Based on those estimates, the system selects the tests that are most likely to fail for a particular change. The diagram below shows which tests (shown in blue) would be chosen for a change affecting the two files from the previous example, where the likelihood of each considered is represented by a number between zero and one.

      In sum, Our recommender system with reinforcement learning algorithms helps to identify the priority test scripts in runtime, the model helps to move up or down the test scripts from the sequences set at the beginning.

      Our github repo: https://github.com/testleaf-software/reinforced-selenium-test-execution

      Process Diagram: https://github.com/testleaf-software/reinforced-selenium-test-execution/blob/master/re-inforcement%20process.png