Big Data - Verify Data Completeness and Data Correctness using Selenium/JAVA
Using Hadoop MapReduce, Java programs are written to process large amounts of Data. Testing has to be performed to check the accurate functioning of these applications. Testing process includes manually verifying business logic on each node for MapReduce process accuracy, Data aggregation/Segregation rules and Generation of Key Value pairs. At the same time, Output data files are also verified for Transformation Rules, Successful Load, Data Integrity and Data Accuracy.
Due to the enormous amount of data and various business rules, manual testing process is time consuming and may lead to slippage of validations. Implementing the automation testing process using Selenium & Java Adapters will make sure the data is complied with all the business/transformation rules and checks the data integrity.
Outline/Structure of the Talk
Hadoop is used to fetch data from different data sources and programmers apply business logic to fetch the required data. E.g., if the Hadoop-gathered data is 100 PB(Peta Bytes) which is a combination of good data & bad data, and the business is interested in good data with required information.Let us say, out of 100 PB data, business-required data is 50 TB. Hadoop programmers write few MapReduce programs in JAVA to run against multiple nodes and they gather around 50 TB. And to validate this 50TB data, what the traditional manual testing process does is take random/sample data files and validate against business logic. This does not 100% guarantee data completeness and data correctness. And with manual testing process it takes months-time to validate data integrity and data accuracy. Manual testing cannot be done with unstructured data which consists of different file types. for ex: Audio, Video, Mobile Calls, Call center data, images, etc. Using automation we can read the headers of these files and split in to structured files.
The Test Automation approach consists of initially splitting the 50TB data into smaller chunks and develop test JAVA programs known as JAVA adapters. The required business logic should be implemented in these JAVA adapters and these have to be run against each individual data file chunk. This generates the output data. And this output file content and output file size is verified against the data generated by Hadoop MapReduce programs. This finally results in verifying of data completeness and data correctness as briefly explained below.
Data Correctness - Validating actual data (Data by MapReduce) Vs expected data(Data by test Java Adapters).
Data Completeness – 100 % data validation
And, for O/P data validation, tools like PRESTO can be used. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from 300 PB and above.
Using Test Automation Framework, large amounts of data can be verified within a very minimal time and Data Integrity/Data Accuracy are verified with 100% coverage
schedule Submitted 3 years ago
People who liked this proposal, also liked:
rajesh sarangapani / Prabhu Epuri - Visualizing Real User Experience Using Integrated Open Source Stack (Selenium + Jmeter + Appium + Visualization tools)rajesh sarangapaniAssociate Vice President - TechnologyGallopPrabhu Epuri--
schedule 3 years agoSold Out!
- Provides Page load times similar to On Load time of real browsers
- Generates HAR file with following statistics
- Details of summary of request times and content types
- Waterfall chart with page download time breakdown statistics such as DNS resolution time, Connection time, SSL handshaking time, Request send time, wait time and receive time.
By integrating the open source stack tools it enables us to provide the same insights which a commercial of the shelf tools would offer. At Gallop we have implemented this at multiple clients providing them insights into various bottlenecks at the client side which helped us to provide greater value proposition
Trinath Babu - Visual Test Automation using SeleniumTrinath BabuSr. ManagerGallop Solutions
schedule 3 years agoSold Out!
Visual Test Automation using Selenium
Visual Testing is the method of verifying that the application’s GUI appears correctly to its users. Most of the people say visual testing is hard to automate. Given the number of web browsers, operating systems, screen resolutions, responsive design, internationalization, etc.) the nature of visual testing can be complex. But with existing open source and commercial solutions, this complexity is manageable, making it easier to automate than it once was, since verification with traditional automated functional testing tools can be very challenging.
It can be easily achieved by integrating Selenium with Applitools. This talk mainly focuses on verifying the application’s graphical user interfaces (GUI) and finding the visual bugs using Applitools. It is very helpful for all sites having graphical functionalities like (charts, graph, dashboards etc). Verify that the GUI appears correctly across all devices & browsers. The nature of visual testing can be complex. But with existing open source and commercial solutions, this complexity is manageable, making it easier to automate than it once was. And the payoff is well worth the effort.
Take pressure off manual QA: increase coverage, test faster & more accurately. Reduce maintenance efforts: automatically propagate changes across execution environments. Release faster, with confidence & flawless.
Applitools Eyes Express captures the screen you want to test, and compares it to a baseline image – instantly, with a single click. No extra testing code necessary, no boring error logs.
For example, a single automated visual test will look at a page and assert that every element on it has rendered correctly. Effectively checking hundreds of things and telling you if any of them are out of place. This will occur every time the test is run, and it can be scaled to each browser, operating system, and screen resolution you care about.
Put another way, one automated visual test is worth hundreds of assertions. And if done in the service of an iterative development workflow, then you’re one giant leap closer.
Each of these tools follows some variation of the following work flow:
- Drive the application under test (AUT) and take a screenshot
- Compare the screenshot with an initial “baseline” image
- Report the differences
- Update the baseline as needed
Dilip S - TestComplete supports Selenium – Other Commercial tools will follow soon?Dilip SAssociate ArchitectGallop
schedule 3 years agoSold Out!
Oliver Wendell Holmes once said: I would not give a fig for the simplicity this side of complexity, but I would give my life for the simplicity on the other side of complexity.
Tool evangelists around the world have been using this phrase for selling their products. They make it a point to look into customer’s eyes and say “Scalability, you know, that’s the main problem with Selenium. What about 3 years down the line, when you have multiple applications in your landscape and Selenium does not support it?”
But they simply ignore the fact, knowingly off course, that this ever belting technology world is going where Selenium is right now. Applications are shrinking into browser windows and changing tracks to align itself to this mobile era.
Even commercial tool like TestComplete from SmartBear has started supporting Selenium and many will follow soon. Reason for this change is not only that most of the organizations are preferring open source tools like Selenium for starting point of their automation activities but also the fact that Selenium by far has proven itself to be one of the best automation tools when it comes to mobile or browser based desktop automation.
Here our aim is to display how seamlessly Selenium integrates with TestComplete and QAComplete and it is for us to understand that it is not Selenium which needs other tools to extend it but it is the other way round.
Vishal Aggarwal - Selenium Next Generation FrameworkVishal AggarwalTest ArchitectGallop Solutions
schedule 3 years agoSold Out!
This talk focuses on the technical side of automated acceptance tests for web applications. There are a lot of high-level frameworks that allow definition of acceptance tests in natural language (Robot, JBehave, Cucumber etc). But when it comes to the technical implementation of the test cases, you are often forced to use the rather low-level WebDriver API directly.
GEB addresses exactly this problem. It is an abstraction of the WebDriver API and combines the expressive and concise Groovy language with a jQuery-like selection and traversal API. This makes the test implementation easier and the code more readable. On top of that, we get support for the page object pattern, asynchronous content lookup and a very good integration in existing test frameworks which makes it simply next generation of automation framework.
vishnu nallani chekravarthula - Extending Selenium Element Locator Strategies – Element Filteringvishnu nallani chekravarthulaTest Automation ArchitectGallop Solutions Inc
schedule 3 years agoSold Out!
Element Locator strategies for Selenium WebDriver are highly flexible, and have been later inherited by many commercial tools. Although the locator strategies are flexible, they are also limited in a sense that, Selenium WebDriver does not currently allow its users to identify/filter UI elements with multiple locator strategies(at a time), as many commercial tools do.
The solution discussed in this article describes a library that allows Selenium WebDriver users to extend the Selenium element locator strategies for Element Filtering and few use cases for the library.
The solution approach allows users to continue to use the existing UI Element definitions in their tests, and extend them, using the By reference. The library will replace the existing Selenium WebDriver “By” reference.
Filtering based on multiple locator strategies
There are various scenarios where to uniquely identify an UI element, a complex XPath has to be written. However, the element can be identified uniquely using multiple locator strategies for the UI Element. The UI Elements can also be filtered, when there are multiple matches in a page. This is the UI Element recognition mechanism used in many commercial test automation tools.
The algorithm for filtering UI Elements based on multiple locator strategies is based on priority of locator strategies. The priority of locator strategies when filtering is:
- LinkText and PartialLinkText
The By.elementFilter method takes multiple locator strategies, and searches the page for elements matching a particular locator strategy/property, and checks if it is a unique match on the page, if not then it uses the next locator strategy passed to it and so on.
This method is also very helpful when the application undergoes constant changes and UI Elements might have either of XPATH, ID , NAME, TagName, ClassName etc still unchanged. That way, it helps reduce a lot of maintenance effort in Selenium WebDriver implementations which is due to UI element changes.
Filtering based on Index
When there are multiple similar UI Elements in a page, such as cells in a grid/table, it makes sense to identify objects based on their Index based on their appearance on the web page.
The By.indexFilter method allows users to define an UI Element based on its Index of occurrence of the UI Element. The Index starts from 1.
Filtering based on relative element
When a UI element cannot be identified uniquely and reliably by any of its properties, but has some elements in its hierarchy or relative to a particular element, this method can be used to identify the element
The By.relationFilter method allows users to define an UI Element in relation to another element. The relation can be defined as “Left”, ”Right”, ”Top”, ”Bottom”, ”Child”, ”Parent”
Filtering for Tables
When dealing specifically with Tables, which have the
html tag, the By.tableFilter method allows the user to quickly identify specific cells in the table, without having to write complex XPaths or logics to achieve the same.
The By.tableFilter method allows users to define a cell in the table with Row,Column numbers. This allows users to directly use the UI Element in their code instead of writing their logic each time. This also increases efficiency and readability of the code.