DRAFT

ODSC India 2020

Tue, Dec 8
Timezone: Asia/Kolkata (IST)
09:00

    Opening Keynote - 45 mins

10:00

    Welcome Note - 15 mins

10:15

    Coffee Break - 15 mins

10:30
  • schedule  10:30 - 10:50 AM place Online

    Anomaly Detection have been one of most sought after analytical solutions for businesses operating in the domain of Network Operation, Service Operation, Manufacturing etc. and many other sectors where continuity of operations is essential. Any degradation in operational service or an outage, implies high losses and possible customer churn. The data in such real world applications is generally noisy, have complex patterns and often correlated.

    There are techniques like Auto-Encoders available for modelling complex patterns, but they can't explain the cause in original feature space. The traditional univariate anomaly detection techniques uses the z-score and p-value methods. These rely upon unimodality and choice of correct parametric form. If assumptions are not satisfied then there would be a high number of False-Positives and False-Negatives.

    This is where the need for estimating a PDF (Probability Density Function) arises that too without assuming a prior parametric form i.e. Non-Parametric approach. The PDF needs to be modelled as close to the true distribution as possible. That is it should have a low bias and low variance to avoid over-smoothing and under-smoothing. Only then we would have better chances of identifying true anomalies.

    Approaches like KDE - Kernel Density Estimation assist in such non-parametric estimations. As per research the type of kernel has a lesser role to play than the bandwidth for a good PDF estimation. The default bandwidth selection technique used in both Python and R packages over-smooths the PDF and is not suitable for Anomaly Detection.

    We will explain another method, where we run optimisation over a cost function based on modelling Gaussian kernel via FFT (Fast Fourier Transform), to obtain the appropriate bandwidth. Then we will show how we can apply it for Anomaly Detection even when the data is multi-modal (have multiple peaks) and the distribution can be of any shape.

    Based on research paper under publication "Optimal Kernel Density Estimation using FFT based cost function", currently scheduled for ICDM 2020, New York

  • Added to My Schedule
    keyboard_arrow_down
    Akshay Bahadur

    Akshay Bahadur - Indian Sign Language Recognition (ISLAR)

    schedule  10:30 - 10:50 AM place Online

    Sample this – two cities in India; Mumbai and Pune, though only 80kms apart have a distinctly varied spoken dialect. Even stranger is the fact that their sign languages are also distinct, having some very varied signs for the same objects/expressions/phrases. While regional diversification in spoken languages and scripts are well known and widely documented, apparently, this has percolated in sign language as well, essentially resulting in multiple sign languages across the country. To help overcome these inconsistencies and to standardize sign language in India, I am collaborating with the Centre for Research and Development of Deaf & Mute (an NGO in Pune) and Google. Adopting a two-pronged approach: a) I have developed an Indian Sign Language Recognition System (ISLAR) which utilizes Artificial Intelligence to accurately identify signs and translate them into text/vocals in real-time, and b) have proposed standardization of sign languages across India to the Government of India and the Indian Sign Language Research and Training Centre.

    As previously mentioned, the initiative aims to develop a lightweight machine-learning model, for 14 million speech/hearing impaired Indians, that is suitable for Indian conditions along with the flexibility to incorporate multiple signs for the same gesture. More importantly, unlike other implementations, which utilize additional external hardware, this approach, which utilizes a common surgical glove and a ubiquitous camera smartphone, has the potential of hardware-related savings at an all-India level. ISLAR received great attention from the open-source community with Google inviting me to their India and global headquarters in Bangalore and California, respectively, to interact with and share my work with the TensorFlow team.

  • Added to My Schedule
    keyboard_arrow_down
    Gunjan Dewan

    Gunjan Dewan - Developing a match-making algorithm between customers and Go-Jek products!

    schedule  10:30 - 10:50 AM place Online

    20+ products. Millions of active customers. Insane amount of data and complex domain. Come join me in this talk to know the journey we at Gojek took to predict which of our products a user is most likely to use next.

    A major problem we faced, as a company, was targeting our customers with promos and vouchers that were relevant to them. We developed a generalized model that takes into account the transaction history of users and gives a ranked list of our services that they are most likely to use next. From here on, we are able to determine the vouchers that we can target these customers with.

    In this talk, I will be talking about how we used recommendation engines to solve this problem, the challenges we faced during the time and the impact it had on our conversion rates. I will also be talking about the different iterations we went through and how our problem statement evolved as we were solving the problem.

11:00
11:30
  • Added to My Schedule
    keyboard_arrow_down
    Venkata Pingali

    Venkata Pingali - Privacy-Law Aware ML Data Preparation

    schedule  11:30 - 11:50 AM place Online

    The new PDP (Personal Data Protection) Law, which is similar to GDPR
    and CCPA, is being implemented in India. All enterprise data services
    including analytics and data science within the scope of the law are
    required to comply with the same. Almost all major geographies have now
    passed similar laws. The expectation of responsible data handling from
    organizations is also increasing.

    Enrich, our product, is a high-trust data preparation platform for
    enterprises that provides data input to analysts and models at scale
    everyday. Such data preparation services are on organizations’
    compliance and privacy-activity critical path because of their
    ‘fan-out’ nature. They provide a convenient location to enforce policy
    and safety mechanisms.

    In this talk we discuss some of the mechanisms that we are building
    for clients in our data preparation platform, Enrich. They include
    opensource compliance checklist to help with the process, ‘right to
    forget’ service using anonymized lookup key service, and metadata
    service to enable tracking of the datasets. The focus will be on the
    generic capabilities, and not on Scribble or our product.

    Note: Will update this over the next few days and weeks

  • schedule  11:30 - 11:50 AM place Online

    This talk focuses on the topic of querying industry grade big data systems. Enterprises have vast amount of information spread across structured data stores (relational databases, data warehouses, etc.). Descriptive analytics over this data is limited to experts familiar with complex querying languages (e.g., Structured Query Language) as well as metadata and schema associated with such large datastores. The ability to convert natural language questions to SQL statements would make descriptive analytics and reporting much easier and widespread. Problem of automatically converting natural language questions to SQL is well studied, viz., Natural Language Interface to Databases (NLIDB). We present our work on an end-to-end (E2E) system focussed on NLIDB.

    We describe two main aspects of E2E NLIDB systems: i) Converting natural language to structured language and ii) understanding natural language. There is a plenitude of applications of such E2E systems across domains e.g., healthcare, finance, logistics, etc.

  • Added to My Schedule
    keyboard_arrow_down
    Priyanshu Jain

    Priyanshu Jain - Automated Ticket Routing for Large Enterprises

    schedule  11:30 - 11:50 AM place Online

    Large enterprises that provide services to consumers may receive millions of customer complaint tickets every month. Handling these tickets on time is very critical, as this directly impacts the quality of service and network efficiency.

    A ticket may be assigned to multiple teams before it gets resolved. Assigning a ticket to an appropriate group is usually done manually as the complaint information provided by the customer is not very specific and maybe inaccurate sometimes. This manual process incurs enormous labor costs and is very time inefficient as each ticket may end up in the queue for hours.

    In this talk, we will present an approach to automate the process of ticket routing completely. We will start by discussing how we can use Markov Chains to model the flow of tickets across different teams. Next, we will discuss the feature engineering part and why Factorization Machine Models are essential for such a use case. This will be followed by a discussion on the learning of decision rule sets in a supervised manner. These decision rules can be used to traverse tickets across multiple teams in an automated fashion. Thus, automating the complete process of ticket routing. We will also discuss that the proposed framework can be validated easily by SMEs, unlike other AI solutions, thus, resulting in its quick acceptability in an organization. Finally, we will go through the different settings in which this solution can fit, therefore, resulting in its broad applicability.

    The framework can provide substantial cost savings to enterprises. It can also reduce Response time to tickets significantly by almost eliminating the queue time. Overall, it can help large enterprises in

    1. Saving costs by reducing the workforce of ticket handling team

    2. Increasing revenue by improving quality of customer experience

12:00
  • Added to My Schedule
    keyboard_arrow_down
    Kuldeep Singh

    Kuldeep Singh - Simplify Experimentation, Deployment and Collaboration for ML and AI Models

    schedule  12:00 - 12:20 PM place Online

    Machine Learning and AI are changing or would say have changed the way how businesses used to behave. However, the Data Science community is still lacking good practices for organizing their projects and effectively collaborating and experimenting quickly to reduce “time to market”.

    During this session, we will learn about one such open-source tool “DVC”
    which can help you in helping ML models shareable and reproducible.
    It is designed to handle large files, data sets, machine learning models, metrics as well as code

  • Added to My Schedule
    keyboard_arrow_down
    Darshan Ganji

    Darshan Ganji / Deepesh Agrawal - On-Demand Accelerating Deep Neural Network Inference via Edge Computing

    schedule  12:00 - 12:20 PM place Online

    Deep Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on mobile phones and embedded systems with limited hardware resources and taking more time for Inference and Training. For many mobile-first companies such as Baidu and Facebook, various apps are updated via different app stores, and they are very sensitive to the size of the binary files. For example, App Store has the restriction “apps above 100 MB will not download until you connect to Wi-Fi”. As a result, a feature that increases the binary size by 100MB will receive much more scrutiny than one that increases it by 10MB. It is challenging to run computation-intensive DNN-based tasks on mobile devices due to the limited computation resources.

    This talk introduces the Algorithms and Hardware that can be used to accelerate the Inferencing or reduce the latency of deep learning workloads. We will discuss how to compress the Deep Neural Networks and techniques like Graph Fusion, Kernel Auto-Tuning for accelerating inference, as well as Data and model parallelization, automatic mixed precision, and other techniques for accelerating training. We will also discuss specialized hardware for deep learning such as GPUs, FPGAs, and ASICs, including the Tensor Cores in NVIDIA’s Volta GPUs as well as Google’s Tensor Processing Units (TPUs). We will also discuss the Deployment of the Large Size Deep Learning Models on the Edge devices like NVIDIA Jetson Nano, Google's Edge TPU(Coral).

    Keywords: Graph Optimization, Tensor Fusion, Kernel Auto Tuning, Pruning, Weight sharing, quantization, low-rank approximations, binary networks, ternary networks, Winograd transformations, data parallelism, model parallelism, mixed precision, FP16, FP32, model distillation, Dense-Sparse-Dense training, NVIDIA Volta, Tensor Core, Google TPU.

  • Added to My Schedule
    keyboard_arrow_down
    Vinayaka Mayura G G

    Vinayaka Mayura G G - Metamorphic Testing for Machine Learning Models with Search Relevancy Example

    schedule  12:00 - 12:20 PM place Online

    Accuracy of a Model can be improved in several levels and multiple variables, boundaries and guidelines. With the well known problem statement and solution, it is difficult to evaluate for all the given cases the model would be predicting expected outcomes. Machine Learning Models are solving for the problems for which results are unknown, most of the times. This arises a problem of Test Oracle. Recent surveys and work have shown that this difficulty can be reduced by some of the blackbox testing techniques such as Metamorphic Testing, Fuzzing, Dual Coding et.,

    Even though the output of a Model is not known, we can make few predictions based on the Metamorphic relations. A metamorphic relation refers to the relationship between the software input change and output change during multiple program executions. Many metamorphic relations are created based on the transformation from training data set or test data set. We further classify them into Coarse-grained Data transformation and Fine-grained data transformation.

    We will discuss different transformations. Will go through the example of a Search relevancy problem and will analyse the application of Metamorphic testing to verify the Machine model built.

12:30

    LunchBreak - 60 mins

13:30
  • Added to My Schedule
    keyboard_arrow_down
    POOJA BALUSANI

    POOJA BALUSANI - Model Interpretability and Explainable AI in Manufacturing

    schedule  01:30 - 01:50 PM place Online

    In this talk, we present an industrial use case on “anomaly detection” in steel mills based on IoT sensor data. In large steel mills and manufacturing plants, the top reasons for unplanned downtime are:
    • Failure of critical asset
    • Quality spec of the end product in line not being met
    • Operational limits outside the recommended range (e.g. process, human-safety, equipment-safety, etc.)

    Unplanned downtime or line stoppage leads to loss of production or throughput and revenue loss.

    Anomaly detection can serve as an early warning system, providing alerts on anomalous behavior that could be detrimental to the equipment health or affect process quality. In this work, we are performing multi-variate anomaly detection on time-series sensor data in a steel mill to help the maintenance engineers and process operators take proactive actions and help reduce plant downtime. Anomaly is presented to the customer in terms of:
    • “time-intervals” – startTime: endTime chunks that exhibit deviant behavior
    • “anomaly-state” – type association of anomaly to a specific pattern or cluster state
    • “anomaly-contribution” – priority association to sensor signals that exhibited deviant behavior within the multi-variate list (more like signal importance)

    We shall introduce the approach, where we reformulate the unsupervised modeling to a supervised formulation to incorporate SHAP, LIME, and other explainable tools. We shall illustrate the steps to provide the above-mentioned meta-data for an anomaly to make it explainable and consumable for the end-customer.

  • Added to My Schedule
    keyboard_arrow_down
    Ujwala Musku

    Ujwala Musku - Supply Path Optimization in Video Advertising Landscape

    schedule  01:30 - 01:50 PM place Online

    In the programmatic era, with a lot of players in the market, it is quite complex for a buyer to reach the destination, namely advertising slot from the source, namely publisher. Auction Duplication, internal deals between DSP & SSP, and fraudulent activities are making the existing complex route even more complex day by day. Due to the aforementioned reasons, it is fairly evident that a single impression is being sold through multiple routes by multiple sellers at multiple prices. The new dilemma that has emerged recently is: Which route/path should the buyer choose and what should be the fair price to pay?

    In this talk, we will discuss a framework that solves the problem of choosing the best path at the right price in programmatic Video Advertising. Initially, we will give an overview of all the different approaches tried i.e., Clustering, Classification Modelling, DEA, and Scoring based on Classification modeling. Out of these, DEA and Scoring Methodology had better results, and hence a detailed comparison of results and why a particular approach worked better will be illustrated. The final framework explains the two best-worked techniques: 1. Data Envelopment Analysis and 2.Scoring based on Classification Modeling. DEA is a non-parametric method used to rank the Unsupervised dataset of various supply paths by estimating the relative efficiencies. These efficiencies are calculated by comparing all the possible production frontiers of decision-making units (here supply paths). As a statistical and machine learning hybrid, the Scoring method calculates the score against each supply path, helping us decide whether a path is worth bidding.

    The results of these models are compared with each other to choose the best one based on campaign KPI i.e., CPM (Cost per 1000 impressions) and CPCV (Cost per completed view of the video ad). A 4 - 8% improvement in CPM is observed in multiple test video ad campaigns, however, there is a dip in the number of impressions delivered. This is tackled by including impressions as an input in both the techniques. These clear improvements in CPM indicate that the technique results in better ROI compared to the heuristic approach. This approach can be used in various sectors like Banks (determining Credit Score) and Retail Industries(supply path optimization in Operations).

  • Added to My Schedule
    keyboard_arrow_down
    Dr. Manjeet Dahiya

    Dr. Manjeet Dahiya - Learning Maps from Geospatial Data Captured by Logistics Operations

    schedule  01:30 - 01:50 PM place Online

    Summary

    Logistics operations produce a huge amount of geospatial data and this talk tells how we can use it to create a mapping service such as Google Maps and Here Maps!

    Abstract

    E-commerce and logistics operations produce a vast amount of geospatial data while moving and delivering packages. As a logistics company supporting the e-commerce operations in multiple Asian countries, Delhivery produces over 50 million geo-coordinates daily. These geo-coordinates represent the movement of trucks and bikes or delivery events to the given postal addresses. The data has great potential to mine geospatial knowledge, and we demonstrate that a mapping service similar to Google Maps and Here Maps can be automatically built using the same. Specifically, we describe the learning of regional maps (localities, cities, etc) from the addresses labeled with geo-coordinates and the learning of roads from the geo-coordinates associated with movement.

    We propose an algorithm to construct polygons and polylines of the map entities given a set of geo-coordinates. The algorithm involves non-parametric spatial probability modelling of the map entities followed by classification of the cells in a hexagonal grid to the respective map entity. We show that our algorithm is capable of handling noise, which is significantly high in our setting due to various reasons such as scale and device issues. A property about the noise and the correct information is presented such that our algorithm infers a correct map entity. We quantitatively measure the accuracy of our system by comparing its output with the available ground truth. We will showcase some localities that have incorrect polygons in Google Maps whereas we can learn the correct version by our data and algorithm. We also discuss multiple applications of the generated maps in the context of e-commerce and logistics operations.


    A part of this work was accepted for publication at ACM/SIGAPP Symposium On Applied Computing 2020:

    "Learning Locality Maps from Noisy Geospatial Labels. In SAC 2020 at Brno, Czech Republic"

14:00
  • Added to My Schedule
    keyboard_arrow_down
    Soham Chakraborty

    Soham Chakraborty - A Spurious Outlier Detection System For High Frequency Time Series Data

    schedule  02:00 - 02:20 PM place Online

    As we are living in the age of IoT, more and more processes are using information gathered from well placed sensors to infer and predict better about their businesses. These sensor data are typically continuous and of enormous volume. Like any other data sources, they are also contaminated by noise (outliers) which may or may not be preventable. Presence of these outlier points will adversely affect the performance of any analytical model. Note that we are differentiating between contextual anomalies and noisy outliers. Former is of importance to us to build predictive models. Here we propose an integrated and scalable approach to detect spurious outliers. The main modules of this proposed system are taken from the literature. But to our knowledge, no such concerted approach exists where an end-to-end robust system is proposed like here. Even though this method was developed specifically using manufacturing IoT data, this is equally applicable for any domain dealing with time series data like CPG, Retail, Healthcare, Agrotech etc.

  • Added to My Schedule
    keyboard_arrow_down
    Soumya Jain

    Soumya Jain - Unsupervised learning approach for identifying retail store employees using footfall data

    schedule  02:00 - 02:20 PM place Online

    Analysis of customer visits (or footfall) in the store traced via geolocation enabled devices, helps digital firms understand customers and their buying behavior better. Insights gained through geo footfall analysis help clients and advertisers make an informed decision, choose profitable regions, recognize relevant advertising opportunities and analyze their competitors to increase the success rate. But all this information can be disingenuous if people who walk past the store without entering, and staff of the store are not excluded. Therefore, two groups of people contributing to the footfall at the store can be considered outliers - people passing by the store, and employees of the store. The behavior of these outliers is expected to be different from the actual customers.

    Since the data collected by geofencing the stores and pings from the SDK of the geo-enabled devices do not contribute much in tagging these outliers exclusively, these outliers are not very evident and cannot be removed by extreme value analysis. To tackle this problem we have formulated a multivariate approach to identify and remove these outliers from our source data. As we have no labeled data that marks a footfall as an employee or customer, we are using an unsupervised outlier detection model using the DBSCAN algorithm to provide a coherent and complete dataset with the labeled outliers. In this process, different techniques were taken into consideration to handle the effectiveness of features. Features like time spent by a visitor in and around the stores compared to other locations, monthly visit frequency, daily visit frequency, etc. were dominant in tagging the outliers.

    Discovering the structure of data was another key step to optimize parameters of the DBSCAN algorithm for our use case namely, epsilon and minimal points.

    Finally, the evaluation was done against the results obtained with that of the k-means algorithm, which showed that DBSCAN has a higher detection rate and a low rate of false positives in discovering outliers for the given problem statement.

  • Added to My Schedule
    keyboard_arrow_down
    Amogh Kamat Tarcar

    Amogh Kamat Tarcar - Privacy Preserving Machine Learning Techniques

    schedule  02:00 - 02:20 PM place Online

    Privacy preserving machine learning is an emerging field which is in active research. The most prolific successful machine learning models today are built by aggregating all data together at a central location. While centralised techniques are great , there are plenty of scenarios such as user privacy, legal concerns ,business competitiveness or bandwidth limitations ,wherein data cannot be aggregated together. Federated Learningcan help overcome all these challenges with its decentralised strategy for building machine learning models. Paired with privacy preserving techniques such as encryption and differential privacy, Federated Learning presents a promising new way for advancing machine learning solutions.

    In this talk I’ll be bringing the audience upto speed with the progress in Privacy preserving machine learning while discussing platforms for developing models and present a demo on healthcare use cases.

14:30
  • Added to My Schedule
    keyboard_arrow_down
    Aravind Kondamudi

    Aravind Kondamudi / Upasana Roy Chowdhury - AI in Manufacturing - Improving Process using Prescriptive Analytics

    schedule  02:30 - 02:50 PM place Online

    With the rise of Industry 4.0, computation power, data warehousing and automation, factories have been increasingly becoming intelligent. Preventive maintenance of Machines and predicting the failures have become an increasingly common sight. AI has also empowered in planning and logistics, where the quantity of item to be manufactured and the timing of it, have been decided through the outputs of ML models. Now the manufacturers are increasingly focused on improving the quality of the process and the throughput through sustainable methods as rising global warming is a concern. To improve the efficiency and to make the process sustainable, Machine Learning models coupled with optimization are used for Prescriptive Analytics. Data of the industrial process is often huge data with many process and control variables involved. Understanding the variables requires domain knowledge expertise coupled with feature engineering techniques. A search-based optimization can be used for finding the Pareto optimal solution with objectives to maximize the KPI and finding the support in historical data. Identifying the interaction effects is done by learning the data through a prediction model. The performance after the process is predicted using modelling for the KPI. Sensitivity analysis was conducted to understand the effect of variables on the uncertainty of model output and the KPI. The process, then optimized for maximizing throughput provides prescriptive analytics thereby improving the performance and reducing energy consumption.

  • Added to My Schedule
    keyboard_arrow_down
    Kavita Dwivedi

    Kavita Dwivedi - Portfolio Valuation for a Retail Bank using Monte Carlo Simulation and Forecasting for Risk Measurement

    schedule  02:30 - 02:50 PM place Online

    Banks today need to have a very good assessment of their portfolio value at any point in time . This is both a regulatory requirement and an operational metrics which helps banks to assess risk of their portfolio and also calculate the Capital Adequacy that they need to maintain at portfolio levels , product levels and all of these aggregated at Bank level.

    This presentation will walk you through a case study which will discuss in detail how we went about calculating Portfolio value for a Home loan on a sample data . The bank wanted a scientific /statistical approach to this as they could take this to regulators for approval and thus convince them about the capital that they have for a particular portfolio.

    The other interesting dimension was that in case the bank wants to sell a particular loan book to another bank /third party financial institutions they would be able to quote a price within the confidence interval of the calculated price. The same model/tool could be also shared with the buyer to convince them on quoted price and will make the negotiation and selling smooth.

    We have used Monte Carlo Simulation on historical data of the portfolio to measure the Portfolio Value for the next 5 years of a Home loan Portfolio. It is a two step modeling process with Machine Learning Models to predict default and then further using simulation to calculate Portfolio value year on year for next 5 yrs taking in account diminishing returns too.

    The presentation will take you through the approach and modeling process and how Monte Carlo Simulation helped us deliver the same to Customer with high accuracy and confidence level.

    This is a real case study and will focus on why Risk Measurement is important and why Basel , CCAR implementation across banks worldwide helps the Central Banks to manage risks in case of a financial downturn or Black Swan events.

  • Added to My Schedule
    keyboard_arrow_down
    Parthiban Srinivasan

    Parthiban Srinivasan - Coronavirus: Through The Lens Of AI

    schedule  02:30 - 02:50 PM place Online

    In a global pandemic such as COVID-19, technology, artificial intelligence, and data science have become critical to helping societies effectively deal with the outbreak. In this talk, I will discuss three case studies of how AI is being used in Corona Virus research. The first part of the talk will discuss about how deep learning model detected COVID-19 caused pneumonia from computed tomography (CT) scans with comparable performance to expert radiologists. To be more specific, I will discuss about UNet++ architecture that was implemented by researchers for evaluating lung infection in COVID-19 CT images. The second part of the talk will be devoted to recent attempts in natural language processing to generate new insights in support of the ongoing fight against this infectious disease. There is a growing urgency for these approaches because of the rapid acceleration in new coronavirus literature, making it difficult for the medical research community to keep up. To be precise, BERT literature search engine for COVID-19 literature.will be discussed .

    The third part of the talk deals with deep learning based generative modeling framework to design drug candidates specific to a given target protein sequence. One of the most important COVID-19 protein targets is the 3C-like protease for which the crystal structure is known. We present different deep learning models designed for generating novel drug molecules with multiple desirable properties. The deep learning framework involves Variational Autoencoder, Generative Adversarial Networks, Reinforcement Learning, and Transfer Learning. The generated molecules might serve as a blueprint for creating drugs that can potentially bind to the viral protein with high target affinity, as well as high drug-likeliness. Last but not the least, this talk will also touch upon how the world community responded by making the data available to the researchers which enabled the data scientists to explore and support the scientific community.

15:00

    Coffee Break - 15 mins

15:15
  • Added to My Schedule
    keyboard_arrow_down
    Dr. Sri Vallabha Deevi

    Dr. Sri Vallabha Deevi - Machine health monitoring with AI

    schedule  03:15 - 03:35 PM place Online

    Predictive maintenance is the most recent technique in maintenance engineering. Machine operational parameters are used to assess the health of equipment and decide on maintenance schedule. In Aviation, aircraft engine manufacturers continuously monitor their engine parameters in flight to evaluate performance and deviations from normal.

    Application of AI in this field enables measurement of behavior that is not observable using traditional means. AI based monitoring provides the edge required to operate in Industry 4.0 where connected machines do away with buffers in between processes and any unscheduled downtime of one machine effects the entire production chain.

    This demonstration will walk you through the development of AI models using IoT data for one of the largest metal manufacturing company in India. It will help you master different types of AI models to answer questions like

    • When do I plan the maintenance of a given equipment?
    • Will a component last till the next maintenance cycle or do I replace it during the current maintenance?
    • How to identify faulty equipment in the long production line?
  • Added to My Schedule
    keyboard_arrow_down
    Dat Tran

    Dat Tran / Tanuj Jain - imagededup - Finding duplicate images made easy!

    schedule  03:15 - 04:00 PM place Online

    The problem of finding duplicates in an image collection is widespread. Many online businesses rely on image galleries to deliver a good customer experience and consequently, generate more revenue. Hence, the image galleries need to be of the highest quality. Presence of duplicates in such galleries could potentially degrade the customer experience. Additionally, image-based machine learning models could generate misleading results due to the duplicates present in the training/evaluation/test sets.

    Therefore, finding and removing duplicates is an important requirement across several use cases. In this talk, we want to present imagededup, a Python package we built to solve the problem of finding exact and near duplicates in an image collection. We will speak about the motivation behind building it, its functionality and also give a demo.

  • schedule  03:15 - 04:00 PM place Online

    Short Abstract

    It is a well known fact that the more data we have, the better performance ML models can achieve. However, getting a large amount of training data annotated is a luxury most practitioners cannot afford. Computer vision has circumvented this via data augmentation techniques and has reaped rich benefits. Can NLP not do the same? In this talk we will look at various techniques available for practitioners to augment data for their NLP application and various bells and whistles around these techniques.

     

    Long Abstract

    In the area of AI, it is a well established fact that data beats algorithms i.e. large amounts of data with a simple algorithm often yields far superior results as compared to the best algorithm with little data. This is especially true for Deep learning algorithms that are known to be data guzzlers. Getting data labeled at scale is a luxury most practitioners cannot afford. What does one do in such a scenario?

     

    This is where Data augmentation comes into play. Data augmentation is a set of techniques to increase the size of datasets and introduce more variability in the data. This helps to train better and more robust models. Data augmentation is very popular in the area of computer vision. From simple techniques like rotation, translation, adding salt etc to GANs, we have a whole range of techniques to augment images. It is a well known fact that augmentation is one of the key anchors when it comes to success of computer vision models in industrial applications.

     

    Most natural language processing (NLP) projects in industry still suffer from data scarcity. This is where recent advances in data augmentation for NLP can come very helpful. When it comes to NLP, data augmentation is not that straight forward. You want to augment data while keeping the syntactic and semantic properties of the text. In this talk we will take a deep dive into the world of various techniques that are available to practitioners to augment data for NLP. The talk is meant for Data Scientists, NLP engineers, ML engineers and industry leaders working on NLP problems.

15:50
  • schedule  03:50 - 04:10 PM place Online

    A large fraction of work in NLP work in academia and research groups deals with clean datasets that are much more structured and free of noise. However, when it comes to building real-world NLP applications, one often has to collect data from applications such as chats, user-discussion forums, social-media conversations, etc. Invariably all NLP applications in industrial settings that have to deal with much more noisy and varying data - data with spelling mistakes, typos, acronyms, emojis, embedded metadata, etc. 

    There is a high level of disparity between the data SOTA language models were trained on & the data these models are expected to work on in practice. This renders most commercial NLP applications working with noisy data unable to take advantage of SOTA advances in the field of language computation.

    Handcrafting rules and heuristics to correct this data on a large scale might not be a scalable option for most industrial applications. Most SOTA models in NLP are not designed keeping in mind noise in the data. They often give a substandard performance on noisy data.

    In this talk, we share our approach, experience, and learnings from designing a robust system to clean noise in data, without handcrafting the rules, using Machine Translation, and effectively making downstream NLP tasks easier to perform.

    This work is motivated by our business use case where we are building a conversational system over WhatsApp to screen candidates for blue-collar jobs. Our candidate user base often comes from tier-2 and tier-3 cities of India. Their responses to our conversational bot are mostly a code mix of Hindi and English coupled with non-canonical text (ex: typos, non-standard syntactic constructions, spelling variations, phonetic substitutions, foreign language words in a non-native script, grammatically incorrect text, colloquialisms, abbreviations, etc). The raw text our system gets is far from clean well-formatted text and text normalization becomes a necessity to process it any further.

    This talk is meant for computational language researchers/NLP practitioners, ML engineers, data scientists, senior leaders of AI/ML/DS groups & linguists working with non-canonical resource-rich, resource-constrained i.e. vernacular  & code-mixed languages.

16:20
  • Added to My Schedule
    keyboard_arrow_down
    Ashwathi Nambiar

    Ashwathi Nambiar - Quantization To The Rescue: An Edge AI Story

    schedule  04:20 - 04:40 PM place Online

    Over the last decade, deep neural networks have brought in a resurgence in artificial intelligence, with machines outperforming humans in some of the most popular image recognition problems. But all that jazz comes with its costs – high compute complexity and large memory requirements. These requirements translate to higher power consumption resulting in steep electricity bills and a sizeable carbon footprint. Optimizing model size and complexity thus becomes a necessity for a sustainable future for AI.
    Memory and compute complexity optimizations also bring in the promise of unimaginable possibilities with edge AI - self-driving cars, predictive maintenance, smart speakers, body monitoring are only the beginning. The smartphone market, with its reach to nearly 4 billion people, is only a fraction of the potential edge devices waiting to be truly ‘smart’. Think smart hospitals or mining, oil and gas industrial automation and so much more.

    In this session we will talk about,

    • Challenges in deep neural network (DNN) deployment on embedded systems with resource constraints
    • Quantization, which has been popularly used in mathematics and digital signal processing to map values from a large often continuous set to values in a countable smaller set, now reimagined as a possible solution for compressing DNNs and accelerating inference.
      It is gaining popularity not only with machine learning frameworks like MATLAB, TensorFlow and PyTorch but also amidst hardware toolchains like NVIDIA® TensorRT and Xilinx® DNNDK. The core idea behind quantization is the resiliency of neural networks to noise. Deep neural networks, in particular, are trained to pick up key patterns and ignore noise. This means that the networks can cope with small changes resulting from quantization error, as backed by research indicating minimal impact of quantization on overall accuracy of the network. This, coupled with significant reduction in memory footprint, power consumption, and gains in computational speed, makes quantization an efficient approach for deploying neural networks to embedded hardware.
    • Example of a quantization solution for an object detection problem
  • Added to My Schedule
    keyboard_arrow_down
    Ravi Ranjan

    Ravi Ranjan - Deep Reinforcement Learning Based RecSys Using Distributed Q Table

    schedule  04:20 - 04:40 PM place Online

    Recommendation systems (RecSys) are the core engine for any personalized experience on eCommerce and online media websites. Most of the companies leverage RecSys to increase user interaction, to enrich shopping potential and to generate upsell & cross-sell opportunities. Amazon uses recommendations as a targeted marketing tool throughout its website that contributes 35% of its total revenue generation [1]. Netflix users watch ~75% of the recommended content and artwork [2]. Spotify employs a recommendation system to update personal playlists every week so that users won’t miss newly released music by artists they like. This has helped Spotify to increase its number of monthly users from 75 million to 100 million at a time [3]. YouTube's personalized recommendation helps users to find relevant videos quickly and easily which account for around 60% of video clicks from the homepage [4].

    In general, RecSys generates recommendations based on user browsing history and preferences, past purchases and item metadata. It turns out most existing recommendation systems are based on three paradigms: collaborative filtering (CF) and its variants, content-based recommendation engines, and hybrid recommendation engines that combine content-based and CF or exploit more information about users in content-based recommendation. However, they suffer from limitations like rapidly changing user data, user preferences, static recommendations, grey sheep, cold start and malicious user.

    Classical RecSys algorithm like content-based recommendation performs great on item to item similarities but will only recommend items related to one category and may not recommend anything in other categories as the user never viewed those items before. Collaborative filtering solves this problem by exploiting the user's behavior and preferences over the items in recommending items to the new users. However, collaborative filtering suffers from a few drawbacks like cold start, popularity bias, and sparsity. The classical recommendation models consider the recommendation as a static process. We can solve the static recommendation on rapidly changing user data by RL. RL based RecSys captures the user’s temporal intentions and responds promptly. However, as the user action and items matrix size increases, it becomes difficult to provide recommendations using RL. Deep RL based solutions like actor-critic and deep Q-networks overcome all the aforementioned drawbacks.

    Present systems suffer from two limitations, firstly considering the recommendation as a static procedure and ignoring the dynamic interactive nature between users and the recommender systems. Also, most of the works focus on the immediate feedback of recommended items and neglecting the long-term rewards based on reinforcement learning. We propose a recommendation system that uses the Q-learning method. We use ε-greedy policy combined with Q learning, a powerful method of reinforcement learning that handles those issues proficiently and gives the customer more chance to explore new pages or new products that are not so popular. Usually while implementing Reinforcement Learning (RL) to real-world problems both the state space and the action space are very vast. Therefore, to address the aforementioned challenges, we propose the multiple/distributed Q table approaches which can deal with large state-action space and that aides in actualizing the Q learning algorithm in the recommendation and huge state-action space.

    References:

    1. "https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers":https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers
    2. "https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429":https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429
    3. "https://www.bloomberg.com/news/articles/2016-09-21/spotify-is-perfecting-the-art-of-the-playlist":https://www.bloomberg.com/news/articles/2016-09-21/spotify-is-perfecting-the-art-of-the-playlist
    4. "https://dl.acm.org/citation.cfm?id=1864770":https://dl.acm.org/citation.cfm?id=1864770
    5. "Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modelling": https://arxiv.org/pdf/1810.12027.pdf
    6. "Deep Reinforcement Learning for Page-wise Recommendations": https://arxiv.org/pdf/1805.02343.pdf
    7. "Deep Reinforcement Learning for List-wise Recommendations": https://arxiv.org/pdf/1801.00209.pdf
    8. "Deep Reinforcement Learning Based RecSys Using Distributed Q Table": http://www.ieomsociety.org/ieom2020/papers/274.pdf
  • schedule  04:20 - 04:40 PM place Online

    The key aspect in solving ML problems in telecom industry lies in continuous data collection and evaluation from different categories of customers and networks so as to track and dive into varying performance metrics. The KPIs form the basis of network monitoring helping network/telecom operators to automatically add and scale network resources. Such smart automated systems are built with the objective of increasing customer engagement through enhanced customer experience and tracking customer behavior anomaly with timely detection and correction. Further the system is designed to scale and serve current LTE, 4G and upcoming 5G networks with minimal non-effective cell site visits and quick identification of Root Cause Analysis (RCA).

    Network congestion has remained an ever-increasing problem. Operators have attempted a variety of strategies to match the network demand capacity with existing infrastructure, as the cost of deploying additional network capacities is expensive. To keep the cost under control, operators apply control measures to attempt to allocate bandwidth fairly among users and throttle the bandwidth of users that consume excessive bandwidth. This approach had limited success. Alternatively, techniques that utilize extra bandwidth for quality of experience (QOE) efficiency by over-provisioning the network has proved to be ineffective and inefficient due to lack of proper estimation.

    The evolution of 5G networks, would lead manufacturers and telecom operators to use high-data transfer rates, wide network coverage, low latency to build smart factories using automation, artificial intelligence and Internet of Things (IoT). The application of advanced data science and AI can provide better predictive insights to improve network capacity-planning accuracy. Better network provisioning would yield better network utilization for both next-generation networks based on 5G technology and current LTE and 4G networks. Further AI models can be designed to link application throughput with network performance, prompting users to plan their daily usage based on their current location and total monthly budget.

    In this talk, we will understand the current challenges in telecom industry, the need for an AIOPS platform, and the mission held by telecom operators, communication service providers across the world for designing such AI frameworks, platforms and best practices. We will see how increasing operator collaborations are helping to create, deploy and productionize AI platforms for different AI use-cases. We will study one industrial use-case (with code) based on real-time field research to predict network capacity. In this respect we will investigate how deep learning networks can be used to train large volumes of data at scale (millions of network cells), and how its use can help the upcoming 5G networks. We will also examine an end to end pipeline of hosting the scalable framework on Google Cloud with special emphasis on Data Governance and Data Mangement. As data volume is huge and data needs to be stored in highly secured systems, we build our high-performing system with extra security features that can process millions of request in an order of few mili-secs. As the session highlights parameters and metrics in creating the neural network, it also discusses the challenges and some of the key aspects involved in designing and scaling the system.

16:50

    Closing Keynote - 45 mins

17:45

    Closing Talk - 15 mins