filter_list help_outline
  • 20 Mins
    Demonstration
    Intermediate

    The key aspect in solving ML problems in telecom industry lies in continuous data collection and evaluation from different categories of customers and networks so as to track and dive into varying performance metrics. The KPIs form the basis of network monitoring helping network/telecom operators to automatically add and scale network resources. Such smart automated systems are built with the objective of increasing customer engagement through enhanced customer experience and tracking customer behavior anomaly with timely detection and correction. Further the system is designed to scale and serve current LTE, 4G and upcoming 5G networks with minimal non-effective cell site visits and quick identification of Root Cause Analysis (RCA).

    Network congestion has remained an ever-increasing problem. Operators have attempted a variety of strategies to match the network demand capacity with existing infrastructure, as the cost of deploying additional network capacities is expensive. To keep the cost under control, operators apply control measures to attempt to allocate bandwidth fairly among users and throttle the bandwidth of users that consume excessive bandwidth. This approach had limited success. Alternatively, techniques that utilize extra bandwidth for quality of experience (QOE) efficiency by over-provisioning the network has proved to be ineffective and inefficient due to lack of proper estimation.

    The evolution of 5G networks, would lead manufacturers and telecom operators to use high-data transfer rates, wide network coverage, low latency to build smart factories using automation, artificial intelligence and Internet of Things (IoT). The application of advanced data science and AI can provide better predictive insights to improve network capacity-planning accuracy. Better network provisioning would yield better network utilization for both next-generation networks based on 5G technology and current LTE and 4G networks. Further AI models can be designed to link application throughput with network performance, prompting users to plan their daily usage based on their current location and total monthly budget.

    In this talk, we will understand the current challenges in telecom industry, the need for an AIOPS platform, and the mission held by telecom operators, communication service providers across the world for designing such AI frameworks, platforms and best practices. We will see how increasing operator collaborations are helping to create, deploy and productionize AI platforms for different AI use-cases. We will study one industrial use-case (with code) based on real-time field research to predict network capacity. In this respect we will investigate how deep learning networks can be used to train large volumes of data at scale (millions of network cells), and how its use can help the upcoming 5G networks. We will also examine an end to end pipeline of hosting the scalable framework on Google Cloud with special emphasis on Data Governance and Data Mangement. As data volume is huge and data needs to be stored in highly secured systems, we build our high-performing system with extra security features that can process millions of request in an order of few mili-secs. As the session highlights parameters and metrics in creating the neural network, it also discusses the challenges and some of the key aspects involved in designing and scaling the system.

  • Ujwala Musku
    keyboard_arrow_down

    Ujwala Musku - Supply Path Optimization in Video Advertising Landscape

    Ujwala Musku
    Ujwala Musku
    Data Scientist II
    MiQ Digital
    schedule 7 months ago
    Sold Out!
    20 Mins
    Talk
    Beginner

    In the programmatic era, with a lot of players in the market, it is quite complex for a buyer to reach the destination, namely advertising slot from the source, namely publisher. Auction Duplication, internal deals between DSP & SSP, and fraudulent activities are making the existing complex route even more complex day by day. Due to the aforementioned reasons, it is fairly evident that a single impression is being sold through multiple routes by multiple sellers at multiple prices. The new dilemma that has emerged recently is: Which route/path should the buyer choose and what should be the fair price to pay?

    In this talk, we will discuss a framework that solves the problem of choosing the best path at the right price in programmatic Video Advertising. Initially, we will give an overview of all the different approaches tried i.e., Clustering, Classification Modelling, DEA, and Scoring based on Classification modeling. Out of these, DEA and Scoring Methodology had better results, and hence a detailed comparison of results and why a particular approach worked better will be illustrated. The final framework explains the two best-worked techniques: 1. Data Envelopment Analysis and 2.Scoring based on Classification Modeling. DEA is a non-parametric method used to rank the Unsupervised dataset of various supply paths by estimating the relative efficiencies. These efficiencies are calculated by comparing all the possible production frontiers of decision-making units (here supply paths). As a statistical and machine learning hybrid, the Scoring method calculates the score against each supply path, helping us decide whether a path is worth bidding.

    The results of these models are compared with each other to choose the best one based on campaign KPI i.e., CPM (Cost per 1000 impressions) and CPCV (Cost per completed view of the video ad). A 4 - 8% improvement in CPM is observed in multiple test video ad campaigns, however, there is a dip in the number of impressions delivered. This is tackled by including impressions as an input in both the techniques. These clear improvements in CPM indicate that the technique results in better ROI compared to the heuristic approach. This approach can be used in various sectors like Banks (determining Credit Score) and Retail Industries(supply path optimization in Operations).

  • Gunjan Dewan
    keyboard_arrow_down

    Gunjan Dewan - Developing a match-making algorithm between customers and Go-Jek products!

    Gunjan Dewan
    Gunjan Dewan
    Data Scientist
    Go-Jek
    schedule 8 months ago
    Sold Out!
    20 Mins
    Talk
    Beginner

    20+ products. Millions of active customers. Insane amount of data and complex domain. Come join me in this talk to know the journey we at Gojek took to predict which of our products a user is most likely to use next.

    A major problem we faced, as a company, was targeting our customers with promos and vouchers that were relevant to them. We developed a generalized model that takes into account the transaction history of users and gives a ranked list of our services that they are most likely to use next. From here on, we are able to determine the vouchers that we can target these customers with.

    In this talk, I will be talking about how we used recommendation engines to solve this problem, the challenges we faced during the time and the impact it had on our conversion rates. I will also be talking about the different iterations we went through and how our problem statement evolved as we were solving the problem.

  • Ravi Ranjan
    keyboard_arrow_down

    Ravi Ranjan - Deep Reinforcement Learning Based RecSys Using Distributed Q Table

    20 Mins
    Talk
    Intermediate

    Recommendation systems (RecSys) are the core engine for any personalized experience on eCommerce and online media websites. Most of the companies leverage RecSys to increase user interaction, to enrich shopping potential and to generate upsell & cross-sell opportunities. Amazon uses recommendations as a targeted marketing tool throughout its website that contributes 35% of its total revenue generation [1]. Netflix users watch ~75% of the recommended content and artwork [2]. Spotify employs a recommendation system to update personal playlists every week so that users won’t miss newly released music by artists they like. This has helped Spotify to increase its number of monthly users from 75 million to 100 million at a time [3]. YouTube's personalized recommendation helps users to find relevant videos quickly and easily which account for around 60% of video clicks from the homepage [4].

    In general, RecSys generates recommendations based on user browsing history and preferences, past purchases and item metadata. It turns out most existing recommendation systems are based on three paradigms: collaborative filtering (CF) and its variants, content-based recommendation engines, and hybrid recommendation engines that combine content-based and CF or exploit more information about users in content-based recommendation. However, they suffer from limitations like rapidly changing user data, user preferences, static recommendations, grey sheep, cold start and malicious user.

    Classical RecSys algorithm like content-based recommendation performs great on item to item similarities but will only recommend items related to one category and may not recommend anything in other categories as the user never viewed those items before. Collaborative filtering solves this problem by exploiting the user's behavior and preferences over the items in recommending items to the new users. However, collaborative filtering suffers from a few drawbacks like cold start, popularity bias, and sparsity. The classical recommendation models consider the recommendation as a static process. We can solve the static recommendation on rapidly changing user data by RL. RL based RecSys captures the user’s temporal intentions and responds promptly. However, as the user action and items matrix size increases, it becomes difficult to provide recommendations using RL. Deep RL based solutions like actor-critic and deep Q-networks overcome all the aforementioned drawbacks.

    Present systems suffer from two limitations, firstly considering the recommendation as a static procedure and ignoring the dynamic interactive nature between users and the recommender systems. Also, most of the works focus on the immediate feedback of recommended items and neglecting the long-term rewards based on reinforcement learning. We propose a recommendation system that uses the Q-learning method. We use ε-greedy policy combined with Q learning, a powerful method of reinforcement learning that handles those issues proficiently and gives the customer more chance to explore new pages or new products that are not so popular. Usually while implementing Reinforcement Learning (RL) to real-world problems both the state space and the action space are very vast. Therefore, to address the aforementioned challenges, we propose the multiple/distributed Q table approaches which can deal with large state-action space and that aides in actualizing the Q learning algorithm in the recommendation and huge state-action space.

    References:

    1. "https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers":https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers
    2. "https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429":https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429
    3. "https://www.bloomberg.com/news/articles/2016-09-21/spotify-is-perfecting-the-art-of-the-playlist":https://www.bloomberg.com/news/articles/2016-09-21/spotify-is-perfecting-the-art-of-the-playlist
    4. "https://dl.acm.org/citation.cfm?id=1864770":https://dl.acm.org/citation.cfm?id=1864770
    5. "Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modelling": https://arxiv.org/pdf/1810.12027.pdf
    6. "Deep Reinforcement Learning for Page-wise Recommendations": https://arxiv.org/pdf/1805.02343.pdf
    7. "Deep Reinforcement Learning for List-wise Recommendations": https://arxiv.org/pdf/1801.00209.pdf
    8. "Deep Reinforcement Learning Based RecSys Using Distributed Q Table": http://www.ieomsociety.org/ieom2020/papers/274.pdf
  • Amogh Kamat Tarcar
    keyboard_arrow_down

    Amogh Kamat Tarcar - Privacy Preserving Machine Learning Techniques

    20 Mins
    Demonstration
    Intermediate

    Privacy preserving machine learning is an emerging field which is in active research. The most prolific successful machine learning models today are built by aggregating all data together at a central location. While centralised techniques are great , there are plenty of scenarios such as user privacy, legal concerns ,business competitiveness or bandwidth limitations ,wherein data cannot be aggregated together. Federated Learningcan help overcome all these challenges with its decentralised strategy for building machine learning models. Paired with privacy preserving techniques such as encryption and differential privacy, Federated Learning presents a promising new way for advancing machine learning solutions.

    In this talk I’ll be bringing the audience upto speed with the progress in Privacy preserving machine learning while discussing platforms for developing models and present a demo on healthcare use cases.

  • Kuldeep Singh
    keyboard_arrow_down

    Kuldeep Singh - Simplify Experimentation, Deployment and Collaboration for ML and AI Models

    20 Mins
    Demonstration
    Intermediate

    Machine Learning and AI are changing or would say have changed the way how businesses used to behave. However, the Data Science community is still lacking good practices for organizing their projects and effectively collaborating and experimenting quickly to reduce “time to market”.

    During this session, we will learn about one such open-source tool “DVC”
    which can help you in helping ML models shareable and reproducible.
    It is designed to handle large files, data sets, machine learning models, metrics as well as code

  • 20 Mins
    Talk
    Advanced

    In the fast moving world today, rare events are becoming increasingly common. Ranging from studying incidents of safety hazards to identifying transaction fraud, they all fall under the radar of rare events. Identifying and studying rare events become of crucial importance, particularly when the underlying event conforms to a sensitive or an adverse issue. The thing to note here is, despite the probability of occurrence being very close to zero, the potential specification of the rare event could be quite extensive. For example, within the parent rare event of Product Safety, there could be multiple types of potential hazard (Fire, Electrical, Pharmaceutical, etc.), rendering the sub-classes rarer still. In this talk, we are going to discuss a novel algorithm designed to study a rare event and its sub-classes over time with primary focus on forecast and detecting anomalies.

    The anomalies studied here are relative anomalies i.e., they may not contribute to the long-term trend of the rare time series but represent deviation from the base state as seen in the immediate past.

    Keywords:

    Product Safety

    Fire Hazards

    Transaction Fraud

    Malware Detection

    Relevant Sister Classes

    Count Time Series

    Non Homogeneous Poisson Process

    Discrete Space Optimization

    Text Mining

    GloVe

    Sparsity Treatment in Text

    Dynamic Time Warping

    Discords

    Anomaly Detection

    Deviation Score

    Relative Local Density

    Density Based Clustering

  • Dr. Sri Vallabha Deevi
    keyboard_arrow_down

    Dr. Sri Vallabha Deevi - Machine health monitoring with AI

    20 Mins
    Demonstration
    Intermediate

    Predictive maintenance is the most recent technique in maintenance engineering. Machine operational parameters are used to assess the health of equipment and decide on maintenance schedule. In Aviation, aircraft engine manufacturers continuously monitor their engine parameters in flight to evaluate performance and deviations from normal.

    Application of AI in this field enables measurement of behavior that is not observable using traditional means. AI based monitoring provides the edge required to operate in Industry 4.0 where connected machines do away with buffers in between processes and any unscheduled downtime of one machine effects the entire production chain.

    This demonstration will walk you through the development of AI models using IoT data for one of the largest metal manufacturing company in India. It will help you master different types of AI models to answer questions like

    • When do I plan the maintenance of a given equipment?
    • Will a component last till the next maintenance cycle or do I replace it during the current maintenance?
    • How to identify faulty equipment in the long production line?
  • POOJA BALUSANI
    keyboard_arrow_down

    POOJA BALUSANI - Model Interpretability and Explainable AI in Manufacturing

    20 Mins
    Talk
    Intermediate

    In this talk, we present an industrial use case on “anomaly detection” in steel mills based on IoT sensor data. In large steel mills and manufacturing plants, the top reasons for unplanned downtime are:
    • Failure of critical asset
    • Quality spec of the end product in line not being met
    • Operational limits outside the recommended range (e.g. process, human-safety, equipment-safety, etc.)

    Unplanned downtime or line stoppage leads to loss of production or throughput and revenue loss.

    Anomaly detection can serve as an early warning system, providing alerts on anomalous behavior that could be detrimental to the equipment health or affect process quality. In this work, we are performing multi-variate anomaly detection on time-series sensor data in a steel mill to help the maintenance engineers and process operators take proactive actions and help reduce plant downtime. Anomaly is presented to the customer in terms of:
    • “time-intervals” – startTime: endTime chunks that exhibit deviant behavior
    • “anomaly-state” – type association of anomaly to a specific pattern or cluster state
    • “anomaly-contribution” – priority association to sensor signals that exhibited deviant behavior within the multi-variate list (more like signal importance)

    We shall introduce the approach, where we reformulate the unsupervised modeling to a supervised formulation to incorporate SHAP, LIME, and other explainable tools. We shall illustrate the steps to provide the above-mentioned meta-data for an anomaly to make it explainable and consumable for the end-customer.

  • Akshay Bahadur
    keyboard_arrow_down

    Akshay Bahadur - Indian Sign Language Recognition (ISLAR)

    Akshay Bahadur
    Akshay Bahadur
    SDE-I
    Symantec Softwares
    schedule 7 months ago
    Sold Out!
    20 Mins
    Demonstration
    Beginner

    Sample this – two cities in India; Mumbai and Pune, though only 80kms apart have a distinctly varied spoken dialect. Even stranger is the fact that their sign languages are also distinct, having some very varied signs for the same objects/expressions/phrases. While regional diversification in spoken languages and scripts are well known and widely documented, apparently, this has percolated in sign language as well, essentially resulting in multiple sign languages across the country. To help overcome these inconsistencies and to standardize sign language in India, I am collaborating with the Centre for Research and Development of Deaf & Mute (an NGO in Pune) and Google. Adopting a two-pronged approach: a) I have developed an Indian Sign Language Recognition System (ISLAR) which utilizes Artificial Intelligence to accurately identify signs and translate them into text/vocals in real-time, and b) have proposed standardization of sign languages across India to the Government of India and the Indian Sign Language Research and Training Centre.

    As previously mentioned, the initiative aims to develop a lightweight machine-learning model, for 14 million speech/hearing impaired Indians, that is suitable for Indian conditions along with the flexibility to incorporate multiple signs for the same gesture. More importantly, unlike other implementations, which utilize additional external hardware, this approach, which utilizes a common surgical glove and a ubiquitous camera smartphone, has the potential of hardware-related savings at an all-India level. ISLAR received great attention from the open-source community with Google inviting me to their India and global headquarters in Bangalore and California, respectively, to interact with and share my work with the TensorFlow team.

  • Soumya Jain
    keyboard_arrow_down

    Soumya Jain - Unsupervised learning approach for identifying retail store employees using footfall data

    Soumya Jain
    Soumya Jain
    Data Scientist II
    MiQ Digital
    schedule 7 months ago
    Sold Out!
    20 Mins
    Talk
    Beginner

    Analysis of customer visits (or footfall) in the store traced via geolocation enabled devices, helps digital firms understand customers and their buying behavior better. Insights gained through geo footfall analysis help clients and advertisers make an informed decision, choose profitable regions, recognize relevant advertising opportunities and analyze their competitors to increase the success rate. But all this information can be disingenuous if people who walk past the store without entering, and staff of the store are not excluded. Therefore, two groups of people contributing to the footfall at the store can be considered outliers - people passing by the store, and employees of the store. The behavior of these outliers is expected to be different from the actual customers.

    Since the data collected by geofencing the stores and pings from the SDK of the geo-enabled devices do not contribute much in tagging these outliers exclusively, these outliers are not very evident and cannot be removed by extreme value analysis. To tackle this problem we have formulated a multivariate approach to identify and remove these outliers from our source data. As we have no labeled data that marks a footfall as an employee or customer, we are using an unsupervised outlier detection model using the DBSCAN algorithm to provide a coherent and complete dataset with the labeled outliers. In this process, different techniques were taken into consideration to handle the effectiveness of features. Features like time spent by a visitor in and around the stores compared to other locations, monthly visit frequency, daily visit frequency, etc. were dominant in tagging the outliers.

    Discovering the structure of data was another key step to optimize parameters of the DBSCAN algorithm for our use case namely, epsilon and minimal points.

    Finally, the evaluation was done against the results obtained with that of the k-means algorithm, which showed that DBSCAN has a higher detection rate and a low rate of false positives in discovering outliers for the given problem statement.

  • Darshan Ganji
    keyboard_arrow_down

    Darshan Ganji / Deepesh Agrawal - On-Demand Accelerating Deep Neural Network Inference via Edge Computing

    20 Mins
    Demonstration
    Advanced

    Deep Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on mobile phones and embedded systems with limited hardware resources and taking more time for Inference and Training. For many mobile-first companies such as Baidu and Facebook, various apps are updated via different app stores, and they are very sensitive to the size of the binary files. For example, App Store has the restriction “apps above 100 MB will not download until you connect to Wi-Fi”. As a result, a feature that increases the binary size by 100MB will receive much more scrutiny than one that increases it by 10MB. It is challenging to run computation-intensive DNN-based tasks on mobile devices due to the limited computation resources.

    This talk introduces the Algorithms and Hardware that can be used to accelerate the Inferencing or reduce the latency of deep learning workloads. We will discuss how to compress the Deep Neural Networks and techniques like Graph Fusion, Kernel Auto-Tuning for accelerating inference, as well as Data and model parallelization, automatic mixed precision, and other techniques for accelerating training. We will also discuss specialized hardware for deep learning such as GPUs, FPGAs, and ASICs, including the Tensor Cores in NVIDIA’s Volta GPUs as well as Google’s Tensor Processing Units (TPUs). We will also discuss the Deployment of the Large Size Deep Learning Models on the Edge devices like NVIDIA Jetson Nano, Google's Edge TPU(Coral).

    Keywords: Graph Optimization, Tensor Fusion, Kernel Auto Tuning, Pruning, Weight sharing, quantization, low-rank approximations, binary networks, ternary networks, Winograd transformations, data parallelism, model parallelism, mixed precision, FP16, FP32, model distillation, Dense-Sparse-Dense training, NVIDIA Volta, Tensor Core, Google TPU.

  • 20 Mins
    Talk
    Intermediate

    A large fraction of work in NLP work in academia and research groups deals with clean datasets that are much more structured and free of noise. However, when it comes to building real-world NLP applications, one often has to collect data from applications such as chats, user-discussion forums, social-media conversations, etc. Invariably all NLP applications in industrial settings that have to deal with much more noisy and varying data - data with spelling mistakes, typos, acronyms, emojis, embedded metadata, etc. 

    There is a high level of disparity between the data SOTA language models were trained on & the data these models are expected to work on in practice. This renders most commercial NLP applications working with noisy data unable to take advantage of SOTA advances in the field of language computation.

    Handcrafting rules and heuristics to correct this data on a large scale might not be a scalable option for most industrial applications. Most SOTA models in NLP are not designed keeping in mind noise in the data. They often give a substandard performance on noisy data.

    In this talk, we share our approach, experience, and learnings from designing a robust system to clean noise in data, without handcrafting the rules, using Machine Translation, and effectively making downstream NLP tasks easier to perform.

    This work is motivated by our business use case where we are building a conversational system over WhatsApp to screen candidates for blue-collar jobs. Our candidate user base often comes from tier-2 and tier-3 cities of India. Their responses to our conversational bot are mostly a code mix of Hindi and English coupled with non-canonical text (ex: typos, non-standard syntactic constructions, spelling variations, phonetic substitutions, foreign language words in a non-native script, grammatically incorrect text, colloquialisms, abbreviations, etc). The raw text our system gets is far from clean well-formatted text and text normalization becomes a necessity to process it any further.

    This talk is meant for computational language researchers/NLP practitioners, ML engineers, data scientists, senior leaders of AI/ML/DS groups & linguists working with non-canonical resource-rich, resource-constrained i.e. vernacular  & code-mixed languages.

  • 20 Mins
    Talk
    Intermediate

    Anomaly Detection have been one of most sought after analytical solutions for businesses operating in the domain of Network Operation, Service Operation, Manufacturing etc. and many other sectors where continuity of operations is essential. Any degradation in operational service or an outage, implies high losses and possible customer churn. The data in such real world applications is generally noisy, have complex patterns and often correlated.

    There are techniques like Auto-Encoders available for modelling complex patterns, but they can't explain the cause in original feature space. The traditional univariate anomaly detection techniques uses the z-score and p-value methods. These rely upon unimodality and choice of correct parametric form. If assumptions are not satisfied then there would be a high number of False-Positives and False-Negatives.

    This is where the need for estimating a PDF (Probability Density Function) arises that too without assuming a prior parametric form i.e. Non-Parametric approach. The PDF needs to be modelled as close to the true distribution as possible. That is it should have a low bias and low variance to avoid over-smoothing and under-smoothing. Only then we would have better chances of identifying true anomalies.

    Approaches like KDE - Kernel Density Estimation assist in such non-parametric estimations. As per research the type of kernel has a lesser role to play than the bandwidth for a good PDF estimation. The default bandwidth selection technique used in both Python and R packages over-smooths the PDF and is not suitable for Anomaly Detection.

    We will explain another method, where we run optimisation over a cost function based on modelling Gaussian kernel via FFT (Fast Fourier Transform), to obtain the appropriate bandwidth. Then we will show how we can apply it for Anomaly Detection even when the data is multi-modal (have multiple peaks) and the distribution can be of any shape.

    Based on research paper under publication "Optimal Kernel Density Estimation using FFT based cost function", currently scheduled for ICDM 2020, New York

  • 45 Mins
    Talk
    Intermediate

    Short Abstract

    It is a well known fact that the more data we have, the better performance ML models can achieve. However, getting a large amount of training data annotated is a luxury most practitioners cannot afford. Computer vision has circumvented this via data augmentation techniques and has reaped rich benefits. Can NLP not do the same? In this talk we will look at various techniques available for practitioners to augment data for their NLP application and various bells and whistles around these techniques.

     

    Long Abstract

    In the area of AI, it is a well established fact that data beats algorithms i.e. large amounts of data with a simple algorithm often yields far superior results as compared to the best algorithm with little data. This is especially true for Deep learning algorithms that are known to be data guzzlers. Getting data labeled at scale is a luxury most practitioners cannot afford. What does one do in such a scenario?

     

    This is where Data augmentation comes into play. Data augmentation is a set of techniques to increase the size of datasets and introduce more variability in the data. This helps to train better and more robust models. Data augmentation is very popular in the area of computer vision. From simple techniques like rotation, translation, adding salt etc to GANs, we have a whole range of techniques to augment images. It is a well known fact that augmentation is one of the key anchors when it comes to success of computer vision models in industrial applications.

     

    Most natural language processing (NLP) projects in industry still suffer from data scarcity. This is where recent advances in data augmentation for NLP can come very helpful. When it comes to NLP, data augmentation is not that straight forward. You want to augment data while keeping the syntactic and semantic properties of the text. In this talk we will take a deep dive into the world of various techniques that are available to practitioners to augment data for NLP. The talk is meant for Data Scientists, NLP engineers, ML engineers and industry leaders working on NLP problems.

  • Priyanshu Jain
    keyboard_arrow_down

    Priyanshu Jain - Automated Ticket Routing for Large Enterprises

    20 Mins
    Talk
    Intermediate

    Large enterprises that provide services to consumers may receive millions of customer complaint tickets every month. Handling these tickets on time is very critical, as this directly impacts the quality of service and network efficiency.

    A ticket may be assigned to multiple teams before it gets resolved. Assigning a ticket to an appropriate group is usually done manually as the complaint information provided by the customer is not very specific and maybe inaccurate sometimes. This manual process incurs enormous labor costs and is very time inefficient as each ticket may end up in the queue for hours.

    In this talk, we will present an approach to automate the process of ticket routing completely. We will start by discussing how we can use Markov Chains to model the flow of tickets across different teams. Next, we will discuss the feature engineering part and why Factorization Machine Models are essential for such a use case. This will be followed by a discussion on the learning of decision rule sets in a supervised manner. These decision rules can be used to traverse tickets across multiple teams in an automated fashion. Thus, automating the complete process of ticket routing. We will also discuss that the proposed framework can be validated easily by SMEs, unlike other AI solutions, thus, resulting in its quick acceptability in an organization. Finally, we will go through the different settings in which this solution can fit, therefore, resulting in its broad applicability.

    The framework can provide substantial cost savings to enterprises. It can also reduce Response time to tickets significantly by almost eliminating the queue time. Overall, it can help large enterprises in

    1. Saving costs by reducing the workforce of ticket handling team

    2. Increasing revenue by improving quality of customer experience

  • Dr. Manjeet Dahiya
    keyboard_arrow_down

    Dr. Manjeet Dahiya - Learning Maps from Geospatial Data Captured by Logistics Operations

    20 Mins
    Talk
    Beginner

    Summary

    Logistics operations produce a huge amount of geospatial data and this talk tells how we can use it to create a mapping service such as Google Maps and Here Maps!

    Abstract

    E-commerce and logistics operations produce a vast amount of geospatial data while moving and delivering packages. As a logistics company supporting the e-commerce operations in multiple Asian countries, Delhivery produces over 50 million geo-coordinates daily. These geo-coordinates represent the movement of trucks and bikes or delivery events to the given postal addresses. The data has great potential to mine geospatial knowledge, and we demonstrate that a mapping service similar to Google Maps and Here Maps can be automatically built using the same. Specifically, we describe the learning of regional maps (localities, cities, etc) from the addresses labeled with geo-coordinates and the learning of roads from the geo-coordinates associated with movement.

    We propose an algorithm to construct polygons and polylines of the map entities given a set of geo-coordinates. The algorithm involves non-parametric spatial probability modelling of the map entities followed by classification of the cells in a hexagonal grid to the respective map entity. We show that our algorithm is capable of handling noise, which is significantly high in our setting due to various reasons such as scale and device issues. A property about the noise and the correct information is presented such that our algorithm infers a correct map entity. We quantitatively measure the accuracy of our system by comparing its output with the available ground truth. We will showcase some localities that have incorrect polygons in Google Maps whereas we can learn the correct version by our data and algorithm. We also discuss multiple applications of the generated maps in the context of e-commerce and logistics operations.


    A part of this work was accepted for publication at ACM/SIGAPP Symposium On Applied Computing 2020:

    "Learning Locality Maps from Noisy Geospatial Labels. In SAC 2020 at Brno, Czech Republic"

  • 20 Mins
    Talk
    Beginner

    This talk focuses on the topic of querying industry grade big data systems. Enterprises have vast amount of information spread across structured data stores (relational databases, data warehouses, etc.). Descriptive analytics over this data is limited to experts familiar with complex querying languages (e.g., Structured Query Language) as well as metadata and schema associated with such large datastores. The ability to convert natural language questions to SQL statements would make descriptive analytics and reporting much easier and widespread. Problem of automatically converting natural language questions to SQL is well studied, viz., Natural Language Interface to Databases (NLIDB). We present our work on an end-to-end (E2E) system focussed on NLIDB.

    We describe two main aspects of E2E NLIDB systems: i) Converting natural language to structured language and ii) understanding natural language. There is a plenitude of applications of such E2E systems across domains e.g., healthcare, finance, logistics, etc.

  • Ashwathi Nambiar
    keyboard_arrow_down

    Ashwathi Nambiar - Quantization To The Rescue: An Edge AI Story

    20 Mins
    Talk
    Intermediate

    Over the last decade, deep neural networks have brought in a resurgence in artificial intelligence, with machines outperforming humans in some of the most popular image recognition problems. But all that jazz comes with its costs – high compute complexity and large memory requirements. These requirements translate to higher power consumption resulting in steep electricity bills and a sizeable carbon footprint. Optimizing model size and complexity thus becomes a necessity for a sustainable future for AI.
    Memory and compute complexity optimizations also bring in the promise of unimaginable possibilities with edge AI - self-driving cars, predictive maintenance, smart speakers, body monitoring are only the beginning. The smartphone market, with its reach to nearly 4 billion people, is only a fraction of the potential edge devices waiting to be truly ‘smart’. Think smart hospitals or mining, oil and gas industrial automation and so much more.

    In this session we will talk about,

    • Challenges in deep neural network (DNN) deployment on embedded systems with resource constraints
    • Quantization, which has been popularly used in mathematics and digital signal processing to map values from a large often continuous set to values in a countable smaller set, now reimagined as a possible solution for compressing DNNs and accelerating inference.
      It is gaining popularity not only with machine learning frameworks like MATLAB, TensorFlow and PyTorch but also amidst hardware toolchains like NVIDIA® TensorRT and Xilinx® DNNDK. The core idea behind quantization is the resiliency of neural networks to noise. Deep neural networks, in particular, are trained to pick up key patterns and ignore noise. This means that the networks can cope with small changes resulting from quantization error, as backed by research indicating minimal impact of quantization on overall accuracy of the network. This, coupled with significant reduction in memory footprint, power consumption, and gains in computational speed, makes quantization an efficient approach for deploying neural networks to embedded hardware.
    • Example of a quantization solution for an object detection problem
  • Kavita Dwivedi
    keyboard_arrow_down

    Kavita Dwivedi - Portfolio Valuation for a Retail Bank using Monte Carlo Simulation and Forecasting for Risk Measurement

    Kavita Dwivedi
    Kavita Dwivedi
    Head Data Science
    ISM
    schedule 7 months ago
    Sold Out!
    20 Mins
    Demonstration
    Advanced

    Banks today need to have a very good assessment of their portfolio value at any point in time . This is both a regulatory requirement and an operational metrics which helps banks to assess risk of their portfolio and also calculate the Capital Adequacy that they need to maintain at portfolio levels , product levels and all of these aggregated at Bank level.

    This presentation will walk you through a case study which will discuss in detail how we went about calculating Portfolio value for a Home loan on a sample data . The bank wanted a scientific /statistical approach to this as they could take this to regulators for approval and thus convince them about the capital that they have for a particular portfolio.

    The other interesting dimension was that in case the bank wants to sell a particular loan book to another bank /third party financial institutions they would be able to quote a price within the confidence interval of the calculated price. The same model/tool could be also shared with the buyer to convince them on quoted price and will make the negotiation and selling smooth.

    We have used Monte Carlo Simulation on historical data of the portfolio to measure the Portfolio Value for the next 5 years of a Home loan Portfolio. It is a two step modeling process with Machine Learning Models to predict default and then further using simulation to calculate Portfolio value year on year for next 5 yrs taking in account diminishing returns too.

    The presentation will take you through the approach and modeling process and how Monte Carlo Simulation helped us deliver the same to Customer with high accuracy and confidence level.

    This is a real case study and will focus on why Risk Measurement is important and why Basel , CCAR implementation across banks worldwide helps the Central Banks to manage risks in case of a financial downturn or Black Swan events.