location_city Bengaluru schedule Aug 8th 05:30 - 06:15 PM IST place Grand Ball Room 2 people 112 Interested

A major approach to the application of AI is leveraging it to create a safer world around us, as well as that of helping people make choices. With the open source revolution having taken the world by a storm and developers relying on various upstream third party dependencies (too many to chose from!:http://www.modulecounts.com/) to develop applications moving petabytes of sensitive data and mission critical code that can lead to disastrous failures, it is required now more than ever to build better developer tooling to help developers make safer, better choices in terms of their dependencies as well as providing them with more insights around the code they are using. Thanks to deep learning, we are able to tackle these complex problems and this talk would be covering two diverse and interesting problems we have been trying to solve leveraging deep learning models (recommenders and NLP).

Though we are data scientists, at heart we are also developers building intelligent systems powered by AI. We, the Redhat developer group through our “Dependency Analytics” platform and extension, seek to do the same. We call this, 'AI-based insights for developers by developers'!

In this session we would be going into the details of the deep learning models we have implemented and deployed to solve two major problems:

  1. Dependency Recommendations: Recommend dependencies to a user for their specific application stack by trying to guess their intent by leveraging deep learning based recommender models.
  2. Pro-active Security and Vulnerability Analysis: We would also touch upon how our platform aims to make developer applications safer by way of CVE (Common Vulnerabilities and Exposures) analyses and the experimental deep learning models we have built to proactively identify potential vulnerabilities. We will talk about how we leveraged deep learning models for NLP to tackle this problem.

This shall be followed by a short architectural overview of the entire platform.

If we have enough time, we intend to showcase some sample code as a part of a tutorial of how we built these deep learning models and do a walkthrough of the same!

 
 

Outline/Structure of the Tutorial

The intent of this talk is two-fold, we not only cover the work we have been doing for the last two years but also focus on how open-source tools, techniques and latest state-of-the-art models in AI can be leveraged to solve problems in a really tough domain - helping developers increase their productivity and confidence.

The focus will be on two major areas - providing dependency recommendations for developers and trying to pro-actively find out security vulnerabilities with deep learning and NLP. We will divide our talk into the following two major parts followed by a brief overview of our platform architecture and how we deploy\scale our models in production:

  • Part 1: AI models for dependency recommendations [15 - 20 mins]:
    • Depending on the ecosystem that a model is targeting, we use either deep learning based or collaborative filtering based approaches for recommendation of dependencies to a user.
    • Architecture of the models, data pre-processing and automated training pipelines
    • Insights into the types of recommendation models used -> deep learning and collaborative filtering
      • Leveraging Generative Deep Learning Models like Variational Autoencoders with Probabilistic Matrix Factorization to build a hybrid recommender that we run in production for large ecosytems
      • Hierarchical Poisson Factorization(collaborative filtering) approach for ecosystems that are not very metadata-rich
  • Part 2: Experimental AI models for vulnerability prediction [15 - 20 mins]: Security vulnerabilities in software particularly from open-source and third-party libraries (dependencies) and frameworks can cost any enterprise dearly since they are not often aware of potential vulnerabilities which might exist in a particular dependency or even a specific version of a dependency. The idea here is can we proactively find out and flag dependencies having a sign of a potential vulnerability before it becomes a serious issue affecting all downstream applications using it. In our solution, we focus on the entire openshift- golang-kubernetes ecosystem and all repositories and dependencies belonging to this ecosystem. We also leverage state-of-the-art deep learning models in NLP to go through GitHub issues, PRs and commits to predict potential security vulnerabilities.
    • Brief overview of security vulnerabilities and their impact to enterprises
    • Deep dive into the sequential deep learning models for NLP used to predict potential vulnerabilities (pre-trained embeddings, Bi-LSTM\GRUs, Attention models etc.)
    • Ways to integrate this potential solution in the developer ecosystem
  • Conclusion: Platform overview - scaling models in production [5 mins]
    • A short architectural overview of how our AI components combine with the rest of our platform[You need well-engineered software to really tap into the maximal potential of AI]
    • A demo/overview of how we containerize our models and micro-services and scale our models (assuming time permits) with docker\kubernetes\openshift

Intent is to not just talk about what we did but also show to a good extent how we did it with some sample tutorials based on the scope.

Learning Outcome

Key Takeaways from this talk:

  • Learn how recommender engines work in a non-conventional setting - dependency recommendations!
  • Understand how newer models like generative deep learning models like Variational Auto-Encoders (VAEs) can be combined with traditional recommender models like Probabilistic Matrix Factorization to build robust hybrid recommender engines
  • Learn how we are tacking a unique problem - pro-active probable CVE\Security Vulnerability Identification with Alternate Open Data Sources
  • State of the art deep learning models being used in NLP for Vulnerability Identifications (pre-trained embeddings, Bi-LSTM\GRUs, Attention models etc.)
  • Brief about leveraging containers/kubernetes/openshift to scale and maintain highly available AI models in production (focus on deployment, scalability and availability)

Target Audience

Data Scientists, Engineers, Developers, Managers, AI and Data Enthusiasts

Prerequisites for Attendees

Participants are expected to know what is AI, Machine Learning and Deep Learning. Some basics around the Data Science lifecycle including data, features, modeling, and evaluation. Some examples will also be shown in Python so having a basic knowledge of Python helps. Knowing general software engineering principles and components like containers would be useful but not mandatory.

Slides


Video


schedule Submitted 3 years ago

  • Dr. Dakshinamurthy V Kolluru
    keyboard_arrow_down

    Dr. Dakshinamurthy V Kolluru - Understanding Text: An exciting journey from Probabilistic Models to Neural Networks

    45 Mins
    Talk
    Intermediate

    We will trace the journey of NLP over the past 50 odd years. We will cover chronologically Hidden Markov Models, Elman networks, Conditional Random Fields, LSTMs, Word2Vec, Encoder-Decoder models, Attention models, transfer learning in text and finally transformer architectures. Our emphasis is going to be on how the models became powerful and simple to implement simultaneously. To demonstrate this, we take a few case studies solved at INSOFE with a primary goal of retaining accuracy while simplifying engineering. Traditional methods will be compared and contrasted against modern models and show how the latest models actually are becoming easier to implement by the business. We also explain how this enhanced comfort with text data is paving way for state of the art inclusive architectures

  • Yogesh H. Kulkarni
    keyboard_arrow_down

    Yogesh H. Kulkarni - MidcurveNN: Encoder-Decoder Neural Network for Computing Midcurve of a Thin Polygon

    45 Mins
    Talk
    Intermediate

    Various applications need lower dimensional representation of shapes. Midcurve is one- dimensional(1D) representation of a two-dimensional (2D) planar shape. It is used in applications such as animation, shape matching, retrieval, finite element analysis, etc. Methods available to compute midcurves vary based on the type of the input shape (images, sketches, etc.) and processing approaches such as Thinning, Medial Axis Transform (MAT), Chordal Axis Transform (CAT), Straight Skeletons, etc., all of which are rule-based.

    This presentation talks about a novel method called MidcurveNN which uses Encoder-Decoder neural network for computing midcurve from images of 2D thin polygons in supervised learning manner. This dimension reduction transformation from input 2D thin polygon image to output 1D midcurve image is learnt by the neural network, which can then be used to compute midcurve of an unseen 2D thin polygonal shape.

  • Dipanjan Sarkar
    keyboard_arrow_down

    Dipanjan Sarkar - Explainable Artificial Intelligence - Demystifying the Hype

    45 Mins
    Tutorial
    Intermediate

    The field of Artificial Intelligence powered by Machine Learning and Deep Learning has gone through some phenomenal changes over the last decade. Starting off as just a pure academic and research-oriented domain, we have seen widespread industry adoption across diverse domains including retail, technology, healthcare, science and many more. More than often, the standard toolbox of machine learning, statistical or deep learning models remain the same. New models do come into existence like Capsule Networks, but industry adoption of the same usually takes several years. Hence, in the industry, the main focus of data science or machine learning is more ‘applied’ rather than theoretical and effective application of these models on the right data to solve complex real-world problems is of paramount importance.

    A machine learning or deep learning model by itself consists of an algorithm which tries to learn latent patterns and relationships from data without hard-coding fixed rules. Hence, explaining how a model works to the business always poses its own set of challenges. There are some domains in the industry especially in the world of finance like insurance or banking where data scientists often end up having to use more traditional machine learning models (linear or tree-based). The reason being that model interpretability is very important for the business to explain each and every decision being taken by the model.However, this often leads to a sacrifice in performance. This is where complex models like ensembles and neural networks typically give us better and more accurate performance (since true relationships are rarely linear in nature).We, however, end up being unable to have proper interpretations for model decisions.

    To address and talk about these gaps, I will take a conceptual yet hands-on approach where we will explore some of these challenges in-depth about explainable artificial intelligence (XAI) and human interpretable machine learning and even showcase with some examples using state-of-the-art model interpretation frameworks in Python!

  • 45 Mins
    Case Study
    Intermediate

    Imitation Learning has been the backbone of Robots learning from demonstrator's behavior. Join us to know more about How to train a robot to perform task like acrobatics etc.

    Two branches of AI - Deep Learning, and Reinforcement Learning are now responsible for many real-world applications. Machine Translation, Speech Recognition, Object Detection, Robot Control, and Drug Discovery - are some of the numerous examples.

    Both approaches are data-hungry - DL requires many examples of each class, and RL needs to play through many episodes to learn a policy. Contrast this to human intelligence. A small child can typically see an image just once, and instantly recognize it in other contexts and environments. We seem to possess an innate model/representation of how the world works, which helps us grasp new concepts and adapt to new situations fast. Humans are excellent one/few shot learners. We are able to learn complex tasks by observing and imitating other humans (eg: cooking, dancing or playing soccer) - despite having a different point of view, sense modalities, body structure, mental facility.

    Humans may be very good at picking up novel tasks, but Deep RL agents surpass us in performance. Once a Deep RL has learned a good representation [1], it is easy to surpass human performance in complex tasks like Go[2], Dota 2[3], and Starcraft[4]. We are biologically limited by time, memory and computation (A computer can be made to simulate thousands of plays in a minute).

    RL struggles with tasks that have sparse rewards. Take an example of a soccer playing robot - controlled by applying a torque to each one of its joints. The environment rewards you when it scores a goal. If the policy is initialized randomly (we apply a random torque to each joint, every few milliseconds) the probability of the robot scoring a goal is negligible - it won't even be able to learn how to stand up. In tasks requiring long term planning or low-level skills, getting to that initial reward can prove impossible. These situations have the potential to greatly benefit from a demonstration - in this case showing the robot how to walk and kick - and then letting it figure out how to score a goal.

    We have an abundance of visual data on humans performing various tasks, in the public domain, in the form of videos from sources like YouTube. In Youtube alone, 400 hours of videos are uploaded every minute, and it is easy to find demonstration videos for any skill imaginable. What if we could harness this by designing agents that could learn how to perform tasks - just by watching a video clip?

    Imitation Learning, also known as apprenticeship learning, teaches an agent a sequence of decisions through demonstration, often by a human expert. It has been used in many applications such as teaching drones how to fly[5] and autonomous cars how to drive[6] - It relies on domain engineered features - or extremely precise representations such as mocap [7]. Directly applying imitation learning to learn from videos proves challenging, there is a misalignment of representation between the demonstrations and the agent’s environment. For example: How can a robot sensing its world through a 3d point cloud - learn from a noisy 2d video clip of a soccer player dribbling?

    Leveraging recent advances in Reinforcement Learning, Self Supervised Learning and Imitation Learning [8] [9] [10], We present a technical deep dive into an end to end framework which:

    1) Has prior knowledge about the world intelligence through Self-Supervised Learning - A relatively new area which seeks to build efficient deep learning representations from unlabelled data but training on a surrogate task. The surrogate task can be rotating an image and predicting the rotation angle or cropping two patches of the image, and predicting their relative tasks - or a combination of several such objectives.

    2) Has the ability to align the representation of how it senses the world, with that of the video - allowing it to learn diverse tasks from video clips.

    3) Has the ability to reproduce a skill, from only a single demonstration - using applied techniques from imitation learning

    [1] https://www.cse.iitb.ac.in/~shivaram/papers/ks_adprl_2011.pdf

    [2] https://ai.google/research/pubs/pub44806

    [3] https://openai.com/five/

    [4] https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/

    [5] http://cs231n.stanford.edu/reports/2017/pdfs/614.pdf

    [6] https://arxiv.org/pdf/1709.07174.pdf

    [7] https://en.wikipedia.org/wiki/Motion_capture

    [8] https://arxiv.org/pdf/1704.06888v3.pdf

    [9] https://bair.berkeley.edu/blog/2018/06/28/daml/

    [10] https://arxiv.org/pdf/1805.11592v2.pdf

  • Anuj Gupta
    keyboard_arrow_down

    Anuj Gupta - Natural Language Processing Bootcamp - Zero to Hero

    480 Mins
    Workshop
    Intermediate

    Data is the new oil and unstructured data, especially text, images and videos contain a wealth of information. However, due to the inherent complexity in processing and analyzing this data, people often refrain from spending extra time and effort in venturing out from structured datasets to analyze these unstructured sources of data, which can be a potential gold mine. Natural Language Processing (NLP) is all about leveraging tools, techniques and algorithms to process and understand natural language based unstructured data - text, speech and so on.

    Being specialized in domains like computer vision and natural language processing is no longer a luxury but a necessity which is expected of any data scientist in today’s fast-paced world! With a hands-on and interactive approach, we will understand essential concepts in NLP along with extensive case- studies and hands-on examples to master state-of-the-art tools, techniques and frameworks for actually applying NLP to solve real- world problems. We leverage Python 3 and the latest and best state-of- the-art frameworks including NLTK, Gensim, SpaCy, Scikit-Learn, TextBlob, Keras and TensorFlow to showcase our examples. You will be able to learn a fair bit of machine learning as well as deep learning in the context of NLP during this bootcamp.

    In our journey in this field, we have struggled with various problems, faced many challenges, and learned various lessons over time. This workshop is our way of giving back a major chunk of the knowledge we’ve gained in the world of text analytics and natural language processing, where building a fancy word cloud from a bunch of text documents is not enough anymore. You might have had questions like ‘What is the right technique to solve a problem?’, ‘How does text summarization really work?’ and ‘Which are the best frameworks to solve multi-class text categorization?’ among many other questions! Based on our prior knowledge and learnings from publishing a couple of books in this domain, this workshop should help readers avoid some of the pressing issues in NLP and learn effective strategies to master NLP.

    The intent of this workshop is to make you a hero in NLP so that you can start applying NLP to solve real-world problems. We start from zero and follow a comprehensive and structured approach to make you learn all the essentials in NLP. We will be covering the following aspects during the course of this workshop with hands-on examples and projects!

    • Basics of Natural Language and Python for NLP tasks
    • Text Processing and Wrangling
    • Text Understanding - POS, NER, Parsing
    • Text Representation - BOW, Embeddings, Contextual Embeddings
    • Text Similarity and Content Recommenders
    • Text Clustering
    • Topic Modeling
    • Text Summarization
    • Sentiment Analysis - Unsupervised & Supervised
    • Text Classification with Machine Learning and Deep Learning
    • Multi-class & Multi-Label Text Classification
    • Deep Transfer Learning and it's promise
    • Applying Deep Transfer Learning - Universal Sentence Encoders, ELMo and BERT for NLP tasks
    • Generative Deep Learning for NLP
    • Next Steps

    With over 10 hands-on projects, the bootcamp will be packed with plenty of hands-on examples for you to go through, try out and practice and we will try to keep theory to a minimum considering the limited time we have and the amount of ground we want to cover. We hope at the end of this workshop you can takeaway some useful methodologies to apply for solving NLP problems in the future. We will be using Python to showcase all our examples.

  • Dr. Atul Singh
    keyboard_arrow_down

    Dr. Atul Singh - Endow the gift of eloquence to your NLP applications using pre-trained word embeddings

    45 Mins
    Talk
    Beginner

    Word embeddings are the plinth stones of Natural Language Processing (NLP) applications, used to transform human language into vectors that can be understood and processed by machine learning algorithms. Pre-trained word embeddings enable transfer of prior knowledge about the human language into a new application thereby enabling rapid creation of a scalable and efficient NLP applications. Since the emergence of word2vec in 2013, the word embeddings field has seen rapid developments by leaps and bounds with each new successive word embedding outperforming the prior one.

    The goal of this talk is to demonstrate the efficacy of using pre-trained word embedding to create scalable and robust NLP applications, and to explain to the audience the underlying theory of word embeddings that makes it possible. The talk will cover prominent word vector embeddings such as BERT and ELMo from the recent literature.

  • Vishnu Murali
    keyboard_arrow_down

    Vishnu Murali - Deep learning for predictive maintenance : Towards Industry 4.0

    45 Mins
    Talk
    Intermediate

    Why Industry 4.0 matters?

    Just 13 % of organizations have attained the complete effect in their digital investments, so empowering them is in demand to have financial upside and make digital expansion. The optimal combination of analytics/deep learning with IoT can save large and SME’s around $16 billion.

    What’s predictive maintenance (PdM) of Industrial physical assets?

    This is a online-monitoring system which requires hardware and software components, including condition monitoring sensors, gateways and modules to handle data processing and transmission, and a secured cloud server to handle data storage and data analytics.

    Why is this important to Industries?

    Cost, safety, availability, and reliability are the main reasons why key industrial players are investing in predictive maintenance. Predictive maintenance allows factories to monitor the condition of in-service equipment by measuring key parameters like vibration, temperature, pressure, and current. Such monitoring requires connected smart sensors featuring a high-speed signal chain, powerful processing, and wired and/or wireless connectivity.

    Solutions

    Considering the above sections, as in the case of any machine learning implementations, there are hidden and underlying challenges involved in implementing PdM for industries.

    To tackle this, our research group has come up with focused solution to seamlessly integrate machine learning algorithms and industrial IoT platform. The real challenge is twofold. Apart from the technical trials, this is more of a need for agreement among plant engineers and research community.

    Ambitious foresight

    • To bring awareness among engineers about industry 4.0
    • To have technically sound way of implementing PdM
    • Providing deliverables and have ROI

    Keywords: Predictive maintenance, Industry 4.0, Behavioral change

  • Samiran Roy
    keyboard_arrow_down

    Samiran Roy / Shibsankar Das - Semi-Supervised Insight generation from petabyte scale Text data

    45 Mins
    Case Study
    Intermediate

    Existing state-of-the-art supervised methods in Machine Learning require large amounts of annotated data to achieve good performance and generalization. However, manually constructing such a training data set with sentiment labels is a labor-intensive and time-consuming task. With the proliferation of data acquisition in domains such as images, text and video, the rate at which we acquire data is greater than the rate at which we can label them. Techniques that reduce the amount of labelled data needed to achieve competitive accuracies are of paramount importance for deploying scalable, data-driven, real-world solutions. Semi-Supervised Learning algorithms generally provide a way of learning about the structure of the data from the unlabelled examples, alleviating the need for labels.

    At Envestnet | Yodlee, we have deployed several advanced state-of-the-art Machine Learning solutions which process millions of data points on a daily basis with very stringent service level commitments. A key aspect of our Natural Language Processing solutions is Semi-supervised learning (SSL): A family of methods that also make use of unlabelled data for training – typically a small amount of labelled data with a large amount of unlabelled data. Pure supervised solutions fail to exploit the rich syntactic structure of the unlabelled data to improve decision boundaries.

    There is an abundance published work in the field - but few papers have succeeded in showing significantly better results than state-of-the-art supervised learning. Often, methods have simplifying assumptions that fail to transfer to real-world scenarios. There is a lack of practical guidelines for deploying effective SSL solutions. We attempt to bridge that gap by sharing our learning from successful SSL models deployed in production.

    We will talk about best practices and challenges in deploying SSL solutions in NLP - We shall cover:

    1. Our findings while working on SSL.
    2. Techniques which have worked for us, and which have not
    3. Which SSL method is suitable to solve a given use-case.
    4. How to deal with different distributions for labelled and unlabelled data
    5. How to quantify the effectiveness of each point in our training data
    6. How to build a feedback loop that chooses points for training that result in the greatest accuracy boosts and
    7. The effect of relative sizes of labelled and unlabelled data

    References:

    [1] https://arxiv.org/pdf/1804.09170.pdf

    [2] http://www.acad.bg/ebook/ml/MITPress-%20SemiSupervised%20Learning.pdf

    [3] https://github.com/brain-research/realistic-ssl-evaluation

    [4] https://arxiv.org/pdf/1511.01432.pdf

    [5] http://pages.cs.wisc.edu/~jerryzhu/pub/sslicml07.pdf

help