filter_list help_outline
  • Liked Dr. Vikas Agrawal
    keyboard_arrow_down

    Dr. Vikas Agrawal - Non-Stationary Time Series: Finding Relationships Between Changing Processes for Enterprise Prescriptive Systems

    45 Mins
    Talk
    Intermediate

    It is too tedious to keep on asking questions, seek explanations or set thresholds for trends or anomalies. Why not find problems before they happen, find explanations for the glitches and suggest shortest paths to fixing them? Businesses are always changing along with their competitive environment and processes. No static model can handle that. Using dynamic models that find time-delayed interactions between multiple time series, we need to make proactive forecasts of anomalous trends of risks and opportunities in operations, sales, revenue and personnel, based on multiple factors influencing each other over time. We need to know how to set what is “normal” and determine when the business processes from six months ago do not apply any more, or only applies to 35% of the cases today, while explaining the causes of risk and sources of opportunity, their relative directions and magnitude, in the context of the decision-making and transactional applications, using state-of-the-art techniques.

    Real world processes and businesses keeps changing, with one moving part changing another over time. Can we capture these changing relationships? Can we use multiple variables to find risks on key interesting ones? We will take a fun journey culminating in the most recent developments in the field. What methods work well and which break? What can we use in practice?

    For instance, we can show a CEO that they would miss their revenue target by over 6% for the quarter, and tell us why i.e. in what ways has their business changed over the last year. Then we provide the prioritized ordered lists of quickest, cheapest and least risky paths to help turn them over the tide, with estimates of relative costs and expected probability of success.

  • Liked Maryam Jahanshahi
    keyboard_arrow_down

    Maryam Jahanshahi - Applying Dynamic Embeddings in Natural Language Processing to Analyze Text over Time

    Maryam Jahanshahi
    Maryam Jahanshahi
    Research Scientist
    TapRecruit
    schedule 2 months ago
    Sold Out!
    45 Mins
    Case Study
    Intermediate

    Many data scientists are familiar with word embedding models such as word2vec, which capture semantic similarity of words in a large corpus. However, word embeddings are limited in their ability to interrogate a corpus alongside other context or over time. Moreover, word embedding models either need significant amounts of data, or tuning through transfer learning of a domain-specific vocabulary that is unique to most commercial applications.

    In this talk, I will introduce exponential family embeddings. Developed by Rudolph and Blei, these methods extend the idea of word embeddings to other types of high-dimensional data. I will demonstrate how they can be used to conduct advanced topic modeling on datasets that are medium-sized, which are specialized enough to require significant modifications of a word2vec model and contain more general data types (including categorical, count, continuous). I will discuss how my team implemented a dynamic embedding model using Tensor Flow and our proprietary corpus of job descriptions. Using both categorical and natural language data associated with jobs, we charted the development of different skill sets over the last 3 years. I will specifically focus the description of results on how tech and data science skill sets have developed, grown and pollinated other types of jobs over time.

  • Liked Praveen Innamuri
    keyboard_arrow_down

    Praveen Innamuri - How we Effectively Scaled the Contact Insights Computation From 0 orgs to 20k orgs With our Spark Data Pipeline

    45 Mins
    Case Study
    Intermediate

    In the world of active conversation across multiple sales reps and customers, there is always a case that a sales rep needs a quick introduction to kickstart their sales process. With millions of conversations going around across multiple user base, building activity graph is a time consuming operation. The scale for computation becomes harder when we need to consistently compute for 20k organizations, and keep the closest computations updated and better with latest conversations and newer relations. We are walk through with our initial approach of solving this harder scale problem, different approaches we choose and fail, and how we effectively scaled it up for growing number of orgs.

  • Liked Anne Ogborn
    keyboard_arrow_down

    Anne Ogborn - Symbolic AI in a Machine Learning Age

    Anne Ogborn
    Anne Ogborn
    Software Engineer
    Hasura.io
    schedule 1 month ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    Before machine learning took over, AI was done symbolically.

    Symbolic methods still have value, and merging of symbolic and statistical methods is an emerging research area.

    In particular, symbolic methods often have much greater explanatory power. Fusing symbolic methods with ML often creates a more explicable system.

    In this talk we will explore some areas of active work on hybrid applications of symbolic and machine learning.

  • Liked Anant Jain
    keyboard_arrow_down

    Anant Jain - Adversarial Attacks on Neural Networks

    Anant Jain
    Anant Jain
    Co-Founder
    Compose Labs, Inc.
    schedule 1 month ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    Since 2014, adversarial examples in Deep Neural Networks have come a long way. This talk aims to be a comprehensive introduction to adversarial attacks including various threat models (black box/white box), approaches to create adversarial examples and will include demos. The talk will dive deep into the intuition behind why adversarial examples exhibit the properties they do — in particular, transferability across models and training data, as well as high confidence of incorrect labels. Finally, we will go over various approaches to mitigate these attacks (Adversarial Training, Defensive Distillation, Gradient Masking, etc.) and discuss what seems to have worked best over the past year.

  • Liked Dipanjan Sarkar
    keyboard_arrow_down

    Dipanjan Sarkar - Explainable Artificial Intelligence - Demystifying the Hype

    Dipanjan Sarkar
    Dipanjan Sarkar
    Data Scientist
    Red Hat
    schedule 2 months ago
    Sold Out!
    45 Mins
    Tutorial
    Intermediate

    The field of Artificial Intelligence powered by Machine Learning and Deep Learning has gone through some phenomenal changes over the last decade. Starting off as just a pure academic and research-oriented domain, we have seen widespread industry adoption across diverse domains including retail, technology, healthcare, science and many more. More than often, the standard toolbox of machine learning, statistical or deep learning models remain the same. New models do come into existence like Capsule Networks, but industry adoption of the same usually takes several years. Hence, in the industry, the main focus of data science or machine learning is more ‘applied’ rather than theoretical and effective application of these models on the right data to solve complex real-world problems is of paramount importance.

    A machine learning or deep learning model by itself consists of an algorithm which tries to learn latent patterns and relationships from data without hard-coding fixed rules. Hence, explaining how a model works to the business always poses its own set of challenges. There are some domains in the industry especially in the world of finance like insurance or banking where data scientists often end up having to use more traditional machine learning models (linear or tree-based). The reason being that model interpretability is very important for the business to explain each and every decision being taken by the model.However, this often leads to a sacrifice in performance. This is where complex models like ensembles and neural networks typically give us better and more accurate performance (since true relationships are rarely linear in nature).We, however, end up being unable to have proper interpretations for model decisions.

    To address and talk about these gaps, I will take a conceptual yet hands-on approach where we will explore some of these challenges in-depth about explainable artificial intelligence (XAI) and human interpretable machine learning and even showcase with some examples using state-of-the-art model interpretation frameworks in Python!

  • Liked Sayak Paul
    keyboard_arrow_down

    Sayak Paul - Interpretable Machine Learning - Fairness, Accountability and Transparency in ML systems

    Sayak Paul
    Sayak Paul
    Data Science Instructor
    DataCamp
    schedule 1 month ago
    Sold Out!
    45 Mins
    Talk
    Beginner

    The good news is building fair, accountable, and transparent machine learning systems is possible. The bad news is it’s harder than many blogs and software package docs would have you believe. The truth is nearly all interpretable machine learning techniques generate approximate explanations, that the fields of eXplainable AI (XAI) and Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) are very new, and that few best practices have been widely agreed upon. This combination can lead to some ugly outcomes!

    This talk aims to make your interpretable machine learning project a success by describing fundamental technical challenges you will face in building an interpretable machine learning system, defining the real-world value proposition of approximate explanations for exact models, and then outlining the following viable techniques for debugging, explaining, and testing machine learning models:

    • Model visualizations including decision tree surrogate models, individual conditional expectation (ICE) plots, partial dependence plots, and residual analysis.
    • Reason code generation techniques like LIME, Shapley explanations, and Tree-interpreter. *Sensitivity Analysis. Plenty of guidance on when, and when not, to use these techniques will also be shared, and the talk will conclude by providing guidelines for testing generated explanations themselves for accuracy and stability.
  • Liked Tanay Pant
    keyboard_arrow_down

    Tanay Pant - Tick Tock: What the heck is time-series data?

    Tanay Pant
    Tanay Pant
    Developer Advocate
    Crate.io
    schedule 1 month ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    The rise of IoT and smart infrastructure has led to the generation of massive amounts of complex data. In this session, we will talk about time-series data, the challenges of working with time series data, ingestion of this data using data from NYC cabs and running real time queries to gather insights. By the end of the session, we will have an understanding of what time-series data is, how to build streaming data pipelines for massive time series data using Flink, Kafka and CrateDB, and visualising all this data with the help of a dashboard.

  • Liked Tanay Pant
    keyboard_arrow_down

    Tanay Pant - Machine data: how to handle it better?

    Tanay Pant
    Tanay Pant
    Developer Advocate
    Crate.io
    schedule 1 month ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    The rise of IoT and smart infrastructure has led to the generation of massive amounts of complex data. Traditional solutions struggle to cope with this shift, leading to a decrease in performance and an increase in cost. In this talk, we will take a look at this kind of data using a simulated Curiosity rover. Participants will learn how to create a data pipeline for ingestion and visualisation. By the end of this session, we will be able to set up a highly scalable data pipeline for complex time series data with real time query performance.

  • Liked Dipanjan Sarkar
    keyboard_arrow_down

    Dipanjan Sarkar - A Hands-on Introduction to Natural Language Processing

    Dipanjan Sarkar
    Dipanjan Sarkar
    Data Scientist
    Red Hat
    schedule 2 months ago
    Sold Out!
    480 Mins
    Workshop
    Intermediate

    Data is the new oil and unstructured data, especially text, images and
    videos contain a wealth of information. However, due to the inherent
    complexity in processing and analyzing this data, people often refrain
    from spending extra time and effort in venturing out from structured
    datasets to analyze these unstructured sources of data, which can be a
    potential gold mine. Natural Language Processing (NLP) is all about
    leveraging tools, techniques and algorithms to process and understand
    natural language-based data, which is usually unstructured like text,
    speech and so on. In this workshop, we will be looking at tried and tested
    strategies, techniques and workflows which can be leveraged by
    practitioners and data scientists to extract useful insights from text data.


    Being specialized in domains like computer vision and natural language
    processing is no longer a luxury but a necessity which is expected of
    any data scientist in today’s fast-paced world! With a hands-on and interactive approach, we will understand essential concepts in NLP along with extensive case-
    studies and hands-on examples to master state-of-the-art tools,
    techniques and frameworks for actually applying NLP to solve real-
    world problems. We leverage Python 3 and the latest and best state-of-
    the-art frameworks including NLTK, Gensim, SpaCy, Scikit-Learn,
    TextBlob, Keras and TensorFlow to showcase our examples.


    In my journey in this field so far, I have struggled with various problems,
    faced many challenges, and learned various lessons over time. This
    workshop will contain a major chunk of the knowledge I’ve gained in the world
    of text analytics and natural language processing, where building a
    fancy word cloud from a bunch of text documents is not enough
    anymore. Perhaps the biggest problem with regard to learning text
    analytics is not a lack of information but too much information, often
    called information overload. There are so many resources,
    documentation, papers, books, and journals containing so much content
    that they often overwhelm someone new to the field. You might have
    had questions like ‘What is the right technique to solve a problem?’,
    ‘How does text summarization really work?’ and ‘Which are the best
    frameworks to solve multi-class text categorization?’ among many other
    questions! Based on my prior knowledge and learnings from publishing a couple of books in this domain, this workshop should help readers avoid the pressing
    issues I’ve faced in my journey so far and learn the strategies to master NLP.


    This workshop follows a comprehensive and structured approach. First it
    tackles the basics of natural language understanding and Python for
    handling text data in the initial chapters. Once you’re familiar with the
    basics, we cover text processing, parsing and understanding. Then, we
    address interesting problems in text analytics in each of the remaining
    chapters, including text classification, clustering and similarity analysis,
    text summarization and topic models, semantic analysis and named
    entity recognition, sentiment analysis and model interpretation. The last
    chapter is an interesting chapter on the recent advancements made in
    NLP thanks to deep learning and transfer learning and we cover an
    example of text classification with universal sentence embeddings.

  • Liked Aamir Nazir
    keyboard_arrow_down

    Aamir Nazir - Evolution Of Image Recognition And Object Segmentation: From Apes To Machines

    Aamir Nazir
    Aamir Nazir
    Student
    -
    schedule 1 month ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    From a long time, we have thought of how we could harness the amazing gift of vision because we could achieve greatness to new heights and open up endless possibilities like cars that can drive themselves. Along the path of harnessing this power, we have found numerous algorithms. In this talk, we will cover and see all the latest trends in this field, the architectures of each algorithm and evolution of different algorithms of image recognition task. we will cover it all from The dinosaur age of Image recognition to the cyborg age of object segmentation and further, CNNs to R-CNNs to Mask-RCNN. A close analysis performance-wise of these models

  • Liked Aamir Nazir
    keyboard_arrow_down

    Aamir Nazir - All-out Deep Learning - 101

    Aamir Nazir
    Aamir Nazir
    Student
    -
    schedule 1 month ago
    Sold Out!
    45 Mins
    Talk
    Beginner

    In This Talk, We will be discussing different problems and The different focus areas of Deep Learning. This Session will focus on intermediate learners looking to learn deeper in Deep Learning. We, Will, Be taking the different Tasks and seeing which deep Neural Network Architecture can solve this problem and also learn about the different neural network architectures for the same task.

  • Liked Maulik Soneji
    keyboard_arrow_down

    Maulik Soneji - Using ML for Personalizing Food Search

    Maulik Soneji
    Maulik Soneji
    Product Engineer
    Go-jek
    schedule 2 months ago
    Sold Out!
    45 Mins
    Talk
    Beginner

    GoFood, the food delivery product of Gojek is one of the largest of its kind in the world. This talk summarizes the approaches considered and lessons learnt during the design and successful experimentation of a search system that uses ML to personalize the restaurant results based on the user’s food and taste preferences .

    We formulated the estimation of the relevance as a Learning To Rank ML problem which makes the task of performing the ML inference for a very large number of customer-merchant pairs the next hurdle.
    The talk will cover our learnings and findings for the following:
    a. Creating a Learning Model for Food Search
    b. Targetting experiments to a certain percentage of users
    c. Training the model from real time data
    d. Enriching Restaurant data with custom tags

    Our story should help the audience in making design decisions on the data pipelines and software architecture needed when using ML for relevance ranking in high throughput search systems.

  • Liked Kuldeep Jiwani
    keyboard_arrow_down

    Kuldeep Jiwani - "Sessionisation" of time sequenced events via Stochastic periods

    45 Mins
    Talk
    Intermediate

    In todays world majority of information is generated by self sustaining systems like various kinds of bots, crawlers, servers, various online services, etc. This information is flowing on the axis of time and is generated by these actors under some complex logic. For example, a stream of buy/sell order requests by an Order Gateway in financial world, or a stream of web requests by a monitoring / crawling service in the web world, or may be a hacker's bot sitting on internet and attacking various computers. Although we may not be able to know the motive or intention behind these data sources. But via some unsupervised techniques we can try to infer the pattern or correlate the events based on their multiple occurrences on the axis of time. Thus we could automatically identify signatures of various actors and take appropriate actions.

    Sessionisation is one such unsupervised technique that tries to find the signal in a stream of events associated with a timestamp. In the ideal world it would resolve to finding the period with a mixture of sinusoidal waves. But for the real world this is a much complex activity, as even the systematic events generated by machines over the internet behave in a much erratic manner. So the notion of a period for a signal also changes in the real world. We can no longer associate it with a number, it has to be treated as a random variable, with expected values and associated variance. Hence we need to model "Stochastic periods" and learn their probability distributions in an unsupervised manner.

    In this talk we will go through a Sessionisation technique based on stochastic periods. The journey would begin by extracting relevant data from a sequence of timestamped events. Then we would apply various techniques like FFT (Fast Fourier Transform), kernel density estimation, optimal signal selection, Gaussian Mixture Models, etc. and eventually discover patterns in time stamped events.

  • Liked Sayak Paul
    keyboard_arrow_down

    Sayak Paul / Anubhav Singh - End-to-end project on predicting collective sentiment for programming language using StackOverflow answers

    90 Mins
    Tutorial
    Intermediate

    In the world of a plethora of programming languages, and a diverse population of developers working on them, an interesting question is posed - “How happy are the developers of any given language?”.

    It is often that sentiment for a language creeps into the StackOverflow answer provided by any user. With an ability to perform sentiment analysis on the user's answers, we can take a step forward to aggregate the average sentiment on the factor of language. This conveniently answers our question of interest.

    The presenters create an end-to-end project which begins with pulling data from the StackOverflow API, making the collective sentiment prediction model and eventually deploying it as an API on the GCP Compute Engine.

  • Liked Dat Tran
    keyboard_arrow_down

    Dat Tran - Demystifying the Buzz in Machine Learning! (This time for real)

    Dat Tran
    Dat Tran
    Head of Data Science
    idealo.de
    schedule 1 month ago
    Sold Out!
    45 Mins
    Talk
    Intermediate

    When I started my data science career in 2013, everyone was into big data. In fact, big data was at the peak of inflated expectations (Source: Gartner). You had to use tools like Hadoop and Spark to be one of the cool kids. Many data prophets out there told you that data is the new oil or even gold. Year 2018, things haven’t changed. Data is still cool and going strong. It’s eating the world and yes you still need big data and now also deep deep very deep learning. There’s a lot of bullshit bingo out there.


    In this talk, I want to demystify the buzz in machine learning by presenting some simple guidelines for successful data projects and real practical use cases. In fact, I will share some of the stuff that we’re working on at idealo (https://www.idealo.de/), Germany’s largest price comparison service. And yes it involves deep learning and yes it can be quite technical sometimes as well.

  • Liked Nirav Shah
    keyboard_arrow_down

    Nirav Shah / Ananth Bala - Data Analysis, Dashboards and Visualization - How to create powerful visualizations like a Zen Master

    480 Mins
    Workshop
    Intermediate

    In today’s data economy and disruptive business environment, data is the new oil and data analysis with data visualization is vital for professionals and companies to stay competitive. Data Analysis and developing useful and interactive visualizations which provide insights may seem complex for a non -data professional. That should not be the case, thanks to various BI & data visualization tools. Tableau is one of the most popular one and widely used in various industries by individual users to enterprise roll out.

    In this complete hands-on training session (slides, workbooks and data-sets will be distributed in advance), you will learn to turn your data into interactive dashboards, how to create stories with data and share these dashboards with your audience. We will begin with a quick refresher of basics about design and information literacy and discussions about practices for creating charts and storytelling utilizing best visual practices. Whether your goal is to explain an insight or let your audience explore data insights, using Tableau’s simple drag-and-drop user interface makes the task easy and enjoyable.

    You will learn to use functionalities like Table Calculations, Sets, Filters, Level of Detail expressions, Animations, predictive analytics using forecast functions and understanding Clustering. You will learn to integrate R and Tableau and how to use R within Tableau. Learn advance charts such as Waterfall charts, Pareto charts, Gantt Charts, Control Charts and Box and Whisker’s plot. You will also learn mapping, using parameters and other visual functionalities. You will learn about data preparation – joins, blending, union and Tableau Prep.

  • Liked Dr. Neha Sehgal
    keyboard_arrow_down

    Dr. Neha Sehgal - Open Data Science for Smart Manufacturing

    45 Mins
    Talk
    Intermediate

    Open Data offers a tremendous opportunity in transformation of today’s manufacturing sector to smarter manufacturing. Smart Manufacturing initiatives include digitalising production processes and integrating IoT technologies for connecting machines to collect data for analysis and visualisation.

    In this talk, an understanding of linkage between various industries within manufacturing sector through lens of Open Data Science will be illustrated. The data on manufacturing sector companies, company profiles, officers and financials will be scraped from UK Open Data API’s.

    Typical task includes data preprocessing, network analysis for industries, clustering and deploying the model as an API using Google Cloud Platform. The presenter will discuss about the necessity of 'Analytical Thinking' approach as an aid to handle complex big data projects and how to overcome challenges while working with real-life data science projects.

  • Liked Kumar Nityan Suman
    keyboard_arrow_down

    Kumar Nityan Suman - Beating BERT at NER For E-Commerce Products

    Kumar Nityan Suman
    Kumar Nityan Suman
    Data Scientist
    Youplus Inc.
    schedule 2 months ago
    Sold Out!
    45 Mins
    Tutorial
    Intermediate

    Natural Language Processing is a messy and complicated affair but modern advanced techniques are offering increasingly impressive results. Embeddings are a modern machine learning technique that has taken the natural language processing world by storm.

    This hands-on tutorial will showcase the advantage of learning custom Word and Character Embeddings for natural language problems over pre-trained vectors like ELMo and BERT using a Named Entity Recognition case study over e-commerce data.

  • Liked Akshay Bahadur
    keyboard_arrow_down

    Akshay Bahadur - DeepVision : Exploiting computer vision techniques to minimize CPU Utilization

    Akshay Bahadur
    Akshay Bahadur
    SDE-I
    Symantec Softwares
    schedule 1 month ago
    Sold Out!
    45 Mins
    Demonstration
    Beginner

    The advent of machine learning along with its integration with computer vision has enabled users to efficiently to develop image-based solutions for innumerable use cases. A machine learning model consists of an algorithm which draws some meaningful correlation between the data without being tightly coupled to a specific set of rules. It's crucial to explain the subtle nuances of the network along with the use-case we are trying to solve. With the advent of technology, the quality of the images has increased which in turn has increased the need for resources to process the images for building a model. The main question, however, is to discuss the need to develop lightweight models keeping the performance of the system intact.
    To connect the dots, we will talk about the development of these applications specifically aimed to provide equally accurate results without using much of the resources. This is achieved by using image processing techniques along with optimizing the network architecture.
    These applications will range from recognizing digits, alphabets which the user can 'draw' at runtime; developing state of the art facial recognition system; predicting hand emojis, developing a self-driving system, detecting Malaria and brain tumor, along with Google's project of 'Quick, Draw' of hand doodles.
    In this presentation, we will discuss the development of such applications using computer vision techniques to minimize CPU utilization.