Future of Technology covered trends in technology across the globe and innovation changing the future

 
 

Outline/Structure of the Talk

nill

Learning Outcome

great

Target Audience

All who loves technology

Prerequisites for Attendees

nill

schedule Submitted 2 years ago

Public Feedback


    • Kabir Rustogi
      keyboard_arrow_down

      Kabir Rustogi - Generation of Locality Polygons using Open Source Road Network Data and Non-Linear Multi-classification Techniques

      Kabir Rustogi
      Kabir Rustogi
      Head - Data Sciences
      Delhivery
      schedule 1 year ago
      Sold Out!
      45 Mins
      Case Study
      Intermediate

      One of the principal problems in the developing world is the poor localization of its addresses. This inhibits discoverability of local trade, reduces availability of amenities such as creation of bank accounts and delivery of goods and services (e.g., e-commerce) and delays emergency services such as fire brigades and ambulances. In general, people in the developing World identify an address based on neighbourhood/locality names and points of interest (POIs), which are neither standardized nor any official records exist that can help in locating them systematically. In this paper, we describe an approach to build accurate geographical boundaries (polygons) for such localities.

      As training data, we are provided with two pieces of information for millions of address records: (i) a geocode, which is captured by a human for the given address, (ii) set of localities present in that address. The latter is determined either by manual tagging or by using an algorithm which is able to take a raw address string as input and output meaningful locality information present in that address. For example, for the address, “A-161 Raheja Atlantis Sector 31 Gurgaon 122002”, its geocode is given as (28.452800, 77.045903), and the set of localities present in that address is given as (Raheja Atlantis, Sector 31, Gurgaon, Pin-code 122002). Development of this algorithm are part of any other project we are working on; details about the same can be found here.

      Many industries, such as the food-delivery industry, courier-delivery industry, KYC (know-your-customer) data-collection industry, are likely to have huge amounts of such data. Such crowdsourced data usually contain large a amount of noise, acquired either due to machine/human error in capturing the geocode, or due to error in identifying the correct set of localities from a poorly written address. For example, for the address, “Plot 1000, Sector 31 opposite Sector 40 road, Gurgaon 122002”, a machine may output the set of localities present in this address as (Sector 31, Sector 40, Gurgaon, Pin-code 122002), even though it is clear that the address does not lie in Sector 40.

      The solution described in this paper is expected to consume the provided data and output polygons for each of the localities identified in the address data. We assume that the localities for which we must build polygons are non-overlapping, e.g., this assumption is true for pin-codes. The problem is solved in two phases.

      In the first phase, we separate the noisy points from the points that lie within a locality. This is done by formulating the problem as a non-linear multi-classification problem. The latitudes and longitudes of all non-overlapping localities act as features, and their corresponding locality name acts as a label, in the training data. The classifier is expected to partition the 2D space containing the latitudes and longitudes of the union of all non-overlapping localities into disjoint regions corresponding to each locality. These partitions are defined as non-linear boundaries, which are obtained by optimizing for two objectives: (i) the area enclosed by the boundaries should maximize the number of points of the corresponding locality and minimize the number of points of other localities, (ii) the separation boundary should be smooth. We compare two algorithms, decision trees and neural networks for creating such partitions.

      In the second phase, we extract all the points that satisfy the partition constraints, i.e., lie within the boundary of a locality L, as candidate points, for generating the polygon for locality L. The resulting polygon must contain all candidate points and should have the minimum possible area while maintaining the smoothness of the polygon boundary. This objective can be achieved by algorithms such as concave hull. However, since localities are always bounded by roads, we have further enhanced our locality polygons by leveraging open source data of road networks. To achieve this, we solve a non-linear optimisation problem which decides the set of roads to be selected, so that the enclosed area is minimized, while ensuring that all the candidate points lie within the enclosed area. The output of this optimisation problem is a set of roads, which represents the boundary of a locality L.

    • Arun Krishnaswamy
      keyboard_arrow_down

      Arun Krishnaswamy - Federated Deep Learning in SAAS Applications

      45 Mins
      Demonstration
      Executive

      ML in Saas Applications becomes exceedingly difficult due to lack to access to customer data. The Customer Data is locked down with no outside access and it presents a huge problem to do ML on this data in a traditional way. The focus of this presentation is to provide alternate solutions to do ML in a distributed fashion.

      We will focus on Split Neural Networks - a relatively new distributed ML Technique to solve Data Access issues with a SAAS Application.

      We will walk through the motivations behind Split Neural Network approach to ML .

      We will go through some concrete examples that are already using this technique.

    • Suvro Shankar Ghosh
      keyboard_arrow_down

      Suvro Shankar Ghosh - Learning Entity embedding’s form Knowledge Graph

      45 Mins
      Case Study
      Intermediate
      • Over a period of time, a lot of Knowledge bases have evolved. A knowledge base is a structured way of storing information, typically in the following form Subject, Predicate, Object
      • Such Knowledge bases are an important resource for question answering and other tasks. But they often suffer from their incompleteness to resemble all the data in the world, and thereby lack of ability to reason over their discrete Entities and their unknown relationships. Here we can introduce an expressive neural tensor network that is suitable for reasoning over known relationships between two entities.
      • With such a model in place, we can ask questions, the model will try to predict the missing data links within the trained model and answer the questions, related to finding similar entities, reasoning over them and predicting various relationship types between two entities, not connected in the Knowledge Graph.
      • Knowledge Graph infoboxes were added to Google's search engine in May 2012

      What is the knowledge graph?

      ▶Knowledge in graph form!

      ▶Captures entities, attributes, and relationships

      More specifically, the “knowledge graph” is a database that collects millions of pieces of data about keywords people frequently search for on the World wide web and the intent behind those keywords, based on the already available content

      ▶In most cases, KGs is based on Semantic Web standards and have been generated by a mixture of automatic extraction from text or structured data, and manual curation work.

      ▶Structured Search & Exploration
      e.g. Google Knowledge Graph, Amazon Product Graph

      ▶Graph Mining & Network Analysis
      e.g. Facebook Entity Graph

      ▶Big Data Integration
      e.g. IBM Watson

      ▶Diffbot, GraphIQ, Maana, ParseHub, Reactor Labs, SpazioDati

    • Aakash Goel
      keyboard_arrow_down

      Aakash Goel / Ankit Kalra - Detect Workout Pose for Virtual Gym using CNN

      45 Mins
      Talk
      Beginner

      Approximately 80% of the people across globe do not use gym, yet they pay $30 to $125/month.Attrition from gym is linked with discouraging results and lack of engagement. Traditional gym users don’t know proper exercise regimen and users prefer workout regimens that are fun, customizable and social.

      To combat above problem, we came up with idea to provide customized fitness solutions using Artificial Intelligence. In this talk, we showcase how we can leverage Deep Learning based Architectures like CNN to develop "Workout pose detection" that tracks user movement and classify it corresponding to specific trained workout and will determine whether the performed pose is correct or wrong.


      Keywords: CNN, Deep Learning, Image classification Model, Computer Vision.

    • Indranil Chandra
      keyboard_arrow_down

      Indranil Chandra - Data Science Project Governance Framework

      Indranil Chandra
      Indranil Chandra
      Assistant Manager
      CITI
      schedule 1 year ago
      Sold Out!
      45 Mins
      Talk
      Executive

      Data Science Project Governance Framework is a framework that can be followed by any new Data Science business or team. It will help in formulating strategies around how to leverage Data Science as a business, how to architect Data Science based solutions and team formation strategy, ROI calculation approaches, typical Data Science project lifecycle components, commonly available Deep Learning toolsets and frameworks and best practices used by Data Scientists. I will use an actual use case while covering each of these aspects of building the team and refer to examples from my own experiences of setting up Data Science teams in a corporate/MNC setup.

      A lot of research is happening all around the world in various domains to leverage Deep Learning, Machine Learning and Data Science based solutions to solve problems that would otherwise be impossible to solve using simple rule based systems. All the major players in the market and businesses are also getting started and setting up new Data Science teams to take advantages of modern State-of-the-Art ML/DL techniques. Even though most of the Data Scientists are great at knowledge of mathematical modeling techniques, they lack the business acumen and management knowledge to drive Data Science based solutions in a corporate/MNC setup. On the other hand, management executives in most of the corporates/MNCs do not have first hand knowledge of setting up new Data Science team and approach to solving business problems using Data Science. This session will help bridge the above mentioned gap and help Executives and Data Scientists provide a common ground around which they can easily build any Data Science business/team from ground zero.

      GitHub Link -> https://github.com/indranildchandra/DataScience-Project-Governance-Framework

    • Karthik Bharadwaj T
      keyboard_arrow_down

      Karthik Bharadwaj T - Failure Detection using Driver Behaviour from Telematics

      Karthik Bharadwaj T
      Karthik Bharadwaj T
      Sr. Data Scientist
      Teradata
      schedule 2 years ago
      Sold Out!
      45 Mins
      Case Study
      Beginner

      Telematics data have a potential to unlock revenue of 1.5 trillion. Unfortunately this data has not been tapped by many users.

      In this case study Karthik Thirumalai would discuss how we can use telematics data to identify driver behaviour and do preventive maintenance in automobile.

    • Karthik Bharadwaj T
      keyboard_arrow_down

      Karthik Bharadwaj T - 7 Habits to Ethical AI

      Karthik Bharadwaj T
      Karthik Bharadwaj T
      Sr. Data Scientist
      Teradata
      schedule 2 years ago
      Sold Out!
      45 Mins
      Talk
      Beginner

      While AI is been put to use in solving great problems of the world, it is subjected to questions the morality of how it is constructed, used and put into use. Karthik Thirumalai addresses the 7 habits of building ethical AI solutions and how it could be put to use for a better world. These habits Data Governance, Fairness, Privacy and Security, Accountability, Transparency, Education help organizations to successfully implement their AI strategy which reflects fundamental human principles and moral values.

    • SUDIPTO PAL
      keyboard_arrow_down

      SUDIPTO PAL - Use cases of Financial Data Science Techniques in retail

      SUDIPTO PAL
      SUDIPTO PAL
      STAFF DATA SCIENTIST
      Walmart Labs
      schedule 2 years ago
      Sold Out!
      20 Mins
      Talk
      Intermediate

      Financial domains like Insurance and Banking have uncertainty itself as an inherent product feature, and hence makes extensive use of Statistical models to develop, valuate and price their products. This presentation will showcase some of the techniques like Survival models and cashflow prediction models, popularly used in financial products, how can they be used in Retail data science, by showcasing analogies and similarities.

      Survival models were traditionally used for modeling mortality, then got extended to be used for modeling queues, waiting time and attrition. We showcase, 1) How the waiting time aspect can be used to model repeat purchase behaviors of customers, and utilize the same for product recommendation on particular time intervals. 2) How the same survival or waiting time problem can be solved using discrete time binary response survival models (as opposed to traditional proportional hazard and AFT models for survival). 3) Quick coverage of other use cases like attrition, CLTV (customer lifetime value) and inventory management.

      We show a use case where survival models can be used to predict the timing of events (e.g. attrition/renewal, purchase, purchase order for procurement), and use that to predict the timing of cashflows associated with events (e.g. subscription fee received from renewals, procurement cost etc.), which are typically used for capital allocation.

      We also show how the backdated predicted cashflows can be used as baseline to make causal inference about strategic intervention (e.g. campaign launch for containing attritions) by comparing with actual cashflows post-intervention. This can be used to retrospectively evaluate the impact of strategic interventions.

    help