schedule Aug 8th 09:00 - 09:45 AM place Grand Ball Room people 34 Interested

Since we originally proposed the need for a first-class language, compiler and ecosystem for machine learning (ML) - a view that is increasingly shared by many, there have been plenty of interesting developments in the field. Not only have the tradeoffs in existing systems, such as TensorFlow and PyTorch, not been resolved, but they are clearer than ever now that both frameworks contain distinct "static graph" and "eager execution" interfaces. Meanwhile, the idea of ML models fundamentally being differentiable algorithms – often called differentiable programming – has caught on.

Where current frameworks fall short, several exciting new projects have sprung up that dispense with graphs entirely, to bring differentiable programming to the mainstream. Myia, by the Theano team, differentiates and compiles a subset of Python to high-performance GPU code. Swift for TensorFlow extends Swift so that compatible functions can be compiled to TensorFlow graphs. And finally, the Flux ecosystem is extending Julia’s compiler with a number of ML-focused tools, including first-class gradients, just-in-time CUDA kernel compilation, automatic batching and support for new hardware such as TPUs.

This talk will demonstrate how Julia is increasingly becoming a natural language for machine learning, the kind of libraries and applications the Julia community is building, the contributions from India (there are many!), and our plans going forward.

 
7 favorite thumb_down thumb_up 1 comment visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Target Audience

All

schedule Submitted 4 months ago

Public Feedback

comment Suggest improvements to the Speaker
  • Anoop Kulkarni
    By Anoop Kulkarni  ~  4 months ago
    reply Reply

    Viral, thanks for your submission. I have been using Julia for some time now as part of quantum computing.. moving only recently to Julia for machine learning.. looking foward to this talk from you given your pedigree in the area


    ~anoop


  • Liked Amit Doshi
    keyboard_arrow_down

    Amit Doshi - Integrating Digital Twin and AI for Smarter Engineering Decisions

    45 Mins
    Talk
    Intermediate

    With the increasing popularity of AI, new frontiers are emerging in predictive maintenance and manufacturing decision science. However, there are many complexities associated with modeling plant assets, training predictive models for them, and deploying these models at scale for near real-time decision support. This talk will discuss these complexities in the context of building an example system.

    First, you must have failure data to train a good model, but equipment failures can be expensive to introduce for the sake of building a data set! Instead, physical simulations can be used to create large, synthetic data sets to train a model with a variety of failure conditions.

    These systems also involve high-frequency data from many sensors, reporting at different times. The data must be time-aligned to apply calculations, which makes it difficult to design a streaming architecture. These challenges can be addressed through a stream processing framework that incorporates time-windowing and manages out-of-order data with Apache Kafka. The sensor data must then be synchronized for further signal processing before being passed to a machine learning model.

    As these architectures and software stacks mature in areas like manufacturing, it is increasingly important to enable engineers and domain experts in this workflow to build and deploy the machine learning models and work with system architects on the system integration. This talk also highlights the benefit of using apps and exposing the functionality through API layers to help make these systems more accessible and extensible across the workflow.

    This session will focus on building a system to address these challenges using MATLAB, Simulink. We will start with a physical model of an engineering asset and walk through the process of developing and deploying a machine learning model for that asset as a scalable and reliable cloud service.

  • Liked Anuj Gupta
    keyboard_arrow_down

    Anuj Gupta - Continuous Learning Systems: Building ML systems that keep learning from their mistakes

    Anuj Gupta
    Anuj Gupta
    Scientist
    Intuit
    schedule 2 months ago
    Sold Out!
    45 Mins
    Talk
    Beginner

    Won't it be great to have ML models that can update their “learning” as and when they make mistake and correction is provided in real time? In this talk we look at a concrete business use case which warrants such a system. We will take a deep dive to understand the use case and how we went about building a continuously learning system for text classification. The approaches we took, the results we got.

    For most machine learning systems, “train once, just predict thereafter” paradigm works well. However, there are scenarios when this paradigm does not suffice. The model needs to be updated often enough. Two of the most common cases are:

    1. When the distribution is non-stationary i.e. the distribution of the data changes. This implies that with time the test data will have very different distribution from the training data.
    2. The model needs to learn from its mistakes.

    While (1) is often addressed by retraining the model, (2) is often addressed using batch update. Batch updation requires collecting a sizeable number of feedback points. What if you have much fewer feedback points? You need model that can learn continuously - as and when model makes a mistake and feedback is provided. To best of our knowledge there is a very limited literature on this.

  • Liked Dr. Vikas Agrawal
    keyboard_arrow_down

    Dr. Vikas Agrawal - Non-Stationary Time Series: Finding Relationships Between Changing Processes for Enterprise Prescriptive Systems

    45 Mins
    Talk
    Intermediate

    It is too tedious to keep on asking questions, seek explanations or set thresholds for trends or anomalies. Why not find problems before they happen, find explanations for the glitches and suggest shortest paths to fixing them? Businesses are always changing along with their competitive environment and processes. No static model can handle that. Using dynamic models that find time-delayed interactions between multiple time series, we need to make proactive forecasts of anomalous trends of risks and opportunities in operations, sales, revenue and personnel, based on multiple factors influencing each other over time. We need to know how to set what is “normal” and determine when the business processes from six months ago do not apply any more, or only applies to 35% of the cases today, while explaining the causes of risk and sources of opportunity, their relative directions and magnitude, in the context of the decision-making and transactional applications, using state-of-the-art techniques.

    Real world processes and businesses keeps changing, with one moving part changing another over time. Can we capture these changing relationships? Can we use multiple variables to find risks on key interesting ones? We will take a fun journey culminating in the most recent developments in the field. What methods work well and which break? What can we use in practice?

    For instance, we can show a CEO that they would miss their revenue target by over 6% for the quarter, and tell us why i.e. in what ways has their business changed over the last year. Then we provide the prioritized ordered lists of quickest, cheapest and least risky paths to help turn them over the tide, with estimates of relative costs and expected probability of success.

  • Liked Paolo Tamagnini
    keyboard_arrow_down

    Paolo Tamagnini / Kathrin Melcher - Guided Analytics - Building Applications for Automated Machine Learning

    90 Mins
    Tutorial
    Beginner

    In recent years, a wealth of tools has appeared that automate the machine learning cycle inside a black box. We take a different stance. Automation should not result in black boxes, hiding the interesting pieces from everyone. Modern data science should allow automation and interaction to be combined flexibly into a more transparent solution.

    In some specific cases, if the analysis scenario is well defined, then full automation might make sense. However, more often than not, these scenarios are not that well defined and not that easy to control. In these cases, a certain amount of interaction with the user is highly desirable.

    By mixing and matching interaction with automation, we can use Guided Analytics to develop predictive models on the fly. More interestingly, by leveraging automated machine learning and interactive dashboard components, custom Guided Analytics Applications, tailored to your business needs, can be created in a few minutes.

    We'll build an application for automated machine learning using KNIME Software. It will have an input user interface to control the settings for data preparation, model training (e.g. using deep learning, random forest, etc.), hyperparameter optimization, and feature engineering. We'll also create an interactive dashboard to visualize the results with model interpretability techniques. At the conclusion of the workshop, the application will be deployed and run from a web browser.

  • Liked Vivek Singhal
    keyboard_arrow_down

    Vivek Singhal / Shreyas J - Training Autonomous Driving Systems to Visualize the Road ahead for Decision Control

    90 Mins
    Workshop
    Intermediate

    We will train the audience how to develop advanced image segmentation with FCN/DeepLab algorithms which can help visualize the driving scenarios accurately, so as to allow the autonomous driving system to take appropriate action considering the obstacle views.

  • 90 Mins
    Workshop
    Intermediate

    Machine learning and deep learning have been rapidly adopted in various spheres of medicine such as discovery of drug, disease diagnosis, Genomics, medical imaging and bioinformatics for translating biomedical data into improved human healthcare. Machine learning/deep learning based healthcare applications assist physicians to make faster, cheaper and more accurate diagnosis.

    We have successfully developed three deep learning based healthcare applications and are currently working on two more healthcare related projects. In this workshop, we will discuss one healthcare application titled "Deep Learning based Craniofacial Distance Measurement for Facial Reconstructive Surgery" which is developed by us using TensorFlow. Craniofacial distances play important role in providing information related to facial structure. They include measurements of head and face which are to be measured from image. They are used in facial reconstructive surgeries such as cephalometry, treatment planning of various malocclusions, craniofacial anomalies, facial contouring, facial rejuvenation and different forehead surgeries in which reliable and accurate data are very important and cannot be compromised.

    Our discussion on healthcare application will include precise problem statement, the major steps involved in the solution (deep learning based face detection & facial landmarking and craniofacial distance measurement), data set, experimental analysis and challenges faced & overcame to achieve this success. Subsequently, we will provide hands-on exposure to implement this healthcare solution using TensorFlow. Finally, we will briefly discuss the possible extensions of our work and the future scope of research in healthcare sector.

  • Liked Dr. C.S.Jyothirmayee
    keyboard_arrow_down

    Dr. C.S.Jyothirmayee / Usha Rengaraju / Vijayalakshmi Mahadevan - Deep learning powered Genomic Research

    90 Mins
    Workshop
    Advanced

    The event disease happens when there is a slip in the finely orchestrated dance between physiology, environment and genes. Treatment with chemicals (natural, synthetic or combination) solved some diseases but others persisted and got propagated along the generations. Molecular basis of disease became prime center of studies to understand and to analyze root cause. Cancer also showed a way that origin of disease, detection, prognosis and treatment along with cure was not so uncomplicated process. Treatment of diseases had to be done case by case basis (no one size fits).

    With the advent of next generation sequencing, high through put analysis, enhanced computing power and new aspirations with neural network to address this conundrum of complicated genetic elements (structure and function of various genes in our systems). This requires the genomic material extraction, their sequencing (automated system) and analysis to map the strings of As, Ts, Gs, and Cs which yields genomic dataset. These datasets are too large for traditional and applied statistical techniques. Consequently, the important signals are often incredibly small along with blaring technical noise. This further requires far more sophisticated analysis techniques. Artificial intelligence and deep learning gives us the power to draw clinically useful information from the genetic datasets obtained by sequencing.

    Precision of these analyses have become vital and way forward for disease detection, its predisposition, empowers medical authorities to make fair and situationally decision about patient treatment strategies. This kind of genomic profiling, prediction and mode of disease management is useful to tailoring FDA approved treatment strategies based on these molecular disease drivers and patient’s molecular makeup.

    Now, the present scenario encourages designing, developing, testing of medicine based on existing genetic insights and models. Deep learning models are helping to analyze and interpreting tiny genetic variations ( like SNPs – Single Nucleotide Polymorphisms) which result in unraveling of crucial cellular process like metabolism, DNA wear and tear. These models are also responsible in identifying disease like cancer risk signatures from various body fluids. They have the immense potential to revolutionize healthcare ecosystem. Clinical data collection is not streamlined and done in a haphazard manner and the requirement of data to be amenable to a uniform fetchable and possibility to be combined with genetic information would power the value, interpretation and decisive patient treatment modalities and their outcomes.

    There is hugh inflow of medical data from emerging human wearable technologies, along with other health data integrated with ability to do quickly carry out complex analyses on rich genomic databases over the cloud technologies … would revitalize disease fighting capability of humans. Last but still upcoming area of application in direct to consumer genomics (success of 23andMe).

    This road map promises an end-to-end system to face disease in its all forms and nature. Medical research, and its applications like gene therapies, gene editing technologies like CRISPR, molecular diagnostics and precision medicine could be revolutionized by tailoring a high-throughput computing method and its application to enhanced genomic datasets.

  • Liked Badri Narayanan Gopalakrishnan
    keyboard_arrow_down

    Badri Narayanan Gopalakrishnan / Shalini Sinha / Usha Rengaraju - Lifting Up: Deep Learning for implementing anti-hunger and anti-poverty programs

    45 Mins
    Case Study
    Intermediate

    Ending poverty and zero hunger are top two goals United Nations aims to achieve by 2030 under its sustainable development program. Hunger and poverty are byproducts of multiple factors and fighting them require multi-fold effort from all stakeholders. Artificial Intelligence and Machine learning has transformed the way we live, work and interact. However economics of business has limited its application to few segments of the society. A much conscious effort is needed to bring the power of AI to the benefits of the ones who actually need it the most – people below the poverty line. Here we present our thoughts on how deep learning and big data analytics can be combined to enable effective implementation of anti-poverty programs. The advancements in deep learning , micro diagnostics combined with effective technology policy is the right recipe for a progressive growth of a nation. Deep learning can help identify poverty zones across the globe based on night time images where the level of light correlates to higher economic growth. Once the areas of lower economic growth are identified, geographic and demographic data can be combined to establish micro level diagnostics of these underdeveloped area. The insights from the data can help plan an effective intervention program. Machine Learning can be further used to identify potential donors, investors and contributors across the globe based on their skill-set, interest, history, ethnicity, purchasing power and their native connect to the location of the proposed program. Adequate resource allocation and efficient design of the program will also not guarantee success of a program unless the project execution is supervised at grass-root level. Data Analytics can be used to monitor project progress, effectiveness and detect anomaly in case of any fraud or mismanagement of funds.

  • Liked Govind Chada
    keyboard_arrow_down

    Govind Chada - Using 3D Convolutional Neural Networks with Visual Insights for Classification of Lung Nodules and Early Detection of Lung Cancer

    Govind Chada
    Govind Chada
    Researcher
    Cy Woods
    schedule 2 months ago
    Sold Out!
    45 Mins
    Case Study
    Intermediate

    Lung cancer is the leading cause of cancer death among both men and women in the U.S., with more than a hundred thousand deaths every year. The five-year survival rate is only 17%; however, early detection of malignant lung nodules significantly improves the chances of survival and prognosis.

    This study aims to show that 3D Convolutional Neural Networks (CNNs) which use the full 3D nature of the input data perform better in classifying lung nodules compared to previously used 2D CNNs. It also demonstrates an approach to develop an optimized 3D CNN that performs with state of art classification accuracies. CNNs, like other deep neural networks, have been black boxes giving users no understanding of why they predict what they predict. This study, for the first time, demonstrates that Gradient-weighted Class Activation Mapping (Grad-CAM) techniques can provide visual explanations for model decisions in lung nodule classification by highlighting discriminative regions. Several CNN architectures using Keras and TensorFlow were implemented as part of this study. The publicly available LUNA16 dataset, comprising 888 CT scans with candidate nodules manually annotated by radiologists, was used to train and test the models. The models were optimized by varying the hyperparameters, to reach accuracies exceeding 90%. Grad-CAM techniques were applied to the optimized 3D CNN to generate images that provide quality visual insights into the model decision making. The results demonstrate the promise of 3D CNNs as highly accurate and trustworthy classifiers for early lung cancer detection, leading to improved chances of survival and prognosis.

  • Liked Dipanjan Sarkar
    keyboard_arrow_down

    Dipanjan Sarkar / Anuj Gupta - Natural Language Processing Bootcamp - Zero to Hero

    Dipanjan Sarkar
    Dipanjan Sarkar
    Data Scientist
    Red Hat
    Anuj Gupta
    Anuj Gupta
    Scientist
    Intuit
    schedule 6 months ago
    Sold Out!
    480 Mins
    Workshop
    Intermediate

    Data is the new oil and unstructured data, especially text, images and videos contain a wealth of information. However, due to the inherent complexity in processing and analyzing this data, people often refrain from spending extra time and effort in venturing out from structured datasets to analyze these unstructured sources of data, which can be a potential gold mine. Natural Language Processing (NLP) is all about leveraging tools, techniques and algorithms to process and understand natural language based unstructured data - text, speech and so on.

    Being specialized in domains like computer vision and natural language processing is no longer a luxury but a necessity which is expected of any data scientist in today’s fast-paced world! With a hands-on and interactive approach, we will understand essential concepts in NLP along with extensive case- studies and hands-on examples to master state-of-the-art tools, techniques and frameworks for actually applying NLP to solve real- world problems. We leverage Python 3 and the latest and best state-of- the-art frameworks including NLTK, Gensim, SpaCy, Scikit-Learn, TextBlob, Keras and TensorFlow to showcase our examples. You will be able to learn a fair bit of machine learning as well as deep learning in the context of NLP during this bootcamp.

    In our journey in this field, we have struggled with various problems, faced many challenges, and learned various lessons over time. This workshop is our way of giving back a major chunk of the knowledge we’ve gained in the world of text analytics and natural language processing, where building a fancy word cloud from a bunch of text documents is not enough anymore. You might have had questions like ‘What is the right technique to solve a problem?’, ‘How does text summarization really work?’ and ‘Which are the best frameworks to solve multi-class text categorization?’ among many other questions! Based on our prior knowledge and learnings from publishing a couple of books in this domain, this workshop should help readers avoid some of the pressing issues in NLP and learn effective strategies to master NLP.

    The intent of this workshop is to make you a hero in NLP so that you can start applying NLP to solve real-world problems. We start from zero and follow a comprehensive and structured approach to make you learn all the essentials in NLP. We will be covering the following aspects during the course of this workshop with hands-on examples and projects!

    • Basics of Natural Language and Python for NLP tasks
    • Text Processing and Wrangling
    • Text Understanding - POS, NER, Parsing
    • Text Representation - BOW, Embeddings, Contextual Embeddings
    • Text Similarity and Content Recommenders
    • Text Clustering
    • Topic Modeling
    • Text Summarization
    • Sentiment Analysis - Unsupervised & Supervised
    • Text Classification with Machine Learning and Deep Learning
    • Multi-class & Multi-Label Text Classification
    • Deep Transfer Learning and it's promise
    • Applying Deep Transfer Learning - Universal Sentence Encoders, ELMo and BERT for NLP tasks
    • Generative Deep Learning for NLP
    • Next Steps

    With over 10 hands-on projects, the bootcamp will be packed with plenty of hands-on examples for you to go through, try out and practice and we will try to keep theory to a minimum considering the limited time we have and the amount of ground we want to cover. We hope at the end of this workshop you can takeaway some useful methodologies to apply for solving NLP problems in the future. We will be using Python to showcase all our examples.

  • Liked Gaurav Godhwani
    keyboard_arrow_down

    Gaurav Godhwani / Swati Jaiswal - Fantastic Indian Open Datasets and Where to Find Them

    45 Mins
    Case Study
    Beginner

    With the big boom in Data Science and Analytics Industry in India, a lot of data scientists are keen on learning a variety of learning algorithms and data manipulation techniques. At the same time, there is this growing interest among data scientists to give back to the society, harness their acquired skills and help fix some of the major burning problems in the nation. But how does one go about finding meaningful datasets connecting to societal problems and plan data-for-good projects? This session will summarize our experience of working in Data-for-Good sector in last 5 years, sharing few interesting datasets and associated use-cases of employing machine learning and artificial intelligence in social sector. Indian social sector is replete with good volume of open data on attributes like annotated images, geospatial information, time-series, Indic languages, Satellite Imagery, etc. We will dive into understanding journey of a Data-for-Good project, getting essential open datasets and understand insights from certain data projects in development sector. Lastly, we will explore how we can work with various communities and scale our algorithmic experiments in meaningful contributions.

  • Liked Aditya Singh Tomar
    keyboard_arrow_down

    Aditya Singh Tomar - Building Your Own Data Visualization Platform

    Aditya Singh Tomar
    Aditya Singh Tomar
    Data Consultant
    ACT Insights
    schedule 2 months ago
    Sold Out!
    45 Mins
    Demonstration
    Beginner

    Ever thought about having a mini interactive visualization tool that caters to your specific requirements. That is the product I created when I started independent consulting. 2 years since, and I have now decided to make it public – even the source code.

    This session will give you an overview about creating a custom, personalized version of a visualization platform built on R and Shiny. We will focus on a mix of structure and flexibility to address the varying requirements. We will look at the code itself and the various components involved while exploring the customization options available to ensure that the outcome is truly a personal product.

  • Liked Deepak Mukunthu
    keyboard_arrow_down

    Deepak Mukunthu - Automated Machine Learning

    45 Mins
    Talk
    Beginner

    Intelligent experiences powered by AI can seem like magic to users. Developing them, however, is pretty cumbersome involving a series of sequential and interconnected decisions along the way that is pretty time-consuming. What if there was an automated service that identifies the best machine learning pipelines for a given problem/data? Automated Machine Learning does exactly that!

    With the goal of accelerating AI for data scientists by improving their productivity and democratizing AI for other data personas who want to get into machine learning, Automated ML comes in many different flavors and experiences. Automated ML is one of the top 5 AI trends this year. This session will cover concepts of Automated ML, how it works, different variations of it and how you can use it for your scenarios.

  • Liked Dr. Om Deshmukh
    keyboard_arrow_down

    Dr. Om Deshmukh - Key Principles to Succeed in Data Science

    90 Mins
    Tutorial
    Beginner

    Building a successful career in the field of data science needs a lot more than just a thorough understanding of the various machine learning models. One has to also undergo a paradigm shift with regards to how s/he would typically approach any technical problems. In particular, patterns and insights unearthed from the data analysis have to be the guiding North Star for the next best action rather than the path of action implied by the data scientist's or his/her superior's intuition alone. One of the things that makes this shift tricker, in reality, is the 'confirmation bias': Confirmation bias is defined as a cognitive bias to interpret information in such a way that it further’s our pre-existing notions.

    In this session, we will discuss how the seemingly disjoint components of the digital ecosystem are working in tandem to make data-driven decisioning central to every functional aspect of every business vertical. This centrality accorded to the data makes it imperative that

    • (a) the data integrity is maintained across the lifetime of the data,
    • (b) the insights generated from the data are interpreted in the holistic context of the sources of the data and the data processing techniques, and
    • (c) human experts are systematically given an opportunity to overwrite any purely-data-driven-decisions, especially when such decisions may have far-reaching consequences.

    We will discuss these aspects using three case studies from three different business verticals (financial sector, logistics sector and the third one selected by popular vote). For each of these three case studies, the "traditional" way of solving the problem will be contrasted with the data-driven approach of solving. The participants will be split into three groups and each group will be asked to present the best data-driven approaches to solve one of the case studies. The other two groups can critique the presentation/approach. The winning group will be picked based on the presentation and the proposed approach.

    At the end of the session, the attendees should be able to work through any new case study to

    • (a) translate a business problem into an appropriate data-driven problem,
    • (b) formulate strategies to capture and access relevant data,
    • (c) shortlist relevant data modelling techniques to unearth the hidden patterns, and
    • (d) tie back the value of the findings to the business problem.
  • Liked Rishu Gupta
    keyboard_arrow_down

    Rishu Gupta / Amit Doshi - Addressing Deep Learning Challenges

    90 Mins
    Tutorial
    Intermediate

    Deep learning is getting lots of attention lately and for good reason. It's achieving results that were not possible before. Though, getting started might not always be easy. MATLAB being an integrated framework allows you to accelerate building consumer and industrial applications while utilizing the capabilities of open-source frameworks like TensorFlow to train the deep learning networks.

    Join us for a hands-on MATLAB workshop, in which you will explore and learn about deep learning workflow in MATLAB while working on key concepts and challenges such as

    • Accelerating/Automating ground truth labeling for data
    • Designing and Validating deep neural networks
    • Training and tuning deep learning algorithms

    Also, we will be talking about the interoperability with different frameworks and workflow for deploying your deep learning algorithms to embedded targets.

  • 20 Mins
    Experience Report
    Beginner

    Videos account for about 75% of Internet traffic today. Enterprises are creating more and more videos and using them for various informational purposes, including marketing, training of customers, partners & employees and internal communications. However, videos are considered as the blackholes of the Internet because it is very hard to see what’s inside them. The opaque nature of videos equally impacts end users who spend a lot of time navigating to their point of interest, leading to severe underutilization of videos as a powerful medium of information.

    In this talk, we will describe visual processing pipeline of VideoKen platform which includes

    1. Graph-based algorithm along with deep scene text detection to identify key visual frames in the video,
    2. FCN-based algorithm for semantic segmentation of screen content in visual frames,
    3. Transfer-learning based visual classifier to categorize screen content into different categories such as slides, code walkthrough, demo, handwritten, etc. and
    4. Algorithm to detect visual coherency and select indices from the video.

    We will discuss challenges and experiences in implementing/iterating on these algorithms using our experience with processing 100K+ video hours of content.

  • Liked Bargava Subramanian
    keyboard_arrow_down

    Bargava Subramanian - Anomaly Detection for Cyber Security using Federated Learning

    Bargava Subramanian
    Bargava Subramanian
    Co-Founder
    Binaize Labs
    schedule 3 weeks ago
    Sold Out!
    20 Mins
    Experience Report
    Beginner

    In a network of connected devices, there are two critical aspects of the system to succeed:

    1. Security – with a number of internet-connected devices, securing the network from cyber threats is very important.
    2. Privacy - The devices capture business sensitive data that the Organisation has to safeguard to maintain their differentiation.

    I've used Federated learning to build anomaly detection models that monitor data quality and cybersecurity – while preserving data privacy.

    Federated learning enables Edge devices to collaboratively learn deep learning models but keeping all of the data on the device itself. Instead of moving data to the cloud, the models are trained on the device and only the updates of the model are shared across the network.

    Using federated learning gave me the following advantages:

    • Ability to build more accurate models faster
    • Low latency during inference
    • Privacy-preserving
    • Improved energy efficiency of the devices

    I built deep learning models using tensorflow and deployed using uTensor. uTensor is a light-weight ML inference framework built on Mbed and Tensorflow.

    In this talk, I will discuss in detail on how I built federated learning models on the edge devices.

  • Liked Anil Arora
    keyboard_arrow_down

    Anil Arora - Building Machine Learning models from scratch and Deploying in downstream Applications

    Anil Arora
    Anil Arora
    Principal Data Scientist
    SAS
    schedule 3 weeks ago
    Sold Out!
    45 Mins
    Demonstration
    Beginner

    The session would start with a brief introduction of the evolutionary transformation of SAS platform for about 5-7 min. Followed by a jump right into the more exciting part of the session with a demo on how to build machine learning models right from scratch. This session would also emphasize and cover the need for feature engineering before building any Machine Learning models. Many organizations still face resistance in building ML models due to loss of model interpretation, hence we will see how can ML models be interpreted in SAS with various out of the box statistics. The demo would also cover the AutoML functionality to give a kickstart for data scientist for developing and refining (if needed) the ML models. At the end, the demo will cover how to consume or deploy the models in downstream applications like mobile, websites, etc. along with model governance. For the pure open source data science people the demo would conclude with how they can embrace and extend the power of open source with SAS

  • Liked Rahul Agarwal
    keyboard_arrow_down

    Rahul Agarwal - Continuous Data Integrity Tracking

    Rahul Agarwal
    Rahul Agarwal
    Vice President
    American Express
    schedule 3 weeks ago
    Sold Out!
    20 Mins
    Experience Report
    Beginner

    "In God we trust; all others must bring data." - W. E. Deming, Author & Professor

    This philosophy is imbibed in the very core of American Express being a data-driven company makes all strategic decisions based on numbers. But who ensures that numbers are correct? That is the work of Data Quality and Governance. Given the dependency on Data, ensuring Data quality is one of our prime responsibilities

    At American Express, we have Data getting generated and stored across multiple platforms. For example, in a market like the US, we process more than ~200 transactions every second and make an authorization decision. Given this speed and scale of data generation, ensuring Data quality becomes imperative and a unique challenge in itself. There are hundreds of models running in production platforms within AMEX having thousands of variables. Many variables are created/populated originally in legacy systems (or have components derived from there) which are then passed onto downstream systems for manipulation and creating new attributes. A tech glitch or a logic issue could impact any variable at any point of this process resulting in disastrous consequences in model outputs which can get transformed into real-world customer impact leading to financial and reputational risk for the bank. So how do we catch these anomalies before they adversely impact processes?

    Traditional approaches to anomaly detection have relied on measuring the deviation from the mean of the variable. The more fancy ones employ time series based forecasting. But both these approaches are fraught with high levels of false positives. When every alert generated has to be analyzed by the business which has a cost, high levels of accuracy is desired. In this talk, we will discuss how AMEX has approached and solved this problem.

  • Liked Anuj Gupta
    keyboard_arrow_down

    Anuj Gupta - NLP Bootcamp

    Anuj Gupta
    Anuj Gupta
    Scientist
    Intuit
    schedule 2 months ago
    Sold Out!
    480 Mins
    Workshop
    Beginner

    Recent advances in machine learning have rekindled the quest to build machines that can interact with outside environment like we human do - using visual clues, voice and text. An important piece of this trilogy are systems that can process and understand text in order to automate various workflows such as chat bots, named entity recognition, machine translation, information extraction, summarization, FAQ system, etc.

    A key step towards achieving any of the above task is - using the right set of techniques to represent text in a form that machine can understand easily. Unlike images, where directly using the intensity of pixels is a natural way to represent the image; in case of text there is no such natural representation. No matter how good is your ML algorithm, it can do only so much unless there is a richer way to represent underlying text data. Thus, whatever NLP application you are building, it’s imperative to find a good representation for your text data.

    In this bootcamp, we will understand key concepts, maths, and code behind the state-of-the-art techniques for text representation. We will cover mathematical explanations as well as implementation details of these techniques. This bootcamp aims to demystify, both - Theory (key concepts, maths) and Practice (code) that goes into building these techniques. At the end of this bootcamp participants would have gained a fundamental understanding of these schemes with an ability to implement them on datasets of their interest.

    This would be a 1-day instructor-led hands-on training session to learn and implement an end-to-end deep learning model for natural language processing.