schedule Sep 1st 11:15 AM - 12:00 PM place Grand Ball Room 2 people 116 Interested

Apache Spark is an amazing framework for distributing computations in a cluster in an easy and declarative way. Is becoming a standard across industries so it would be great to add the amazing advances of Deep Learning to it. There are parts of Deep Learning that are computationally heavy, very heavy! Distributing these processes may be the solution to this problem, and Apache Spark is the easiest way I could think to distribute them. Here I will talk about Deep Learning Pipelines an open source library created by Databricks that provides high-level APIs for scalable deep learning in Python with Apache Spark and how to distribute your DL workflows with Spark.

2 favorite thumb_down thumb_up 0 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist

Outline/Structure of the Case Study

  • Intro
  • Introduction to Deep Learning
  • What is hard about Deep Learning
  • Distributed Deep Learning
  • Introduction to Apache Spark 2.3.x
  • Deep Learning Pipelines
  • Brief Demo
  • Last words

Learning Outcome

In this talk, you'll get an overview of Apache Spark 2.3.x and how to use it to distribute your Deep Learning workflows using Deep Learning pipelines, an open source library created by Databricks that provides high-level APIs for scalable deep learning in Python with Apache Spark

Target Audience

Data Scientists, Data Engineers, Data Specialists, Machine Learning Engineers, Data Science Enthusiasts.

Prerequisites for Attendees

  • A prior knowledge of Python is necessary.
  • Some familiarity with Spark would be great.
  • Principles of Deep Learning and Programming.
  • There will be some code examples and notebooks, so if you want to follow them (they will be posted on GitHub) bring your laptop and have an internet connection.
schedule Submitted 8 months ago

Public Feedback

comment Suggest improvements to the Speaker

  • Liked Favio Vázquez

    Favio Vázquez - Agile Data Science Workflows with Python, Spark and Optimus

    Favio Vázquez
    Favio Vázquez
    Sr. Data Scientist
    Raken Data Group
    schedule 8 months ago
    Sold Out!
    480 Mins

    Cleaning, Preparing , Transforming and Exploring Data is the most time-consuming and least enjoyable data science task, but one of the most important ones. With Optimus we’ve solve this problem for small or huge datasets, also improving a whole workflow for data science, making it easier for everyone. You will learn how the combination of Apache Spark and Optimus with the Python ecosystem can form a whole framework for Agile Data Science allowing people and companies to go further, and beyond their common sense and intuition to solve complex business problems.