Deep Learning with Apache Spark
Apache Spark is an amazing framework for distributing computations in a cluster in an easy and declarative way. Is becoming a standard across industries so it would be great to add the amazing advances of Deep Learning to it. There are parts of Deep Learning that are computationally heavy, very heavy! Distributing these processes may be the solution to this problem, and Apache Spark is the easiest way I could think to distribute them. Here I will talk about Deep Learning Pipelines an open source library created by Databricks that provides high-level APIs for scalable deep learning in Python with Apache Spark and how to distribute your DL workflows with Spark.
Outline/Structure of the Case Study
- Intro
- Introduction to Deep Learning
- What is hard about Deep Learning
- Distributed Deep Learning
- Introduction to Apache Spark 2.3.x
- Deep Learning Pipelines
- Brief Demo
- Last words
Learning Outcome
In this talk, you'll get an overview of Apache Spark 2.3.x and how to use it to distribute your Deep Learning workflows using Deep Learning pipelines, an open source library created by Databricks that provides high-level APIs for scalable deep learning in Python with Apache Spark
Target Audience
Data Scientists, Data Engineers, Data Specialists, Machine Learning Engineers, Data Science Enthusiasts.
Prerequisites for Attendees
- A prior knowledge of Python is necessary.
- Some familiarity with Spark would be great.
- Principles of Deep Learning and Programming.
- There will be some code examples and notebooks, so if you want to follow them (they will be posted on GitHub) bring your laptop and have an internet connection.
Links
schedule Submitted 2 years ago
People who liked this proposal, also liked:
-
keyboard_arrow_down
Favio Vázquez - Agile Data Science Workflows with Python, Spark and Optimus
480 Mins
Workshop
Intermediate
Cleaning, Preparing , Transforming and Exploring Data is the most time-consuming and least enjoyable data science task, but one of the most important ones. With Optimus we’ve solve this problem for small or huge datasets, also improving a whole workflow for data science, making it easier for everyone. You will learn how the combination of Apache Spark and Optimus with the Python ecosystem can form a whole framework for Agile Data Science allowing people and companies to go further, and beyond their common sense and intuition to solve complex business problems.