Deep Learning with Apache Spark
Apache Spark is an amazing framework for distributing computations in a cluster in an easy and declarative way. Is becoming a standard across industries so it would be great to add the amazing advances of Deep Learning to it. There are parts of Deep Learning that are computationally heavy, very heavy! Distributing these processes may be the solution to this problem, and Apache Spark is the easiest way I could think to distribute them. Here I will talk about Deep Learning Pipelines an open source library created by Databricks that provides high-level APIs for scalable deep learning in Python with Apache Spark and how to distribute your DL workflows with Spark.
Outline/structure of the Session
- Introduction to Deep Learning
- What is hard about Deep Learning
- Distributed Deep Learning
- Introduction to Apache Spark 2.3.x
- Deep Learning Pipelines
- Brief Demo
- Last words
In this talk, you'll get an overview of Apache Spark 2.3.x and how to use it to distribute your Deep Learning workflows using Deep Learning pipelines, an open source library created by Databricks that provides high-level APIs for scalable deep learning in Python with Apache Spark
Data Scientists, Data Engineers, Data Specialists, Machine Learning Engineers, Data Science Enthusiasts.
- A prior knowledge of Python is necessary.
- Some familiarity with Spark would be great.
- Principles of Deep Learning and Programming.
- There will be some code examples and notebooks, so if you want to follow them (they will be posted on GitHub) bring your laptop and have an internet connection.