Apache Spark for Machine Learning on Large Data Sets

location_city Sydney schedule Sep 19th 01:30 - 02:15 PM place Grand Lodge people 84 Interested

Apache Spark is a general purpose distributed computing framework for distributed data processing. With MLlib, Spark's machine learning library, fitting a model to a huge data set becomes very easy. Similarly, Spark's general purpose functionality enables application of a model across a large collection of observations. We'll walk through fitting a model to a big data set using MLlib and applying a trained scikit-learn model to a large data set.

 
 

Target Audience

Anyone interested in machine learning and using Apache Spark and MLib.

schedule Submitted 3 years ago

Public Feedback