Apache Spark for Machine Learning on Large Data Sets

schedule Sep 19th 01:30 PM - 02:15 PM place Grand Lodge people 84 Attending

Apache Spark is a general purpose distributed computing framework for distributed data processing. With MLlib, Spark's machine learning library, fitting a model to a huge data set becomes very easy. Similarly, Spark's general purpose functionality enables application of a model across a large collection of observations. We'll walk through fitting a model to a big data set using MLlib and applying a trained scikit-learn model to a large data set.

1 favorite thumb_down thumb_up 1 comment visibility_off  Remove from Watchlist visibility  Add to Watchlist

Target Audience

Anyone interested in machine learning and using Apache Spark and MLib.

schedule Submitted 11 months ago

Comments Subscribe to Comments

comment Comment on this Submission
  • Josh Graham
    By Josh Graham  ~  11 months ago
    reply Reply

    G'day Juliet, if we can see some sort of draft outline or even topic synopsis this week, that would be awesome.