Crunch Data and Deploy Serverless Architecture the Smart Way

schedule Sep 2nd 10:00 AM - 06:00 PM place Boardroom people 53 Interested add_circle_outline Notify

The workshop will showcase how to perform machine learning analysis on notebooks, where the participants will be able to run their own Jupyter or Databricks notebook to find predictive features in a dataset with many columns. Furthermore, it will showcase how to deploy a serverless architecture using AWS CloudFormation template. The workshop also provides the opportunity to discuss differences in academic versus commercial data science.

Before the workshop:

For the workshop:

 
 

Outline/Structure of the Workshop

1) Generating Insights

  • What is Data Science, Machine Learning and Analytics?
  • Definitions
  • Application cases
  • Challenges
  • Frameworks that help, e.g. Databricks

2) Generating Insights - hands on: running wide random forest on Databricks

VariantSpark on Databricks: https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1020355316241938/1261269838839355/5046708719373721/latest.html

  • Where would you apply this to
  • Scoping project ideas for your business

3) Going to production

4) Going to production - Demo : GT-Scan deploy

5) Going to production - Hands on

Cloudformation AWS template: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/GettingStarted.Walkthrough.html

6) Stacking data science services

  • API, Serverless and "aaS" ecosystem

7) Stacking data science services - hands on:

GT-Scan jupyter notebook: https://github.com/BauerLab/GT-scan2-Notebooks

Learning Outcome

  • Data driven insights essential for robust business decisions
  • VariantSpark is RF implementation for 'wide' data
  • Deploying infrastructure through IaC is easy and reliable and allows hypothesis-driven incremental changes to cloud infrastructure
  • Data science modules can be "stacked" to easily reuse components for different questions

Target Audience

Data Scientists, Data Engineers, Data Specialists, Machine Learning Engineers, Data Science Enthusiasts

Prerequisites for Attendees

1) Make sure you have a free AWS account: https://portal.aws.amazon.com/billing/signup#/start

2) Make sure you have AWS cli installed on your machine: https://aws.amazon.com/cli/

3) Make sure you have a free Databricks account: https://databricks.com/try-databricks

4) Make sure you have Postman installed: https://chrome.google.com/webstore/detail/postman/fhbjgbiflinjbdggehcddcbncdddomop?hl=en

schedule Submitted 1 year ago

Public Feedback

comment Suggest improvements to the Speaker