Feature Engineering with Excel and Machine Leaning using Python

Machine Learning is such an exciting topic to learn. However, Feature Engineering is even more crucial for beginners during the learning process. Often the feature engineering is explored and executed through codes, not giving enough time for new learners to grasp the concept entirely.

In this workshop, I plan to use Excel to clean data by imputing missing values, creating new features, dividing into test train datasets and build the intuition of modelling. The learners will get a visual cue of what all happens in the background.

Once the learners are confident about the process, making them do the same thing with codes will help them understand the feature engineering topic better.

Once, the pre-processing part is over, teaching various models ranging from Logistic to Deep Learning Models on the cleaned data will help them grasp the modelling process better.


Outline/Structure of the Talk

0/ Introduction and Agenda

------- Topics using Excel -------

1/ Download Data and Make the first submission

2/ Explore data and find Variables with Missing Value

3/ Treat Embarked Variable

4/ Treat Age & Fare Variable

5/ Treat Cabin and recode Sex Feature

6/ Extract new features from Sibsp & Parch

7/ Extract Title from Name Variable

8/ Extract new Feature from Ticket Variable

9/ Create Dummy or One Hot Coding Categorical Variables

10/ Create Train Test Split

------- Topics using Python -------

11/ Build a basic Logistic Model on train data

12/ Build a basic Logistic Model and submit to Kaggle

13/ Decision Tree Model - All Variables

14/ Use sklearn train test split

15/ Tune Decision Tree Hyperparameters with Grid Search

16/ Decision Tree with Grid Search - Selected feature using SelectKBest

17/ Extra Tree Model - Selected feature using SelectKBest

18/ Tune Random Forest Hyperparameters with Grid Search

19/ Tune Extra Trees Model (Ensemble Method) with Grid Search

20/ XGBoost Model

21/ Deep Learning - H2o and Lime

22/ Random Forest with Pipelines

Learning Outcome

1) A deeper understanding of Feature Engineering & improve thinking

2) Visual aid to how Feature Engineering looks like when it's written by code

3) Exposure to building thinking around Feature Engineering

4) Learning how to create different models/evaluate and choose the best one

5) Exposure to the pros and cons of different machine learning techniques

6) Basic understanding of how to code and implement Machine Learning and Deep Models

7) Using Pipeline to Build Models more accurately for Deployment

Target Audience

1) Who intend to learn Machine Learning 2) Learners who have just finishes an Online Course or Classroom Course on Machine Learning 3) Learners who want to practice what they learned 4) Learners who want to gain deeper understanding of Feature Engineering and how various models are used 5) Curious Trainers who want to teach better

Prerequisites for Attendees

1) The hands-on session, so carrying a laptop is mandatory

2) Anaconda(for Python) and Excel Installed on their laptop

3) Entered the Titanic Competition on Kaggle (with datasets downloaded)

4) Eager to gain deeper understanding of Feature Engineering

5) Build many Machine Learning Models in a power-packed session



schedule Submitted 1 year ago

Public Feedback