Let’s make Data science as easy take away. Having a very little knowledge of python, python libraries, statistics, algebra, jupyter notebook, and other programming tools. Just fall in love with the journey to be started
Table Of Content:
· Data Science Phases & My Familiarity With Concept
· Facing Emotions
· What is Machine Learning?
· Types of Machine Learning
· Metrics To Evaluate Machine Learning Algorithms Using Python
· Choosing An Algorithm for Machine Learning!
· Cheatsheet: Machine Learning Algorithms (Python & R Code)
· About the Author & Where to Find ME!
Data Science Phases & My Familiarity With Concept:
· Define Business Problem (Familiar with)
· Data Collection (Familiar with)
· Data Cleaning (Familiar with)
· Data Analysis (Familiar with)
· Predictive Analysis (NOW LEARNING)
· Validating Model (NEXT UP)
· Deployment (COMING SOON)Facing Emotions!
It is easy
to get overwhelmed if you knew the only way to get to an apartment on the
12th floor was to … take the stairs if the elevator doesn’t work. The idea of
being overwhelmed is up to the person. When it comes to a certain task, someone
may love it and/or someone else may hate it. Emotions are temporary. Right now,
I’m definitely overwhelmed with learning the python coding, terminology, and
statistics that is associated to machine learning. WHY am I overwhelmed? I
believe it is because it is new and unknown to me in this present moment and by
the end of the writing this blog, I’ll be a step closer to becoming a data
scientist.
What is Machine Learning?
As you can see machine learning is related to math/statistics and computer science. Simply enough, machine learning is teaching a machine by inputting data, labeled or unlabeled, to predict an outcome & the machine will develop knowledge of the topic over time.
Types Of Machine Learning
“Supervised Learning: This algorithm includes a target/outcome variable (dependent variable) which is to be predicted from a given set of predictors (independent variables). Using these set of variables, a function can be generated to map inputs to desired outputs. The training process continues until the model achieves a desired level of accuracy on the training data. Examples of Supervised Learning: Regression, Decision Tree, Random Forest, KNN, Logistic Regression etc.
Unsupervised
Learning: In this algorithm, there are no target or outcome variables to
predict/estimate. Furthermore, this algorithm is used for clustering population
in different groups, which is widely used for segmenting customers in different
groups for specific intervention. Examples of Unsupervised Learning: Apriori
algorithm, K-means.
Reinforcement
Learning: Using this algorithm, the machine is trained to make specific
decisions. It works this way: the machine is exposed to an environment where it
trains itself continually using trial and error. This machine learns from past
experience and tries to capture the best possible knowledge to make accurate
business decisions. Example of Reinforcement Learning: Markov
Decision Process”
Metrics To Evaluate Machine
Learning Algorithms Using Python
Before we
move on to look at the process of choosing an algorithm, it is important to
note that the goal of our metrics after we create our model is to evaluate
whether the model is a “good” model to use compared to other that you can use.
Classification
metrics:
Classification
Accuracy
Log Loss
Area Under
ROC Curve
Confusion
Matrix (Method for classification prediction results)
Classification
Report (Method for classification prediction results)
Regression
Metrics
Mean
Absolute Error
Mean Squared
Error
R²
Validating
Results for Clustering
Internal
validation, which revolves around the following
metrics: cohesion with each cluster
& separation between different clusters
External
validation
Choosing
An Algorithm for Machine Learning!
Here is how I think about the process:
·
Think of problem & what are you looking to predict (Are you looking
to predict a number, classify something, etc.)
·
Import python packages
·
Pick Your 1st Model. Algorithm Types: Supervised Learning, Unsupervised
Learning, and Reinforcement Learning
·
Split the data: test & train (Split 1st to avoid data leakage)
·
Scale data (ONLY x values)
·
Cross validate
·
Fit your model (regularization happens here):
·
Predict
·
Check metrics &
evaluate (metrics to evaluate the performance differ from each model type)
·
(Optional) Compare your model by running another algorithm under
the same machine learning type & run steps above to compare evaluation. You
may tune the hyperparameters and repeat the same process till we achieve the
desired performance. Your final model selection will depend on optimal
evaluation metrics for the chosen model and problem.
Source: Includes Algorithms for Linear Regression, Logistic Regression, Decision Tree, CWM (Support Vector Machine), Naive Bayes, kNN(k-Nearest Neighbors), k-Means, Random Forest, Dimensionality Reduction Algorithms, Gradient Boosting, Gradient Boosting & AdaBoost
Comments
Post a Comment