AWS Machine Learning Workshop

Apply AWS ML to problems you have existing samples of actual answers
For example, to predict if new email is spam or not, you need to collect examples of spam and non-spam.
Binary classification (true / false)
Is spam or not spam, churn, will customer accept campaign?
Multiclass classification (one of more than two outcomes)
Regression (numeric number)
Building a Machine Learning Application
Frame the core ML problems
Collect, clean and prepare data
Features from raw data
Feed to learning algorithm to build models
Use the model to generate predictions for new data

Leaning process computes one weight for each feature to form a model that can predict the target value
For example, estimated target = 0.2 + 5 * age + 0.00003 * income

Learn the weights of the model
Loss function: penalty when estimate target provide by the model not equal exact result
Optimization technique: minimize the loss (Stochastic Gradient Descent), during each passes updates the feature weights one example at a time with the aim of approaching the optimal weight that minimize the loss.
For binary classification, Amazon ML uses logistic regression (logistic loss function + SGD).
For multiclass classification, Amazon ML uses multinomial logistic regression (multinomial logistic loss + SGD).
For regression, Amazon ML uses linear regression (squared loss function + SGD)

Download samples from http://bit.ly/john-2017ml-labdata, create a S3 bucket and upload 3 csv files into that S3 bucket.
churn_new.csv => create data source from s3 file link => create model => use custom receipt
With 3334 records has column "State,Account Length,Area Code,Phone,Intl Plan, VMail Plan, VMail Message, Day Mins,Day Calls,Day Charge,Eve Mins,Eve Calls,Eve Charge,Night Mins,Night Calls,Night Charge,Intl Mins,Intl Calls,Intl Charge,CustServ Calls,Churn?", once you import them into AWS ML, you will automatically have a model used to predict a customer will leave or continue subscription.
70% of imported data will be used to build up model, 30% will be used to evaluate the accuracy of the model.
banking.csv => create data source from s3 file link => create model => use default
banking-batch.csv => create batch prediction from model above

This 3 hour workshop is easy and help you have basic understanding how to use AWS Machine Learning service to automatically create Model, evaluate Model and call API for prediction.
Prepare your data to CSV format and upload to S3, then rest of modeling part and evaluation result AWS will create for you.
There are also other sources for you to import real production data such as RDS / RedShift ...etc...
The visualization is easy for you to evaluate the model
There are APIs for you to do prediction based on your created models.
Batch prediction
Real time prediction
The hardest part is "How to prepare your data and feature from raw data?"
The AWS Machine Learning document is worth to read! You can have basic understanding of Machine Learning concepts and how AWS did internally.

kuanhung c