跳到主要內容

AWS Machine Learning Workshop


Machine Learning Concepts

  • Apply AWS ML to problems you have existing samples of actual answers
  • For example, to predict if new email is spam or not, you need to collect examples of spam and non-spam.
  • Binary classification (true / false)
  • Is spam or not spam, churn, will customer accept campaign?
  • Multiclass classification (one of more than two outcomes)
  • Regression (numeric number)
  • Building a Machine Learning Application
  • Frame the core ML problems
  • Collect, clean and prepare data
  • Features from raw data
  • Feed to learning algorithm to build models
  • Use the model to generate predictions for new data

Linear Models

  • Leaning process computes one weight for each feature to form a model that can predict the target value
  • For example, estimated target = 0.2 + 5 * age + 0.00003 * income

Learning Algorithm

  • Learn the weights of the model
  • Loss function: penalty when estimate target provide by the model not equal exact result
  • Optimization technique: minimize the loss (Stochastic Gradient Descent), during each passes updates the feature weights one example at a time with the aim of approaching the optimal weight that minimize the loss.
  • For binary classification, Amazon ML uses logistic regression (logistic loss function + SGD).
  • For multiclass classification, Amazon ML uses multinomial logistic regression (multinomial logistic loss + SGD).
  • For regression, Amazon ML uses linear regression (squared loss function + SGD)

Evaluate Model Accuracy

  • 70% to build up model, 30% for evaluation
  • Binary classification, 0.5 almost same use random guessing

Workshop

  • Download samples from  http://bit.ly/john-2017ml-labdata, create a S3 bucket and upload 3 csv files into that S3 bucket.
  • churn_new.csv => create data source from s3 file link => create model => use custom receipt
  • With 3334 records has column "State,Account Length,Area Code,Phone,Intl Plan, VMail Plan, VMail Message, Day Mins,Day Calls,Day Charge,Eve Mins,Eve Calls,Eve Charge,Night Mins,Night Calls,Night Charge,Intl Mins,Intl Calls,Intl Charge,CustServ Calls,Churn?", once you import them into AWS ML, you will automatically have a model used to predict a customer will leave or continue subscription.
  • 70% of imported data will be used to build up model, 30% will be used to evaluate the accuracy of the model.
  • banking.csv => create data source from s3 file link => create model => use default 
  • banking-batch.csv => create batch prediction from model above 

Thoughts

  • This 3 hour workshop is easy and help you have basic understanding how to use AWS Machine Learning service to automatically create Model, evaluate Model and call API for prediction.
  • Prepare your data to CSV format and upload to S3, then rest of modeling part and evaluation result AWS will create for you.
  • There are also other sources for you to import real production data such as RDS / RedShift ...etc...
  • The visualization is easy for you to evaluate the model
  • There are APIs for you to do prediction based on your created models.
  • Batch prediction
  • Real time prediction
  • The hardest part is "How to prepare your data and feature from raw data?"
  • The AWS Machine Learning document is worth to read! You can have basic understanding of Machine Learning concepts and how AWS did internally.

References



留言

這個網誌中的熱門文章

3M UVA3000 更換濾芯紫外線燈匣

用了一年的3M濾水器提示說要換濾芯和燈匣 上 Youtube 想找教學的影片可是沒看到 UVA 3000 的 經過了一番奮戰後在這邊記錄一下 希望可以幫助後人,以免再重蹈覆轍。 Step 1. 拔掉插頭,把淨水器從牆上拿下來(基本上他是掛著而已),比較方便施工。 Step 2. 把前蓋往上拉,很容易就可以看到裡面的東西了。 Step 3. 打開後可以看到有兩個柱狀體,左邊的是燈匣,右邊的是濾芯。 Step 4. 這裡有個祕技是,這兩個柱狀體是可以往上 翻開30 度左右,這樣就可以有比較大的空間施工。 Step 4. 更換濾芯的話,柱狀體的瓶身上有箭頭,往左就是轉開,往右就是鎖緊。 Step 5. 更換燈匣的話比較麻煩一點,因為他底部是電源,頂部的右邊有個突出來的小方塊。對照淨水器上方連接處的話會有個弧形的凹槽,這是要 match 的.如果你只注意瓶身的箭頭往右鎖回去,就會造成漏水...Orz... Step 6. 把前蓋蓋回,機器掛回牆上,插插頭,開水,如果機器沒有告訴你有燈匣異常或漏水的話,就可以長按 C / UV  Reset 計數器了. 所以關鍵字就是,要往上翻 30 度,燈匣上面的小凸點要在右側,要看瓶身的 小箭頭. May it helps!

MySQL CONVERT_TZ return NULL

在 local dev 環境想要 reproduce 一個 bug 的時候, 卻發現在我的環境 MySQL store procedure 的行為和 production 機器上的不一樣 原本以為是 store procedure 的邏輯有問題 最後發現原來是 CONVERT_TZ() 搞的鬼... 因為我的 local dev 環境是自己從零開始 setup 的 沒想到 CONVERT_TZ 這個 MySQL 內建的 function 需要一些 initial data 不然只要丟給他轉換的日期時間都會回傳 NULL.... 解決方法就是 mysql_tzinfo_to_sql / usr / share / zoneinfo | mysql - u root mysql 這樣就會把 zoneinfo 轉成 CONVERT_TZ 所需要的資料 我的 local dev 環境的行為就恢復正常啦~ Reference: http://stackoverflow.com/questions/14454304/convert-tz-returns-null

Angular 2

Angular 2 似乎不錯的樣子? ECMAScript 6 的 class 和 Angular 2 語法的大改版 感覺似乎直覺多了... 從 Server Side 的 PHP Framework + jQuery 換成 Angular 2 的純 Frontend 可以真正擺脫後端做一套, 前端又要做半套 然後又難以測試的問題嗎? 或許 Angular 2 是一個值得投資時間關注的新方式? Angular 2 Preparation - Part 1 - Code Structure Comparison Free New Angular 2 Preparation Course An Angular2 Todo App: First look at App Development in Angular2 https://angular.io