跳到主要內容

AWS Machine Learning Workshop


Machine Learning Concepts

  • Apply AWS ML to problems you have existing samples of actual answers
  • For example, to predict if new email is spam or not, you need to collect examples of spam and non-spam.
  • Binary classification (true / false)
  • Is spam or not spam, churn, will customer accept campaign?
  • Multiclass classification (one of more than two outcomes)
  • Regression (numeric number)
  • Building a Machine Learning Application
  • Frame the core ML problems
  • Collect, clean and prepare data
  • Features from raw data
  • Feed to learning algorithm to build models
  • Use the model to generate predictions for new data

Linear Models

  • Leaning process computes one weight for each feature to form a model that can predict the target value
  • For example, estimated target = 0.2 + 5 * age + 0.00003 * income

Learning Algorithm

  • Learn the weights of the model
  • Loss function: penalty when estimate target provide by the model not equal exact result
  • Optimization technique: minimize the loss (Stochastic Gradient Descent), during each passes updates the feature weights one example at a time with the aim of approaching the optimal weight that minimize the loss.
  • For binary classification, Amazon ML uses logistic regression (logistic loss function + SGD).
  • For multiclass classification, Amazon ML uses multinomial logistic regression (multinomial logistic loss + SGD).
  • For regression, Amazon ML uses linear regression (squared loss function + SGD)

Evaluate Model Accuracy

  • 70% to build up model, 30% for evaluation
  • Binary classification, 0.5 almost same use random guessing

Workshop

  • Download samples from  http://bit.ly/john-2017ml-labdata, create a S3 bucket and upload 3 csv files into that S3 bucket.
  • churn_new.csv => create data source from s3 file link => create model => use custom receipt
  • With 3334 records has column "State,Account Length,Area Code,Phone,Intl Plan, VMail Plan, VMail Message, Day Mins,Day Calls,Day Charge,Eve Mins,Eve Calls,Eve Charge,Night Mins,Night Calls,Night Charge,Intl Mins,Intl Calls,Intl Charge,CustServ Calls,Churn?", once you import them into AWS ML, you will automatically have a model used to predict a customer will leave or continue subscription.
  • 70% of imported data will be used to build up model, 30% will be used to evaluate the accuracy of the model.
  • banking.csv => create data source from s3 file link => create model => use default 
  • banking-batch.csv => create batch prediction from model above 

Thoughts

  • This 3 hour workshop is easy and help you have basic understanding how to use AWS Machine Learning service to automatically create Model, evaluate Model and call API for prediction.
  • Prepare your data to CSV format and upload to S3, then rest of modeling part and evaluation result AWS will create for you.
  • There are also other sources for you to import real production data such as RDS / RedShift ...etc...
  • The visualization is easy for you to evaluate the model
  • There are APIs for you to do prediction based on your created models.
  • Batch prediction
  • Real time prediction
  • The hardest part is "How to prepare your data and feature from raw data?"
  • The AWS Machine Learning document is worth to read! You can have basic understanding of Machine Learning concepts and how AWS did internally.

References



留言

這個網誌中的熱門文章

flash tomato firmware on ASUS RT-N12-C1

Tomato by Shibby For RT-N12-C1 we have to download from: K26RT-N – MIPSR2 – special builds for E4200, RT-N10U, RT-N12B1/C1, RT-N15U, RT-N53, RT-N66U, WNR3500Lv2 and newer Linksys E-series routers Step: 1. Reset AP to factory default 2. Setup staic IP for you desktop or laptop 3. Unplug power 4. Press the reset button in the back of AP the plug power 5. Wait until the pwoer led falsh slowly 6. Open browser and connect http://192.168.1.1 7. You should see a firmware upload page, select the tomato firmware and upload it 8. After upload success wait 5 minutes 9. Connect http://192.168.1.1, if you see the tomato webpage, you have done the job!

3M UVA3000 更換濾芯紫外線燈匣

用了一年的3M濾水器提示說要換濾芯和燈匣 上 Youtube 想找教學的影片可是沒看到 UVA 3000 的 經過了一番奮戰後在這邊記錄一下 希望可以幫助後人,以免再重蹈覆轍。 Step 1. 拔掉插頭,把淨水器從牆上拿下來(基本上他是掛著而已),比較方便施工。 Step 2. 把前蓋往上拉,很容易就可以看到裡面的東西了。 Step 3. 打開後可以看到有兩個柱狀體,左邊的是燈匣,右邊的是濾芯。 Step 4. 這裡有個祕技是,這兩個柱狀體是可以往上 翻開30 度左右,這樣就可以有比較大的空間施工。 Step 4. 更換濾芯的話,柱狀體的瓶身上有箭頭,往左就是轉開,往右就是鎖緊。 Step 5. 更換燈匣的話比較麻煩一點,因為他底部是電源,頂部的右邊有個突出來的小方塊。對照淨水器上方連接處的話會有個弧形的凹槽,這是要 match 的.如果你只注意瓶身的箭頭往右鎖回去,就會造成漏水...Orz... Step 6. 把前蓋蓋回,機器掛回牆上,插插頭,開水,如果機器沒有告訴你有燈匣異常或漏水的話,就可以長按 C / UV  Reset 計數器了. 所以關鍵字就是,要往上翻 30 度,燈匣上面的小凸點要在右側,要看瓶身的 小箭頭. May it helps!

Application Load Balancer lambda endpoint healthy check will be charged

AWS 還是有蠻多坑的... 如果 ALB 的 TargetGroup 使用 lambda endpoint 那麼為了避免 code start issue 可能會使用 provisioned concurrency 另外 lambda endpoint 在 update stack 的時候會出現 Load Balancer not able to stabilizied 的問題。AWS support 目前給的work around就是開啟 ALB healthy check 預設是每 35 秒會做一次 healthy check 然後如果 ALB 跨3個 AZ 的話,healthy check count 就 x3 .... 然後每次的 lambda execution 都是照 lambda function usage 來收費 另外ALB的 healthy check 不會有完整的 request header 如果你的 framework 不預期有這種不正常的 header 沒有去 handle 的話 可能會一直狂噴500... 如果你還沒有把預設的 lambda error retry 關掉的話...... 這樣子用 lambda endpoint 真的比較省嗎?....