數據分析
機器學習
AUC=0.5 (no discrimination 無鑑別力)
0.7≦AUC≦0.8 (acceptable discrimination 可接受的鑑別力)
0.8≦AUC≦0.9 (excellent discrimination 優良的鑑別力)
0.9≦AUC≦1.0 (outstanding discrimination 極佳的鑑別力)
可同時classification跟regression
通常用來classification
無變數的
只需要決定k值
k : 有多少鄰近的類別
k = 1
k = 3
k = 7
Manhattan distance(L1 distance)
Euclidean Distance(L2 distance)
https://www.kaggle.com/c/titanic
It is your job to predict if a passenger survived the sinking of the Titanic or not.
For each in the test set, you must predict a 0 or 1 value for the variable.
Your score is the percentage of passengers you correctly predict. This is known as accuracy.
You should submit a csv file with exactly 418 entries plus a header row. Your submission will show an error if you have extra columns (beyond PassengerId and Survived) or rows.
The file should have exactly 2 columns:
PassengerId (sorted in any order)
Survived (contains your binary predictions: 1 for survived, 0 for deceased)
PassengerId,Survived
892,0
893,1
894,0
Etc.