1.1 What is Machine Learning?

# 1.1 What is Machine Learning? algorithm： a sequence of instructions that should be carried out to transform input to ouput. > 舉例來說，一個 sorting 的 algorithm，input 是一個 set of numbers，output 是 ordered list。 > > 要做到這件事情可能有多種不同的 algorithm，我們希望找到最 efficient 的那個，「最 efficient」可能代表 instruction 數最少，或是 memory 需求最少，或兩者皆是。 $\rightarrow$ 不過對有些 task 來說，可能不存在這樣的 algorithm > 例子： > - 區別垃圾郵件 >> 我們有 email 本身（input），直接去看也可以知道它算不算是垃圾郵件（output），但我們不知道怎麼從這個 input transform 成 output。況且到底怎樣算是垃圾郵件，可能不同的時間會有不同的認定，對不同人也有不同的認定。 $\rightarrow$ 我們希望透過數以百萬的 email，其中有些是 spam 而有些不是，來 "learn" 到底是什麼構成了使它是 spam？ :::success 我們希望 computer (machine) 可以自己自動 extract 一個 algorithm。 ::: data mining：把 ML 的方法應用到很大的 database > 這個詞是一個 analogy，也就是從地球的很大一部分，去挖出一小部分珍貴的礦石；在 data mining 中，我們透過處理很大量的 data 來建立出一個有價值的 simple model。 >> - 「有價值的」可能例如 predictive accuracy 很高 >> - 應用的範圍很廣，例如在金融業中，利用 past data 建 model 來做 fraud detection 不過，++ML 不只是一個 database 的 problem，它也和 AI 有關++，為了達到 intelligent，一個在不斷變動的環境中的 system 應該要有學習的能力，如果這個 system 能夠 learn, adapt to changes，設計這個 system 的人就不用會預測，也不用提供所有可能情況下的不同解答。舉例來說，ML 應用在人臉辨識： > 我們平常每天只要看到某個人的臉或他的照片，就可以輕鬆辨認我們的家人和朋友，就算他們可能有著不同的動作，或是光線不一樣，換了新的髮型。我們每天是無意識地在做這件事，也沒辦法說明我們到底怎麼做到的；因為我們沒辦法說明，我們就也沒辦法寫一個 program 來做這件事。 > > 不過，我們知道一張臉的 image 不是由隨機的 pixels 組合而成，一張臉會有特定的 structure（眼睛、鼻子、嘴巴在特定的位置），而且是對稱的。 > > 透過分析某個人的 face image 的 samples，一個 learning program 可以找到這個人臉部特定的 pattern，也就能透過去檢查某個 image 是否滿足這樣的 pattern，來 recognize 是否是那個人。 ML 是利用 example data 或 past experience，去 program computers，以 optimize 一個 performance criterion。我們會有++一個由一些 parameters defined 的 model++，而 learning 就是去執行一個 computer program，透過 training data 或 past experience，去 ++optimize 這些 parameters++。這個 model 可能是 predictive 的，也有可能是 descriptive 的；predictive 的也就是他可以對未來去做一些預測，descriptive 的則是增加我們對 data 的 knowledge（當然也可能同時是 descriptive 的，也是 predictive 的。） ML 用統計的理論去建立數學模型，因為最核心的任務就是去從一個 sample 中去做推論。在這裏，computer science 的角色有兩個部分： 1. training 的部分：需要 efficient 的 algorithm 去解決 optimization problem，以及去存放、處理大量現有的 data 2. 透過 learning 得到一個 model 以後，model 本身的 representation 和推論的 algorithmic solution 也要是 efficient 的 > 有可能在特定的應用之下，time / space complexity 和預測正確率一樣重要。