Try   HackMD

01 資料來源與資料庫

MySQL Workbench Community Edition

https://dev.mysql.com/downloads/workbench/
MySQL Workbench Community Edition 8.0.34
在 MacOS 14.2.1 一直當掉,不推薦使用

推薦使用 https://github.com/Sequel-Ace/Sequel-Ace
雖然功能比較少,但還滿穩定的

關聯式資料庫

一對多

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

其中的「病患 ID」的
* 代表 primary key
** 代表 foreign key

primary key 是由一個或數個欄位組成,必須是 unique ,而且不能是 NULL

NOTE
NULL 代表的是 empty 而不是 blank

多對多

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

利用中介表 (junction table) 記錄多對多的關係

1.4 維度資料倉儲

資料倉儲通常是採用維度建模的設計方式
維度建模就是將資料分成「事實」和「維度」

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

如上圖,我們將醫生看診的資料轉成星型結構,有一個預約事實表記錄各種資料。
這個「預約的事實」又可以切分成各種分析的維度

  • 病患維度
  • 病患年紀維度
  • 醫師維度
  • 日期維度
  • 時間維度
  • 預約原因維度

譬如我們將醫生維度與病患年紀的維度組合起來,未來我們就可以分析出某位醫生比較受到哪個年紀的病患歡迎。

1.5 對資料來源提出疑問

這邊是問題範例,自己看

1.6 農夫市集資料庫

請自行下載匯入
https://www.flag.com.tw/bk/st/F3234

1.7 資料科學的術語

  • 每一列的值組合可以作爲訓練模型的輸入,這一列通常被稱爲訓練實例 instance
  • 每列的每個欄位就是一個「特徵值」feature
  • 模型試圖預測的輸出特徵則稱爲「目標變數」