###### tags: `選修` # 大數據的分析與應用 老師:吳帆 教授 :::spoiler 考試必考 ::: --- :::spoiler 作業 1. 分析 3C 2. 上台5分 ::: --- :::spoiler 基礎知識 [數據分析入門:了解分析的本質與基本思維能力](https://procrustes.info/analysis-basic/) ::: --- ## 章節 ### 2022/02/19 Chap1-大數據分析與應用序論 ### 2022/03/05 Chap2-Database vs. Datawarehouse ### 2022/03/19 Chap3 --- :::spoiler 2022/02/19 Chap1-大數據分析與應用序論 #### Mental models:如何看世界(同樣的資料,會有不同的解讀) * Your statistical model depends on your mental model. ![](https://i.imgur.com/iAtLJHA.png) --- #### data analysis activities 1. Define:定義問題 2. Disassemble:拆解 3. Evaluate:評價 4. Decide:決定 ![](https://i.imgur.com/q3wKGD7.png) --- #### Furthermore Row Data ![](https://i.imgur.com/FqTpySx.png) --- ::: --- :::spoiler 2022/03/05 Chap2 Database vs. Datawarehouse ### Data Warehouses ![](https://i.imgur.com/8e3e9mi.png) * Time-variant: * Data stored to provide information from a historical perspective * Nonvolatile * requiring two operations: initial loading of data and access of data ### Dataware Architecture ![](https://i.imgur.com/hMIDBcw.png) ### The lattice of a 4-dimensional cube. ![](https://i.imgur.com/sFY74zf.png) [數據立方體](https://www.twblogs.net/a/5c8f4f70bd9eee35fc154770) ### Methods of data reduction * Strategies of data reductions * data aggregation (e.g., sum, mean) * attribute subset selection (e.g., removing irrelevant attributes) * dimensionality reduction (e.g., using encoding schemes) * numerosity reduction (e.g., “replacing” the data by smaller representations such as clusters or parametric models) ::: --- :::spoiler 2022/03/19 Chap3 ## Data reduction:attribute ssubset sselection ![](https://i.imgur.com/4nu0eWg.png) ## 決策樹 ![](https://i.imgur.com/NLqiagn.png) ![](https://i.imgur.com/AT5icbl.png) ![](https://i.imgur.com/86wi4id.png) ![](https://i.imgur.com/gyePXxz.png) ![](https://i.imgur.com/HWdd8BO.png) :::