# Intermediate R: Machine Learning
https://hackmd.io/@k8hertweck/RML
**Sign in to each class meeting** [here](https://goo.gl/forms/j4MbWJuPoIYeJET12)
This page is for easy access to links we'll use during class. Bookmark it for future reference.
Have you installed Anaconda and run the conda script to add plotnine? Instructions [here](http://www.fredhutch.io/software/#python-jupyter-notebooks)
If you have feedback about this course, please [comment here](https://goo.gl/forms/Bw8dTV0Wghq2iG5i2)
Complete class notes [here](https://github.com/fredhutchio/python_machine_learning)
If you're having trouble viewing a notebook directly in github, try pasting the url into this [Jupyter Notebook Viewer](https://nbviewer.jupyter.org/) to render the notebook as a static webpage.
_**Note on directory structure**_: Whatever you name your directory for this course, make sure it has these three directories inside: `data/`, `img/`, and `notebooks/`
**Week 1: Machine Learning and CRISP-DM Overview; EDA and Data Preparation**
* Files
* [week1.ipynb](https://raw.githubusercontent.com/fredhutchio/python_machine_learning/master/notebooks/week1.ipynb) <-- using `save as...`, save this notebook inside `notebooks/` making sure it has the `.ipynb` filename extension
* (1) [commute-times-train.csv](https://github.com/fredhutchio/python_machine_learning/raw/master/data/commute-times-train.csv); (2) [commute-times-test.csv](https://github.com/fredhutchio/python_machine_learning/raw/master/data/commute-times-test.csv) <-- using `save as...`, save these 2 files inside `data/` making sure they have the `.csv` filename extension
* Reference
* [Week 1 Slides from Concepts in ML](https://github.com/fredhutchio/concepts_machine_learning/blob/master/slides/ML_concepts_wk1_slides.pdf)
* [CRISP-DM](https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining)
* [Machine Learning](https://en.wikipedia.org/wiki/Machine_learning)
**Week 2: Case Study in Regression**
* Files
* [week2.ipynb](https://raw.githubusercontent.com/fredhutchio/python_machine_learning/master/notebooks/week2.ipynb) <-- using `save as...`, save this notebook inside `notebooks/` making sure it has the `.ipynb` filename extension
* Reference
* [Week 2 Slides from Concepts in ML](https://github.com/fredhutchio/concepts_machine_learning/blob/master/slides/ML_concepts_wk2_slides.pdf)
* [Supervised Learning](https://en.wikipedia.org/wiki/Supervised_learning)
* [Regression Analysis](https://en.wikipedia.org/wiki/Regression_analysis)
**Week 3: Classification with Logistic Regression and Random Forests**
* Files
* [week3.ipynb](https://raw.githubusercontent.com/fredhutchio/python_machine_learning/master/notebooks/week3.ipynb) <-- using `save as...`, save this notebook inside `notebooks/` making sure it has the `.ipynb` filename extension
* [Confusion_Matrix.png](https://github.com/fredhutchio/python_machine_learning/raw/master/img/Confusion_Matrix.png) <-- using `save as...`, save this image file inside `img/` making sure it has the `.png` filename extension
* [tennis.txt](https://raw.githubusercontent.com/fredhutchio/python_machine_learning/master/data/tennis.txt) <-- using `save as...`, save this text file inside `data/` making sure it has the `.txt` filename extension
* Reference
* [Week 3 Slides from Concepts in ML](https://github.com/fredhutchio/concepts_machine_learning/blob/master/slides/ML_concepts_wk3_slides.pdf)
* [Statistical Classification](https://en.wikipedia.org/wiki/Statistical_classification)
* [Logistic Regression](https://en.wikipedia.org/wiki/Logistic_regression)
* [Confusion Matrix](https://en.wikipedia.org/wiki/Confusion_matrix)
* [Receiver Operating Characteristic (ROC Curve)](https://en.wikipedia.org/wiki/Receiver_operating_characteristic)
* [Random Forest](https://en.wikipedia.org/wiki/Random_forest)
**Week 4: PCA and Clustering; Case Study in Unsupervised Learning**
* Files
* [week4.ipynb](https://github.com/fredhutchio/python_machine_learning/raw/master/notebooks/week4.ipynb) <-- using `save as...`, save this notebook inside `notebooks/` making sure it has the `.ipynb` filename extension
* (1) [NCI60_X.csv](https://github.com/fredhutchio/python_machine_learning/raw/master/data/NCI60_X.csv); (2) [NCI60_y.csv](https://github.com/fredhutchio/python_machine_learning/raw/master/data/NCI60_y.csv); (3) [USArrests.csv](https://github.com/fredhutchio/python_machine_learning/raw/master/data/USArrests.csv) <-- using `save as...`, save these 3 files inside `data/` making sure they have the `.csv` filename extension
* [pca.gif](https://github.com/fredhutchio/python_machine_learning/raw/master/img/pca.gif) <-- using `save as...`, save this notebook inside `img/` making sure it has the `.gif` filename extension
* (1) [clusters.png](https://github.com/fredhutchio/python_machine_learning/raw/master/img/clusters.png); (2) [iris-measurements.png](https://github.com/fredhutchio/python_machine_learning/raw/master/img/iris-measurements.png); (3) [kmeans.png](https://github.com/fredhutchio/python_machine_learning/raw/master/img/kmeans.png); (4) [letters-dendrogram.png](https://github.com/fredhutchio/python_machine_learning/raw/master/img/letters-dendrogram.png); (5) [letters-grouped.png](https://github.com/fredhutchio/python_machine_learning/raw/master/img/letters-grouped.png); (6) [letters-ungrouped.png](https://github.com/fredhutchio/python_machine_learning/raw/master/img/letters-ungrouped.png) <-- using `save as...`, save these 6 files inside `img/` making sure they have the `.png` filename extension
* Reference
* [Week 4 Slides from Concepts in ML](https://github.com/fredhutchio/concepts_machine_learning/blob/master/slides/ML_concepts_wk4_slides.pdf)
* [Unsupervised Learning](https://en.wikipedia.org/wiki/Unsupervised_learning)
* [Clustering Analysis](https://en.wikipedia.org/wiki/Cluster_analysis)
* [Principal Component Analysis (PCA)](https://en.wikipedia.org/wiki/Principal_component_analysis)
* [Curse of Dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality)
**Resources for continued learning**
* Learn about other courses through fredhutch.io [here](http://www.fredhutch.io/resources/).
* The Fred Hutch Bioinformatics and Data Science Cooperative, or the Coop, hosts many community meetings and office hours about data science. Learn more information about these groups [here](https://research.fhcrc.org/coop/en/community/hosted-groups.html),
* Join the [Coop Community Slack](https://join.slack.com/t/fhbig/shared_invite/enQtMzUyMDIxNzk3MDU3LWE5NGUyMTY1NGU0N2VmMmEyNTM5YzM1MmNlMTk2YmM1OWNkMmJiNTQxMTQ4OTNkMTFjMjk3M2Q0MzkwYzQ3NDA) to talk about data science with other Hutch researchers!
* The [Fred Hutch Biomedical Data Science Wiki](https://sciwiki.fredhutch.org) is written by Hutch researchers and staff, and is a great place to find information about data management, bioinformatics, computing, and more.
###### tags: `fredhutch.io` `Machinelearning` `python`