# Cookiecutter Data Science ## Intro - https://drivendata.github.io/cookiecutter-data-science/ - [5min video tutorial](https://www.youtube.com/watch?v=ugGu8fHWFog) ## Notes & Manifesto - Why use this project structure? - Other people will thank you - You will thank you - Directory structure - **Data is immutable** - `data/raw` and `data/external` are read-only folders!!! - it doesn't need source control in the same way that code does - Notebooks are for exploration and communication - Keep secrets and configuration out of version control # Cookiecutter - Install `cookiecutter` - `pip install cookiecutter` - `conda install -c conda-forge cookiecutter` - (linux only) `apt-get install cookiecutter` - (mac only) `brew install cookiecutter` - demo - set up the project - `cookiecutter git@github.com:drivendata/cookiecutter-data-science.git` - `make help` - `make create_environment` - `make test_environment` - `conda activate my_project` - `make requirements` - git? - `git init` - create remote on GitHub - connect local to remote - `git remote add origin git@github.com:pareyesv/my_project.git` - `git add .` - `git commit -m "chore: initial commit"` - commit - `git push -u origin master` - package contents - ```python import src help(src) ``` - Jupyter Notebooks - activate (pyenv env) (conda env) - install kernel - add `ipykernel` to `requirements.txt` - `python -m ipykernel install --user --name my_project --display-name "Python (my_project)"` - write ```python # file src/visualize/visualize.py def hello(): print("hello") ``` - jupyter notebook example ```python %load_ext autoreload %autoreload 2 import src.visualization.visualize as vis vis.hello() ``` - documentation - `cd docs/` - `make html` - try the [RTD theme](https://pypi.org/project/sphinx-rtd-theme/) - autodoc - add `.. autoclass:: src.visualization.visualize.Greetings` - add `extensions = ['sphinx.ext.autodoc', 'sphinx.ext.napoleon']` # More templates - Lists - http://cookiecutter-templates.sebastianruml.name/ - https://github.com/topics/cookiecutter-template - https://github.com/search?q=cookiecutter&type=Repositories - https://cookiecutter.readthedocs.io/en/1.7.0/README.html#categories-of-cookiecutters - Examples - docker - https://cookiecutter.readthedocs.io/en/latest/README.html#available-cookiecutters/cookiecutter-docker-science - DVC - https://github.com/pprados/cookiecutter-datascience # Similar projects - [PyScaffold DS](https://github.com/pyscaffold/pyscaffoldext-dsproject)