--- tags: M2 Internship --- Good Practices for Jupyter Notebooks ==================================== Head ---- - All **imports** should be at the **top** of the notebook (after the title and possibly some markdown cells for description and before any other code). Use `isort` to sort the imports. - Follow the import with the various configurations, if any (e.g., `pd.set_option('display.max_columns', 100)`). - Add the paths as constant just afterward (e.g., `DATA = Path('data')`) to ensure that another user can easily change the path at a single easily found place. Use `pathlib` for file paths. - Next, put the **constants** (e.g., `SEED = 42`). Functions --------- - Then, put all functions (1 cell per function). Possibly put them in an external module and import them in the top code cell. - Describe what the functions do in docstring, and possibly use `doctests` to illustrate with values and check for regression errors. Main Code --------- - Finally, put the main code separating it in cells of reasonable size (e.g., 10-20 lines). - Use tables and plots to present the results. - Use markdown cells to describe the code and the results. Style ----- - Use `black` to format the code. Ensure no line is longer than 79 (or 89) characters (including comments). - **Keep the notebook clean**. Remove unused imports, commented code, and unnecessary print statements. - **Keep the notebook short**. A notebook should be used to investigate and present a single idea or topic, not a whole project (unless the project is tiny).