RSH 016 internal: debugging

# RSH 016 internal: debugging ## Introduction - This is a really challenging topic to talk about, since there are - So many ways to go wrong - So many ways to fix it - No one right answer, it's more of an art than a science - https://en.wikipedia.org/wiki/List_of_software_bugs ## Types of bugs - syntax errors - runtime errors - Hard errors that are clearly reported as errors - results are wrong (like the kelvin example) - Heisenbug: trying to study the bug changes it, e.g. adding print statements changes timings. Memory locations change. etc - Compiled with or without optimizations. Stepped through with debugger to eliminate race conditions. - you add a print statement and bug goes away: memory bugs - schrödinbug: it never worked, but you never noticed it. - Local (clearly identify to one line) vs systematic (a property of the whole system) ## Approach to debugging - Reality not as fancy as you might thing - Take a break - Reducing size of the problem - Make it run faster, but still produce the bug (smaller input data, less iterations, etc) - Removing degrees of freedom: Disable optional features until you get straight to it - i thought efficient debugging is like efficient tree-search of possibilities "inside our head": eliminating as big branches as possible as early as possible. - Finding the point of the problem - bisection - git bisect: when you have good version control, when was it introduced? - Turn off optional features and see if the problem still occurs - "Deactivating code"/skipping code to locate memory problems: making the result wrong but making the code not crash - git grep - it can be useful to make the code produce "not scientifically meaningful" results for the sake of debugging - Now, you roughly know the point. What now? - Reading error messages - How to approach stack traces and finding the problem - How to pick the interesting part out of error message: read from both the top and the bottom - internet-searching solutions with the right error message - Turn it into a unit test or example script, that is self-contained - Can you make the example portable to someone else to try (git, conda, containers, etc?) - This is basically a prerequisite to asking someone for help - Asking for help - When to ask for help (after you have narrowed it down) - What to include when asking (not "it doesn't work") - what to do if it crashes/fails in somebody else's code (library or package) ## Preparing code for debugging - various points from Zen of Python, for example "Errors should never pass silently / unless explicitly silenced", but there is more relevant in it too - print debugging - writing good error messages, catching errors the right way - don't trap all errors and ignore them - logging and verbosity - shell: set -x - -v, -q - stdout vs stderr for printing stuff - logging module **show example (rkdarst)** - Everything defined in levels: debug, info, warning, error, critical - https://github.com/NordicHPC/envkernel/blob/master/envkernel.py - assertions, programming for safety - shell script strict modes - set -e ; set -u ; set -o pipefail . often done as set -euo pipefail - debug compile flags ## Debuggers and debugging tools - "debuggers do not remove bugs, they run your code in slow motion" - **gdb/pdb type interface (AF)**, let it run until error happens - demo-debugging/fortran/bugs - explicit error to make it start - print statements - Useful as starting point - Maybe use logging module instead? - jupyter: **%%debug (RD)** - import IPython ; IPython.embed() or or from code import interact; interact() - **Valgrind and memory bugs (Radovan)** - C/C++ example with 3 memory bugs (use after free, memory leak, out of bounds access) - **debugging through IDE**; remote debugging - **variable inspector (RD)** ## Bonus if we have time - git bisect example