# ARCHIVED How to debug code 2022 :::danger ## Infos and important links * This document is the archival of the chats during the 2022 edition of the debugging course * The live document is at: https://hackmd.io/@AaltoSciComp/Debug2022 * Program: https://scicomp.aalto.fi/training/scip/debugging-2022/ ::: --- # Day 1 - Debugging with python :::spoiler The material is here: https://aaltoscicomp.github.io/python-debugging/ - If you plan to type-along or do the exercises, please git-clone the following two repositories: - https://github.com/AaltoSciComp/python-debugging (The examples subfolder has a few files of interst.) - https://github.com/AaltoSciComp/double-pendulum (example project) After the presentation, you can do the exercises or work on your own code, and the course staff will try helping you. ## Icebreaker **Can you share your most memorable python bug, and perhaps how you found it?** - As boring as it sounds, coming from Matlab my usual early days bugs (= mistakes) were about indexing arrays starting from 0 vs 1 (I still think starting from 1 makes more sense :) ). I usually found it with old school print statements. - Not so much a bug with python but a wrong assumption on some input data. I was assuming it to be in a certain format based on a couple of examples and it took me quite some time to figure out why the results looks so odd until I noticed, that for some data points one element contained invalid data. - I created a local package (directory in the project folder) with the same name as a pip installed package. Import worked fine, but functions were missing or had the wrong fingerprint! Figured it out after removing the system level package, the import still worked in one directory :smile: - Not sure if this is bug. I was surprised when a Python function created a new object and my Pandas tabular got not updated as expected. Otherwise, the bugs I found so far in package raised expectations and then it was clear it was a bug. Another one: creating by accident a global variable by having a typo in the function. No, the method created a new object. Yes! - Typing `&&` instead of `and` :smile: :+1: - wrong indentation - I always get confused if Python passes the value or reference. - For immutable elements: value, for mutable objects: the reference - Using empty list as a default argument in function **Which institution and/or department are you from?** - Aalto University, School of Sciences - Aalto University, School of Engineering, Department of Built Environment - Aalto, SCI, NBE - Aalto, SCI, Applied physics - University of Vaasa, School of Technology and Innovations - Aalto, NBE ## Introduction *Material at https://aaltoscicomp.github.io/python-debugging/introduction/* ## Python features relevant for debugging ### QUESTIONS - In Python I can tell a function what type of argument a function can get. There is the notation of : by giving the type. Or does he mean something else? - You can do that, but it is not required, and in a lot of code this is not done, so you often end up (at least in non-well maintained libraries) with code where you don't necessarily know what comes in. - This syntax is mentioned in the material, where it says it's mostly a suggestion. So I guess you can call the function with wrong types? - Just tried. Indeed you can still call with any argument - I can call a function with any variable, even if the argument has a defined typed of a variable? Oh no... - Yes, you can. The "defined" types are just hints/documentation, unless you make hard assertions in the code. - Thanks for clarification. I use sometimes `assert` or `isinstance` to check. - What's the difference between that and str(x)? - str(x) tries to create a new string object (and I would expect that it does call __str__), while __str__ represents a function that can be defined on each object individually. - ... - The other day I created by accident with scoping a global variable. How can I protect myself against such mistakes, please? In particular in conbination with Jupyter notebooks, where the order of cell execution is maybe not linear. Wondered the other day, what to do. I was given the advice to do OOP, not just script. Could this help? - As far as I know there are only two ways to create a new scope: functions and classes. I guess they mean "use classes" when suggesting OOP. - One option would be to create a "main" function and only define variables in functions. So not variables on top level, only in functions. - As for the global variable use: ``` def set_x(): global x x = 1 ``` This function will change the outer x variable. - In the just discussed example with the list, what would happen if I have not `append_to` as returned variable. It is also used as input argument for the function. Could I avoid this problem then or not? I just return another variable? - The problem not returning it is that if you don't give it, your resulting list ends up unaccessible (or well, it would be accessible via append_to_list.__defaults__, but not otherwise.). That's partially why this take an argument and return a modification is somewhat unpythonian, a function should either create a new object OR return a modified objecte from the input. - I have still an understanding problem, but how can I avoid this, please? - If you set `append_to=None` in the function call (`append_to_list(value, append_to=None)`) and then do: ``` if append_to == None: append_to = [] ``` You create a new list every time the function is called without a list but it always is a NEW list and not the list stored in the default arguments. - Example of the closure problem: ``` index = 1 def closure_problem(i): global index index = i def new_function(x): return x+index return new_function f1 = closure_problem(1) f1(1) # output: 2 f2 = closure_problem(2) f1(1) # output: 3 ``` - Not a very realistic example, but maybe makes it clearer? Personally I have never encountered this problem in practice. - I tried once to debug with pdb, but it said too many levels down or something like this. Then I did not know what to do anymore and I gave up. Is there a limit of levels down in code? - This sounds like you had a very deep recursion in your code, and very likely this amount of recursions was actually the bug (i.e. your did more recursions than intended due to some bug). - Python has a limit of how many function calls deep you can go. Usually you don't want to have that many nested calls. If you do, there is a way to change the limit. ``` import sys new_recursion_limit=1500 sys.setrecursionlimit(new_recursion_limit) ``` - Example of a loop instead of recursion: ``` def recursive(n): if n==0: return 1 return n*recursive(n-1) def not_recursive(n): stack = [n] while True: loop_n = stack[-1] if loop_n == 0: result = 1 break else: stack.push(loop_n-1) while True: loop_n = stack[-1] result *= loop_n return result ``` - Sorry, not the clearest implementation. The idea is that you run the function until the recursive function call in the first loop, then push to a stack (instead of calling the function), and finish in the second loop. - Thank you. Could also explain why it is better to write this loop? Or was this overall bad programming from my side that I run into this problem in the first place? - It wasn't necessarily bad programming. But in general, if you end up with such deep recursion levels it is likely that using a loop is more efficient, because you have to e.g. allocate less memory (each recursion level needs some) and so on. Essentially, python protects by the recursion limit against stack-overflow errors. **How is the speed so far? (vote by adding an o to the option)** Too fast: o Too slow: oo Just right: ooo ## Defensive Programming ## The Python Debugger - I've maybe missed, can we download the example code? - https://github.com/AaltoSciComp/python-debugging , example code is in the examples folder - Do we use bdp instead of the IDE debugger? - We will demonstrate both pdb and how to use a debugger from spyder. But since we want to stay general, we don't only show debugging from spyder. - But how to set this up with the `flush=True` ? Okay. - `print('This is my message', flush=True)` - . - . ## Exercises **Are you planning on doing the exercises now** yes: oo :dog: :cat: no: oo - debugging looks similar to C#, is it? - At least it is similar to C, not sure about C# specifically. - Thanks! - - Will you discuss the exercises later? - Simppa: I was not planning to, they are rather open-ended; just motivation to try the tools out. - Jarno: But if you have any questions, do ask. - Seems I am quite lost in the double pendulum... So the recommendation is to run it with `pdb` and start at line 205, right? - Sure... Would it be possible to unmute? It - . ## Feedback: 1. Write one good thing about today - Good examples, interactivity (+1) - Encouraging to see that even experts can be still surprised by Python and thanks for sorting this out live. - I learned new nuances about Python that I was never aware of - I definitely learned new things and ways to avoid making bugs. Format of the course is well-structured. - . 2. One thing we could improve (and how we could improve it, if you know) - Maybe to use only one python editor? I am not familiar with all editors used today. - I found the pdb demonstration a bit confusing. That was hard to follow. - +1 I was wondering it might be easier if there were some more easy hands on (the "broken double pendulum" seems a bit harder to me) +1 - More example. Small examples and maybe even illustrations (e.g. when talking about references) - Are there any recommendations on incorporating IDE into workflow, when working on CSC superclusters? - The most common one used around here is probably spyder. But e.g. eclipse with PyDev also works quite well. In the end most IDEs are equally good. - I think it is better to show the exercises together, also lowers the barrier to get started with them. This pendelum exercise looked scary. ::: # Day 2 - Debugging with Matlab :::spoiler The material is here: https://github.com/AaltoSciComp/matlab-debugging ## Icebreaker **1. Can you share your most memorable matlab bug, and perhaps how you found it?** - It took me some time to realize that every time I restart matlab, the random number generator seed is set to be the same. This is problematic when starting many matlabs to do supposedly random independent computations. - Noticing that (at least in Matlab 2018b) the matlab unit test system does not clear global variables set during a test. This led to a very strange behaviour in between different operating systems, where we were testing different solvers and setting the solver was done via a global variable. So on some OS tests that did not specify a particular solver because the solver wasn't thought to be relevant, an unexpected solver was used which failed on the test, while on other OS the tests succeeded, which was extremely puzzling. - . - . - . **2. Which institution and/or department are you from?** - Aalto, SCI - Aalto, SCI - Aalto, SCI, Physics - . - . - . - . # Interpreting Error Messages QUESTIONS? Comments? - A commment about the pointwise operators: here is something un-intuitive: - ``.'`` Transpose - ``'`` Complex conjugate transpose (So, ``'`` is not transpose, but complex conjugate + transpose) - q1 - answer1 - answer2 - q2 - answer # Programming concepts in Matlab and the errors they generate ## Operator Precedence ### Exercise 1 till XX:40 #### Exercise url: https://github.com/AaltoSciComp/matlab-debugging/blob/main/Exercises.md#1-operator-precedence ### Questions: - A bit unsure now, which exercise file should we do now? - Exercise 1 : Operator Precedence - Ok, thanks - . **Are you done with the exercise? (answer by adding a o)** Yes: o No: ## Cell Arrays and Indexing ### Questions - . - . ### Exercise 2 till XX:20 #### Exercise url: https://github.com/AaltoSciComp/matlab-debugging/blob/main/Exercises.md#2-cell-arrays-and-indexing ### Questions: ### Exercise 3 till XX:45 #### Exercise url: https://github.com/AaltoSciComp/matlab-debugging/blob/main/Exercises.md#3-overloadingvisibility ### Questions: - Uhh, so only "non-overloadable keywords" would be the basic ones such as `for` and `if`, but otherwise everything can be overloaded? - yes. Anything that is not syntactically forbidden, can be overloaded. which means any function or variable. And my bad. its catch not except in matlab, so no, keywords can't be overloaded, but all variabled can (and often are)... - This is also a potential problem with long scripts, when variables are redefined somewhere and this is not noticed. So using encapsulated smaller functions can help avoiding these issues. - . ## Feedback: A good thing about today: - Exercises were good, I was able to +- do them in the time given. - . - . A thing we can improve about today: - I expected file `Excercises.md` to be in `exercises` directory, but that's minor thing I guess. - . - . ::: # Day 3 - Debugging with C/C++ :::spoiler The material is here: https://rantahar.github.io/debugging-c-cpp/ ## Icebreaker **1. Can you share your most memorable C/C++ bug, and perhaps how you found it?** - Boring one, but error with pointers and segemntation faults are most memorable when I started because of the hours and hours I had to spend fixing them (and learning them!) :) - My favorite bug was in a HPC code. It manifested only after running over 12 hours with at least tens of processors, and went away when I added a debug print. This was actually Fortran, but the same tools were used. Never figured out what was the root cause. - Coming from Java, I was trying to incorporate a c++ algorithm into a library. That algorithm used a map to retrieve some data, and had a custom comparator for the keys of the map. Unfortunately, that comparator was buggy and so the map always returned a default value, leding to the code producing garbage. Since in java a non existing object in a map is returned as null, and would cause a nullpointer exception it took me a long time to figure out what went wrong. - "Am I a base or a derived?" problems, e.g I created array of `Base*`, than called member functions --> Which one would be called? - .. - .. **2. Which institution and/or department are you from?** - Aalto, SCI - University of Vaasa, School of Technology and Innovations - Aalto SCI - .. - .. ## Introduction https://rantahar.github.io/debugging-c-cpp/introduction/ ### Questions? Comments? Poll: Are you interested in Fortran? yes: oo no: o - How often is Fortran used? (I.e. I have no idea how much is it used e.g. in scientific HPC... ) I might answer the Poll above according to it. - Not sure about active development, but quite some old code exists that is still in use, and needs to be interfaced or extended. - Physicists still use it a lot. In my field (fusion science), more than C. :+1: - What's your field, btw? :) ## Things about C/C++ https://rantahar.github.io/debugging-c-cpp/c_cpp_topics/ - The typical "With great power comes great responsibility" :smile: - Hold on, do you need to write `struct` keyword elsewhere then in definition? Or is it an example bug? - `std::string` is cpp string - [my reference to struct](https://www.learncpp.com/cpp-tutorial/introduction-to-structs-members-and-member-selection/#:~:text=COPY-,Accessing%20members,-Consider%20the%20following) -- there it's basically used like class, i.e. like Thomas says (no `typedef` needed) - Could be a difference between c++ and C ? - Is there an easy way to allocate `array` (or `std::vector`) and assigning it with zeros? I.e. without need to loop over everything and assign 0 manually? Or is allocated memory always some garbage unless you do something with it? - allocated memory has undefined contents. You cannot assume anything. - In C you would use calloc(), that zeroes allocated memory. - There is a function called fill: ``` std::vector<int> v; v.resize(10); std::fill(v.begin(), v.end(), 0); ``` - Is there any usecase to prefer macros over `constexpr` variables and `inline` functions? So far I see many good reasons to avoid macros in general. - Sometimes you need to go around the rules of the language, and then macros can be of help. ## Finding Memory Issues - download the example src code: https://raw.githubusercontent.com/rantahar/debugging-c-cpp/main/examples/memory_leak.c - compile with debug - `gcc -g -o memory_leak memory_leak.c` - `cl /Zi /MT /EHsc /Oy- /Ob0 memory_leak.exe memory_leak.c` ### Poll: Are you using or compiling for Windows? yes, using: o yes, compiling: o no: ooo ### Questions/comments? - .. - .. - .. - .. - .. - ## Feedback: Write something good about today: - The examples were good (showed some examples and how to use different tools). - . - . - . Write something we can improve about today: - The introduction seemed to me a bit lengthy, i.e. I would expect more time for the part "Errors and Exceptions" and later on - . - . - . ::: # Day 4 - Debugging with Julia :::spoiler - **material**: https://github.com/lucaferranti/JuliaDebuggingWorkshop - **instructors**: Luca Ferranti and Jarno Rantaharju - **this page**: https://hackmd.io/@AaltoSciComp/Debug2022/edit ## Ice - breaker What is the weirdest bug you have encountered in Julia? - Using @animate macro from the Plots package was doign something weird, but I never found out why. - sometimes conversion and promotion rules give me a headache. - .. - . ## Installation - clone this repository and enter the folder ``` git clone https://github.com/lucaferranti/JuliaDebuggingWorkshop.git ``` ``` cd JuliaDebuggingWorkshop ``` - activate and instantiate the environment ``` julia --project ``` ```julia julia> using Pkg; Pkg.instantiate() ``` - that's it, this will take care of downloading all dependencies you need. - **from now on, we will assume you are in the `JuliaDebuggingWorkshop` folder** ## Schedule **Tentative** schedule | Time | Topic | |:--------------:|:---------------------:| | 12:00 -- 12:10 | welcome + ice breaker | | 12:10 -- 12:50 | Julia overview and common julia gotchas | | 12:50 -- 13:00 | break :coffee: | | 13:00 -- 13:20 | Debugger.jl and Infiltrator.jl demo | | 13:20 -- 13:40 | individual exercises | | 13:40 -- 14:00 | exercises solution | | 14:00 -- 14:10 | break :tea: | | 14:10 -- 14:30 | defense against dark arts: type stability, world age problem | | 14:30 -- 15:00 | time for extra questions/discussion/exercises | ### Questions: - Question - answer ## Episodes ### The only one you need to know The ultimate debugging technique is... look for help on the internet, hence let us first have a quick overview on where and how to ask help. The Julia community is very friendly and if you get stuck there are a lot of places to get help, particularly: - [Julia Discourse](https://discourse.julialang.org/): the julia forum, *the* place to post questions - [Julia slack/discord/zulip](https://julialang.org/community/): chats where people passionate about julia hang out When asking a question, it's good to remember that most people out there use their *free time* to develop open source, hence a few tips: - be polite! - be specific! Don't just say "I want to do X but it doesn't work", always post an example of your non-working code to show you gave a try first (this is also a smart application of [Cunningham's law](https://en.wikipedia.org/wiki/Ward_Cunningham#%22Cunningham's_Law%22)) - don't post a screenshot of your code and error message, otherwise people cannot directly copy-paste it to try, always use text - most forums allow for syntax highlight, check it out before posting your code. - if your code is a bigger project, try to reduce it to a Minimum Working Example (MWE), that is try to first identify what part is causing the problem (this workshop will hopefully help you) and write a short simple example demonstrating your issue. See also [stack-overflow guide](https://stackoverflow.com/help/minimal-reproducible-example) on creating MWEs. - Not getting your answer immediately does **not** mean a whole commmunity is unwelcoming!! ### The one where you forgot to revise - Question? - For reproducible env, should I have both toml files or just Project.toml is enough? - You need both for it to be fully reproducible (with exactly the same versions). - . ### The one where something is missing - . - . - . - . ### The one where something is ambiguous ### The one where better safe then sorry #### Questions: - . - . - . ### The one where we get serious Interactive demo about `Debugger.jl` and `Infiltrator.jl`. ### The one where you are on your own Time to solve the exercises! All relevant functions are defined in the file with the corresponding exercise number, e.g. the functions for the first exercise are defined in `exercise1.j` etc. All the functions are exported, meaning that if you have imported this package, you can just call them from the REPL. #### Exercise 1 The function `increment` should add ``1`` to the given input, but it doesn't seem to work, try to call e.g. `increment(1)`, study the error and fix it. #### Exercise 2 This package has some tests that can be run with `using Pkg; Pkg.test()`. All tests reguard the function in `exercise2.jl`. At the moment all tests are failing, it is your task to fix them! **Hint!**: If you get stuck at a test and want to move on to fix the next one, you can change `@test` to `@test_skip` #### Exercise 3 The function is `exercise3.jl` seems innocent and correct, but it actually has a bug. Try to find it and fix it! **Note!**: This is somehow a slightly more advanced exercise, if dont like this (not everyone will!), feel free to skip it and move on. #### Exercise 4 The functions in `exercise4.jl` have some methods ambiguities, fix them! You can either use the function `test_ambiguities(JuliaDebuggingWorkshop)` from `Aqua.jl` (remember to import it) or the built-in static code analyzer in your brain. ### The one where the instructor is there for you Instructors will go through the solutions here. ### The one where you get unstable Demo about type instability and how to detect it :::