CR May 2021 Old

# CR May 2021 Old ###### tags: `workshop` We used this as "spill-over" hackmd to save older notes and questions if the main one becomes unresponsive. Contents have been moved to workshop webpage repository for editing & publishing # CodeRefinery May 2021 workshop ## Planned schedule day 6: * 9:00 - 10:45 Software testing (Thor Wikfeldt, Johan) - 9:00-9:05 Short info about today's breakout rooms and possible questions from yesterday - 9:05-9:10 [Motivation](https://coderefinery.github.io/testing/motivation/) - 9:10-9:20 [Concepts](https://coderefinery.github.io/testing/concepts/) - 9:20-9:40 [Testing locally](https://coderefinery.github.io/testing/pytest/) - 15 minute breakout session, normal rooms - 9:40-9:50 Break - 9:50-10:20 [Automated testing](https://coderefinery.github.io/testing/continuous-integration/) - type-along session - 10:20-10:45 [Test design](https://coderefinery.github.io/testing/test-design/) - 25 minute breakout session, normal rooms plus a few language-specific rooms with (language) expert helpers - 10:45-11:00 Break * 11:00 - 12:15 Modular code development (Anne Fouilloux, Radovan) - all as demo in main room * 12:15 - 12:30 Summary and where to go from here * End 12:30 ## Links - Link to this page: https://hackmd.io/@coderefinery/2021-may-workshop - Workshop agenda and lesson material: https://coderefinery.github.io/2021-05-10-workshop/ - Visiblity of this document: everybody can read and write - Code of Conduct: https://coderefinery.org/about/code-of-conduct/ - Twitter hashtag #CodeRefinery - Questions from previous days: https://coderefinery.github.io/2021-05-10-workshop/questions/ - If this document becomes unresponsive, you can also ask for help in exercise rooms here: https://hackmd.io/@coderefinery/exercise-room-help ## Please rename yourself on Zoom Please rename yourself on Zoom to make breakoutroom management easier: - Examples: - Learner in room 12: "(12) Firstname Lastname" - Exercise leader in room 9: "(9,H) Firstname Lastname" - Expert helpers and instructors: "(CR) Firstname Lastname" - Observers: "(OB) Firstname Lastname" ## How to raise issues and ask questions - Please ask questions always at the bottom of this document - Recommended feedback controls for learners to signal issues - any questions about content/exercises should go to hackmd (this document) - technical problems concerning video/audio: zoom chat - no "raise hand" to ask questions because then audio would end up in stream - When not editing this document, please use "view" mode ("eye" symbol top right or top left of page) - but we also learned that when not logged-in, you may need to be in "view" mode to get live updates ## Room allocation - Rooms 1-19 are teams - Rooms 20- are independent - Some rooms <20 are empty to allow large teams to split to two rooms :::info You can find its number in your email, and hopefully you have named yourself it. With recent Zoom, you can do this by yourself: <img src="https://coderefinery.github.io/manuals/_images/zoom--breakout-room-button.png" width="49%"> <img src="https://coderefinery.github.io/manuals/_images/zoom--breakout-join.png" width="49%"> You can come back from the breakout room as you want: <img src="https://coderefinery.github.io/manuals/_images/zoom--leave.png" width="49%"> <img src="https://coderefinery.github.io/manuals/_images/zoom--leave-breakout-room.png" width="49%"> If you can't (for example, you are using web browser), use the "raise hand" reaction and we will move you. Make sure that your name includes your breakout room number. <img src="https://coderefinery.github.io/manuals/_images/zoom--reactions.png" width="75%"> ::: ### Breakout room size adjustments Please let us know here if your BO room is too small and you'd like to merge with another room ## HackMD unresponsive & old questions If this hackmd becomes unresponsive, we will also watch https://hackmd.io/@coderefinery/exercise-room-help Questions and answers from previous days moved to https://coderefinery.github.io/2021-05-10-workshop/questions/ Old questions have been copied to here: https://hackmd.io/qIlUPDTbSVCCstUto10kTw?edit ## Icebreaker for the day 6 ### What do you know now that you wish someone had taught you when you started studies/research/work? We may tweet some of these answers, of course anonymized. - Only fools don't ask questions / asking questions is not foolish but essential in research :+1: :+1: :+1: - First make your code work, only then make it good or fast (if necessary) - Making a logical folder/data structure can save a lot of time :+1: - Git :+1: :+1: :+1: - Focus more on methods and less on theory. :+1: - Save the editable version AND the scripts with the data for the figures - Avoiding "clever" code. The simpler the code the better. :+1: - To not worry so much about code speed. I have a feeling I spend more human time trying to understand code than CPU time running the code. - Save all results, even the ones that don't seem necessary - It is always worth it to spend time on version control, testing and documentation. Start early! :100: - Write a paper when you have a result (really, no-one told me...) - define proper processes to be followed: folder structure, version control, etc. - To properly document the code +1 :+1: :+1: - - Spend time on making a good reserach design early on this will save you a lot of time and make things easier along the way - Make all the notes! Tomorrow you will not remember anything.:+1: :+1: :+1: - great! I also started doing that. notes for everything. - Dont be autodestructive. Nobody knows everything and it is ok to say you do not have the answer. - That data management is really important! - Most old professors are totally biased by obsolete methods: there is much more than what they think to know... :100: - Comment, Comment, Comment! There are not too many comments in a code. - Never forget readme files :+1: - Python, and coding in general, taught by actual computer engineers in dedicated lessons, not merely by physicists with poor programming practices - Sharing your knowledge with others via teaching or mentoring is an excellent way to learn yourself and benefits others at the same time. ## Questions on earlier day's topics - I have difficulties finding the links to the previous HackMD pages (I always need to go to the emails). Perhaps, it should be nice to put links in https://coderefinery.github.io/2021-05-10-workshop/ or any centralized page that could play the role of an index. :100: - thanks for suggestion. indeed we should collect all resources in one place and cross-reference them. I have moved this good suggestion to our "lessons learned" (on chat) so that we improve this for the future. we wish we could also put zoom link on the workshop page but we are worried about zoom-bombing. - We have been using Git Bash and Anaconda Prompt. Some instructions only work on one or the other platform. Why? Can this be fixed? How? In some cases this has been a limitation in following the lections and it slowed us down (surely it slowed me). [Windows question] - If the setup is fully correct, everything included in the CodeRefinery materials will work in both Git Bash and Anaconda Prompts on Windows. Please provide more details what didn't work and we can concentrate on it when improving troubleshooting guide - pip for example did not work in Git ## Software testing https://coderefinery.github.io/testing/ ### Motivation - direct comparison could also provide different results across different OSes, deployment environments right? - It is an important thing to consider and test for - I totally don't get the point of this test. Seems trivial. Why would the function NOT return the expeted result when you feed it the right input? - Well, it's a first demonstration - And what happens when it gets more and more complex over time? We'll see a basic principle is always have more than one place to verify result - OK now I get this, this is interesting thanks. So if somebody updates / esxpands the function the test shuold still return the same result for a specific input. Right! - And finally, when I make the function the first time, I want to run it once to verify. I've learned, why not take that one sample, put it in test instead of console, and let it run over and over again? Gets a lot more useful with big code - Also sometimes one person's trivial is other person's non-trivial. But of course I agree that this is a trivial example. But often a function can be "obviously" correct or wrong for the original developer but it may be a lot less obvious for a contributor or the new person joining a project. - What is Regression Testing ? - Check that new behavior (or results) are same as old (even if you don't exactly verify either are "correct") - For "when to make tests", one my philosophies is "if I have to test manually, may as well make it automatic". It can also make interface easier, since you can use all the test setups. - also automating it makes it easier to remember how to do the test steps. great for new people joining project: they can look up the testing scripts and from that can deduce how to install and test (if not documented elsewhere) - Integration/Regression tests work also as examples of usage - great point. they can be great "documentation" which is sometimes more up-to-date than the actual documentation :-) :+1: ### Concepts - How do we test functions whose output is unknown to us. i.e We know what they are supposed to do but don't know whether thr esult is correct or not (As in calculating a high Dimensional gradient) - Good question. A general advice is to try to calculate by other means. Perhaps a different algorithm or approach. Of course this is not always possible but. - To me, it's the same philosophical question when making the code in the first place. I would first test that it runs without failures. Are there any corner-cases when you know results ($p=0$ makes empty results)? Are results same as your first run? Can I implement the same thing a simpler way to compare (even if I don't know if either are scientifically correct)? - I like to approach it like this: how do you verify yourself whether it does what you expect? Then try to express this in writing, then try to express this in script/code. Now you have a test. - Ammm, tests are not supposed to be super complicated. They are suppsed to check the basic functionality. Thus computing a high dimensional gradient of a constant and some other (analytically known simple results) would be sufficient. If it works for the simple tractable cases, it should work for the more complex too. - if you cannot predict the output exactly (numerical issue, randomness) you could also at least test some general sanity checks, e.g. that the dimension of the gradient is correct or that the returned type matched the expected type. - For numerical algorithms, I would also get the result with a computer algebra system (Mathematica, maple) and check that the output of my code matches that. - For c++ for example, do I need to include the test tools to my package? What if others use different test tool? - tests are supposed to test your own package and are independent on how your package is used by others. - No example for each one?! (from zoom chat) - "each one" refers to what precisely? - Different tests, like Unite Tests, Integration Tests and etc. - examples for each type of test maybe? - good suggestion. we should add examples for each of these concepts. also pull requests and suggestions very welcome (there is always room for improvement and it's a process ...) - Any way to test the response to time-dependent results? Time is difficult to reproduce in different platforms. - in this case I would test functionality on many platforms but the timing only on one or few, where it's somehow well defined, and I would test with some tolerance. but inded it is tricky. - "fake time" for your tests. Here we discuss about unit testing (mostly). - actually I meant "run time" rather than time as input/argument. - Can you elaborate. I still usually make my code believe there are specific run time conditions. - true, but if you want to test a function that does something in a given time, then the time has to be actual run time. The fake time will by-pass the function that I want to test. I guess a better design would remove this need, somehow. - hmm! I would like to hear more about this. how does this work? - If you cannot test everything, it's always good to test the part(s) of the code you just changed - I'd always recommend testing before merging to the main/master branch - Can we use ```pytest``` in jupyter notebook? - Yes. - You can also run shell commands in the notebook if you add `!` before the command, e.g. `!pytest -v example.py` ## Break 09:40 - 09:50 ## ## Exercise: Testing locally 09:30 - 09:40 ## - Interestingly, we found out that assert add(0.2,0.3) == 0.5 passes the test, but add(0.1,0.2) == 0.3 fails :D - Numerical accuracy problems! Floating point numbers have many of these things :+1: - that's because 0.5 is exactly representable as floating point number. - This (and the lower "accuracy problems" section may be good starting points: <https://en.wikipedia.org/wiki/Floating-point_arithmetic#Representable_numbers,_conversion_and_rounding> - addition is still a nice operation in floating point arithmetic, because you can actually exactly represent the rounding error as a floating point number (you can google error-free transformations if you are interested in knowing more) - https://docs.python.org/3/tutorial/floatingpoint.html `round(0.1 + 0.2, 10) == round(0.3, 10)` (should help) Link:https://coderefinery.github.io/testing/pytest/ Duration: u - 1: Done - Room 2 - 3 Done. We discussed a bit what actually approx does. - 4 - 5: Done - 6 - 7 - 8 - 9 Done talking about floating point precision, differences between float and double, etc. Also, the approx function seems a bit unclear what it does. - 10 done - 11 Done - 12 (all in 11) - 13 Done, but add(0.2,0.3) == 0.5 passes :/ - I would say you were lucky not to have floating-point rounding errors (or rather python took care of them for you in the background). For larger floats, and for multiple addition operators, the error can easily build up, making it a good practice to have tests that allow for the error. - I know ;) And 0.1+0.2 does fail, just surprising ;) - yes, some combinations work fortunately/unfortunately. this exercise has been carefully designed to fail with 0.1+0.2 - :D - more info and workaround: https://docs.pytest.org/en/6.2.x/reference.html#pytest-approx - 14 Done - 15 Done - 16 Done - 17 Done - 18 - 19 done - 20 done - 21: in break - 23: Done - 24: done - 26: done - (great question from breakout room): Why do we need pytest (or similar testing frameworks) if I can achieve the same by inserting asserts/isinstance into the code? - inline testing is good for safety, but not really enough to run full models on test data with known outputs - also this would then test at runtime and runtime may take many minutes or even hours but with a testing framework we have the chance to test functions in isolation which often only takes seconds. plus we get the flexibility to run some tests selectively. but still good idea to also use assertions. - great question! worth to take up in main lecture? - we should share more questions from BORs here in hackmd. :+1: - We were having fun with `approx` and were trying out e.g. approx(add(0.2+0.3)) == approx(0.5), and my question is basically stated below by the next person... - great! I think it's very useful for future to see this. - We discussed a bit the approx funciton. What is exactly its output? We are now realying on an external funciton for our test but we don't understand exactly what the function does. - ok let's maybe first think how we would do this if we didn't have the approx function: then you could check `assert abs(result - expected_result) < 1.0e-14` (adjusting the tolerance). also the approx function from pytest has an in-built tolerance which can be changed. - the HTML code below has messed up code highlighting, anyone have a solution? fixed with `:::` - Fix by adding 'julia' after \\```" :::spoiler Julia implementation of `isapprox` ```julia function isapprox(x::Number, y::Number; atol::Real=0, rtol::Real=rtoldefault(x,y,atol), nans::Bool=false, norm::Function=abs) x == y || (isfinite(x) && isfinite(y) && norm(x-y) <= max(atol, rtol*max(norm(x), norm(y)))) || (nans && isnan(x) && isnan(y)) end ``` ::: ## Automated testing https://coderefinery.github.io/testing/continuous-integration/ Questions: - If you need external data to test the code, i.e. downloading a ML set, do you have to set something specific for Github actions? - gh-actions is like a normal system, it could download data (will it use too much bandwith?) - Should we included a license for the new repo not? :)? - great suggestion! - Which one would be nice for this example? MIT? - yes MIT or BSD are permissive and only thing they require is attribution ## Icebreaker for the day 6 ### What do you know now that you wish someone had taught you when you started studies/research/work? We may tweet some of these answers, of course anonymized. - Only fools don't ask questions / asking questions is not foolish but essential in research :+1: :+1: :+1: - First make your code work, only then make it good or fast (if necessary) - Making a logical folder/data structure can save a lot of time :+1: - Git :+1: :+1: :+1: - Focus more on methods and less on theory. :+1: - Save the editable version AND the scripts with the data for the figures - Avoiding "clever" code. The simpler the code the better. :+1: - To not worry so much about code speed. I have a feeling I spend more human time trying to understand code than CPU time running the code. - Save all results, even the ones that don't seem necessary - It is always worth it to spend time on version control, testing and documentation. Start early! :100: - Write a paper when you have a result (really, no-one told me...) - define proper processes to be followed: folder structure, version control, etc. - To properly document the code +1 :+1: :+1: - - Spend time on making a good reserach design early on this will save you a lot of time and make things easier along the way - Make all the notes! Tomorrow you will not remember anything.:+1: :+1: :+1: - great! I also started doing that. notes for everything. - Dont be autodestructive. Nobody knows everything and it is ok to say you do not have the answer. - That data management is really important! - Most old professors are totally biased by obsolete methods: there is much more than what they think to know... :100: - Comment, Comment, Comment! There are not too many comments in a code. - Never forget readme files :+1: - Python, and coding in general, taught by actual computer engineers in dedicated lessons, not merely by physicists with poor programming practices - Sharing your knowledge with others via teaching or mentoring is an excellent way to learn yourself and benefits others at the same time. ## Questions on earlier day's topics - I have difficulties finding the links to the previous HackMD pages (I always need to go to the emails). Perhaps, it should be nice to put links in https://coderefinery.github.io/2021-05-10-workshop/ or any centralized page that could play the role of an index. :100: - thanks for suggestion. indeed we should collect all resources in one place and cross-reference them. I have moved this good suggestion to our "lessons learned" (on chat) so that we improve this for the future. we wish we could also put zoom link on the workshop page but we are worried about zoom-bombing. - We have been using Git Bash and Anaconda Prompt. Some instructions only work on one or the other platform. Why? Can this be fixed? How? In some cases this has been a limitation in following the lections and it slowed us down (surely it slowed me). [Windows question] - If the setup is fully correct, everything included in the CodeRefinery materials will work in both Git Bash and Anaconda Prompts on Windows. Please provide more details what didn't work and we can concentrate on it when improving troubleshooting guide - pip for example did not work in Git ## Software testing https://coderefinery.github.io/testing/ ### Motivation - direct comparison could also provide different results across different OSes, deployment environments right? - It is an important thing to consider and test for - I totally don't get the point of this test. Seems trivial. Why would the function NOT return the expeted result when you feed it the right input? - Well, it's a first demonstration - And what happens when it gets more and more complex over time? We'll see a basic principle is always have more than one place to verify result - OK now I get this, this is interesting thanks. So if somebody updates / esxpands the function the test shuold still return the same result for a specific input. Right! - And finally, when I make the function the first time, I want to run it once to verify. I've learned, why not take that one sample, put it in test instead of console, and let it run over and over again? Gets a lot more useful with big code - Also sometimes one person's trivial is other person's non-trivial. But of course I agree that this is a trivial example. But often a function can be "obviously" correct or wrong for the original developer but it may be a lot less obvious for a contributor or the new person joining a project. - What is Regression Testing ? - Check that new behavior (or results) are same as old (even if you don't exactly verify either are "correct") - For "when to make tests", one my philosophies is "if I have to test manually, may as well make it automatic". It can also make interface easier, since you can use all the test setups. - also automating it makes it easier to remember how to do the test steps. great for new people joining project: they can look up the testing scripts and from that can deduce how to install and test (if not documented elsewhere) - Integration/Regression tests work also as examples of usage - great point. they can be great "documentation" which is sometimes more up-to-date than the actual documentation :-) :+1: ### Concepts - How do we test functions whose output is unknown to us. i.e We know what they are supposed to do but don't know whether thr esult is correct or not (As in calculating a high Dimensional gradient) - Good question. A general advice is to try to calculate by other means. Perhaps a different algorithm or approach. Of course this is not always possible but. - To me, it's the same philosophical question when making the code in the first place. I would first test that it runs without failures. Are there any corner-cases when you know results ($p=0$ makes empty results)? Are results same as your first run? Can I implement the same thing a simpler way to compare (even if I don't know if either are scientifically correct)? - I like to approach it like this: how do you verify yourself whether it does what you expect? Then try to express this in writing, then try to express this in script/code. Now you have a test. - Ammm, tests are not supposed to be super complicated. They are suppsed to check the basic functionality. Thus computing a high dimensional gradient of a constant and some other (analytically known simple results) would be sufficient. If it works for the simple tractable cases, it should work for the more complex too. - if you cannot predict the output exactly (numerical issue, randomness) you could also at least test some general sanity checks, e.g. that the dimension of the gradient is correct or that the returned type matched the expected type. - For numerical algorithms, I would also get the result with a computer algebra system (Mathematica, maple) and check that the output of my code matches that. - For c++ for example, do I need to include the test tools to my package? What if others use different test tool? - tests are supposed to test your own package and are independent on how your package is used by others. - No example for each one?! (from zoom chat) - "each one" refers to what precisely? - Different tests, like Unite Tests, Integration Tests and etc. - examples for each type of test maybe? - good suggestion. we should add examples for each of these concepts. also pull requests and suggestions very welcome (there is always room for improvement and it's a process ...) - Any way to test the response to time-dependent results? Time is difficult to reproduce in different platforms. - in this case I would test functionality on many platforms but the timing only on one or few, where it's somehow well defined, and I would test with some tolerance. but inded it is tricky. - "fake time" for your tests. Here we discuss about unit testing (mostly). - actually I meant "run time" rather than time as input/argument. - Can you elaborate. I still usually make my code believe there are specific run time conditions. - true, but if you want to test a function that does something in a given time, then the time has to be actual run time. The fake time will by-pass the function that I want to test. I guess a better design would remove this need, somehow. - hmm! I would like to hear more about this. how does this work? - If you cannot test everything, it's always good to test the part(s) of the code you just changed - I'd always recommend testing before merging to the main/master branch - Can we use ```pytest``` in jupyter notebook? - Yes. - You can also run shell commands in the notebook if you add `!` before the command, e.g. `!pytest -v example.py` ## Break 09:40 - 09:50 ## ## Exercise: Testing locally 09:30 - 09:40 ## - Interestingly, we found out that assert add(0.2,0.3) == 0.5 passes the test, but add(0.1,0.2) == 0.3 fails :D - Numerical accuracy problems! Floating point numbers have many of these things :+1: - that's because 0.5 is exactly representable as floating point number. - This (and the lower "accuracy problems" section may be good starting points: <https://en.wikipedia.org/wiki/Floating-point_arithmetic#Representable_numbers,_conversion_and_rounding> - addition is still a nice operation in floating point arithmetic, because you can actually exactly represent the rounding error as a floating point number (you can google error-free transformations if you are interested in knowing more) - https://docs.python.org/3/tutorial/floatingpoint.html `round(0.1 + 0.2, 10) == round(0.3, 10)` (should help) Link:https://coderefinery.github.io/testing/pytest/ Duration: u - 1: Done - Room 2 - 3 Done. We discussed a bit what actually approx does. - 4 - 5: Done - 6 - 7 - 8 - 9 Done talking about floating point precision, differences between float and double, etc. Also, the approx function seems a bit unclear what it does. - 10 done - 11 Done - 12 (all in 11) - 13 Done, but add(0.2,0.3) == 0.5 passes :/ - I would say you were lucky not to have floating-point rounding errors (or rather python took care of them for you in the background). For larger floats, and for multiple addition operators, the error can easily build up, making it a good practice to have tests that allow for the error. - I know ;) And 0.1+0.2 does fail, just surprising ;) - yes, some combinations work fortunately/unfortunately. this exercise has been carefully designed to fail with 0.1+0.2 - :D - more info and workaround: https://docs.pytest.org/en/6.2.x/reference.html#pytest-approx - 14 Done - 15 Done - 16 Done - 17 Done - 18 - 19 done - 20 done - 21: in break - 22: done - 23: Done - 24: done - 26: done - (great question from breakout room): Why do we need pytest (or similar testing frameworks) if I can achieve the same by inserting asserts/isinstance into the code? - inline testing is good for safety, but not really enough to run full models on test data with known outputs - also this would then test at runtime and runtime may take many minutes or even hours but with a testing framework we have the chance to test functions in isolation which often only takes seconds. plus we get the flexibility to run some tests selectively. but still good idea to also use assertions. - great question! worth to take up in main lecture? - we should share more questions from BORs here in hackmd. :+1: - We were having fun with `approx` and were trying out e.g. approx(add(0.2+0.3)) == approx(0.5), and my question is basically stated below by the next person... - great! I think it's very useful for future to see this. - but I also highly recommend to inspect the actual implementation: https://github.com/pytest-dev/pytest/blob/main/src/_pytest/python_api.py#L512 - We discussed a bit the approx funciton. What is exactly its output? We are now realying on an external funciton for our test but we don't understand exactly what the function does. - ok let's maybe first think how we would do this if we didn't have the approx function: then you could check `assert abs(result - expected_result) < 1.0e-14` (adjusting the tolerance). also the approx function from pytest has an in-built tolerance which can be changed. - Julia built-in implementation of `isapprox` (spoiler alert) ```julia function isapprox(x::Number, y::Number; atol::Real=0, rtol::Real=rtoldefault(x,y,atol), nans::Bool=false, norm::Function=abs) x == y || (isfinite(x) && isfinite(y) && norm(x-y) <= max(atol, rtol*max(norm(x), norm(y)))) || (nans && isnan(x) && isnan(y)) end ```