CAM-Gerlach
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    1
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    ## https://hackmd.io/@CAM-Gerlach/SJPDZXJ5h/edit ## Description Let's face it: the overwhelming majority of current scientific code is siloed away into one-off scripts and notebooks, where the only real mechanism for re-using and building upon it is good old copy and paste. In order to keep "building upon the shoulders of giants", we need to achieve not only reproducibility of individual results, but also true reusuability of research methods, that can be shared, built upon, and deployed by users across the world. At this BoF, we invite the community to share their tools and workflows for reusable science, and hope to explore how we can encourage users to expand beyond the current notebook-centric monoculture and toward holistic, open, modular and interoperable approaches to conducting research and developing scientific code. The ideas and discussion at the BoF and in this document will inform future guides and resources on this topic, to be hosted on central community platforms like the Scientific Python organization ## Schedule 5:45 - What is reusable research and why it is important, and what are the goals & outcomes of the BoF? 5:50 - What tools and techniques do people have to share for effective reusable research? 6:05 - How can we integrate reusable research into existing workflows? 6:20 - How do we teach students and researchers about reusable research, and encourage using them? 6:35 - Additional discussion, plugs, "talk to me afters" & closing ## What is reusable research and why it is important, and what are the goals & outcomes of the BoF? * Reusable research * Can be not only replicated, but also built upon and extended easily by both the author and others * "Building upon the shoulders of giants" is the foundation of both open science and open source * One-off scripts and notebooks not typically very re-usable; generally cannot easily: * import them * specify dependencies * extend them * use them for another project (without copy/paste and managing multiple versions) * And additionally, for notebooks specifically, cannot easily * track them in VCS (with clean diffs) * lint, type check, test or format them with standard Python tools * interoperate with most other non-notebook-specific ecosystems * Etc. * Primary BoF goals (and topics) * Share tools, techniques and resources for reusable research * Discuss how we can better integrate them into common workflows * Determine how we can effectively teach and encourage their use among both newer students and established scientists * BoF outcomes * This document, documenting our discussions, ideas and resources from the BoF, which will be made public afterwards * Will inform potential future guides and resources hosted on a central community location (e.g. Scientific Python) * Serve as a potential jumping off point for potential further events at future conferences **_Even if you don't get a chance to speak up yourself, feel free to add any notes you like to the relevant categories!_** ## What tools, techniques and resources do people have to share for effective reusable research? * There's a tool called nbflake8 to lint notebooks * Would be cool to have Ruff based tool too * * Can be difficult to easily compare outputs between notebooks created by different researchers * VSCode recently made a change to the notebook diff viewers to more easily show just the code, as well as show metadata changes optionally as well * Is there a way to see if there just is a output * https://code.visualstudio.com/docs/datascience/jupyter-notebooks#_custom-notebook-diffing * This is an idea I've been working on for 10 years * We put the stuff we want to be modular in a regular Python module, and then have a Jupyter notebook that shows an example using the code * Have a collection of modular calculations that start with a Python function, decorate it with a class and then it connects to the framework that * https://github.com/usnistgov/iprPy * I'd probably add devcontainers, being able to work with a lab group or collaborating with a team, it allows you to work together and see everything on their screen. In VSCode live share is also a really cool similar feature. * One of the things we do on our project is everything has to be documented, and one of the things we struggled with was reducing a notebook to the type of report NASA is typically looking for, which is a step we're struggling with * jbednar: I'd argue that a notebook is not a unit of reproducible research, a project is (notebooks or scripts + envt + record of commands to run there). See [8-levels of Reproducibility](https://www.anaconda.com/blog/8-levels-of-reproducibility) and [Conda Project](https://github.com/conda-incubator/conda-project). * Adding a plug for [papermill](https://papermill.readthedocs.io/en/latest/) - super useful tool for parameterizing and executing notebooks programmatically ## How can we integrate reusable research into existing workflows? * I really like the [cookie template that Henry (III)](https://github.com/scientific-python/cookie) has for packaging * A lot of my workflows are just messing around with my data * Having something like a package structure from the get go will help make it easier to not miss things * Following up on that, I'm in nuclear engineering and we often have two week project leveraging Jupyter at the center * We have a cookiecutter template that has Sphinx, and a directory structure, and metadata that looks familiar and has everything set up by default * This particularly helps ensure that different colleagues and team members are on the same page with doing things * Been using [data-driven cookiecutter template](https://drivendata.github.io/cookiecutter-data-science/) to have a structured way for where to put things * This helps ensure consistency in terms of what things are named, and the order to run things * There's a really [cool tool called "Show your work"](https://github.com/showyourwork/showyourwork) that comes out of the astrophysics community, that's more in line with wanting to produce a paper at the end but include all the steps that show your work along the way * Show your work gives you a template so you can show your work at the end * Is build on a tool called snakemake * Show your work sets up the template and the paper * Really helpful guide for getting started and ensuring all your projects have the same structure * Axel who gave talk on Wenesday published their gammapy paper using this tool * Related tool for citing open source authors specifically check out [duecredit](https://github.com/duecredit/duecredit/) which looks at your code and finds the authors (via git commit) that wrote the code * Followup question: How is this different from Quarto? * Quarto is much more general, whereas show your work was specifically built to allow users to produce a PDF in LaTeX at the end * I do something very old fashioned, I write a aaa_readme.txt file where I record a diary of what I was doing on that project so if I have break working on it, I can go back to those notes and remind myself of what I was doing * On that note, notebooks are supposed to be this thing to make programming literate * However, while beginners use them interactively because they don't know how to use debuggers, but they don't always remember the literate part * nbdev is my favorite tool for that, but also getting people accustomed to best practices can also be helpful for reproducibility * I love notebooks, and also love modules, and love the flow of code from notebooks into modules once it approaches that point * Thinking of modules as a unit of documented, tested code, but which doesn't mean a lot on its own, whereas combined with a notebook, it gives them context and meaning * If your community is afraid of modules, then try to make making them easier, rather than avoiding them, so that you have fully reimportable python modules. * For students, the notebooks often turn into a fancy scratch pad or script file, and once they get stuff that works, they can move that stuff out into modules * Then the notebooks start to morph into examples and the history of what the work was that can be interpreted by other researchers * Tools like autodoc in VSCode can be a great way to reduce the friction for students, as they just add the triple quotes and VSCode expands the rest * Wrt [nbdev](https://github.com/fastai/nbdev), you can develop your code and let it grow, and then eventually you can export the parts of your code as modules at the end * Downside of the documentation is it talks about everything as packages, but you can use for individual notebooks and modules * We're talking about students here, and I was hesitant to show it to my students since they're early hPython programmers, but it was actually quite easy to have that one line at the end * Do you have a page where you document this? I'm still learning Python and would like to learn more about this and teach it to my students * [nb-convert](https://nbconvert.readthedocs.io/en/latest/index.html) is a similar cli too that can convert notebooks to many different formats, including a Python script. This is also similar to the [built in VSCode feature](https://code.visualstudio.com/docs/datascience/jupyter-notebooks#_export-your-jupyter-notebook) * I did a tutorial here and can share that; the documentation is pretty intimidating but it would be great to have that in a smaller scale setting * Juanita: A cool output of the BoF is this document listing a bunch of tools, which can be the input to a series of guides and tool lists online * Question: How do I get started * Issue: Documenting the parameters of your modules; without it it's very difficult to use them. jbednar: That's what param.holoviz.org is for. :-) ## How do we teach students and researchers about reusable research, and encourage using them? * It's one thing when its students, but how do you do that when its your whole organizational culture that needs to change * Juanita: I am a student myself, and no one every really talked to me about IDEs and explained what they were and why you'd want to use one. * It's important for teachers to actually teach them about using the proper tools * But I have no idea when it comes to coworkers using these things * With respect to the team situation, the most effective way I found is nerd sniping * You figure out what is the biggest pain point for the team, and its usually something that should be automated * So I've tricked people into using better practices by showing them how these tools can fix that problem * (Juanita) Yeah, I think it's really just awareness, if you show someone a cool tool most folks will make the decision to adopt them on their own, but there will always be folks who might not want that * I think students mostly get introduced to notebooks through classes in contexts that are very different from how they would use them for their research * More a question really, as I don't have a good resource for that to hand to a student if they have a question or are confused about that * (Juanita) I think that should be part of the curriculum, why are people learning machine learning using Jupyter notebooks without learning how to use Jupyter notebooks * Many folks don't come from a traditional computer science background and may not know about all these tools, so we get a lot of benefit from students bringing in new ideas * CAM: I feel the fact that students are only exposed to notebooks really makes them not necessarily want to reach for other tools even when they would be more appropriate down the line * (Responce) I feel we should be encouraging students to use an IDE like JupyterLab that offers many of those IDE like features but also allowing them to take advantage of the notebook's interactive features * Juanita: I was a Spyder developer (and CAM is too), and I feel that we should show students how to use those tools like debugging and make it easier for them to do that, but give them the choice whether they want to use those tools. I think the right approach is not necessarily telling them what tool to use, but having documentation and exposure to those tools so students can pick the best option for them. * Its true, we want to give students options, but many might not need a debugger * I work in the library here at UT, and we often only have an hour to introduce users to Python * We use Google Collab (notebooks) because it makes it a lot easier for students to get started with Python over having to download and install an IDE * And then students tend to be familiar with that tool and continue to use it * (Juanita) I'm a big fan of using videos to help reach student over reading the documentation, as I feel they are much more likely to watch them * I am Particle Physicist and I ask all my students to use jupytext. * This helps the student to make from Notebooks to python file to be committed to the git. * The code can be committed as python. * In Jupyter we can right-click and open a python file as a Notebook and continue working on it. ## Additional discussion, plugs, "talk to me afters" & closing * One other thing I want to add is students might have familiarity with Python or R, but Git is a completely different animal and is quite challenging to factor that into education * My wife is a writer, and she would really benefit from Git but its really hard to get her to use it * Yeah, we may not be aware of how inefficient the workflows we use are, because that's all we know * Feel free to add more questions, comments, and feedback in the Slack channel or this document

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully