owned this note
owned this note
Published
Linked with GitHub
---
tags: tool-development
---
# Autograding Documentation
Some documentation of abc classroom here:
https://hackmd.io/0ZbGctpuSdqYK2OdPI51dw?both
## Jenny and Leah's Notes!!
YAML FILE CONFIGURATION -
We have a yaml file in abc classroom that specified
## Make Template Repo Workflow Comment
TODO: This script creates the repo in the same directory as the nbgrader repo.
that is not optimal as that means you have a git repo nested in another git repo
potentially. That could be a behavior but we likely want the user to specify a location or
it places is in the same dir as the nbgrader repo (that is likely ideal?)
That could actually be a config yaml setting.
TODO: we should be able to specify the files contained in the directory or it should move everything.
TODO: YAML file could have both assign date and due date and time
TODO: Can you create a new template repo on github from the github api ?? it would be better if we could rather than manually creating it and then connecting -- look into that.
TODO: when the template repo already exists, and you try to use the both flag, it exits. Desired behaviour: it should just push and have a message that says, template repo exists, pushing to github.
TODO: when the files are pushed.
1) the only file that is pushed is the notebook. We want it to push other files too in that directory.
2) We want to populate all repos with standard files like .gitignore files and a readme
3) ABC classroom handles standard files with a config.
4) we would like a standard readme file to be created for all assignments with the option of overwriting it if we want to do that.
## Create the Assignment in Github Classroom
See jenny's note line 282 -- <>
TODO: When you create the github classroom assignment template, there is a check box for Template repositiory under settings. If this is not checked, the template association will fail. Can this be done at the CLI??
Github Classroom
* All commits are coming from Jenny??? Curious as to why?? must be a classroom setting
## Student Roster Issues
TODO: github provides IDENTIFIER in the csv download, nbgrader needs ID. i manually changed it to ID. if i switch it back to identifier, the CLONE step works ....
TODO: A github username column is required. so we need a template roster file for people creating it.
>[name=JennyPalomino] See my comment on the roster download from GitHub Classroom on line 216 (an repeated on line 334).
## Clone repos ...
> Note - clone organizes by student. is by assignment or student preferred?
>
## Pull Down Assignments
TODO: there are several flags here including classroom name (which is actually the org name), roster and assignment that could be idenfied from the config.yml file. So the config could have
* student roster OR csv path to roster
* org name
* abc clone assignment-name-here
TODO: when i created the roster initially, i replaced identifier with ID. however the clone step requires the identifier column. It also wants a github username column. we need to streamline this step.
TODO: when i graded i got the assignemnt and a copy of the assignment. this is likely because of the roster needing a specific set of columns. we need to build some functionality around roster format.
Question: Renamed assignment notebooks are pulled down into the submitted directory but they are not autograded. The autograding only works on the notebook that exactly matches the name of the released .ipynb. Is it possible to run the the autograder on a .ipynb that has a different name? Or do we need to tell students to not rename the assignment notebook (even if they are only adding their name to the end of the filename)?
## How to Create a good notebook
* this might end up in nbgrader repo / docs!!
## Overview of autograding workflow
We are using nbgrader and GitHub Classroom for the workflow. These two platforms are not integrated, so there are a set of [integration scripts](https://github.com/earthlab/autograding-notebooks/tree/master/scripts) that move information between them. There are also a few steps that need to be repeated in both places (adding assignments, adding students). At a high level, the workflow to set up the first assignment in a new course is as follows. Details of course and assignment setup are in the next sections.
> [name=Leah Wasser] what is the list below? is the first element the tool being used and the list is in order of where you would do each thing? i'm a bit confused by it. it seems like we could group by tool as some of these items don't need to be done in a particular order and others do.
> [name=Jenny Palomino] Karen made this list to provide an overview of what is happening where. So the the first element is the where. I was thinking that this may work better as two tables: one for setting up the classroom which would steps 1-2, and one for creating through grading assignments. However, there are two steps under Set up New Assignment that are sort of part of the new classroom set up but just are not completed until after the first assignment is created:
> * [_github classroom_] After all students have linked their github accounts, download the student roster as a csv file.
> * [_nbgrader_] Import the students into the database from the csv roster.
> So maybe we can think about how to best to represent this division.
Set-up New Classroom
* [_nbgrader_] Create a new local nbgrader course directory
* [_github classroom_] Create a new GitHub Classroom. Add the students via their github usernames.
Set-up New Assignment
* [_nbgrader_] Create a new assignment and add one or more notebooks for the assignment to the nbgrader directory
* Put the nbgrader repo onto GitHub as a private repo
* [_nbgrader_] Validate the notebook(s) and create the student version.
* Create a template repository in the GitHub organization. Use the `make_template_repo.py` script to 1) create a local template repo from the nbgrader directory and 2) link to the GitHub remote.
* [_github classroom_] Make a new assignment for the course, adding the template repository.
* Provide the assignment link to the students and have them complete the assignment and then submit by pushing their changes to GitHub.
* [_github classroom_] After all students have linked their github accounts, download the student roster as a csv file.
* [_nbgrader_] Import the students into the database from the csv roster.
* After the assignment deadline, use the `clone-all.py` script to clone all of the student repos to the instructor computer and move the completed notebook to the nbgrader directory.
* [_nbgrader_] Grade the notebooks (autograding and manual grading) and generate the feedback reports.
* Use the `git-feedback.py` script to push the feedback reports to the student repositories on GitHub.
## How to Setup Course To use with Nbgrader and Github Classroom
The steps below only have to be done once for every course.
### Step 1: JupyterHub (Optional)
Update / create JupyterHub using the [hub-ops github repo](https://github.com/earthlab/hub-ops) with required environment / packages
* Give students access to the hub by adding their github usernames to the whitelist.
*NOTE: You do NOT need JupyterHub to run the grading workflow.*
### Step 2: Setup Your Local Python Environment
Next you need to setup your local environment. For the earth analytics courses we use Miniconda and an earth-analytics conda environment created specifically for our courses.
1. Install [Miniconda - URL will be live after merge of open PR](https://www.earthdatascience.org/workshops/setup-earth-analytics-python/setup-git-bash-conda/)
2. Install the [Earth Lab earth-analytics-python conda environment](https://github.com/earthlab/earth-analytics-python-env)
3. Activate the `earth-analytics-python` conda environment that you just installed. `$ conda activate earth-analytics-python`
5. Install the [nbgrader extensions into the active environment](https://nbgrader.readthedocs.io/en/stable/user_guide/installation.html#nbgrader-extensions):
```
jupyter nbextension install --sys-prefix --py nbgrader --overwrite
jupyter nbextension enable --sys-prefix --py nbgrader
jupyter serverextension enable --sys-prefix --py nbgrader
```
Note that the extensions will be available to use anytime you activate the `earth-analytics-python` environment. You could also use this workflow with another Python environment. This setup is for the earth-analytics courses specifically.
> [name=Leah Wasser] We could create a little bash script to install everything. Might just make things easier. This could be an intern task.
> [name=Nathan Korinek] ^Done
### Step 3. Setup GitHub Classroom
Next, create a github classroom for your course (if you do not already have one setup). This classroom will live within your chosen organization.
https://classroom.github.com/classrooms/new
We are using the github.com/earth-analytics-edu organization.
*IMPORTANT: You need to be an admin / owner of the organization to create a classroom. This is the only step that requires you to be an owner for the organization.*
> [name=Leah Wasser]https://classroom.github.com/classrooms/45207559-ea-leah-test-class
Once you have a course setup within github classroom, you can add students to the classroom:
* Get list of students and their github usernames
* Add the students that you want to have access to your new classroom. On the classroom page, click the 'Settings' tab and then 'Roster management' in the sidebar. You can paste in a list of identifiers, one per line.
> [name=Karen Cranston]Would be good to use Canvas IDs for the identifers. Then will we have canvas and github ids associated for each student.
> [name=Jenny Palomino] CANVAS GitHub classroom functionality has now been enabled for https://classroom.github.com/classrooms/23106100-earth-analytics-bootcamp-fall-2019. We just had to email CU IT and provide the credential keys under Settings > Connection Settings.
>
> [name=Leah Wasser] Jenny did you test this? Did the integration work well?
> we should add some more info about how the link with canvas works here... this will be specific to CU but that is ok.
> [name=Jenny Palomino] Initially, the GitHub classroom roster populated with the students whose names matched exactly between CANVAS and GitHub (i.e. the actual first and last name, not the GitHub username). So there is a little manual work to link those students with different names between the two (e.g. Jennifer in CANVAS but Jenny in GitHub). The GitHub classroom roster now matches CANVAS exactly.
### Step 4: Setup nbgrader repository
`nbgrader` requires a specific directory structure in order to work properly. See the [nbgrader philosophy](https://nbgrader.readthedocs.io/en/stable/user_guide/philosophy.html) for details. In order to use the autograding workflow, you need to set up your course using this directory structure. Note that there is **one** nbgrader directory for each course (not each assignment).
* Create an `nbgrader` directory on your local computer using the command
`nbgrader quickstart course-name-here`
This is the directory that you will use to manage all assignments in your course. The course name does not have to match the name of the course that you used for your github classroom setup but we suggest that you do make the names the same to keep things simple.
In bash, `cd` to your local course directory. It is not yet a git repo.
Initialize the `course-name-here` directory on your local computer as git repo using `git init`.
> [name=Leah Wasser] the above step is only needed IF we plan to push to github i think?
> [name=Jenny Palomino] this is the nbgrader directory for the course, so you can create/release assignments, etc, so it needs to be created locally (so you can work with assignments) but doesn't have to be pushed to GitHub
### OPTIONAL - Proceed with Caution:
If you want to share your nbgrader repo with others (such as others who might be contributing to grading and teaching in your class), you may want to connect the local nbgrader repo to a remote repo on github. IMPORTANT: this is where student grades will be stored. Be sure to make this repository private!
To share your nbgrader repo as a private repo on github:
1. Create an nbgrader repo with the same name in your github organization. Be sure that it is PRIVATE as this is where grades will be stored.
2. Setup `remotes` to connect your local repository to the github repository. *Note: you can copy the remote url from github.*
`git remote add origin https://github.com/your-org-name-here/your-class-name-here-nbgrader.git`
* Next, [add the students to nbgrader](https://nbgrader.readthedocs.io/en/stable/user_guide/managing_the_database.html#managing-students).
> [name=Leah Wasser] this is confusing and needs some explanation. I thought i'm adding them to the classroom but here i'm adding them again? Any explanation you guys can add here would be really nice.
> [name=Jenny Palomino] we may want to move this step to later in the workflow, after the first assignment is created and accepted by students. I did not complete this step here because I waited until the students had accepted the assignment and then I can download the roster from GitHub classroom, as the HINT says below.
HINT: The easiest way to add students to nbgrader is to download a roster in .csv format from GitHub classroom. Wait until the students have accepted the first assignment and linked their github accounts. Then you can download the full roster as a `.csv` file from GitHub.
To download a roster from GitHub classroom:
1. go to the classroom settings --> roster management. there you will see a list of all students in your classroom
2. Click on the green download button in the upper right hand corner. The file will be called `classroom_roster.csv`.
3. Add this file to your local nbgrader directory.
4. The `.csv` file contains almost all of the information that you need to add to nbgrader. Open up the csv file. Change the first column name that is called `identifier` to `id`. nbgrader requires `ID` as the column name to function properly.
5. Once you have edited the ID column name, you can import your students into nbgrader:
`$ nbgrader db student import classroom_roster.csv`
> [name=Leah Wasser] this csv option doesn't seem to work as we need an id column in the csv. it looks like GH classroom by default has an identifier column. yup. we could . I wonder if we could submit a PR to nbgrader that would accept either id or identifier to play more nicely with github classroom.
> [name=Leah Wasser] is thre a way to get student first name, last name and email in there? or does it matter? ask jenny.
>[name=JennyPalomino] 9/23/19 - I tested pull down the repositories and running the autograde on the first assignment submitted via GitHub classroom workflow. I used the roster that I downloaded from GitHub Classroom and had to add a column called `id` and populate it with the same value provided in the github_username column. This change was needed in order to import the student.csv roster in nbgrader. . I left the identifier column in the spreadsheet. So far, the workflow is working and has been tested through the autograde step. Once I have completed the manual grading and pushed the feedback reports back, I will confirm that adding this `id` is the only change needed to the roster provided by GitHub Classroom.
Once you have completed the steps above, you are ready to begin creating assignments for your course.
## Create an Assignment
To create an assignment, you need to setup the assignment in both nbgrader locally and in GitHub Classroom (if you are using Classroom). There is no explicit link between nbgrader and github classroom, but the [integration scripts](https://github.com/earthlab/autograding-notebooks) assume that the assignment name is the same in both.
> [name=Leah Wasser]ADD LINK TO THE INTEGRATION SCRIPTS. i keep forgetting where they are. This is also the first time these scripts are mentioned... with 0 explanation.
### Step 1: nbgrader assignment setup
The instructions below are using the command line to set things up. You can also perform the same steps in the nbgrader interface within Jupyter notebooks.
> [name=Leah Wasser]IMPORTANT. our audience is not necessarily going to be command line savy. I think we need to emphasize the GUI interface over the CLI. i could imagine two sets of docs in RTD - one for gui (primary) and another for CLI (which we might use more)? I really like to see lists of student and such and can imagine using a gui here.
> [name=Leah Wasser] NOTE II: I am getting some errors - when i launch notebooks i get error fetching courses. id like to understand what is going on and how to troubleshoot these things.
> what is the best tool to look at a sql lite db?
> [name=Leah Wasser] https://sqlitebrowser.org/dl/
> BELOW COULD BE A RESULT OF the updated nbgrader version? i will start over
>
> [DbApp | WARNING] Outdated config: use CourseDirectory.course_id rather than Exchange.course_id
> [DbApp | WARNING] Outdated config: use CourseDirectory.course_id rather than Exchange.course_id
> [NbGraderApp | WARNING] Outdated config: use CourseDirectory.course_id rather than Exchange.course_id
> No db command given (run with --help for options). List of subcommands:
>
[name=Jenny Palomino] I am getting the same error for fetching courses. However, since I haven't pulled any repositories yet (I will after this Friday), I have not run into any issues yet. I'll report back after the students submit the first assignment for this workflow this Friday.
student
Add, remove, or list students in the nbgrader database.
assignment
Add, remove, or list assignments in the nbgrader database.
upgrade
Upgrade database schema to latest version.
### Command Line Steps To Create a New Assignment
1. Add new assignment to nbgrader:
`nbgrader db assignment add <assignment_name>`
> [name=Leah Wasser]looks like there is an assignment due date option?? ``[DbAssignmentAddApp | INFO] Creating/updating assignment with ID 'new-assignment': {'duedate': None}``
2. Create the assignment notebook(s), including autograded test cells. Put notebook(s) in the `source` dir of the nbgrader repo: `nbgrader/source/<assignment_name>/*.ipynb`
3. Validate the assignment:
`nbgrader validate source/<assignment_name>/*.ipynb`
*or click on the validate button in the notebook.*
4. Create the student version of the assignment using:
`nbgrader assign "assignment_name" --IncludeHeaderFooter.header=source/header.ipynb --create`
*(or click on the generate button in the nbgrader interface)*
> [name=Leah Wasser]as i upgraded nbgrader i now need to update the assignment... we need to walk folks through this. can it be done in the jupyter interface? it doesn't look like i can do that
When you create the student version of the notebook, nbgrader generates the necessarily files that the students need to complete the assignment in `nbgrader/release/<assignment_name>`. If all worked well, these files should have the answers to the problems hidden from the students.
### Step 2: GitHub Classroom Assignment Setup
Once you have created the student version of the assignment, you are ready to create the GitHub classroom assignment and link a template repository. The template is the repo that GitHub Classroom uses to create the assignment repositories that you will share with the students.
1. Go to GitHub. Create a template assignment repo in your classroom organization (e.g. earthlab-education organization). This is the repo that will be used to create the github classroom assignment. it will populate each student's repo with the files that they need to complete the assignment. Follow the [naming conventions in this document](https://hackmd.io/_4GDiG9wSwe6lBq18bDM7w#Naming-conventions).
> [name=Leah Wasser] should this be public or private? i'm assuming public is ok as long as no one else can commit to it.
> [name=Leah Wasser] this needs to happen before you create teh assignment on classroom ideally as you can specify it there. i know you can go back and do it but why bother if you can just do this first?
> ok i'm confused now as the step above - is that necessary if the step below happens?
>
COnfig yaml file with
- org
- store credentials / secrets
- assignment: name and then files to include
> [name=Leah Wasser] Automate this as a part of the create assignment step... It could do nbgrader things and github things at the same time. i think we could use some of the abc-classroom stuff here. i remember this all being automated but u do have to manually create the assignment. maybe there's an api for that too now?
>[name=Jenny Palomino] I created it as a private repo, and I did not initialized repo with README.md per details on make-template-repo.py script.
1. Locally, run the [make-template-repo](https://github.com/earthlab/autograding-notebooks/blob/master/scripts/make-template-repo.py) script, which will create the template repo and push to the empty GitHub repo that you previously created for the template. Sample call (see readme in [github](https://github.com/earthlab/autograding-notebooks/tree/master/scripts) for details):
`python make-template-repo.py both full-path-to-nbgrader-course-dir assignment-name --org_name earthlab-education`
`python ../autograding-notebooks/scripts/make-template-repo.py --org_name earth-analytics-edu create . new-assignment`
>[name=Jenny Palomino] If you already created the template repo locally (e.g. you already ran `python make-template-repo.py create` or `python make-template-repo.py both`), you will get an error that the repo already exists when trying to run `python make-template-repo.py both` again. To get around this for now, delete the local template repo and run `python make-template-repo.py both` again.
> [name=Leah Wasser] that script `make-template-repo.py` is not documented. can someone please add instructions here on how to call it? i'm happy to help document things but right now there is not documentation for how to call it
>[name=Jenny Palomino] Update from Karen: I need to set up ssh keys for git to avoid this error.
>When I first ran the `make-template-repo.py` script, I got a permission error:
>`Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.`
To get around this, I changed line 62 in the script from `repo_url = "git@github.com:{}/{}.git".format(org_name,repo_name)` to `repo_url = "https://github.com/{}/{}.git".format(org_name,repo_name)` and the repo built successfully.
2. Create a new assignment in GitHub classroom and use the name of the template repo in the 'template repository' field. _The 'assignment title' on GitHub must be the same as the nbgrader assignment name. See Naming Conventions, at the bottom_.
>[name=Jenny Palomino] I have to manually check a box to enable the Template Repository option in the GitHub Settings of my template repo. Before I did this, it was not identified as a template repository that could be used to populate the new assignment repo in GitHub classroom. After I changed that setting, I was able to use the template repo as the Template Repository for the new assignment in GitHub classroom.
3. Once you create the assignment, GitHub will provide an assignment link for you to distribute to students. Give this link to the students.
4. Students click on the link, which generates a repo in the GitHub Classroom organization with their name in the repo name.
5. To work locally on the assignment, or to work in the hub or some other cloud space, students clone their repo locally. Then they complete the assignment (either on a local installation of jupyter notebook or on the jupyterhub). They submit using `git push` to their repositories.
> [name=Leah Wasser] I believe that the text below refers to the fact that we may
* (Later, we might want the students to work with forks and pull requests).
* The first time the students click on an assignment link, they will be asked to join the organization. Now you can link the student to their github account and download the roster as a csv file.
6. Add the students to nbgrader database: `nbgrader db student import students.csv`
> [name=Karen Cranston] I think there might be some re-formatting of the csv file needed between github export and nbgrader import.
>[name=JennyPalomino] 9/23/19 - I tested pull down the repositories and running the autograde on the first assignment submitted via GitHub classroom workflow. I used the roster that I downloaded from GitHub Classroom and had to add a column called `id` and populate it with the same value provided in the github_username column. This change was needed in order to import the student.csv roster in nbgrader. . I left the identifier column in the spreadsheet. So far, the workflow is working and has been tested through the autograde step. Once I have completed the manual grading and pushed the feedback reports back, I will confirm that adding this `id` is the only change needed to the roster provided by GitHub Classroom.
### Step 3 - Grading assignments and returning feedback
1. Use the [clone-all.py](https://github.com/earthlab/autograding-notebooks/blob/master/scripts/clone-all.py) script to collect student repos and integrate notebooks into nbgrader repo. This takes the place of the nbgrader collect command. Sample call (see readme in [github](https://github.com/earthlab/autograding-notebooks/tree/master/scripts) for details):
`python clone-all.py --roster students.csv --assignment assignment-name --classroom earthlab-education`
>[name=Jenny Palomino] Update from Karen: I need to set up ssh keys for git to avoid this error.
>When I first ran the `clone-all.py` script, I got a permission error:
>`Permission denied (publickey).
fatal: Could not read from remote repository.`
To get around this, I changed line 39 in the script from `repo_url = "url = "git@github.com:{}/{}.git".format(classroom, slug)` to `url = "https://github.com/{}/{}.git".format(classroom, slug)` and the script ran successfully.
3. Grade the notebooks using `nbgrader autograde <assignment_name>` and then do any [manual grading using the form grader](https://nbgrader.readthedocs.io/en/stable/user_guide/creating_and_grading_assignments.html#manual-grading).
4. Generate the feedback reports using `nbgrader feedback <assignment_name>`.
5. Keep updating the nbgrader repo via git push so that other instructors can see grades and reports.
6. Push the html feedback reports to the student repos using the [git-feedback.py](https://github.com/earthlab/autograding-notebooks/blob/master/scripts/git-feedback.py) script. Sample call (see readme in [github](https://github.com/earthlab/autograding-notebooks/tree/master/scripts) for details):
`python git-feedback.py --roster students.csv --assignment assignment-name`
7. Students run `git pull` to pull down the feedback.html to their repositories.
## Naming conventions
Based on discussion in [this issue](https://github.com/earthlab/autograding-notebooks/issues/6), these are the naming conventions for the various digital artifacts. Because so many of these names gets concatenated into repo names, general convention is to keep them short but descriptive.
* **GitHub organization**: [earth-analytics-edu](https://github.com/earth-analytics-edu)
* **nbgrader private repo**: where source notebooks and grades live (only one per class per semester). This repository mirrors the nbgrader directory structure.
* _format_:`ea-{classname}-nbgrader`, e.g. ea-bootcamp-nbgrader
* **Classname**
* _description_: The name of the class in GitHub Classroom, e.g. [class for bootcamp test](https://classroom.github.com/classrooms/45207559-ea-bootcamp-test). This does not affect any other names.
* _format_:`ea-{classname}`, e.g. ea-bootcamp
* **Acronym**:
* _description_: a short form of the classname; used in repository names to prevent them from being too long and unwieldy and also to disambiguate repos for different courses
* _format_: anything short, e.g. `ea` for 'earth-analytics' or `ea-bootcamp` for the bootcamp
* **Assignment {title/name}**
* _description_: Name of the assignment, including week number to facilitate ordering. Set twice - once as `assignment title` in GitHub classroom and once as `assignment name` in nbgrader (_name must match in both locations!_) and then used as parameter for [nbgrader-classroom integration scripts](https://github.com/earthlab/autograding-notebooks/tree/master/scripts).
* _format_: `{acronym}-{week-number}-{assignment}`, e.g. `ea-02-spatial-vector-data`. Note inclusion of course acronym, because all assignments for all courses go into same github org. Also week number with leading zero.
* **Assignment repository prefix**
* _description_: Required field when setting up a GitHub classroom assignment. This will be part of the student assignment repo name. Workflow assumes this is the same as Assignment.
* _format_: Same as Assignment, e.g. enter same text for `assignment title` and `assignment repository prefix` in GitHub classroom.
* **Template repository**
* _description_: name used for repo created by [template repo script](https://github.com/earthlab/autograding-notebooks/blob/master/scripts/make-template-repo.py) and used as template for assignments
* _format_: `{assignment}-template`, e.g. `ea-02-spatial-vector-data-template`
* **Student repositories**
* _description_: auto-generated names based on other choices, above. Cannot be modified.
* _format_: `{assignment}-{student-github-username}`, e.g. `ea-02-spatial-vector-data-lwasser`