# Chapter 3 (version 2): A framework for alternative grading The previous chapter was an unflinching look at the flaws in traditional grading systems. It's now time to start thinking about how to do better, and build a system of grading that promotes learning and growth. In this chapter, we'll begin to do this by isolating key elements of "better" grading systems, then examining instances of these systems used for evaluation and development outside of traditional classooms. We'll then introduce some existing models of alternatives that will appear throughout the rest of the book: Standards-based grading, specifications grading, and ungrading. Finally, we'll distill all of the characteristics that make these alternative systems work into a framework we call the Four Pillars of Alternative Grading. ## How to improve grading The simplest formulation of how to build a better grading system might be *look at all the flaws of traditional grading from Chapter 2, and do the opposite*. If we took that route, our line of reasoning would look like this: - Traditional grading has lots of false positives and false negatives. Instead: *Make grades directly connected to demonstrated learning with clear criteria for what's "good enough".* Traditional assessments often are made of items whose purpose, and point values, are unclear. Students ask, _Why am I being tested on this, and why it this problem 12 points instead of 20?_ So the grade on those items, therefore on the assessment itself, is unclear -- both about the purpose of the assessment, and whether the flaws in student work are minor or major. Furthermore, since points are fungible, major issues can be masked by earning more points on minor concepts. Instead, student work should be graded in such a way that the purpose of the assessment is clear, the requirements for students are easy to grasp, and the results speak directly to what went well and what needs more work. - Traditional grading is bias-prone because it tends to include non-academic factors. Instead: *Make grades directly connected to learning, and nothing else.* Related to the previous item, including things like attendance, punctuality, "engagement", and so on in the grade causes the grade not to reflect accurately what the student reall knows. So, to avoid inequities, a grading system should avoid such items and focus only on concrete evidence of *learning*. - Traditional grading is based on statistics that don't make sense. Instead: *Use simple descriptive language instead of points.* The use of points can give the appearance of scientific objectivity, but remember, these are just labels in the end, not numerical data. So instead of using numbers as a proxy for the categories into which we put student work, just skip the numbers and use the category labels. And do it in a way so that the labels tell students how they are doing. - Traditional grading is demotivating, for a lot of reasons but especially because students are out of the loop thanks to one-and-done assessments. Instead: *Let students try again on work that isn't "good enough", until it is, using feedback.* We noted in Chapter 2 that all significant learning experiences happen through engagement with a *feedback loop*, and yet most traditional grading totally excludes feedback loops. Perhaps the most effective step we can make to improve grading is to fix this, by building grading around trying things, getting feedback, and then having opportunities to improve based on the feedback. If this approach to grading sounds like a fantasy, there is good news: It's not. We've seen it before, all around us. ## Real-world examples Methods of evaluating student work that use all four of the basic ideas above are not only extant, but common. Here are just a few instances that should be very familiar to you. ### Kindergarten report cards Let's start at the beginning: Kindergarten. Kindergarten report cards are a model of the grading system we are hoping to construct. A typical one, in part, looks like this: [[table]] Notice: - The criteria, or *standards*, for what kids should be able to do is clearly stated using plain descriptive language: *I can count to 100 by 10s*. So are the descriptions of the kids' current statuses with those standards: "Does with help", "Does independently", "Needs assistance", and so on. There's not much left to the imagination, which is good news for the kids and their parents. - However kids are assessed on these standards, the "grade" is a simple description that indicates how they are doing with the standard, relative to a criteria for success. It is *not* a point value. If the result on the report card for "I can recite the alphabet in order" said "91%", this might raise more questions than it answers. But the verbal descriptions tell you all you need to know. - Although the report card doesn't necessarily say this, it seems highly unlikely that kids in kindergarten are given only one shot to recite the alphabet in order, and the results of that attempt are what's on the card. It's far more likely that kids work on this *every day*, and the report card shows the results of either their most recent attempt or their best attempt. It shows what the kids have *eventually* been able to do, through regular engagement in a feedback loop. ### Peer review Stepping forward a few decades, most instructors in higher education are intimately familiar with the process of submitting articles to research journals. This process, too, models the kind of "grading" system we wish to see: - Most journals have sections of their websites with clear standards for what constitutes an acceptable publication, with specific criteria word count, font size, and so on, as well as for stylistic and structural elements. Certainly, following these instructions doesn't guarantee publication. But the minimal standards for publication, at least, are clear and easy to access. - The vast majority of journals do not assign numerical grades to submitted manuscripts! Instead, submissions are given labels that indicate what needs to happen next: *Rejected*, *Major revision*, *Minor revision*, or *Accepted* for example. As in kindergarten, a point value attached to a journal submission would probably just be confusing. - Unless the response from the journal is *Rejected*, the submission process isn't one-and-done. After getting the initial review, the authors then iterate on the manuscript using feedback from the reviewers to improve the work and resubmit it, as many times as needed until it's "good enough". ### Personnel evaluations Annual personnel evaluations, both in academia and in the "real world", also follow this model: - The organization (if it's not dysfunctional) maintains a curated list of clear criteria for annual performance review that explains what employees need to be doing in order to get a positive review. That list does *not* include non-work related items, because including those introduces tremendous bias. - At review time, the employee gets a 360-degree review from (among others) their managers, who give a thorough *verbal* report on their progress. These almost exclusively do not use points, nor are the results used to rank employees. (The notorious "rank and yank" system used by Jack Welch at General Electric has fallen into disrepute.) - As with the other examples, annual review is not one-and-done. The purpose is to invite the employee into a feedback loop where the results of the last evaluation are productively enacted before the next one. *Maybe a footnote from this article: https://www.theatlantic.com/politics/archive/2015/08/how-millennials-forced-ge-to-scrap-performance-reviews/432585/ which makes the point that Millenials benefit more from continuous review rather than annual* ## Alternative grading models - SBG - Specs grading - Ungrading *Keep these more or less as they appear already* ## The Four Pillars of Alternative Grading *Now bring it all together -- the points of the ideal system, the real life examples, and the grading systems, and distill into the four pillars model* *Much of this can probably be taken from the existing Chapter 3 with minor tweaks... I think*