Rethinking How and Why We Grade Students… Again and Again and Again

David Buck

1 Rethinking How and Why We Grade Students… Again and Again and Again

by Barry J. Fishman

The University of Michigan

fishman@umich.edu

This chapter is modified from the Introduction to the 50th Anniversary Edition of Wad-Ja-Get? The Grading Game in American Education, which employs a CC-BY-SA 4.0 license. The complete book is freely accessible at https://doi.org/10.3998/mpub.11900733

“American education is in trouble.” That’s how Howard Kirschenbaum, Sidney Simon, and Rodney Napier opened their classic book, Wad-Ja-Get? The Grading Game in American Education, five decades ago (Kirschenbaum et al., 1971). The authors named a litany of troubles relevant in the late 1960s and early 1970s, most of which are, sadly, still salient. Their core focus—grading— is probably not what most people, either then or now, might name as the most important issue facing society. But the success of our education system is crucial, and grading shapes learning and teaching in innumerable and significant ways. Perhaps that is why debates about grading and grading systems remain perennial subjects of debate in American education. And that is why a 50-year-old book deserves another reading today, and why it is so important to have Wad-Ja-Get? back in print, both in physical form and in a free online and digital version. The way Wad-Ja-Get? introduces the debate over grading makes it particularly accessible to a wide range of audiences. My hope is that this book will be widely shared among teachers, parents, and policymakers in order to spark new conversations about the purposes and practices of grades and grading.

Wad-Ja-Get? presents its arguments in the form of a conversation among students, teachers, administrators, parents, and community-members in the semi-fictional Mapleton High School. Although the story is set in a high school, the ideas are equally applicable to both K-12 and higher education. And though the story is set in the late 1960s, the topics and ideas feel current, not dated. When I first read Wad-Ja-Get? I was struck by how many of the arguments are the same ones I have with colleagues and students today. Readers will note that many of the themes, ideas, and solutions presented in the excellent Ungrading book (Blum, 2020) that inspired this collection are presaged in Wad-Ja-Get?, making it a conceptual “prequel” to today’s discussions. Readers should also note that while the ideas and arguments in Wad-Ja-Get? remain relevant, at times the writing conspicuously reflects the time period in which it was written. Instead of updating the text to reflect today’s ways of speaking about race, gender, and other topics, a mindful decision was made to leave the language in its original form. Doing so provides regular reminders of how much has changed in 50 years . . . and how much has not changed at all.

In this chapter, I explain how Wad-Ja-Get? came to be, and what happened after its publication, including how research on grades and grading has advanced in last fifty years leading up to Ungrading. Then I consider how the underlying infrastructure for grading and college admissions keeps things from changing in meaningful ways, along with a few examples of how changes to that infrastructure are making a positive impact. Finally, I consider how two crises—systemic racism and a pandemic—should influence our thinking about grades and grading.

The Story Behind Wad-Ja-Get?

Why was Wad-Ja-Get? written in the first place? Grading and assessment were not a central scholarly focus for any of the three co-authors. Though Wad-Ja-Get? is a notable contribution to the literature on grading, Sid Simon and Howie Kirschenbaum are better known for their work in educational psychology, values clarification, and counseling. Rod Napier is an expert on leadership development, change management, and strategic thinking. The three met at the College of Education at Temple University in Philadelphia in the late 1960s, where Simon and Napier were faculty members, and Kirschenbaum was Simon’s graduate student. That’s where the story of Wad-Ja-Get? begins . . . with Sid Simon being denied tenure at Temple.

The reason for the tenure denial was Sid’s grading practices, and his refusal to change them at the administration’s request. He never liked letter grades. Sid told me that, when he taught high school, it broke his heart “to squeeze these kids into one of the five letters of the alphabet.” He much preferred narrative evaluations, where he could offer feedback using “all the letters of the alphabet.” By the time Sid was teaching in college, he had adopted a blanket grading policy where everyone in the class got a B. Nobody would fail, and nobody would get an A. In Sid’s way of thinking, this freed everyone to focus on their learning instead of their grades. At Temple, Kirschenbaum suggested that instead of giving everyone a B, Simon should give all As in order to get the attention of the larger institution and possibly spur a conversation about grading. This tactic worked; the administration noticed what Simon was doing and didn’t like it.

In response to the tenure denial, a “Save Simon” movement arose on campus, with flyers, protests, and coverage in both city and campus newspapers and even in the Temple alumni magazine. Kirschenbaum was active in the movement, organizing rallies and gathering support for his advisor. In response, a faculty committee was formed to investigate the research on grading, chaired by . . . Rod Napier. The results of this committee’s work indicated that the existing research did not provide strong arguments in support of traditional letter grading; a finding that continues to be true today. The combination of protest and scholarly review led to a reversal of the tenure decision and to the writing of Wad-Ja-Get?. Napier’s committee report became the review of research in the appendix for the book.

Wad-Ja-Get? had solid early sales and attracted enough attention that the authors kept the momentum going by organizing a national conference on grading alternatives. In 1972, the Ohio Education Association (an affiliate of the National Education Association) hosted a conference on grading that attracted seven hundred attendees. In 1973 the three authors organized four conferences, in New York City, New Orleans, Chicago, and San Francisco, attracting over two thousand people. The conferences were followed by the establishment of the National Center for Grading and Learning Alternatives, led by James Bellanca, who until his recent retirement served as the Executive Director of the Illinois Consortium for 21st Century Schools and has long been active in the skills-based or mastery learning movement, variants of which are leading alternatives to traditional grading (more on that later). In 1973 Bellanca and Kirschenbaum also conducted a survey of twenty-six hundred colleges to better understand the role that traditional grading plays in the admissions (more on that later as well; Bellanca & Kirschenbaum, 1973). In the same year, the Association for Supervision and Curriculum Development published Degrading the Grading Myths, a volume edited by Simon and Bellanca that featured essays and an overview of research on grading, intended for educators and educational leaders (Simon & Bellanca, 1976).

Scholarship on Grades and Grading since Wad-Ja-Get?

The research reviewed in Wad-Ja-Get? covers more than sixty years of scholarship on grades and grading. In 2016, as part of the 100th anniversary celebration of the American Educational Research Association (AERA), Susan Brookhart and Thomas Guskey, both among the most accomplished scholars on grading and assessment, organized a group of colleagues for a comprehensive review article to address the question “What do grades mean?,” covering both pre-Wad-Ja-Get? scholarship and the fifty years since (Brookhart et al., 2016, p. 804). What has changed in those fifty years? Not that much. The general finding that teacher-assigned grades are subjective and unreliable remains constant. More recently, however, researchers have increased an emphasis on non-cognitive skills including persistence, engagement, and positive school behaviors. Research also focuses on educational outcomes like successful graduation from high school or college, finding that grades can provide “a useful indicator of numerous factors that matter to students, teachers, parents, schools, and communities,” and have been shown to predict academic persistence, completion, and ease of transition from high school to college (Brookhart et al., 2016, p. 833). Grades appear to correlate with cognitive knowledge as measured by standardized tests. In this sense, grades can serve as a measure of success in school, although there’s a circular logic underlying these observations: Students who perform well on the dominant school-based performance indicator (grades) are observed to “do well” in school, both academically and behaviorally.

Brookhart, Guskey, and their colleagues conclude in their review that, though many, “may wish grades were unadulterated measures of what students have learned and are able to do, strong evidence indicates that they are not” (Brookhart et al., 2016, p. 835). In her 2000 AERA Presidential Address, assessment scholar and psychometrician Lorrie Shepard argued that the dominant paradigm guiding educational measurement in the twentieth century is the heart of the problem (Shepard, 2000). A culture that views intelligence as innate, a curriculum based on social efficiency and transmission of knowledge, and a deep-rooted belief in “scientific” measurement and sorting of students produce the desire to see inherent value in grades as instruments of rational control. Combined with deeply ingrained social inequities, this culture also results in systematic racial biases in grading, a topic not well covered in the Brookhart et al. review, but well documented elsewhere (e.g., Malouff & Thorsteinsson, 2016). Shepard argues for the development of a “learning culture” that prioritizes the idea that all students can learn with a focus on higher-order thinking. Such a culture would also demand new and more dynamic forms of assessment and ways to record and report student learning and progress.

The history of American education has been characterized as a struggle between psychologist Edward Thorndike and philosopher and psychologist John Dewey going back to the first decades of the 1900s. Thorndike was a proponent of scientific management, believing that the goal of education was to sort young people by their ability to improve the efficiency of the system. He believed deeply that “quality is more important than equality” (Rose, 2016). Dewey was a leading progressive voice, arguing for problem-based and experiential learning; a vision of education that was tailored for the individual, not for efficiency. As a leading educational historian has observed, a good summation of the last century of schooling would be: “Thorndike won. Dewey lost” (Lagemann, 2000). Many of the progressive education reforms introduced in recent decades—such as project-based learning and the current conversation on ungrading—are flashes of Deweyan sensibility, so the battle is not necessarily lost, though deep changes to the underlying educational structures that percentage-based grading feeds into would be needed to change the direction we have been and are headed.

Rethinking the Infrastructures that Structure and Support Grading Practices

Creating real change in education is hard, perhaps as hard as any societal challenge. In Degrading the Grading Myths, Simon and Bellanca muse, “There may be no more difficult reform task than introducing a non-grade report system into schools. Everyone wants it, but few initiate it!” (Simon & Bellanca, 1976, p. 70). In a 1995 critique of teaching and assessment in higher education, University of Rochester English Professor David Bleich notes, “Testing and grading are normalized to such a degree that only a small minority of students, teachers, and administrators can conceive of any alternative to this system. . . . [They] are manifestations of the ideology of schooling” (Bleich, 1995, p. 568).

One way to explain this normalization is that percentage-based grading and letter grades have become part of the underlying infrastructure that shapes education. While we generally think of plumbing or highways when we think of infrastructure, Susan Leigh Star—a sociologist who studied information systems—described infrastructure as the embedded and transparent standards and conventions that shape practice (Star & Ruhleder, 1996). Things like grades become deep cultural practices embedded in schooling, and the structures of schooling—for instance the use of transcripts designed to record only final letter grades and the use of the grade point average (GPA) as a summation of all individual course grades—reinforce the importance of grades. To change this, we need to engage in what Star called infrastructuring work, creating new structures—such as new forms of transcript—and practices—such as the way colleges use information from secondary school for admissions—that create value for information about learning beyond the final letter grade and GPA.

College Admissions, Transcripts, and GPAs as Infrastructures

One such effort discussed in the Brookhart et al. (2016) review is the emergence—or reemergence—of mastery learning, which is a leading candidate considered by the Mapleton High School of Wad-Ja-Get? (no spoilers). In Wad-Ja-Get? the authors distinguish between a five-point system of mastery and a two-point system, which is akin to Pass / No Credit, sometimes referred to as standards-based grading. It is difficult to pin these terms down clearly in either literature or practice (see Guskey & Anderman, 2013 for some useful definitional work). The key, however, is that grading in these approaches is based on clear demonstrations of what students can do, and do not involve percentage-based grading or comparative grading such as curves. Much important work on mastery learning was done right before and during the time Wad-Ja-Get? was being written, first discussed by John Carroll in 1963 and further developed by Benjamin Bloom in 1968 and later work (Guskey, 2015). One key advantage of mastery learning is that it does not make time the main arbiter of learning, allowing for individual variation on the way to learning goals, with liberal use of formative assessments and feedback. Another advantage is that mastery assessment emphasizes what students know and can do, as opposed to merely ranking them against one another. Alfie Kohn, progressive education proponent (and author of the introduction to Ungrading), makes an elegant argument against using assessment to rank students, pointing out that comparative information by itself is not that useful. What would be more useful is a rating system, not a ranking system (Kohn, 2019). Current grading approaches seem to ration mastery, as though it isn’t conceivable that all students can learn. Through detailed monitoring of what students know and can do, it is possible to produce a report of their learning with much more information than a single letter grade. Harvard education professor Todd Rose (2016) picks up on this in The End of Average, in which he calls out the culture of “averagarianism,” arguing that ranking and sorting learners around a mean ignores important differences in learning, and in fact masks information about actual learning. Rose points out that two students could have the same GPA yet vary widely in their strengths and weaknesses. Mastery learning and grading approaches allow us to respect learners as individuals. A comprehensive meta-analysis of mastery learning programs found them to be highly effective across a range of learners, subject matter, and levels, and one of the most consistently effective educational interventions (Kulik et al., 1990). Given findings like these from the 1990s, why aren’t mastery learning approaches more common today? A one-word answer: infrastructure.

From the 1990s through today, education policy in the US has emphasized a culture of testing that has dramatically narrowed what learners view as important. Cathy Davidson, a professor of English at the City University of New York’s Graduate Center, put it well, writing that our students “were well-taught and learned well the lesson implicit in our society that what matters is not the process or the learning but the end result, the grade. A typical college freshman today has been through 10 years of No Child Left Behind educational philosophy where ‘success’ has been reduced to a score on a test” (Davidson, 2013). Davidson’s book The New Education includes a broader critique of grading, describing it as part of the development of modern college education and a misguided search for “merit” based on reductive information like GPAs and test scores (Davidson, 2017).

In thinking about our current-day obsession with grades and GPAs, one can plausibly argue that college admissions shares blame equally with education policy. As one of the characters in Wad-Ja-Get? muses, “It’s the colleges that have started the trend [towards percentage-based grading and GPAs] this time. . . . Harvard’s and Yale’s entrance requirements became the standard high school curriculum” (Kirschenbaum et al., 2021, p. 42). This common perception, that colleges required a GPA in applications, turned out not to be true. When Kirschenbaum and Bellanca surveyed college admissions offices about their policies in 1973, they were surprised to learn from the 1900 or so college admissions officers who responded that less than five percent of colleges required grades or class ranks in the application, and seventy-seven percent responded that other forms of transcripted information would receive “fair and equal review” (Bellanca & Kirschenbaum, 1973). This is still true today—colleges are willing to consider applicants with non-traditional academic records, though this is not widely advertised—but the perception and practice of grades and GPAs has only become stronger over the past decades. Pressure to get into selective colleges has become so intense that students (and their parents) obsess over each element of the application, imagining that each question must be answered in a certain way to even have a chance of admission to a dream school. Given the uncertainty of the process, it makes sense that seemingly objective measures like grades would become a prominent focus. Colleges feel GPA pressure as well. College ranking systems (another powerful infrastructural component) overvalue admitted students’ GPA and test scores because they are easily measured and compared across colleges, even if it is unclear exactly what is being measured (O’Neil, 2017). In this environment, it is not surprising that students respond with an intense focus on grades, with notable detrimental effects. In addition to de-emphasizing learning, there is evidence that the heightened focus on grades is harmful to student mental health (Crocker et al., 2003). And the emphasis on chasing grades does not diminish after college admission. When I talk with students in my own university about replacing their GPAs with richer or more detailed information about what they know or can do, their first instinct is to panic. Wouldn’t that put them at a disadvantage to other students with “high” GPAs to put on their graduate school or employment applications? The problem isn’t student resistance, resilience, or grit; the problem is that the whole system emphasizes ranking and grading over learning. We need to change the system to reignite a focus on learning. Which brings us back to infrastructure.

Infrastructure shapes culture. What if the infrastructure of college admissions were designed to use richer forms of information that more accurately depict students as individuals? What if we could give employers a way to identify students who were a good match with their organizational mission and needs, as opposed to a one-number ranking? One technology that has enlivened the conversation in this area is digital badges, which can be used to denote learning or accomplishment in more granular ways than traditional grades and can be used to guide individualized pathways towards specific learning goals. When multiple badges are assembled into portfolios, they allow learners to customize their self-presentation (Casilli & Hickey, 2016). The work in badges is related to work around mastery learning, which has seen a recent resurgence though organizations like CompetencyWorks (Sturgis & Casey, 2018).

In 2018, I co-hosted a meeting of college admissions personnel, experts in digital badges, and assessment experts to discuss the potential of badges as evidence in the college admissions process (Fishman et al., 2018). Admissions officers at this meeting viewed these representations of learning as potentially crucial for expanding the range of students who see themselves as good candidates for success in college and for their ability to identify those students. A leading effort to change the infrastructure of college admissions is the Mastery Transcript Consortium (MTC; http://mastery.org/). Independent school leader Scott Looney was frustrated with the limiting aspects of the traditional transcript, believing that its structure limited innovation, discouraged interdisciplinary and engaged learning, and was more useful for sifting and sorting students than anything else: a strong echo with the arguments discussed above. Even more frustrating, he felt unable to do anything about this without jeopardizing his students’ access to top colleges. A college admissions officer friend suggested that, while it might not be a great idea for his school to do its own thing, if a consortium of schools banded together, it would make it easier for colleges to respond positively to an alternative transcript. And so the MTC was born, with the goal of developing a transcript that represented areas of mastery instead of course titles and grades. The areas could be customized by each school and yet remain relatively easy for admissions officers to make sense of, in order to understand an applicant in the context of other applicants from the same school or state and against the specific goals of the college in building a broadly diverse admissions cohort. Since the establishment of the MTC network in 2017, it has grown to more than 370 private and public schools. In 2019–20, the first students used a pilot version of the Mastery Transcript to successfully apply to college. MTC is building new infrastructure that makes mastery learning into an attractive option for high schools . . . again.

Another element of infrastructure that needs to change in order to upend the dominant system of letter grades and percentage-grading is gradebooks themselves. The most common form of technology in schools today is the learning management system (LMS). These platforms—such as Canvas, Blackboard, and Moodle—support functions like distributing class materials, collecting and grading student assignments, and supporting classroom communication. LMSs are both ubiquitous and nearly invisible in schools; they are infrastructure. And LMSs typically have a narrow view of what grades and grading look like. It’s difficult to find an LMS gradebook that doesn’t start with the assumption that 100% is “perfect,” thus making the objective for students to maintain grades that—on average—are as close to 100% as possible. Gradebooks are hopelessly averagarian. In response, students game the system within or across their courses to maintain or maximize their average.

School Is a Game . . . but It Is a Terrible Game

At this point, it’s fitting to emphasize that the subtitle of Wad-Ja-Get? is “The Grading Game in American Education.” School is indeed a kind of game, but it’s a terrible game, with broken engagement and reward structures. Students are motivated to get good grades, but not to learn. What if we tried to learn from well-designed games and applied those lessons to the design of school learning (Gee, 2003)? In a well-designed game, players willingly take on challenges, persist through difficulties, and are resilient in the face of multiple failures on the way to ultimate success. In general, if you succeed at everything you do on the first try, you probably aren’t being properly challenged. Eric Klopfer of the MIT Education Arcade is fond of pointing out that people play well-designed games because they are hard, not despite their difficulty. Why isn’t school like that? I can’t recall the last time a student came to my office and asked, “Professor, what’s the hardest thing I can do next?” That’s simply not how the current grading game is played.

To make school into a better game, my colleagues and I at the University of Michigan are developing a pedagogical approach we call gameful learning, after well-designed games (Fishman et al., in press; Fishman & Holman, 2015). A key to this approach is changing the frame for grading. Instead of starting with 100%—which you will most likely lose—in a gameful course you start with zero, but you can end up wherever you want based on the choices you make and the effort you put in. In this way, gameful learning is aligned with ideas from mastery learning. Learners are given autonomy, in terms of being able to make choices with respect to assignments and pathways through a course. Their feelings of competence are supported, in part through being able to make choices about what to work on, and through a sense of productive failure. What this means is that if a learner earns, for example, 60% of the points available on an assignment, this isn’t a failure at all (though it would certainly be viewed as a one in most standard grading systems), but instead represents progress. What did they learn? What hasn’t been learned yet? How should we focus future work by this student to help ensure that all goals are met by the end of the course? Gameful courses also emphasize a sense of belonging, helping students to feel a part of something larger than themselves. In the study of academic motivation, self-determination theory has demonstrated that when learners’ autonomy, belonging, and competence are supported, they feel more intrinsically motivated (Deci & Ryan, 1985; Roy & Zaman, 2017). When these three elements are thwarted—as they are in much of contemporary education—extrinsic motivation is required to get learners to engage. For today’s learners, that extrinsic motivation comes from grades.

Note that gameful learning is not about playing literal games to support learning; it is about changing the rules of grading to make school itself into a better game. Unfortunately, this way of thinking about grading is also nearly impossible to support with traditional LMS gradebooks. Infrastructure strikes again! To make it easier for instructors to implement gameful learning in their classrooms, we created GradeCraft (http://www.gradecraft.com/), which integrates with popular LMS platforms and is custom-built to support gameful assessment approaches. At present, more than seventeen thousand learners have experienced gameful courses at more than seventy-five academic institutions. At the University of Michigan, forty-eight different programs across a broad range of subjects have courses that have been redesigned to be gameful. GradeCraft is an example of the infrastructuring work needed to change the way instructors and students think about grades and grading.

Beyond gameful learning, I argue that the entire college experience can be reshaped to better engage students as learners and to develop into the kinds of resilient problem-solvers the world needs now and in the future. The program I imagine would be inherently multidisciplinary, focused on problems instead of fields of study. There would be clearly defined learning goals, with achievement measured as progress towards those goals instead of by grades. There would be no required courses, and students would be supported in learning from all the resources and opportunities presented by the university and the surrounding community. This program imagines a different supportive infrastructure that in turn reshapes how learners interact with the institution and vice versa, and an admissions process that does not use GPA or standardized test scores to rank and sort students. I am leading an effort to develop an undergraduate degree program that makes this vision a reality at the University of Michigan. It is critical for this kind of work to happen at colleges, to send the message to K-12 schools and families that we value what is learned and how it is learned more than the final grades. There is growing evidence that the same is true for employers and graduate schools (National Academies of Sciences, Engineering, and Medicine, 2018). I note that much of our work in this area is supported by a capable campus Center for Academic Innovation, helping to spur new and transformative ideas in education and providing both a supportive community and the necessary infrastructures for bringing those ideas to life. I am happy to say that efforts to reimagine education through centers for academic and learning innovation are spreading across higher education more broadly (Kim & Maloney, 2020). The status quo around grading is in part a response to the perceived demands of college, so it is fitting that colleges should lead the way forward.

The Current Crises in America

At the outset of this introduction, I lamented that many of the troubles facing education and society fifty years ago are still present. One of the most significant of these is institutionalized racism and anti-Blackness, which has been constructed and continually reproduced in the United States for over four hundred years and is deeply embedded in the structure of US education. A new challenge is the emergence of COVID-19, the effects of which are amplified by racism. Both have deep implications for society in general, and for education in particular. Is it frivolous to advocate for changing grading practices against the backdrop of such challenges? I don’t think so. Rapid changes made in response to the coronavirus revealed many of the underlying flaws in current grading practices, in particular its focus on ranking instead of on learning, care, or equity. Progressive advancements in the way we think about grades and grading offer crucial opportunities to enhance equity and the care we show for all students, especially those who don’t fare well in the traditional game of school. This is an opportunity to increase our focus on learning over sorting.

Why might now be different in terms of reform? Despite a century of scholarship and advocacy about grades and grading, the use of percentage-based letter grades is more prevalent and entrenched than ever. As a frustrated Mapleton student states in Wad-Ja-Get?, “I’m tired of talking about grades. I’ve talked about grades as long as I’ve been in school and that seems like a long time. We’ve talked about grades in this class before. We’ve talked about the history of grading. . . . And here we are again, talking about grades. But that’s all we do around here: Talk” (Kirschenbaum et al., 2021, pp. 58–59). In Where Good Ideas Come From, science writer Steven Johnson illustrates how innovations come not from sudden breakthroughs or lone thinkers, but rather evolve over time, gathering (and losing) momentum in conjunction with other developments (Johnson, 2011). And sometimes, in what Johnson calls “the adjacent possible,” the right combination of elements exists in the right context to enable the innovation to be recognized for what it is and take hold. The current combination of technology (e.g., digital badges), advocacy (e.g., Mastery Transcript Consortium, the growth of academic innovation centers in higher education), and crisis as it relates to education might bring about an adjacent possible favorable to advancing approaches to grading that enhance learning and equity.

Many Americans view education as a means of advancing one’s standing in society, but like many of our institutions, its structures are more aligned with preserving the status quo, and these structures have been reinforced by decades of law and politics that created segregated neighborhoods and schools, and actively undermined efforts at meaningful desegregation (Ryan, 2010). “Race gaps” in achievement have long plagued American education (Barton & Coley, 2010). Black and brown students trail their peers academically as a result of chronic and systematic underfunding of the schools they attend (Baker et al., 2016). Repeated evidence indicates that socioeconomic status is one of the most reliable indicators of academic performance, and racist policies related to housing, employment, and other key elements of wealth creation for generations of Americans ensured that Black people in particular have less than white people. Even when overall achievement rises, the gap remains (Hanushek et al., 2019). A focus on gaps can be problematic in and of itself, leading us to view students in racialized categories labelled as “underachieving.” In our current system we too often apply a deficit lens to disparities, asking how group performance can be “fixed,” rather than addressing the systemic structures and processes that caused the gaps in the first place (Gutiérrez, 2008).

The field has never been level, and systems of grading and ranking contribute to preserving institutional inequity, with abundant evidence that teachers’ grading practices are subject to racial bias, irrespective of teachers’ racial attitudes (Quinn, 2020). I have already cited Todd Rose’s (2016) critique of using averages to rank students. Historian and founding director of Boston University’s Center for Antiracist Research Ibram X. Kendi argues that that the statistical methods we use to “measure” learning were developed by scholars who were also proponents of eugenics, committed to “proving” that the Black race was inferior to others (Kendi, 2019). Institutional racism in education, supported by a grading and standardized testing system focused on ranking, was readily able to demonstrate that some students were “naturally” less able than others. Thorndike’s scientific management approaches to education played a key role in reinforcing and amplifying the inequality that has been there from the start.

How could we break this cycle? One way would be to champion grading approaches that focus on progress instead of comparison, on rating instead of ranking. Approaches like mastery-based grading allow for more individualization and autonomy and are thus inherently more equitable than approaches based on average performance because they do not ration success. One might argue that there are more important things to focus on, such as school funding. I don’t disagree, but I do reject the false choice. Education is a system, and when we work for improvement, we need to focus on its multiple interconnected elements simultaneously.

Wad-Ja-Get? is mostly silent on matters of race and economic inequality when discussing grades and grading systems. Though the conversation at Mapleton High does not directly address systemic racism, it is also clear that the students engaged in the discussion are headed to college and enjoy many privileges that are invisible to them, and the book pointedly describes Mapleton High as occupying a luxurious building in a town with a small percentage of Black residents. I have observed that K-12 schools and colleges with “alternative” grading systems are often enclaves of privilege. One of these privileges is the simple ability to question and challenge the dominant grading system. It is the nature of institutionalized inequality that the rules don’t need to apply to those at the top. Disadvantaged students and schools feel like they must compete within the system in order to be seen as equal. But the competition begins from unequal starting points. This is why it is so crucial to understand and address those rules and the infrastructures that support them in a way that emphasizes equity to the greatest extent possible.

Compared to the slow, constant crisis of institutional racism, the COVID-19 crisis emerged seemingly overnight and swept across the US education system at lightning speed. In a matter of weeks, almost all schools moved to remote instruction. As Na’ilah Suad Nasir and Megan Bang, President and Vice President of the influential Spencer Foundation, wrote in March 2020, “More drastic change to education systems has occurred in the last week than it has in arguably the last 50 years. What possibilities does this open up for the future of learning, for the reorganization of our institutions, for the centrality of families and family life?” (Nasir & Bang, 2020). Teachers and students alike found themselves working to find ways to continue learning in unfamiliar—and sometimes infelicitous—contexts. The social inequalities related to race and class that already were prevalent in the country were exacerbated for learners by differences in access to technology, networks, and safe and stable places to learn remotely. After the move to remote instruction, it was striking to note that the first major changes were to grading systems.

With few exceptions, K-12 and higher education institutions across the US rapidly switched to some variant of Pass/Fail or Pass / No Credit grading (Reich et al., 2020). Arguments in favor of the move tended to emphasize fairness and care for learners, on the grounds that most instructors and students were not well-prepared for such a move, and the growing health crisis created a situation that would make it difficult to focus on learning, to say the least. The arguments against the change were the most revealing, however. Some argued that by moving to Pass/Fail, we were lowering our standards. Others argued that, without a letter grade to aim for, students would lose their motivation to work hard. I also heard that the move would be unfair to students who had been working hard for a high grade thus far in the term, or to students who were on an upward trajectory and needed these grades to raise their GPA.

To the first objection—that we are lowering standards by moving to a Pass/Fail system—I ask, what were our standards in the first place? Let’s begin with the assumption that passing is the equivalent of a C or C- in the standard grading approach. If you aren’t happy with students earning those grades, why do they exist at all? Shouldn’t a passing grade mean that the student has at least learned the core goals of the course? To me, this objection is an argument for raising standards such that nobody can pass a course without mastering the core learning objectives. This objection reveals a lack of focus on learning in our current grading systems.

The second objection—that without high grades to aim for, students will become unmotivated and stop working—reveals the devil’s bargain inherent in current grading systems, in which students are only working for grades, not for learning. Self-determination theory calls this the “overjustification effect,” wherein receiving a reward (a grade) for something you used to enjoy doing (learning) causes your enjoyment to decrease, and your need for extrinsic rewards (more grades) to increase in order for you to continue to engage (Lepper et al., 1973). If this isn’t a clear call to refocus assessment and grading on learning, I don’t know what is. This objection implies a lack of focus on care and equity.

The third objection—that this move is unfair to “hard workers” who were shooting for an A, or to students who need a good grade to improve their GPAs—again shows a lack of focus on care in our grading systems. The first group isn’t materially affected by the change to Pass/Fail. The second group can be supported with a note to their transcript, or in letters of recommendation. In any event, the real question here—if care is the focus—is what the best move is for most students?

The focus on the pros and cons of Pass/Fail, however, misses a larger opportunity. If we employed mastery grading approaches, the entire system would be better prepared for shocks like COVID-19 and would just be better in general. Mastery-based grading focuses on learning, not ranking. It respects the idea that learners might take different paths to the same outcomes, thus enhancing equity. It supports the use of rigorous standards, and the idea that it is possible for everyone to reach them. It offers the potential to communicate exactly what learners know and can do, in comparison to information-poor GPAs and transcripts. Finally, mastery learning is flexible and resilient in the face of unforeseen situations, allowing us to emphasize care for learners. Of course, without explicit attention to issues of systemic racism, moves towards mastery grading will be undercut by the ability of those with power to maintain their advantage by investing in resources only they can afford or access. Any effort to improve our grading practices needs to maintain a focus on the deeper challenges of racism and anti-Blackness. The crisis presented by the pandemic creates a space for everyone in education to ask, “What are grades for?” Hopefully the answers we respond with will emphasize learning, care, and equity.

Conclusion

I’m so grateful to Drs. Kirschenbaum, Simon, and Napier both for the gift of this book fifty years ago, and for their enthusiasm for reintroducing Wad-Ja-Get? to a new generation of learners, families, educators, and policymakers. Talking with them and hearing their stories was inspiring to me as an educator, as a scholar, and as a person. My hope is that new readers find Wad-Ja-Get? to be just as compelling as I found it, and that it fosters renewed and productive conversations about grading and assessment. Changing how we think about and practice grading is crucial to redesigning education systems to be more just, more equitable, and more focused on learning. It’s been fifty years since Wad-Ja-Get? first argued for that. Now is the time.

References

Baker, B. D., Farrie, D., & Sciarra, D. G. (2016). Mind the gap: 20 years of progress and retrenchment in school funding and achievement gaps. ETS Research Report Series, 2016(1), 1–37. https://doi.org/10.1002/ets2.12098

Barton, P. E., & Coley, R. J. (2010). The Black-White achievement gap: When progress stopped. In Educational Testing Service [Policy Information Report]. Educational Testing Service. https://eric.ed.gov/?id=ED511548

Bellanca, J. A., & Kirschenbaum, H. (Eds.). (1973). College guide for experimenting high schools. National Humanistic Education Center.

Bleich, D. (1995). Academic ideology and the new attention to teaching. New Literary History, 26(3), 565–590. JSTOR. https://www.jstor.org/stable/20057301

Blum, S. D. (Ed.). (2020). Ungrading: Why rating students undermines learning (1st edition). West Virginia University Press.

Brookhart, S. M., Guskey, T. R., Bowers, A. J., McMillan, J. H., Smith, J. K., Smith, L. F., Stevens, M. T., & Welsh, M. E. (2016). A Century of Grading Research: Meaning and Value in the Most Common Educational Measure. Review of Educational Research, 86(4), 803–848. https://doi.org/10.3102/0034654316672069

Casilli, C., & Hickey, D. (2016). Transcending conventional credentialing and assessment paradigms with information-rich digital badges. The Information Society, 32(2), 117–129. https://doi.org/10.1080/01972243.2016.1130500

Crocker, J., Karpinski, A., Quinn, D. M., & Chase, S. K. (2003). When grades determine self-worth: Consequences of contingent self-worth for male and female engineering and psychology majors. Journal of Personality and Social Psychology, 85(3), 507–516. https://doi.org/10.1037/0022-3514.85.3.507

Davidson, C. N. (2013, January 7). Why Students Gripe About Grades. HASTAC. https://www.insidehighered.com/views/2013/01/07/essay-how-end-student-complaints-grades

Davidson, C. N. (2017). The new education: How to revolutionize the university to prepare students for a world in flux (1 edition). Basic Books.

Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. Plenum.

Fishman, B., Hayward, C., & Niemer, R. (in press). Improve student engagement with gameful learning. In D. Seelow (Ed.), Teaching in the game-based classroom: Practical strategies for grades 6-12. Routledge.

Fishman, B., & Holman, C. (2015, December 1). Higher ed grading systems deserve an F. It’s Not Academic. http://blog-en.heqco.ca/2015/12/barry-fishman-and-caitlin-holman-higher-ed-grading-systems-deserve-an-f/

Fishman, B., Teasley, S., & Cederquist, S. (2018). Micro-Credentials as Evidence of College Readiness: Report of an NSF Workshop. University of Michigan. http://hdl.handle.net/2027.42/143851

Gee, J. P. (2003). What videogames have to teach us about learning and literacy. Palgrave Macmillan.

Guskey, T. R. (2015). Mastery learning. In J. D. Wright (Ed.), International Encyclopedia of the Social & Behavioral Sciences (2nd ed., Vol. 14, pp. 752–759). Elsevier.

Guskey, T. R., & Anderman, E. (2013, December). In search of a useful definition of mastery. Educational Leadership, 71(4), 18–23.

Gutiérrez, R. (2008). A “gap-gazing” fetish in mathematics education? Problematizing research on the achievement gap. Journal for Research in Mathematics Education, 39(4), 357–364. JSTOR. https://www.jstor.org/stable/40539302

Hanushek, E. A., Peterson, P. E., Talpey, L. M., & Woessmann, L. (2019). The unwavering SES achievement gap: Trends in U.S. student performance (Working Paper No. 25648; Working Paper Series). National Bureau of Economic Research. https://doi.org/10.3386/w25648

Johnson, S. (2011). Where good ideas come from: The natural history of innovation. Riverhead Books.

Kendi, I. X. (2019). How to be an antiracist. One World.

Kim, J., & Maloney, E. (2020). Learning innovation and the future of higher education. Johns Hopkins University Press.

Kirschenbaum, H., Napier, R., & Simon, S. (1971). Wad-ja-get? The grading game in American education. Hart Publishing Company.

Kirschenbaum, H., Napier, R., & Simon, S. (2021). Wad-ja-get? The grading game in American education (50th Anniversary Edition). Maize Books.

Kohn, A. (2019, June 16). Why can’t everyone get A’s? The New York Times, 8. https://www.nytimes.com/2019/06/15/opinion/sunday/schools-testing-ranking.html

Kulik, C.-L. C., Kulik, J. A., & Bangert-Drowns, R. L. (1990). Effectiveness of mastery learning programs: A meta-analysis. Review of Educational Research, 60(2), 265–299. https://doi.org/10.3102/00346543060002265

Lagemann, E. C. (2000). An elusive science: The troubling history of education research. University of Chicago Press.

Lepper, M. R., Greene, D., & Nisbett, R. E. (1973). Undermining children’s intrinsic interest with extrinsic reward: A test of the “overjustification” hypothesis. Journal of Personality and Social Psychology, 28(1), 129–137. https://doi.org/10.1037/h0035519

Malouff, J. M., & Thorsteinsson, E. B. (2016). Bias in grading: A meta-analysis of experimental research findings. Australian Journal of Education, 60(3), 245–256. https://doi.org/10.1177/0004944116664618

Nasir, N. S., & Bang, M. (2020, March 20). An open letter to our community: Covid-19. Spencer Foundation. https://www.spencer.org/news/an-open-letter-to-the-spencer-community-covid-19

National Academies of Sciences, Engineering, and Medicine. (2018). The Integration of the Humanities and Arts with Sciences, Engineering, and Medicine in Higher Education: Branches from the Same Tree. The National Academies Press. https://doi.org/10.17226/24988

O’Neil, C. (2017). Weapons of math destruction: How big data increases inequality and threatens democracy (Reprint edition). Broadway Books.

Quinn, D. M. (2020). Experimental evidence on teachers’ racial bias in student evaluation: The role of grading scales. Educational Evaluation and Policy Analysis. https://doi.org/10.3102/0162373720932188

Reich, J., Buttimer, C., Fang, A., Hillaire, G., Hirsch, K., Larke, L., Littenberg-Tobias, J., Moussapour, R., Napier, A., Thompson, M., & Slama, R. (2020). Remote learning guidance from state education agencies during the covid-19 pandemic: A first look. http://osf.io/k6zxy/

Rose, T. (2016). The end of average: How we succeed in a world that values sameness. HarperOne.

Roy, R. van, & Zaman, B. (2017). Why Gamification Fails in Education and How to Make It Successful: Introducing Nine Gamification Heuristics Based on Self-Determination Theory. In M. Ma & A. Oikonomou (Eds.), Serious Games and Edutainment Applications (pp. 485–509). Springer International Publishing. https://doi.org/10.1007/978-3-319-51645-5_22

Ryan, J. E. (2010). Five miles away, a world apart: One city, two schools, and the story of educational opportunity in modern America. Oxford University Press.

Shepard, L. A. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4–14.

Simon, S. B., & Bellanca, J. A. (Eds.). (1976). Degrading the grading myths: A primer of alternatives to grades and marks. Association for Supervision and Curriculum Development.

Star, S. L., & Ruhleder, K. (1996). Steps Toward an Ecology of Infrastructure: Design and Access for Large Information Spaces. Information Systems Research, 7(1), 111–134. https://doi.org/10.1287/isre.7.1.111

Sturgis, C., & Casey, K. (2018). Quality principles for competency-based education. iNACOL and CompetencyWorks. https://aurora-institute.org/resource/quality-principles-for-competency-based-education/

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

License

Share This Book