Research Spotlight: “Self-Grading” in 1969

“An experiment in learning” by Peter G. Filene

Sep 20, 2024

UNC students in May 1970. This photograph, courtesy of UNC Libraries, was part of an exhibit on the student protests of the 1960s in Chapel Hill.

In my last post, I shared a study on student perceptions of ungrading that was published only a few months ago. Today, I want to share something much older: an article about an experiment in “self-grading” at the University of North Carolina, Chapel Hill in 1969, by Peter G. Filene. Thanks to my colleague

Josh Eyler

for bringing this one to my attention:

Filene, P. G. (1969). Self-Grading: An Experiment in Learning. The Journal of Higher Education, 40(6), 451–458. https://doi.org/10.2307/1979820

Filene is the author of the 2005 volume The Joy of Teaching: A Practical Guide for New College Instructors. I admit I have not read the book, but from what I can tell it doesn’t dive very deeply into grading practices.

Filene’s 1969 article, however, is a fascinating look at an early experiment in collaborative grading. In fact, this is the earliest published instance of collaborative grading (or something very like it) on the college level I’ve found during the course of my research—if you know of something earlier, please send it my way!

Filene begins by narrating both student and instructor experiences that are still common today: students view their grade as the main thing they “get out of” any given course; instructors are frustrated that grades corrupt the learning experience and place educators in the position of “policeman-judge” when they should ideally be serving as the “teacher-guide.” In considering this problem, a colleague suggested to Filene that since he could not abolish grades, perhaps he should have the students grade themselves instead.

Filene decided to try self-grading with the 172 students he was teaching across three sections of American history—two sections of an introductory survey and one section for upper-level students. He taught these lecture-based courses normally except that students’ essay exams were returned to them without grades, and with copious amounts of feedback. He also shared his own criteria for excellence on these exams and the suggested weight that each exam should carry in the final grade. Finally, he proposed two standards by which students should grade themselves:

“Grade yourself (a) by what you put into the course, in terms of effort and interest, and (b) by what you got out of the course relative to what was to be gotten.”

Having dispensed these guidelines, Filene left students to their own grading devices.

Throughout the semester, he noticed no changes in the way students attended, prepared for, or performed in the class relative to previous semesters. While he didn’t give grades on student exams, he kept a private record of what grade he would have given each student on their work if he were formally evaluating them. At the end of the semester, Filene met with students to record the grade they awarded themselves for the course.

This is what he found:

“As compared to the standard of my ‘private’ grades, 3 per cent of the students graded themselves lower. Roughly 57 per cent in each course gave themselves the same grade that they would have received if I had been grading them. Forty per cent…evaluated themselves one or two grades higher.”

Filene has a few ideas about why 40% of students ranked themselves higher than he would have. Some, he noted, may have more accurate perceptions of their achievement than he, since there are so many aspects of that achievement that are invisible to instructors. In other cases, however, students “evidently mistook enthusiasm for achievement” or simply caved to the pressures of an environment where grades are all-important, assigning themselves a higher grade than they deserved and then rationalizing that decision after the fact.

This leads Filene to the crux of the problem with self-grading: grades mean different things to different people, and they carry weight far beyond the classroom. Employers and graduate schools view grades not as tools for learning but as the instructor’s summary of a student’s achievement, and they “hold the professor responsible for those grades.” The most important lesson Filene learned from self-grading, then, is that…

“Yes, the flesh is often weak, and so is the spirit, but above all the system is strong…One instructor cannot blithely try to make his courses a version of pedagogical utopia and at the same time use the symbols employed and defined in other ways by the non-utopian outer world.”

Yet, Filene argues, self-grading has many benefits. It promotes student autonomy, helping students invest in real and meaningful learning. It also invites instructors to productively reconsider their relationships with students. Without the “grade weapon,” course assessments can no longer be used “to convict lazy or bluffing students” and must instead be thought of primarily as tools to further their learning.

So, how can we reap the learning benefits of self-grading while maintaining the more public functions of grades? Filene proposes two changes. On the systems level, he suggests abolishing A-F grades in favor of an honors/pass/fail system.

The “more adventurous change,” however, would be to practice a modified form of self-grading, wherein students and instructors meet together in the latter half of the semester to collaboratively determine a grade. Both parties would explain their reasoning and reach an agreement about where the student stands. Then after the last exam, the instructor would determine the student’s final grade based on their exam performance and on their earlier conversation with the student.

I’m struck by how much of Filene’s experience is similar to the experience of college instructors today, more than 50 years later. His observation that “self-grading enhances the process of learning but disrupts the public process of measurement and reward” is particularly salient. Instructors are still feeling the push and pull of these two functions of evaluation.1 I’m also struck by how closely Filene’s proposed amendment to self-grading, which centers honest conversations between students and instructors, mirrors the way many practice collaborative grading today.

Throughout the article, Filene incorporates remarks from some of his students on this grading experiment, and here again the comments are familiar. Some students liked the system, noting that it helped them focus on learning rather than achieving a certain grade. Some were ambivalent, feeling both freed and disconcerted by the new autonomy afforded them. And others, of course, felt that it encouraged their classmates to cut corners.

My favorite piece of student feedback is this: “You have a good system but lack a perfect society in which to use it.” Perhaps the imperfection that this student refers to is the dishonesty of those who saw the course as an opportunity for an easy A. But they might just as well mean the imperfection of a society that places so much stock in such a flawed measurement.

The idea that “self-grading” only works in a perfect society seems, in fact, to sum up the conclusion Filene himself reaches. And it’s the conclusion I sometimes reach at the end of the semester when I have to generate a grade that reduces the complexity of another human’s development over the course of several months down to a single marker of (under)achievement.

Filene ends the article by observing that his conclusions are entirely tentative: “A first word, not a last.” In fact, a further word came in a subsequent issue of The Journal of Higher Education: in “Is Self-Grading the Answer?,” published in 1970, Ronald H. Mueller calls Filene’s experiment “a clear example of a trend that is gaining acceptance in many colleges and universities.” How widespread this trend actually was at the time I couldn’t say. But today, of course, more of us than ever are employing “self-grading,” or at least collaborative grading. Not a last word indeed.

For an extended take on this, I recommend Jack Schneider and Ethan Hutt’s Off the Mark: How Grades, Ratings, and Rankings Undermine Learning (but Don’t Have to).

Unmaking the Grade

Discussion about this post