Wednesday, December 4, 2013

Thoughts on the Grading Asymptote

During the question period of yesterday's faculty meeting, Professor Mansfield said that he had heard that the modal grade at Harvard (the most frequently given grade, as he put it) was A–. As the Crimson accurately reports,
[Dean of Undergraduate Education Jay] Harris then stood and looked towards FAS Dean Michael D. Smith in hesitation. 
“I can answer the question, if you want me to.” Harris said. “The median grade in Harvard College is indeed an A-. The most frequently awarded grade in Harvard College is actually a straight A.”
These complaints are a very old issue. Letter grading started at Harvard in 1886, and the first anti-inflationary committee report was issued in 1894! As I wrote in Excellence Without a Soul (p. 115),

“[The Committee on Raising the Standard] believes . . . that by defining anew the Grades A, B, C, D, and E, and by sending the definitions to every instructor, the Faculty may do something to keep up the standard of the higher grades. It believes that in the present practice Grades A and B are sometimes given too readily—Grade A for work of no very high merit, and Grade B for work not far above mediocrity.” More broadly, the Committee opined, lax grading was compromising the very significance of a Harvard degree. “One of the chief obstacles to raising the standard of the degree is the readiness with which insincere students gain passable grades by sham work. . . . These students maintain themselves in technically good standing with so little work that our degree would be seriously cheapened if its minimum cost were generally known.” 
Back in 2001 I did my best to reconstruct historical data; it seems that grades have been rising for a long time, not continuously, but more or less without interruption, except for the decade of the 1970s. I noted that "the present rate of increase can't go on forever." The news about the median and modal grades at Harvard don't settle the question of changes in grade point averages, but I'll bet they have continued to rise, though at a slower rate simply because they can't be higher than 4.0 in the Harvard system.

What is to be done?

With all due respect to my friend and colleague Professor Mansfield, it is not clear to me that anything needs to be done. Sorry to sound complacent, but almost any remedy I can think of would have side effects that are worse than the problem (which I dissect in detail in Excellence Without a Soul and will not burden you with here).

Toughening up on grading practices would run counter to other educational initiatives and desiderata with which we are engaged. For example, grading is higher in smaller, less anonymous courses. And we are trying to reduce the number of larger, more anonymous courses. It's pretty reasonable to think that faculty are softer on students they know well, and the trend is to try to improve education by making the experience more intimate. (That is actually implicit in the other story in today's Crimson, the success of the multiple, smaller life sciences concentrations.)

If it is still true (it was true a decade ago, but no recent data are publicly available) that grades in humanities courses are higher than grades in science courses, then any attempt to lower grades would probably have a differential effect on the humanities. But the humanists are well aware that they are already losing students and are alarmed about the trend lines. I can't imagine they would welcome any effort to make their courses less attractive by making the grading tougher.

We are also trying to reduce stress, and arguably tougher grading might raise stress levels. Though it is not really clear to me; there is always going to be competition as long as more than one grade is possible. And students get stressed even about ungraded courses they are virtually certain to pass -- they get stressed about their performance as a matter of personal pride even if the professor's assessment means next to nothing.

In any case, given the long history of claims of imminent harm from soft grading and the difficulty I have in finding a lot of slackers in my classes, I am skeptical about the direness of the problem. When new colleagues ask me what the community norms are, I typically tell them to grade however they want, as long as they will be able to look back in 3 years and remember who was a star, who was a workhorse, and who was just getting by.

Of course the situation in computer science may be atypical, since almost no employer asks candidates for transcripts. They administer a kind of oral quiz in the interview and try to figure out what the candidate knows. Grades are pretty irrelevant.

Having said all that, here are two suggestions that I think might help.

1) Require every department to have a discussion of grading once a year. Hand out the grades assigned by everyone in the department so they have to look at their own and their peers's practices while everyone is watching. Have someone from the administration go to the meeting to make sure it happens. No quotas, no rules, just information and a requirement for talking to each other about what grades are being given and why. The underlying idea here is to make the conversation more intimate, conducted not by a dean from on high but in a collegial way, by people the faculty have to interact with every day and whose respect they value.

2) Make grades less important. Here is one idea.

Largely decouple honors from overall GPA, which is a meaningless soup of uncertified ingredients. Set some generous minimum threshold, and then throw the matter of cum, magna, and summa recommendations to the departments, giving each department some kind of quotas. Only the departments can look at students' transcripts and tell the difference between an ambitious program and a lazy one. Have departments make their honors recommendations however they want, based on the transcript, a thesis, and whatever else they know about the student. This would greatly reduce the incentive to take easy courses or to fight for microscopic increases in GPA, or to take courses that are not educational because the student already knows the material. (When I am on a prize committee that shares transcripts, the difference between differently ambitious transcripts with identical GPAs is pretty apparent.)

Of course this could be opposed as impractical or unfair. Students want to know exactly what they have to do to graduate magna and there would be no way to know in this subjective system. But these students all got into Harvard via exactly such a subjective system, one that does not ignore grades and scores, but uses them only as part of a more holistic review of essays, interviews, and letters of recommendation. Students would have to agree would they not, that such a system can produce pretty good results?

And departments might not like the system because they don't know their students well enough to make these decisions in any non-mechanical way. Well, they could use a mechanical way if they preferred. Or they could get to know their students better!

See also this curious story. Can't vouch for its accuracy! As Yale talks grade deflation, Princeton pulls back | Yale Daily News. And for those who haven't seen it, a homily about grades I gave years ago at Morning Prayers. Beggars for punishment can also read this homily about education requirements.

[Revised 11:15pm on 12/4 to include note about grading in the humanities and reference to second Morning Prayer talk.]

[Added 12/5: I should have mentioned that what I propose for honors determination resembles what happens right now in Phi Beta Kappa elections. That was probably in the back of my mind while I was typing actually.]


  1. The deepest side-effect I can see is inaccuracy. If the differences in exam scores are largely noise then it's inaccurate to give an A to those who lucked into the slightly higher scores. Yet a curve forces the professor into valorizing the noise.

    And if the professor is doing a very, very good job then everyone should know the material perfectly by the end of the course and get an A. The pre-ordained curve pretty much assumes that the professors won't do a very, very good job.

    And if a professor tries to fit a Gaussian to the exam scores, the professor also seems to rest on the idea that knowledge transmittal is largely a random process, another notion that's philosophically at odds with idea that professors are unique and skilled carpenters who build a frame of knowledge in the students' brains.

    1. That all makes sense. Curves are a stupid idea for lots of reasons. Class enrollments are not random samples. Even if you did take a random sample of the Harvard student population, it would be a random sample drawn from an unbelievably skewed distribution according to tested ability.

  2. Peter:

    I'll start by granting the caveat that grades are necessarily the best approximation we have to a student's ability and that there will certainly be noise. I'll also start by stating I'm a Harvard professor (a colleague of Harry's) and my class is known for lacking in grade inflation (as Harry can verify); my median grade is usually a B/B+ (though the mode is, I think, usually an A-).

    You make some statements I disagree with.
    "If the differences in exam scores are largely noise then it's inaccurate to give an A to those who lucked into the slightly higher scores."
    I think the key here is that we should be designing exams that are not largely noise. In my exams, the grades typically range from 30% to 99%; while there's some noise, the large separation in grades means I get useful information. I note that students don't particularly like this; low grades do not make them feel good, even though having a large range of results provides the most useful information for grading purposes. More typical exams, where the range is typically 85-99%, provide much less information; if grades are to be given, I think it's appropriate to give an exam that tests the range of abilities.

    "And if the professor is doing a very, very good job then everyone should know the material perfectly by the end of the course and get an A."
    I just can't agree with this statement. This is only true if one is not challenging a majority of the students. I would say that if the professor is doing a very good job, then everyone should be learning as much new material as is reasonably possible (given the heterogeneity of ability and effort). Ideally, everyone who takes my class (and puts in the effort) is learning some minimum threshold amount. Also ideally, the talented and/or energetic students that come in are not bored but challenged, and have the opportunity to learn more. I suppose I could give a grade based on learning that minimum threshold amount and then everyone might get an A. But that doesn't seem appropriate to me.

    Of course, students need to realize that a "B" in my class would possibly be equal to an A or A- in another class, a message I have to try to reinforce often.

  3. If almost everyone gets some sort of A, that necessarily increases the stress of getting a B. If grade inflation continues, an A- will soon be horrible.