In 2016, around1.6 million students took the SAT (either old or new) at least once. If every student submitted an essay, the College Board needed to grade 1.6 million essays. Since the essay was first offered with the writing section in 2005, the College Board has relied on human graders to evaluate the student work. Assuming that a grader reads one essay every 3 minutes, 800 essays a week, and is paid$15 per hour, one grader can grade 40,000 essays in a year at a cost of $30,000. Put another way, each essay costs $1.50 for two graders to evaluate each student essay. Using these metrics, the College Board spends $2.4 million each year paying graders to evaluate essays, not considering the cost of administering, transporting, scanning, and storing essays, or paying a third grader if the scores of the first two differed significantly. If only there were another way to grade essays and use the $2.4 million for other meaningful purposes…
Enter the automated essay scorer, a mere theory in 1966 that has grown into a reality for many institutions. In 1999, the ETS (Educational Testing Service) offered one of the first automatic essay scorers, called e-rater, and testing companies have had more than 15 years to improve upon that earlier model. More recently, the GMAT published a 2009 study affirming the fairness of its automated essay scorer, IntelliMetric. The analytical writing assignment is scored by a human as well as a computer, and the two scores are averaged together. By incorporating a computer into the grading process, the GMAT not only saves half the cost of grading the essay, but also is able to perform an objective analysis of sentence structure, word count, and complexity that a human reader would not have the time to complete. With a human reader assessing the coherence of the argument and the computer comparing the essay with its database of essays, the GMAT can enjoy the best of both worlds.
It makes sense, then, that the College Board and ACT would be eager to follow in GMAT’s footsteps. If they could replace one reader with a computer, there is the potential to save the hypothetical $1.2 million per year and invest it elsewhere. The fact that both tests have expressed a desire to move to a digital format in the coming decade makes the transition that much simpler: if a test taker types an essay rather than writes it, a computer could deliver a tentative score instantaneously. Only one human reader would be required to follow up and ensure that the computer graded the essay appropriately.
In a preview of that world, the College Board teamed up with Khan Academy to grade electronically the practice essays available online. Currently, students can input essays for SAT Tests 1 and 2 on Khan’s website and receive automated feedback based on the College Board’s essay rubric: 3 scores for reading, analysis, and writing, each out of 8 points.
Naturally, we had to test out the automated essay grader for ourselves.
0, 0, 0
Simply copying and pasting an unrelated article resulted in zeroes across the board.
0, 0, 0
Writing one relevant paragraph and copy/pasting it several times also resulted in zeroes.
7, 4, 7
Five well-written but shorter paragraphs yielded high marks for reading and writing, but low marks for analysis. The computer grader, like its human predecessors, knows the limits of a short essay.
8, 6, 7
Adding an additional paragraph to create a longer essay boosted analysis as well as reading.
Thus far, we noticed that the essay grader does a good job of identifying irrelevant, repeated material. It also evaluates the length when determining its score. To test the program further, we asked ourselves how the grader would respond to a nonsensical essay that used all the right words and sentence structure, even referencing rhetorical devices and making quotations of the passage. Try and make sense of the following introduction, written by one of our more linguistically creative tutors. (The essay asked students to evaluate the rhetorical devices used by Bogard, who in a persuasive essay laments the diverse and damaging effects of light pollution on humans and animals.)
Darkness can symbolize a protean notion of absolute nihilism, floating endlessly in a void without any smattering of perception or purpose. Bogard embraces this absence and sees darkness as a lofty pursuit necessary for absolute harmony within our fractured post-modern existence. For when we lose the dark, we become absorbed by the light and the nocturnal chimeras of our subconscious cannot take flight. Using alliterative juxtapositions, carcinogenic conceits, and allusions to fiscal collapse, Borgard persuades the audience that we need to embrace the abyss in order to keep balance in an increasingly fractured and oppressive world.(click here to continue reading).
This essay used very high-level vocabulary and sentence structure, relevantly addressed the rhetorical devices within the author’s passage, and even supplied quotations from key parts of the passage. Surely a human would be required to recognize the ingenious absurdity of this author’s writing!
The computer gave 7’s for reading and writing, fairly evaluating the author’s ability to read Bogard’s argument critically and craft well-written paragraphs. Much to our surprise, the computer gave the writer a 2 for analysis, easily recognizing that the author’s essay, however well it was written and however well it interacted with the rhetorical passage, was absurd to the extreme. Nicely done, automated grader.
In addition to the essay grader which provides scores for Tests 1 and 2, Khan Academy also provides more personalized feedback. To serve the students looking for more in-depth analysis, the College Board partnered with TurnItIn to give specific line-by-line suggestions for the practice essay section. Students can write essays and receive comments on particular sections of their essays based on their reading, analysis, and writing abilities.
The College Board and ACT have their work cut out for them to persuade colleges and universities that their essays are predictive of college success for applicants. Despite the initially lukewarm reception to the redesigned essays, the College Board is investing resources into electronic essay grading, demonstrating its belief that the exercise provides a valuable metric for colleges. We can expect at least one set of human eyes to continue grading student essays in the short term, but if the Khan Academy essay grader is any indication, even that role may be close to retirement.
Applerouth is a trusted test prep and tutoring resource. We combine the science of learning with a thoughtful, student-focused approach to help our clients succeed. Call or email us today at 866-789-PREP (7737) or email@example.com.
In March of 2016, the College Board rolled out the new SAT. At the time, these changes to the SAT were the most significant since 2005, when the College Board introduced a writing section and increased the scoring range from 1600 points to 2400 points.
Initially, many students, teachers, tutors, and guidance counselors were anxious to see what the changes would mean. In fact, changes to the scoring structure and format of the new test were of particular concern, as many students did not know exactly how their performance would be assessed.
Now, almost a whole year later, we have a much better understanding of the new SAT and how it is scored. Specifically, we now know the new scoring scale and we know that the actual scoring process is not much different than it was for the older version of the SAT.
To learn more about the format, scoring scale, and scoring process for the new SAT, read on.
What is the format of the New SAT?
At first glance, the new SAT appears significantly different from the SAT administered prior to March 2016. It contains two primary test sections, and one additional optional test section, as opposed to the three required sections on the previous version of the test.
One of the primary tests is the Math Test. This is actually comprised of two smaller test sections: the Math Test With Calculator and the Math Test – No Calculator.
The other primary test is the Evidence-Based Reading and Writing Test. This is also comprised of two smaller test sections: the Critical Reading Test and the Writing and Language Test.
The final component of the new exam, the SAT Essay, is now optional.
How are tests scored?
When you are finished taking the SAT, the test supervisor will collect and count the test books to make sure all materials have been turned in before dismissing you from the testing room. This is to help ensure the security of testing materials.
All test materials are then put into a sealed envelope and sent to a scoring center. At the scoring center, SAT Essays are removed for separate scoring, while the remaining answer sheets are scanned by a machine that counts the number of correct answers bubbled in on each answer sheet.
Tests are scored based on the number of answers that you got correct. With the exception of the SAT Essay, all tests have multiple-choice or grid-in answers. This means that answer sheets can be quickly scanned to tally raw scores. Because there is no scoring penalty for wrong answers, your raw score is simply the number of correct answers that you achieved on each section.
Once your raw scores have been tallied, they are converted to scaled scores through a process called equating. Equating accounts for very slight differences in test difficulty and ensures that scores are consistent across different forms of the SAT.
The exact equation used to equate your raw SAT score to a scaled score varies slightly from one test to another, and is adjusted in small increments to reflect the difficulty of the test.
You can get a better idea of the exact process by reviewing the scoring procedure for official SAT practice tests prepared by the College Board. Check out the Raw Score Conversion Tables beginning on page seven of the packet Scoring Your SAT Practice Test #1.
What is the score range for the new SAT?
Scaled scores for each required SAT test range from 200-800. You receive one score from 200-800 for the Math test, which takes into account your performance on both the Math Test With Calculator and Math Test – No Calculator sections. You receive another score from 200-800 for the Evidence-Based Reading and Writing test, which takes into account your performance on both the Writing and Language Test and the Critical Reading Test.
Your total SAT score will always range from 400-1600 and is calculated simply by adding together the scores from your Math test and your Evidence-Based Reading and Writing Test.
The new, optional SAT Essay is scored differently, using a different scale, and it bears no weight on your total SAT score.
To learn more about SAT scores, read CollegeVine’s What Is a Good SAT Score?
How is the new SAT essay scored?
The optional essay cannot be scored by computer since its answers are not multiple-choice or grid-in. Instead, each SAT essay is read by two qualified readers. The readers each assign a score from one to four in three different dimensions: Reading, Analysis, and Writing.
If the scores assigned by the readers to any single dimension vary by more than one point, a scoring director will read the essay to resolve the discrepancy.
The points assigned in each dimension are then totaled, resulting in a score range for each dimension between two and eight. The dimension scores are added together to result in a total score ranging from 6-24.
You can read more about the SAT Essay scoring process and preview the scoring rubric on CollegeBoard’s SAT Essay Scoring site.