AI Cheating Is Getting Worse

19.08.2024 16:00

TheAtlantic.com

Kyle Jensen, the director of Arizona State University’s writing programs, is gearing up for the fall semester. The responsibility is enormous: Each year, 23,000 students take writing courses under his oversight. The teachers’ work is even harder today than it was a few years ago, thanks to AI tools that can generate competent college papers in a matter of seconds.

A mere week after ChatGPT appeared in November 2022, The Atlantic declared that “The College Essay Is Dead.” Two school years later, Jensen is done with mourning and ready to move on. The tall, affable English professor co-runs a National Endowment for the Humanities–funded project on generative-AI literacy for humanities instructors, and he has been incorporating large language models into ASU’s English courses. Jensen is one of a new breed of faculty who want to embrace generative AI even as they also seek to control its temptations. He believes strongly in the value of traditional writing but also in the potential of AI to facilitate education in a new way—in ASU’s case, one that improves access to higher education.

[Read: The first year of AI college ends in ruin]

But his vision must overcome a stark reality on college campuses. The first year of AI college ended in ruin, as students tested the technology’s limits and faculty were caught off guard. Cheating was widespread. Tools for identifying computer-written essays proved insufficient to the task. Academic-integrity boards realized they couldn’t fairly adjudicate uncertain cases: Students who used AI for legitimate reasons, or even just consulted grammar-checking software, were being labeled as cheats. So faculty asked their students not to use AI, or at least to say so when they did, and hoped that might be enough. It wasn’t.

Now, at the start of the third year of AI college, the problem seems as intractable as ever. When I asked Jensen how the more than 150 instructors who teach ASU writing classes were preparing for the new term, he went immediately to their worries over cheating. Many had messaged him, he told me, to ask about a recent Wall Street Journal article about an unreleased product from OpenAI that can detect AI-generated text. The idea that such a tool had been withheld was vexing to embattled faculty.

ChatGPT arrived at a vulnerable moment on college campuses, when instructors were still reeling from the coronavirus pandemic. Their schools’ response—mostly to rely on honor codes to discourage misconduct—sort of worked in 2023, Jensen said, but it will no longer be enough: “As I look at ASU and other universities, there is now a desire for a coherent plan.”

Last spring, I spoke with a writing professor at a school in Florida who had grown so demoralized by students’ cheating that he was ready to give up and take a job in tech. “It’s just about crushed me,” he told me at the time. “I fell in love with teaching, and I have loved my time in the classroom, but with ChatGPT, everything feels pointless.” When I checked in again this month, he told me he had sent out lots of résumés, with no success. As for his teaching job, matters have only gotten worse. He said that he’s lost trust in his students. Generative AI has “pretty much ruined the integrity of online classes,” which are increasingly common as schools such as ASU attempt to scale up access. No matter how small the assignments, many students will complete them using ChatGPT. “Students would submit ChatGPT responses even to prompts like ‘Introduce yourself to the class in 500 words or fewer,’” he said.

If the first year of AI college ended in a feeling of dismay, the situation has now devolved into absurdism. Teachers struggle to continue teaching even as they wonder whether they are grading students or computers; in the meantime, an endless AI-cheating-and-detection arms race plays out in the background. Technologists have been trying out new ways to curb the problem; the Wall Street Journal article describes one of several frameworks. OpenAI is experimenting with a method to hide a digital watermark in its output, which could be spotted later on and used to show that a given text was created by AI. But watermarks can be tampered with, and any detector built to look for them can check only for those created by a specific AI system. That might explain why OpenAI hasn’t chosen to release its watermarking feature—doing so would just push its customers to watermark-free services.

Other approaches have been tried. Researchers at Georgia Tech devised a system that compares how students used to answer specific essay questions before ChatGPT was invented with how they do so now. A company called PowerNotes integrates OpenAI services into an AI-changes-tracked version of Google Docs, which can allow an instructor to see all of ChatGPT’s additions to a given document. But methods like these are either unproved in real-world settings or limited in their ability to prevent cheating. In its formal statement of principles on generative AI from last fall, the Association for Computing Machinery asserted that “reliably detecting the output of generative AI systems without an embedded watermark is beyond the current state of the art, which is unlikely to change in a projectable timeframe.”

[Read: A generation of AI guinea pigs]

This inconvenient fact won’t slow the arms race. One of the generative-AI providers will likely release a version of watermarking, perhaps alongside an expensive service that colleges can use in order to detect it. To justify the purchase of that service, those schools may enact policies that push students and faculty to use the chosen generative-AI provider for their courses; enterprising cheaters will come up with work-arounds, and the cycle will continue.

But giving up doesn’t seem to be an option either. If college professors seem obsessed with student fraud, that’s because it’s widespread. This was true even before ChatGPT arrived: Historically, studies estimate that more than half of all high-school and college students have cheated in some way. The International Center for Academic Integrity reports that, as of early 2020, nearly one-third of undergraduates admitted in a survey that they’d cheated on exams. “I’ve been fighting Chegg and Course Hero for years,” Hollis Robbins, the dean of humanities at the University of Utah, told me, referring to two “homework help” services that were very popular until OpenAI upended their business. “Professors are assigning, after decades, the same old paper topics—major themes in Sense and Sensibility or Moby-Dick,” she said. For a long time, students could just buy matching papers from Chegg, or grab them from the sorority-house files; ChatGPT provides yet another option. Students do believe that cheating is wrong, but opportunity and circumstance prevail.

Students are not alone in feeling that generative AI might solve their problems. Instructors, too, have used the tools to boost their teaching. Even last year, one survey found, more than half of K-12 teachers were using ChatGPT for course and lesson planning. Another one, conducted just six months ago, found that more than 70 percent of the higher-ed instructors who regularly use generative AI were employing it to give grades or feedback to student work. And the tech industry is providing them with tools to do so: In February, the educational publisher Houghton Mifflin Harcourt acquired a service called Writable, which uses AI to give grade-school students comments on their papers.

Jensen acknowledged that his cheat-anxious writing faculty at ASU were beset by work before AI came on the scene. Some teach five courses of 24 students each at a time. (The Conference on College Composition and Communication recommends no more than 20 students per writing course and ideally 15, and warns that overburdened teachers may be “spread too thin to effectively engage with students on their writing.”) John Warner, a former college writing instructor and the author of the forthcoming book More Than Words: How to Think About Writing in the Age of AI, worries that the mere existence of these course loads will encourage teachers or their institutions to use AI for the sake of efficiency, even if it cheats students out of better feedback. “If instructors can prove they can serve more students with a new chatbot tool that gives feedback roughly equivalent to the mediocre feedback they received before, won’t that outcome win?” he told me. In the most farcical version of this arrangement, students would be incentivized to generate assignments with AI, to which teachers would then respond with AI-generated comments.

Stephen Aguilar, a professor at the University of Southern California who has studied how AI is used by educators, told me that many simply want some leeway to experiment. Jensen is among them. Given ASU’s goal to scale up affordable access to education, he doesn’t feel that AI has to be a compromise. Instead of offering students a way to cheat, or faculty an excuse to disengage, it might open the possibility for expression that would otherwise never have taken place—a “path through the woods,” as he put it. He told me about an entry-level English course in ASU’s Learning Enterprise program, which gives online learners a path to university admission. Students start by reading about AI, studying it as a contemporary phenomenon. Then they write about the works they read, and use AI tools to critique and improve their work. Instead of focusing on the essays themselves, the course culminates in a reflection on the AI-assisted learning process.

[Read: Here comes the second year of AI college]

Robbins said the University of Utah has adopted a similar approach. She showed me the syllabus from a college writing course in which students use AI to learn “what makes writing captivating.” In addition to reading and writing about AI as a social issue, they read literary works and then try to get ChatGPT to generate work in corresponding forms and genres. Then they compare the AI-generated works with the human-authored ones to suss out the differences.

But Warner has a simpler idea. Instead of making AI both a subject and a tool in education, he suggests that faculty should update how they teach the basics. One reason it’s so easy for AI to generate credible college papers is that those papers tend to follow a rigid, almost algorithmic format. The writing instructor, he said, is put in a similar position, thanks to the sheer volume of work they have to grade: The feedback that they give to students is almost algorithmic too. Warner thinks teachers could address these problems by reducing what they ask for in assignments. Instead of asking students to produce full-length papers that are assumed to stand alone as essays or arguments, he suggests giving them shorter, more specific prompts that are linked to useful writing concepts. They might be told to write a paragraph of lively prose, for example, or a clear observation about something they see, or some lines that transform a personal experience into a general idea. Could students still use AI to complete this kind of work? Sure, but they’ll have less of a reason to cheat on a concrete task that they understand and may even want to accomplish on their own.

“I long for a world where we are not super excited about generative AI anymore,” Aguilar told me. He believes that if or when that happens, we’ll finally be able to understand what it’s good for. In the meantime, deploying more technologies to combat AI cheating will only prolong the student-teacher arms race. Colleges and universities would be much better off changing something—anything, really—about how they teach, and what their students learn. To evolve may not be in the nature of these institutions, but it ought to be. If AI’s effects on campus cannot be tamed, they must at least be reckoned with. “If you’re a lit professor and still asking for the major themes in Sense and Sensibility,” Robbins said, “then shame on you.”

When you buy a book using a link on this page, we receive a commission. Thank you for supporting The Atlantic.