When There Are No Perfect Answers: Designing Assessment for the Age of AI
Published on (2026-03-04) by Stephen Wheeler.
Introduction: Living With the Assessment Dilemma
Amelia King recently captured something many educators are quietly experiencing: a growing sense that the traditional assumptions underpinning assessment no longer hold.
Her article, “What Do We Do About AI and Assessment When There Are No Good Answers?”, describes the circular conversations educators now find themselves in. We know AI detection does not work reliably, yet the proposed alternatives - in-class testing, oral examinations, or technological surveillance - threaten to reshape classrooms in ways that undermine their educational purpose.
As King puts it, we appear to be caught between two unsatisfying positions: we cannot reliably verify student work produced outside controlled environments, yet turning classrooms into assessment centres risks destroying the very conditions that make learning meaningful.
In other words, the problem is not simply technological. It is pedagogical.
In this post I want to respond to that dilemma from the perspective of work I have recently undertaken at the University of Manchester. For an entirely online and distance postgraduate programme, I developed an assessment design approach called Structured Peer Verification (SPV).
SPV emerged from a very specific constraint: how do we design assessments that remain credible in an online environment where invigilated exams are neither desirable nor practical, and where generative AI tools are now readily available to students?
Rather than attempting to eliminate AI use entirely, SPV begins from a different premise.
If we cannot guarantee the absence of AI, we must design assessment that guarantees the presence of reasoning.
The Dead End of AI Detection
One of the most striking observations in King’s article is the recognition that the traditional strategy of detecting cheating is increasingly untenable.
Students themselves sometimes assume teachers should simply “know” when work is AI-generated. Yet, as King recounts through classroom examples, comparisons with previous work often fail to resolve the question because the quality may already resemble AI output.
This exposes a deeper issue.
The modern generative AI model does not merely produce incorrect answers or stylistically alien prose. Increasingly, it produces plausible academic responses that resemble competent student work.
Attempts to detect AI therefore risk becoming epistemologically fragile:
- statistical detection produces false positives
- watermarking is unreliable
- stylistic analysis fails when students revise AI output
As Andrej Karpathy bluntly observed, there may simply be no reliable way to detect AI authorship.
If this is true, then a large portion of our current academic integrity infrastructure rests on unstable foundations.
The False Choice: Surveillance or Exams
King identifies the two dominant responses currently emerging.
The first is technological surveillance and AI detection.
The second is a return to controlled assessment environments: in-class writing, oral defences, and examinations.
Each response carries costs.
Surveillance technologies introduce ethical and practical concerns.
In-class testing, meanwhile, risks transforming classrooms into sites of verification rather than sites of learning. King expresses this concern powerfully:
classrooms risk becoming places of proving learning rather than enabling learning.
However, the difficulty becomes even sharper in online and distance education.
For programmes delivered entirely online, bringing assessment “back into the classroom” is often not possible in practice. International cohorts, professional learners, and asynchronous learning structures make traditional invigilated exams difficult to implement and pedagogically questionable.
In online education, therefore, the assessment challenge cannot be solved simply by relocating the exam.
The design of assessment itself has to change.
A Different Starting Point: Verifying Reasoning, Not Policing Tools
The Structured Peer Verification model that I developed for an online programme at my University begins from a different assumption.
Instead of asking:
“Did the student use AI?”
it asks:
“Can the reasoning behind the work withstand structured scrutiny?”
This shift moves the focus of assessment away from authorship detection and toward epistemic accountability.
The goal is not to guarantee that AI was never used. In practice, that guarantee may be impossible.
Instead, the goal is to ensure that the intellectual work remains visible, explainable, and defensible.
In this sense, SPV reframes assessment around three principles:
- reasoning must be inspectable
- understanding must be demonstrated
- intellectual claims must survive verification
The Structured Peer Verification Model
The SPV model was designed specifically for an online and distance postgraduate programme where large-scale oral examinations or invigilated testing were impractical.
It introduces a three-stage assessment process that embeds verification directly within the learning environment.
Stage 1: Student Submission
Students complete the assessment task as normal.
This stage resembles a traditional assignment submission and may involve calculation, analysis, or written explanation.
Stage 2: Structured Peer Verification
Rather than simply grading the work, peers conduct a structured verification of the reasoning behind it.
This involves:
- checking intermediate reasoning steps
- replicating calculations
- identifying conceptual gaps
- asking targeted clarification questions
The peer reviewer does not mark the work. Instead, they test the integrity of the reasoning.
Stage 2b: Author Response
The original student responds to the verification report.
They must:
- clarify reasoning
- address identified weaknesses
- explain methodological choices
This step requires the student to demonstrate ownership of the reasoning process.
Stage 3: Procedural Clarification
A random sample of submissions undergoes a short procedural check with the instructor.
This is not a full viva.
Instead, it functions as a light verification mechanism to ensure that reasoning can be articulated if required.
Why This Matters in the Age of AI
The SPV approach does not attempt to eliminate AI from student workflows.
Students may consult AI tools, just as they consult textbooks, lecture notes, or peers.
However, AI assistance alone cannot complete the verification cycle.
To succeed in the assessment, students must still be able to:
- explain their reasoning
- respond to critique
- defend methodological decisions
In other words, the model shifts assessment away from product authenticity and toward reasoning accountability.
This makes it significantly harder to outsource intellectual engagement entirely.
Preserving the Classroom as a Space for Learning
One of the most compelling concerns raised in King’s article is the risk that classrooms become dominated by testing rather than learning.
For online programmes the equivalent danger is different but related: replacing learning with surveillance.
Assessment systems that rely primarily on AI detection tools, remote invigilation, or technological policing risk undermining trust and distorting the learning environment.
The SPV model attempts to avoid this by embedding verification within collaborative academic practice.
Rather than increasing surveillance, it increases scrutiny of reasoning.
Rather than relying on technological detection, it relies on structured intellectual dialogue between students.
Assessment becomes embedded within learning processes rather than imposed upon them.
Accepting Imperfect Solutions
King’s article ends with an honest admission: perhaps there are no perfect answers.
That may be true.
But the absence of perfect solutions does not mean all responses are equal.
Some responses narrow the possibilities of education.
Others expand them.
Structured Peer Verification is not a complete solution to the AI assessment challenge. It is an approach developed for a specific online programme that attempts to preserve three things simultaneously:
- intellectual rigour
- pedagogical integrity
- the realities of AI-enabled learning environments
If generative AI has forced educators to rethink assessment, that disruption may ultimately prove valuable.
It compels us to ask a deeper question:
not how we detect cheating,
but how we design assessment that makes genuine thinking visible.
References
King, A. (2026) What Do We Do About AI and Assessment When There Are No Good Answers? LinkedIn article, 12 February.
Karpathy, A. (2024) Public comments on AI detection and education policy.