Help end medical misogyny. Sign our petition.

Help end medical misogyny.
Sign our petition.

Sign the petition

Please or to access all these features

Higher education

Talk to other parents whose children are preparing for university on our Higher Education forum.

Should universities bring back in-person exams to tackle AI?

83 replies

Jaxx · 07/06/2026 12:26

From the Speccie:
Why isn’t Durham University taking AI cheating seriously?

At Durham University, I have been Chair of the Board of Examiners for Philosophy since 2016. Last week I resigned, because I feel that it is my responsibility to raise a vital issue in higher education, one whose true significance is not understood. The existence of a crisis requiring immediate action is not generally recognised. I am not blaming the deans and pro-vice-chancellors. I want to hold the appropriate figures to account: the Vice-Chancellors.

Durham University is a beacon of excellence in the UK university sector. I wish to maintain standards in its top-rated Philosophy department. The issue I am addressing affects students past, present and future, in many leading British universities. It involves a crime that is not victimless. It is this: the lazy student cheats with professional-grade versions of AI chatbots such as ChatGPT or Claude, and gets a first-class result. The hard-working student uses no bots, or a non-professional bot honestly, thinking and writing for themselves, but gets a 2:1. That’s not fair. Yet Vice-Chancellors – to mix metaphors – are sitting on their lavishly-remunerated backsides and adopting the ostrich position.

Let me explain how the university marking system works, at least in the UK. Until Covid, there was a mix of assessment by sit-down exam and continuous assessment. During Covid, this was replaced, for understandable reasons, by “at home” exams done on computer – to the relief of academics who no longer had to decipher student handwriting. After Covid this continued, and worked satisfactorily. A system called Turnitin would check for plagiarism by scanning essays for text compiled from published sources, and it was hard to cheat.

That system has been subverted by ChatGPT and similar AI tools. These apps now work very well for producing university-style essays and essay components, especially in the professional version which many Durham students can afford. (You see that there is discrimination already against working-class students. What a surprise.)

A return to sit-down exams is the obvious solution. Universities say that these are financially impractical for various reasons. I doubt this, but in any case it is too late at Durham and elsewhere to implement sit-downs exams this year. (They should have been implemented a year ago.) So in response to the immediate crisis I must suggest sticking plaster solutions. I am old enough to recall the introduction of anonymous marking, which happened because research showed that female students were discriminated against. Now, when the alternative is what I call a chaotic system of guesswork, it may be our least-worst immediate alternative.

AI, in contrast, is now often impractical to detect with enough reliability to meet the high standards of proof required for accusations of cheating. Perhaps there is no ultimate alternative to sit-down exams – other options such as vivas are very time-consuming. My suggestion of abandoning anonymous marking seems a reasonable sticking plaster. We need an immediate response to the unfairness in the coming exam season. Yet most of my colleagues don’t seem to understand how it is fairer.

The buck stops with VCs, who should be exercising leadership. Obviously students who cheat are behaving badly – but the point is that the system allows them to do so with ease. Students have always cheated, but now they know they can do so with impunity.

I have been an academic since 1988, and worked at Durham University since 1991. I’ve been very lucky to get a job at Durham – and privileged. The students are excellent, the university is well-known and generally well-run. Any student who gets a place here will be both excellent academically, and well-set for future employment. Academic work is the only full-time employment I’ve had, after a series of temporary jobs as a student, and taking a PGCE course at Moray House Edinburgh (Primary). But now, shortly before retirement, I find that the skills I’ve developed as a marker – essential to my teaching role – aren’t able to be put to their proper use. Much effort is being expended by academics in working out how to identify the improper use of AI. But this vain pursuit is becoming very difficult and may soon be impossible. AI “tells” that remain are being aggressively stamped out by AI companies. One cannot mark essays under the Covid system, in the era of ChatGPT. Yet the university leadership is still in the ostrich position.

I was on research leave for the first term this academic year; Durham still operates a term system. Since then I have chaired one disciplinary panel for AI cheating, detected by the junior colleague who marked it. He noticed that the referencing contained tell-tale signs of AI “hallucinations”. This AI “tell” can prove misuse, as has happened in a number of high-profile cases. But AI makers have reduced its frequency, and cheaters can cover their tracks by minimally checking or omitting references. With existing tools, AI cheating is almost impossible to prove except by student confession. Stylistic cues, incongruous sophistication or maturity of writing for an undergrad cannot reliably show that AI has been misused, even when the marker is suspicious. Only with the last batch of essays which I started marking in April, did the true horrors of the situation become apparent to me. The system was broken and needed immediate emergency repair.

It should be clear that my complaint is not against Durham University. Few British universities have re-introduced sit-down exams extensively. Many departments have never had anonymisation; music, for obvious reasons. I know from bitter experience that one cannot play the piano with a paper bag over one’s head. Universities seem unable to respond in a timely manner to the current crisis. Academics are grumbling. Someone surely has to blow the whistle?

I guess that few whistle-blowers are enthusiastic. I have never engaged in a lawsuit, or been the victim of one, on any matter. A cursory Google search reveals that UK whistleblower laws protect my salary and pension, and I’ll get a hundred quid for this article. So I’m prepared to light the blue touchpaper and retire, in both senses.

It has been a very difficult decision to get this article published. Colleagues will be upset, and students disconcerted. But should they continue in ignorance? There needs to be quick action – something that large institutions such as universities have never been known for. Think of Air Chief Marshal Dowding in 1940, confronted with massive losses of planes in France. He goes to Winston Churchill and says that we cannot send any more Hurricanes and Spitfires there. But instead of reluctantly agreeing, Churchill sets up a cabinet committee to investigate, and committees in parliament. By the time they report, Britain is being invaded. I think I would be on Dowding’s side in that debate.

The imperative is the wellbeing of students who are suffering under the present system, though they don’t realise it. Staff in UK universities, and elsewhere, are not able to mark with the integrity the matter demands, because there is no sufficiently reliable way of detecting or preventing improper use of AI. Academics are doing their best, but as one young colleague said to me, “We are fighting with our hands tied behind our backs”.

Cromwell’s immortal words to the Rump Parliament apply to the current generation of VCs: “You have sat too long here for any good you have been doing lately…Depart, I say, and let us have done with you. In the name of God, go!” We need an equivalent of the mani pulite movement in Italian politics – one that does not end with a Berlusconi.

by Andy Hamilton, Professor of Philosophy and former Chair of Board of Examiners at Durham University

Full disclosure, my son is at Durham doing an humanities degree and I totally agree there should be a return to in person degrees for all exams. He doesn’t use AI to write essays tbf, but does use it to help plan, find sources and check his work acting on recommendations he agrees with.

OP posts:
greenlampern · Yesterday 13:17

poetryandwine · Yesterday 12:51

Yes, I am a STEM academic. I don’t specialise in anything medical or biological, but I regard the use of LLMs in medical imaging and diagnostics as one of the unmitigated successes to date. I think shortly LLMs will be able to take over a great deal here and I hope that will relieve a burden in the NHS.

I think you’re talking about a new scale of workings rather than refuting the scientific methodology, but I daresay scientists and engineers felt the same way during the Industrial Revolution! In 200 years the AI paradigm shift will seem slow paced.

People are focused on LLMs, but it's the genAI tools that seem to have the greatest potential now.

Yes, not refuting the scientific method at all. But having to develop methodologies that align and answer to it as you go along is intense. It's the Wild Wild West, and no one has any idea how it should or could be done.

It'll take so much less money and time to generate evidence and knowledge. The problem in my opinion is policy makers and people at the top of these projects are understandable resistant, but also hostile.

greenlampern · Yesterday 13:26

poetryandwine · Yesterday 12:51

Yes, I am a STEM academic. I don’t specialise in anything medical or biological, but I regard the use of LLMs in medical imaging and diagnostics as one of the unmitigated successes to date. I think shortly LLMs will be able to take over a great deal here and I hope that will relieve a burden in the NHS.

I think you’re talking about a new scale of workings rather than refuting the scientific methodology, but I daresay scientists and engineers felt the same way during the Industrial Revolution! In 200 years the AI paradigm shift will seem slow paced.

Also another interesting point to what you're saying- a lot of people are leaving academia and going into industry and just doing the same work for better pay. Work that was traditionally done within universities. The difference is they don't have to answer to rigorous academic rules, which in reality do stifle innovation. You see the negative impact of this with the abandonment of ethics, but also unprecedented and interesting to watch.

AmaryllisNightAndDay · Yesterday 13:51

poetryandwine · Yesterday 12:51

Yes, I am a STEM academic. I don’t specialise in anything medical or biological, but I regard the use of LLMs in medical imaging and diagnostics as one of the unmitigated successes to date. I think shortly LLMs will be able to take over a great deal here and I hope that will relieve a burden in the NHS.

I think you’re talking about a new scale of workings rather than refuting the scientific methodology, but I daresay scientists and engineers felt the same way during the Industrial Revolution! In 200 years the AI paradigm shift will seem slow paced.

Are you specifically taking about LLMs or AI in general? I worked in a hospital in the late 1970s which ran one of the first AI-based report generators. It wasn't an LLM or even what we'd now think of as AI - it simply took sets of figures, interpreted them and translated them into an English sentence in a strict rule-based way (it was a version of PUFF which interpreted flow-volume loops together with some other patient data, to add to the printed graph) We called it a "summary generator" not AI. It was always subordinate to the judgment of the consultant who was expected to check it. And because it applied a small number of strict and rigid explicit rules to generate the summaries and the graph was right there to see it was reliable and verifiable and easy to test and fix.

AI has moved on massively since then, especially the growth of statistical AI and machine learning. The AI paradgim shift may seem slow because AI isn't infallible and it's not transparent and we can't yet scope the possibilities for failure. And that especially matters in safety-critical systems like medical diagnosis.

And AI is more than LLM. I don't think I'd ever want to be diagnosed by an LLM! LLMs generate plausible text not accurate text. Forcing an LLM to check ground truth is still an open research problem.

Owlbookend · Yesterday 14:50

I dont think anybody thinks AI isnt a useful tool or that students shouldnt be taught about where and how to use AI appropriately or effectively.
The problem is that unless you believe that assessments can be tackled in any way using any AI tool students using it covertly will be problematic. Most universities will ask students to include an AI statement with uncontrolled assessments (coursework assessments etc.) and a policy on what is or is not acceptable AI use. However, if the AI statement is not truthful it is very difficult to prove this. If you think any type of AI use is acceptable it doesnt matter. However, if you think that AI use needs to be limited or used only in particular ways it is.
Students are developing the ability to analyse and interpret data, sources or literature. You can get LLM AI to do this. Does that mean students dont have to learn how to do it themselves? Do we not need the ability to do this as humans so we can analyse the validity output? Is undergraduate education simply about developing your skills to write effective AI prompts? There are quite fundamental educational issues about skills, product and process.
If we take a step back how about in pre-university education? Should young people use AI in GCSE English assessments? If asked to write a magazine article promoting sport to young people is it okay to just plug the question that into ChatGPT and have the output marked? You could do that in 'real life'. Playing devil's advocate we dont need to develop writing skills- AI can do it. Why learn to solve quadratic equations or analyse the causes of WW2? A computer can do it for the student.

Fundamentally education is about developing and refining skills and abilities it is not about creating 'products' (reports, essays, proofs, solutions etc.). We already know computers can sometimes do these things more efficiently. The assessments are just mechanisms to get students using and developing their skills and abilities. I dont think the only skill people need to develop is cutting and pasting assignment briefs into an AI engine.

Owlbookend · Yesterday 14:55

& i dont think every assignment should be either
A) Something personal/reflective
B) Analysing the validity and accuracy of AI outputs
These are the two strategies that are commonly argued to be robust. Although both can have their place - they cant be a complete assessment strategy.

poetryandwine · Yesterday 16:20

AmaryllisNightAndDay · Yesterday 13:51

Are you specifically taking about LLMs or AI in general? I worked in a hospital in the late 1970s which ran one of the first AI-based report generators. It wasn't an LLM or even what we'd now think of as AI - it simply took sets of figures, interpreted them and translated them into an English sentence in a strict rule-based way (it was a version of PUFF which interpreted flow-volume loops together with some other patient data, to add to the printed graph) We called it a "summary generator" not AI. It was always subordinate to the judgment of the consultant who was expected to check it. And because it applied a small number of strict and rigid explicit rules to generate the summaries and the graph was right there to see it was reliable and verifiable and easy to test and fix.

AI has moved on massively since then, especially the growth of statistical AI and machine learning. The AI paradgim shift may seem slow because AI isn't infallible and it's not transparent and we can't yet scope the possibilities for failure. And that especially matters in safety-critical systems like medical diagnosis.

And AI is more than LLM. I don't think I'd ever want to be diagnosed by an LLM! LLMs generate plausible text not accurate text. Forcing an LLM to check ground truth is still an open research problem.

Your early report generator sounds fascinating.

I was thinking specifically of medical imaging and genetics/ genomics, which are about pattern finding. I think LLMs, machine learning and some generative AI are key drivers, are they not? This is not my specialism.

Yes, of course AI is much more than LLMs and as its scope expands so does the need to stay connected to reality.

Surely GIGO applies as much to AI use (albeit somewhat differently, perhaps - it may be better to think of sow’s ears and silk purses) as to any other field. Too many students do not appreciate this. It is one of the reasons they trip themselves up.

You know a lot more than me about the use of AI in other areas of diagnostics. With the caveat that a consultant should always be in charge, may I ask what types of AI usage would make you uncomfortable? Or is it that you fear consultants would relinquish independent thought?

blueshipsail · Yesterday 18:03

My daughter is doing a STEM degree at Surrey and has sat seven in person exams in y1 plus some multiple choice tests under supervision. It sounds like there’s a big discrepancy between courses and across institutions.

ElephantGrey101 · Yesterday 18:21

I am a university lecturer and we are changing our assessment to in class tests to make them harder to cheat.

I have failed some students and put some through academic misconduct for AI use but that is only a tiny minority of those who have used AI to complete their assignment.

We tried to get the students to do a reflective assignment where they critiqued an AI output but many of them actually got the AI to critique itself ( I wish I was joking) so that assignment has to be changed.

New posts on this thread. Refresh page