Please or to access all these features

Add post

Watch this thread

Save thread

Start a new thread

Flip thread

Hide thread

My feed

Active Unanswered threads

Getting started FAQ's

Unanswered threads Acronyms Talk guidelines

Hide shortcut buttons

Talk

AIBU?

Share your dilemmas and get honest opinions from other Mumsnetters.

Flip

1 2 3 4

Original poster

to wonder how exam boards/ofqual can be so shit at statistics and if they even passed A Level maths themselves

79 replies

ShootsFruitsAndLeaves · 17/08/2020 20:43

Roll a dice 30 times. How many sixes did you get?

www.random.org/dice/?num=30

I tried ten times:

5, 4, 6, 5, 6, 2, 7, 7, 4, 4.

Now imagine that's a comprehensive sixth form sitting an A Level. Even if we know the school's results for years and years, and we know 1 in 6 get As, then in any given year there are likely (99%) to be between 2 and 8 As in that class.

If you want to award the same number of As as last year, you're going to need to hand out 5 As to that school. But this year just by randomness maybe only 2 deserved it. Lucky kids. Or maybe 8 deserved it. Ouch.

Now imagine if that was a smaller class only 15, and normally on 1 in 15 get As. If student performance is randomly distributed then there's a 99% chance of 0, 1, 2 or 3 As. So you can have 3 A*s in a class that normally gets one. Not every school would be like that. But some classes in some schools WILL be in that position. And it's in fact NOT randomly distributed. Some schools will have good intake years. Maybe a genius physicist's twin children joined the school and are sitting A Level Physics. So the exam results are actually less likely to follow the past distribution than even random dice would be. It's even less predictable than that!

Even an elite school where 40% get As, is going to have a considerable variance from year-to-year even in large subjects like maths. In smaller subjects with 10 students per year, the model proposed to award grades according to the past 3 years' results. In the past my son's school got for music 100% As some years, and in other a mixture of A, A, B, C. Unlucky for you if you were in the all A cohort.

It's astonishing how anyone with even a GCSE in Maths could think any of this could make sense. It's the most basic principle of variance.

Surely they couldn't really be so incredibly stupid?! Was it all a trick so that those given unfair grades by their teachers would not complain about that, because well at least you weren't awarded a D by ERNIE.

OP posts:

Original poster

ShootsFruitsAndLeaves · 17/08/2020 20:47

sorry should say 99% for 30 dice is 1 to 11 6s. 2 to 8 is only about a 92% chance, which would leave 1 in 12 outside that range.

OP posts:

serenada · 17/08/2020 21:08

I didn't understand any of that but if its any consolation I agree that the anomaly that left c/d students so disproportionately disadvantaged had to be noticed somewhere by someone. And still allowed.

ssd · 17/08/2020 21:10

I'm so lost by your post but I think I agree with you.

Original poster

ShootsFruitsAndLeaves · 17/08/2020 21:36

I didn't understand any of that but if its any consolation I agree that the anomaly that left c/d students so disproportionately disadvantaged had to be noticed somewhere by someone. And still allowed.

Do you mean grade c/d? There were numerous problems with the system:

very small cohorts were given the teacher-assessed grades not the computer algorithm. Small cohorts are much more common at private schools (think A Level Fine Art, Music Technology, etc.). This meant even though the rules applied the same everywhere, there was indirect discrimination.
comprehensive schools by their nature have more variance. This means that there might be several good students some years and none the next. The algorithm made no attempt to identify which group this might be (e.g., by looking at GCSE grades or even at the teacher's predicted grades which were simply a placebo, and were ignored in favour of the overall rank)

Independent schools already filtered out the weak students at 11+ so on the whole their grades will be A-B, not A-E as at a comprehensive school. This means the average private school student is less likely to be the victim of an egregious error, because for example the boy ranked 60 out of 70 in 2019 might still have got a B, whereas at a comprehensive, the 5th student out of 30 might have been a C one year, and an A the next. So on the whole even if the private school has a slightly better group of students this year, this might result in a group of candidates who deserved As getting Bs, that's not going to be as severe as an A student at a comprehensive being awarded a C.

(Of course the boy ranked dead last at the private school but predicted a C, is still going to get handed a U if that was the last place on average in previous years at his school. And I'm not a teacher, but it's hard to imagine the teacher (who has submitted a C grade in good faith) actually anticipates which students are the ones who get Us, given that sometimes it could be due to illness, not turning over the paper properly, etc., so for someone to be awarded a U because they were last place in their elite school even though in reality they really did deserve a C is not fair at all.

OP posts:

serenada · 18/08/2020 09:48

Yes

MereDintofPandiculation · 18/08/2020 10:12

Why should members of Exam boards have passed A level maths? The majority of graduates haven't.

holdingpattern · 18/08/2020 10:17

I seriously don't get how no one understands the algorithm or purpose.

If School A for the last 100 years has only managed
AABBBCCCC
AAABBBCCC
ABBBCCCDD
BBBBCCCDU

Then this year the teachers say

AAAAAAAAB

I think everyone would agree its quite amazing how the pupils of 2020 were so brilliant.

The algorithm was there to say - normally even at your best, you never got those results. We will be generous and add inflation in, and give you
AAAABBBBBC

For better schools, they were stricter and hence their downgrading looked smaller. For less better schools, they overestimated greatly.

For individuals there was always the possibility of a star pupil. But if teachers had ranked them 1, then they would have still got the A grades. If teachers ranked them 8, then they might have found themselves dropped down.

What has happened is teachers had to rank pupils too.
So in many cases of people saying but I got A in prelims, and Joe got a B in mocks/prelims but Joe got awarded an A and I got a C. That's because the teachers (unbiased of course) ranked each pupil and you were ranked below Joe, both considered an A by the school. Ofqual had to readjust these predictions and you dropped to a C because of ranking.

The other thing everyone seems oblivious to, is that every year the exam results are normalised.

If everyone got 80% in maths or higher, they move the pass mark to 90% on the basis the exam was too easy and they have to normalise the results. So in the past 100 years if there was a year that everyone got 80% and if you could see marks vs grades, you would find that at 80% would have got a D.

This year by awarding 40% inflation, Universities will not have enough places and will have to select on the basis of something else. Maybe hair colour, maybe an essay, maybe like LNAT and UCAT tests for law and medicine. But they will have to select somehow.

The unfairness was cancelling the exams. These were the closest to giving a fair comparison. Teachers are different, schools are different, mocks, tests, prelims, marking and setting are different.

Cheesess · 18/08/2020 10:21

The biggest problem is the fact that some students are very intelligent however they completely flunk exams, might have a bad day, etc.
And no one is going to be able to predict who that’s going to happen to on the day.
But I think that just PROVES that our educational system is all messed up and how you perform on a few days during May/June one year should not affect the whole of your future.
I had a bad day during one of my exams at university and barely got a third (42%) the rest of my results for my entire degree were >70%.

Original poster

ShootsFruitsAndLeaves · 18/08/2020 16:24

I seriously don't get how no one understands the algorithm or purpose.

If School A for the last 100 years has only managed
AABBBCCCC
AAABBBCCC
ABBBCCCDD
BBBBCCCDU

Then this year the teachers say

AAAAAAAAB

I think everyone would agree its quite amazing how the pupils of 2020 were so brilliant.

The algorithm was there to say - normally even at your best, you never got those results. We will be generous and add inflation in, and give you
AAAABBBBBC

Why are you saying things that aren't true.

They didn't use the last 100 years. The cut off was three years.

There was no inflation in the algorithm. The only inflation was for non-algorithm cohorts, that is those with very small groups doing a subject or new subjects. This was mostly found for rare subjects and mostly at private schools. These were given the CAGs, which as you say are inflationary.

The issue is that in your example in fact we only have THREE years of data. And it doesn't have inflation. It assumes that the school did as well as the average from those three years. In fact mathematically as a statistically inevitability one in four would have had their best ever year, and exactly half of non-small cohorts would have been overgraded and exactly half undergraded.

And within that the grades wouldn't have been given to the right people , because teachers are fallible and can't possibly rank accurately. Think how hard it is to win the football pools. Correctly predicting ranks is 100.00000000000000000000000000000000000000% impossible. Especially across millions of students. It can't possibly happen.

Research before posting.

OP posts:

Hardbackwriter · 18/08/2020 16:28

Ofqual were told that the algorithm couldn't produce grades that would amount to grade inflation. There wasn't a way of doing that that didn't mean downgrading the CAGs on average. I think Ofqual should have said that it was impossible to do this equitably, and that they certainly should have done so after their own report showed that the impact was uneven and that it disproportionately affected state sixth forms. But I also think that the DfE, who set the parameters of what the algorithm must achieve (no grade inflation) bear a significant share of the responsibility and it's disgusting to see Williamson throwing Ofqual under the bus at the moment.

chomalungma · 18/08/2020 16:34

Law of small numbers and all that.

en.wikipedia.org/wiki/Law_of_small_numbers#:~:text=Law%20of%20small%20numbers%20may%20refer%20to%3A%201,the%20law%20of%20small%20numbers%20More%20items...%20

user1497207191 · 18/08/2020 16:35

The unfairness was cancelling the exams.

Fully agree. The moment that decision was made, it was always going to have a shambolic outcome.

The next shambolic decision was asking teachers to rank pupils within each grade boundary. That was a monumental cock up. If Ofqual had wanted a more detailed "estimate", they should have asked schools to provide a mark or percentage mark rather than a grade. At least then, they could have "moderated" using the normal modernation/standardisation method.

Original poster

ShootsFruitsAndLeaves · 18/08/2020 16:37

Ofqual were told that the algorithm couldn't produce grades that would amount to grade inflation. There wasn't a way of doing that that didn't mean downgrading the CAGs on average.

They could have downgraded the CAGs in proportion with average predicted vs actual grade inflation for those centres.

OP posts:

Hardbackwriter · 18/08/2020 16:47

@ShootsFruitsAndLeaves

> Ofqual were told that the algorithm couldn't produce grades that would amount to grade inflation. There wasn't a way of doing that that didn't mean downgrading the CAGs on average.

They could have downgraded the CAGs in proportion with average predicted vs actual grade inflation for those centres.

Sorry but I don't quite follow what you mean? Do you mean their previous predicted grades from previous years vs the actual grades? If so I think there would have been many saying (correctly) that CAGs were not comparable to the predicted grades from previous years - schools went through a different and more rigorous process - and so that wouldn't be comparing like with like either.

Hardbackwriter · 18/08/2020 16:49

I think the fatal flaw was there from the off, in the decision to prioritise limiting grade inflation above almost all else. That was a political decision that came from a government obsessed with 'quality' and 'value' in education (see also their attitude to HE) for essentially elitist reasons, not the result of Ofqual's statisticians not understanding maths.

Piggywaspushed · 18/08/2020 17:07

Your understanding of the algorithm is faulty holding. Ignoring a couple of other matters, in the example you give, no human went ' oh all right then, we will adjust your over predictions down a bit.' That U you cited would have influenced things enough to bring the B down to a U ( although in your cohort size example, maybe not) and possibly not allow an A.

Just So you know OP , I had a class of 8 with only 2 years of prior data ( the prior year being udrntical to 2020 CAG)and 50 % were downgraded.

Piggywaspushed · 18/08/2020 17:11

We also don't predict grades. Ofqual sneakily cast aspersions on our prediction ability in their 350 page algorithm by using UCAS predictions as evidence in order to build in assumptions that teachers cannot predict. They have no other data at all on accuracy of prediction as we no longer supply predictions to boards.

Their algorithm also did not look at value added in anything like the way it should have done .

irregularegular · 18/08/2020 17:13

I agree that they put too much weight on imposing "average" distributions on fairly small populations. What they should have done was used teacher assessments combined with random, detailed checks of actual evidence to support those assessments. The algorithm should have been used to flag up anomalous looking cases which needed more checking. They should certainly have picked up on cases where students were downgraded 2, 3 or more grades according to the algorithm. Yes teachers will err on the generous side, but would are they really likely to predict a A for a D grade student?

The other mistake was NOT moderating very small groups at all (your class of 10 by the way would have received marks that were a combination of teacher assessed and algorithm generated). Of course this couldn't be done statistically, but could be done by spot checking with follow ups as needed. Not moderating small groups led to considerable grade inflation for schools/subjects with small classes. This meant that others inevitably had to have marks to down in order to maintain very slight inflation overall. It's noticeable that state grammar schools were worse affected on average than comprehensive schools, so it is not having a selective intake that advantaged private schools, but the small classes. This was a very obvious bias!

The other very silly thing they did was test how good the algorithm was at predicting grades using last year's data and predicted grades and the KNOWN rankings from A-level resuts. Of course it predicted well! Duh!

Anyway, ironically they will have ended up with even more grade inflation than if they just used CAGs in first place for A levels, since they are letting upgraded results stand! It still won't be "fair" though, as schools almost certainly took very different approaches to producing grades. Moderation/standardisation made sense in principle. They just made a hash of it!

Original poster

ShootsFruitsAndLeaves · 18/08/2020 17:20

Teachers cannot predict with total accuracy. It is as impossible as everyone winning the football pools. They did ask teachers to give their 'true honest pinkie promise grades', which were different from the ucas grades.

However in game theory terms it would have been stupid for teachers to downgrade their students. There was no benefit in submitting 'honest' grades, no matter how much they were asked to do so.

Ofqual ignored the grades anyway and just used the ranks against an average which would almost never match the actual results, and this was more error on top of the inherent error in teachers imperfect ranking abilities.

OP posts:

Hardbackwriter · 18/08/2020 17:20

I agree with everything you've said @irregularregular. I guess the question is whether Ofqual had (almost certainly not) or could have obtained the extra resource to conduct the very labour-intensive, manual work that you describe.

irregularegular · 18/08/2020 17:48

I guess the question is whether Ofqual had (almost certainly not) or could have obtained the extra resource to conduct the very labour-intensive, manual work that you describe.

It would have been a challenge. But they wouldn't need to check everything. It could be targeted. They had access to examiners who would not have been examining (including those who normally moderate practical work). Teachers could check other schools (yes I know they were busy). Plenty of recent graduates twiddling their thumbs since June.

I'm not sure what the answer was. But it wasn't this! They turned down offers of free help from the Royal Statistical Society because the Society wouldn't sign a NDA.

Hardbackwriter · 18/08/2020 18:02

People certainly existed who could have done this work, but would Ofqual have been given the extra resource to pay them? I genuinely don't know the answer to this question. In hindsight of course the government should have thrown all the resource they could at this, but I wonder if that was the thinking in April.

evensong1 · 18/08/2020 18:09

If the action/restrictions had been prompt when Covid 19 became a pandemic, then some exams could have been sat in June or early July. So one real exam and possibly one mock could have been used as evidence, not the algorithm that blatantly favoured small schools and minority subjects.

LizzieMacQueen · 18/08/2020 18:13

People need to stop putting asterisks in their posts, unless they mean to highlight them. So that A grade that is higher than an A, can you call it A+. Please. Then it doesn't stuff up the format of your OP.

chomalungma · 18/08/2020 18:21

I wonder if this will be on More or Less tomorrow?

I love how small numbers can be deliberately misinterpreted.
I remember a dramatic headline in the local paper about bike thefts being up by 1/3 in a year.

There were 3 in 1 year. Then there were 4.

Flip

1 2 3 4

Swipe left for the next trending thread