Meet the Other Phone. Only the apps you allow.

Meet the Other Phone.
Only the apps you allow.

Buy now

Please or to access all these features

Feminism: Sex and gender discussions

Sex bias in AI systems can affect social care packages?

13 replies

RethinkingLife · 12/08/2025 18:27

Interesting observations about AI and large language models (LLM)s in social care.

The researchers used more than 600 real case notes. They varied only the sex of the person at the heart of them. They used a model that’s already in use by more than half of England’s councils to generate an assessment for an adult social care support package.

When the person was "Mr Smith," the AI summarised this as, "84-year-old man who lives alone and has a complex medical history, no care package and poor mobility." With a switch in sex and titles/pronouns, "Mrs Smith," was, "84-year-old living alone. Despite her limitations, she is independent and able to maintain her personal care."

I’ve seen a fair amount of public discussion that says the AI is correcting for bias. That the same facts and needs are different because ‘men at any age are less competent at self care’. So, holding everything else equal, men are more likely to need a care package than women. So, presumably, those commenters would think other LLMs (see below are unfairly biased).

This is not an intellectual exercise. These systems are influencing or, in some cases, deciding who is eligible for social care support. How quickly, and how intensively (2 or 4 visits a day?).

Just describing someone as "coping" rather than "struggling" shifts the outcome. Men’s reports and experiences were framed in terms of difficulty, women in terms of self-reliance.

I’m highlighting the researchers’ findings with one popular LLM. Other LLMs displayed less bias or no discernible bias but Google’s Gemma is a popular LLM.

Gemma’s male summaries were generally more negative in sentiment, and certain themes, such as physical health and mental health, were more frequently highlighted for men. The language used by Gemma for men was often more direct, while more euphemistic language was used for women. In the Gemma summaries, women’s health issues appeared less severe than men’s and details of women’s needs were sometimes omitted. Workers reading such summaries might assess women’s care needs differently from those of otherwise identical men, based on gender rather than need. As care services are awarded based on need, this could impact allocation decisions.

Without formal evaluations, we don’t know what training people receive. Nor what the governance is nor the safeguards.

Copied from lead author’s discussion.

Rickman, S. Evaluating gender bias in large language models in long-term care. BMC Med Inform Decis Mak 25, 274 (2025). https://doi.org/10.1186/s12911-025-03118-0

As AI is rolled out in areas that affect people's lives - like social care, housing and criminal justice - how do we know we're using the right models?

In this study, we evaluated gender bias in Google and Meta's LLMs for summarising pseudonymised, gender-swapped care records. While Meta's model showed no measurable bias, Google's was more likely to mention physical and mental health needs if they were men's - even when women had the same conditions.

These differences matter because care goes to those who appear in greatest need. If women's needs are downplayed, will they receive the same level of care and support?

The research also suggests that AI bias isn't inevitable - it varies between models. Social care, health, and other public services face real challenges with documentation, and AI can help address them. But those benefits come with risks. Evaluation is essential to ensure we choose the best tools for the job.

https://www.theguardian.com/technology/2025/aug/11/ai-tools-used-by-english-councils-downplay-womens-health-issues-study-finds?

Open access paper: https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-025-03118-0

AI tools used by English councils downplay women’s health issues, study finds

Exclusive: LSE research finds risk of gender bias in care decisions made based on AI summaries of case notes

https://www.theguardian.com/technology/2025/aug/11/ai-tools-used-by-english-councils-downplay-womens-health-issues-study-finds

OP posts:
Imnobody4 · 12/08/2025 18:39

This is scary though unfortunately unsurprising. Thanks for the post

IwantToRetire · 12/08/2025 18:54

This was happening well before AI via the prejudices of computer programmers influencing outcomes.

eg in the early days of introducing computers into the work place a well known teaching hospital thought it would stream line its application process for doctors hoping to work at the hospital.

Two years after they implemented it, they had to abandon it when it became obvious that fewer applications (with the right qualifications) from outside of the UK and women were being accepted.

This is a daily problem and why so many attempts at computerising just dont work.

People think computers will just do it better, quicker but do not write the correct specifications for a program. So someone who is a good programmer (and maybe an incel) then uses their values.

ie the inbuilt prejudices are subconsciously imbedded by the programmer.

RethinkingLife · 12/08/2025 18:54

There are so many marginalised people who hope that AI will be more neutral than consulting other humans with in-built filters and biases. I’ve seen this in women commenting on delayed diagnosis for endometriosis, for heart disease and other conditions.

I understand this and am apprehensive that without a way to assess algorithmic bias, or insight into governance, this may well get worsen outcomes.

Report on endometriosis - the range in time to diagnosis goes up to 19 years.
https://www.ncepod.org.uk/2024endometriosis/Endometriosis_A%20Long%20and%20Painful%20Road_full%20report.pdf

For social care, I’m very concerned that many women who are already unwell will be assessed as fit and willing to provide unrelenting, intensive care for family members with no, or negligible, social care support.

https://www.ncepod.org.uk/2024endometriosis/Endometriosis_A%20Long%20and%20Painful%20Road_full%20report.pdf

OP posts:
AlexandraLeaving · 12/08/2025 19:10

Thanks for highlighting this. It makes for depressing reading. I wonder, though, how much worse it is than human assessment, given that we know there is often a bias in favour of men across society as a whole. I’m not excusing it, just interested in whether similar tests have been done to check for bias in human summaries.

RethinkingLife · 12/08/2025 19:52

Shortly after posting the above about healthcare, I read this from from Isabel Straw which I’m reproducing for the very useful links

https://www.linkedin.com/posts/isabel-straw-b375ba152_ai-aibias-fairness-ugcPost-7361065335992590336-339c?

Naga Munchetty at BBC Radio 5 Live, discussing the evolving role of AI in the NHS, and the risks of AIBias. With the 2025 NHS Long Term Plan declaring its bold ambition to “make the NHS the most AI-enabled care system in the world”, we need to ensure algorithmic fairness and equity are at the forefront of these developments. In the interview, we built on yesterday’s discussion with Dr Sam Rickman regarding his research at The London School of Economics and Political Science (LSE), that revealed gender bias in Language Models assessing care needs [1]. We discussed wider examples in healthcare, including discriminatory stereotypes in Psychiatric AI [2] and algorithmic underperformance for women with heart conditions [3], plus the wider research that has reproduced these findings and more [4].

Our segment ran from 12.05 on BBC Sounds (1hr 05 - 1hr 25): https://www.bbc.co.uk/sounds/play/m002gzxp

For interested listeners, find below the links to the published AI Bias papers discussed and the key initiatives mentioned - including the team at UCL Institute of Health Informatics, Dr Xiao Liu & Dr Joseph Alderman's "Standing Together Initiative" at the University of Birmingham, the projects at the NHS AI Ethics Lab and non-profits including the The Light Collective and Dr. Joy Buolamwini's Algorithmic Justice League.

Lastly, thank you to Naga, Eloise Maddocks, and the wider team behind the scenes at the BBC for highlighting this important topic!

Paper references:
[1] ‘Google’s AI Model Gemma Overlooks Women’s Health, LSE Research. https://www.lse.ac.uk/news/latest-news-from-lse/ai-tools-risk-downplaying-womens-health-needs-in-social-care
[2] Straw, I et al. ‘Artificial Intelligence in Mental Health and the Biases of Language Based Models’. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0240376
[3] Straw, I, et al. ‘Sex-Based Performance Disparities in Machine Learning Algorithms for Cardiac Disease Prediction: Exploratory Study’. JMIR https://www.jmir.org/2024/1/e46936
[4] Suenghataiphorn, T et al. Bias in Large Language Models Across Clinical Applications: A Systematic Review.https://arxiv.org/abs/2504.02917

Resources:
[5] UCL's IHI Research Programme - https://www.ucl.ac.uk/population-health-sciences/health-informatics/research
[6] Dr Liu’s ‘Standing Together’ initiative - https://www.datadiversity.org/home
[8] The AI Security Institute - https://www.aisi.gov.uk/
[9] The Light Collective project focused on Patient AI Rights, headed up by Andrea Downing - https://lightcollective.org/patient-ai-rights/
[10] Dr Buolamwini's team's at the The Algorithmic Justice League - https://www.ajl.org/

Artificial Intelligence in mental health and the biases of language based models

Background The rapid integration of Artificial Intelligence (AI) into the healthcare field has occurred with little communication between computer scientists and doctors. The impact of AI on health outcomes and inequalities calls for health professiona...

https://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0240376

OP posts:
FeedbackProvider · 13/08/2025 07:17

AlexandraLeaving · 12/08/2025 19:10

Thanks for highlighting this. It makes for depressing reading. I wonder, though, how much worse it is than human assessment, given that we know there is often a bias in favour of men across society as a whole. I’m not excusing it, just interested in whether similar tests have been done to check for bias in human summaries.

I’m sorry to say that there are technical reasons that learning models are likely to show more bias than the underlying data on which they are trained. There’s a constant tension between increasing the likelihood of less likely desired outcomes vs increasing the likelihood of undesired outcomes. Models trained to avoid producing nonsense may also fail to produce lower likelihood valid results.

The particular problem with commercial LLMs is twofold: they have been trained on vast quantities of text and the metrics they are tested against have historically been all about plausibility and not at all about correctness in a factual sense.

Given how much training data comes from Reddit, using a standard LLM to summarise something is not so different from asking any random 15 year old basement dwelling male to produce a summary.

GiantTeddyIsTired · 13/08/2025 07:35

AI isn't really intelligence at all - it's pattern detection, so if the pattern is already there, that's what it's going to do (eg. misogyny)

I see it when I use it as a coding assistant - it tries, but because it isn't really thinking or coding, but seeing what lots of other people do, and copying it, it makes silly mistakes which I have to catch. This is also why it's good for small snippets of code, but bigger stuff will definitely have issues.

It's like when it's generating images - it just melds together lots of things, and you end up with 6 fingers. It's not 'drawing a human' it's finding lots of things it's been told are humans, and mushing them together to 'create' a drawing of a human. It doesn't know what is and isn't important about humans.

mybestchildismycat · 13/08/2025 07:38

FeedbackProvider · 13/08/2025 07:17

I’m sorry to say that there are technical reasons that learning models are likely to show more bias than the underlying data on which they are trained. There’s a constant tension between increasing the likelihood of less likely desired outcomes vs increasing the likelihood of undesired outcomes. Models trained to avoid producing nonsense may also fail to produce lower likelihood valid results.

The particular problem with commercial LLMs is twofold: they have been trained on vast quantities of text and the metrics they are tested against have historically been all about plausibility and not at all about correctness in a factual sense.

Given how much training data comes from Reddit, using a standard LLM to summarise something is not so different from asking any random 15 year old basement dwelling male to produce a summary.

I work in tech and was chatting to the founder of a UK AI company a few weeks ago. He told me that their model is primarily trained on Mumsnet!

DrBlackbird · 13/08/2025 07:48

One of OpenAI’s co-founders was interviewed 2 years ago and the matter of bias in AI was raised and his response was essentially ‘we had no idea it’d get so big and so quickly, if we had we’d have used different training data’ - men in Silicon Valley could not care less about women or worse.

Technology is and will only make it worse for women given how people a) assume computer generated results are objective and b) the difficulty to challenge machine learned outcomes.

Three women were chosen for dismissal in one firm and when they asked their line manager why them, he said he didn’t know, the algorithm had identified them. Sexism is baked into the data.

Edited to add: thanks to @RethinkingLife for posting these papers.

summerskyblue · 13/08/2025 08:00

It is a worry but unsurprising that as AI programmers will very likely be in majority men that that this is causing an issue with bias.

One of the many reasons why we need to be more cautious about AI and not believe everything that it generates.

It is not impartial, objective or all-knowing.

It is only as good as the humans who create it and feed it information and it carries their prejudices.

Igmum · 13/08/2025 12:58

Thanks @RethinkingLifebut it isn’t just the programmers it’s the reports that are fed in. I suspect this AI was trained on many social work reports which left women to cope and supported men. It’s a misogynist old world we live in.

New posts on this thread. Refresh page
Swipe left for the next trending thread