Meet the Other Phone. Flexible and made to last.

Meet the Other Phone.
Flexible and made to last.

Buy now

Please or to access all these features

Site stuff

Join our Innovation Panel to try new features early and help make Mumsnet better.

See all MNHQ comments on this thread

Mumsnet Corpus

1000 replies

TokyoBouncyBall · 19/04/2024 11:36

Not a TAAT, but a bit of googling as a result of a now deleted thread has led me to this:

https://fold.aston.ac.uk/handle/123456789/18

I note it says that the License is uncertain. Can you confirm that you have given permission for posts to be used in this way, or is there something that Aston might like to look into?

I note it says Users who wish to access this dataset must make a detailed application to FoLD and the researcher, as well as potentially gain additional agreement from an external organisation before they can be approved for access.

Given one of the uses it is being put to, I think it is a bit dubious to say the least.

OP posts:
Thread gallery
82
Ereshkigalangcleg · 23/04/2024 16:03

Have they accessed our real names/emails/locations too?

It's unlikely that they can get that information from Mumsnet unless it was posted on a thread by the poster. See Justine's post.

Cauliflowery · 23/04/2024 16:03

Ereshkigalangcleg · 23/04/2024 15:45

@JustineMumsnet

Please ask them to clarify how they know via the model whether the "linguistic transphobia" is original or reported speech, and how they take account of the fact that people are constantly quoting other people and media articles, both posted on the thread and offline. If there was one comment, it might be quoted by 20 people. That's not 20 separate uses of the "transphobic" term in any meaningful sense.

Point raised by @VitoCorleoneOfMNMafia

Even on occasions where MN have given permission for a university to use the data we provide, it seems really inappropriate for the university to approve women's data to be used for a research project that demonstrates such a huge misogynist bias. It implies pretty much anyone at Aston could get access to the data if they want it.

(Imagine a university ethics board backing an openly racist PhD using data from Lipstick Alley, or a an openly homophobic PhD using data from gay Reddit forums).

Why aren't Aston encouraging their researchers to deal with their unconscious bias before having access to data created by a marginalised subset of society?

AlisonDonut · 23/04/2024 16:04

We need to know all the reports that have been created using our data and have access to them and they need to delete the damn database.

Encyclopediaofnonsense · 23/04/2024 16:04

JustineMumsnet · 23/04/2024 15:16

Update - Aston Uni have responded and offered a call with their Vice Chancellor to explain the reasons for the research, how they manage ethical approval and protect privacy and data. I'll be taking them up on that and putting some of our own (and your questions). Will report back!

Can you assure us you're not going to seek to make a profit financial settlement from this but will endeavour to seek the best outcome for your site users?

Ereshkigalangcleg · 23/04/2024 16:04

Why aren't Aston encouraging their researchers to deal with their unconscious bias before having access to data created by a marginalised subset of society?

I'd like to know that too. I imagine we can complain to the ethics department of Aston?

C8H10N4O2 · 23/04/2024 16:16

JustineMumsnet · 23/04/2024 15:16

Update - Aston Uni have responded and offered a call with their Vice Chancellor to explain the reasons for the research, how they manage ethical approval and protect privacy and data. I'll be taking them up on that and putting some of our own (and your questions). Will report back!

CDA approach to language analysis is fundamentally problematic due to the significant impact of subjective decisions made by the researchers.

This isn't just about pro/anti/dontcare on trans issues, it applies to any model built where subjective researchers start with a strong opinion on the subject. Methodologically the risk is saying "I define X, I say X is bad, this is X, now use that pattern to find more bad, oh look I have found bad". The whole approach of CDA leans to be politicised and in this particular case there are clear and strong pre existing views - its hard to see the credibility. Using this approach to build language models to sell or licence for NLP use is lucrative but against most Responsible AI guidelines and in areas which are politically or socially controversial they can be particularly discriminatory. Its a real challenge in building models, even for sentiment analysis in this type of space.

The issues of Author Authenticity and potential for abuse I think is already covered by the thread.

VitoCorleoneOfMNMafia · 23/04/2024 16:18

Ereshkigalangcleg · 23/04/2024 15:45

@JustineMumsnet

Please ask them to clarify how they know via the model whether the "linguistic transphobia" is original or reported speech, and how they take account of the fact that people are constantly quoting other people and media articles, both posted on the thread and offline. If there was one comment, it might be quoted by 20 people. That's not 20 separate uses of the "transphobic" term in any meaningful sense.

Point raised by @VitoCorleoneOfMNMafia

Reposting:

you have people blockquoting other posters prior to the Quote button being introduced, so you can't assume that a given post contains only material written by that poster.

the "recentsy primer" (my spelling might be wrong) [this refers to something mentioned in the 2019 talk video] might well be an artifact of posters quoting PPs. It's the most plausible explanation I can think of to explain a user adopting a phrase for only one thread and never using it again.

Pro-tip, Aston researchers: when A quotes B, that's B's words, not A's, not even "A's for the duration of the thread".

My per diem for tearing holes in your research is £1000 to Women's Aid.

AmaryllisNightAndDay · 23/04/2024 16:19

Did they scrape the MN disability boards?

Ereshkigalangcleg · 23/04/2024 16:21

I think they scraped the whole site, for their "sandbox" rather than particular boards. We know they scraped posts about fertility treatment and dieting.

AmaryllisNightAndDay · 23/04/2024 16:22

Ereshkigalangcleg · 23/04/2024 16:21

I think they scraped the whole site, for their "sandbox" rather than particular boards. We know they scraped posts about fertility treatment and dieting.

Probably yes, but I would like @JustineMumsnet to ask so that he is aware.

The disability boartds are slightly protected, they don't appear in trending etc.

Sparklybutold · 23/04/2024 16:23

JustineMumsnet · 23/04/2024 15:16

Update - Aston Uni have responded and offered a call with their Vice Chancellor to explain the reasons for the research, how they manage ethical approval and protect privacy and data. I'll be taking them up on that and putting some of our own (and your questions). Will report back!

Given the obvious anger and uncertainty this has caused many MN members could I suggest you also take a MN user with you? I would propose @ArabellaScott or @Boiledbeetle. As a user I feel it is important that the positionality of this research warrants assurances that the topic area that this research has crashed into is appropriately scrutinised.

I mention above as they are the most well versed in this area. I'm sure there are others.

@JustineMumsnet ? What are your thoughts?

Also apologies to @Boiledbeetle and @ArabellaScott for throwing you the lit branch which you may not want?

AgathaAllAlong · 23/04/2024 16:24

I would really want to understand how the system works. What is being scraped, and how, who owns the database, how do they decide who will get to use it.

Ask them if you can access the database.

It does rather sound like they're hoping this will go away and they can carry on after the call....

Ereshkigalangcleg · 23/04/2024 16:25

Probably yes, but I would like @JustineMumsnet to ask so that he is aware.

Absolutely.

PerkingFaintly · 23/04/2024 16:26

VitoCorleoneOfMNMafia · 23/04/2024 15:27

Given we had no knowledge our site was being scraped (against our T&Cs) and the data used in this way, surely the only answer that matters is to "have you deleted the scraped data yet?"

Yeah this.

Will await results of convo, obvs.

But VC wants "to explain the reasons for the research, how they manage ethical approval and protect privacy and data" sounds like, "We did something illegal, we're not sorry, and we're going to yap at you until you give up and go away".

Ereshkigalangcleg · 23/04/2024 16:28

But VC wants "to explain the reasons for the research, how they manage ethical approval and protect privacy and data" sounds like, "We did something illegal, we're not sorry, and we're going to yap at you until you give up and go away".

It does.

PerkingFaintly · 23/04/2024 16:32

I'm getting flashbacks to dealing with the builders who fucked up my house.Hmm

shockeditellyou · 23/04/2024 16:35

Fucking A, Mumsnet!

I’d also be interested in Aston’s (no doubt highly comprehensive) training for researchers dealing with data…..

Ereshkigalangcleg · 23/04/2024 16:37

I'm not sure we are going to go away, whatever the results of this conversation. The research exemption to explicit consent in the DPA 2018 hinges on the research being ethical and not liable to cause distress.

https://ico.org.uk/media/for-organisations/documents/1061/anonymisation-code.pdf

Chapter 9 and the appendices deal with research.

Riva5784 · 23/04/2024 16:38

PerkingFaintly · 23/04/2024 16:26

Yeah this.

Will await results of convo, obvs.

But VC wants "to explain the reasons for the research, how they manage ethical approval and protect privacy and data" sounds like, "We did something illegal, we're not sorry, and we're going to yap at you until you give up and go away".

Yep. The VC has probably been advised by their lawyers not to admit to anything in writing.

GCITC · 23/04/2024 16:38

If there's one thing Aston has learnt, it's not to piss off the vipers found on the feminism board.

I'm really looking forward to someone's PhD thesis on the subject.

Grin
everythingthelighttouches · 23/04/2024 16:39
  • correction of technical points

Separately to the issue of whether Aston inadvertently re-identify previously pseudonymised data,

I would just like to add that in order to process special category data under UKGDPR any institution needs to have a lawful basis AND conditions for processing.

I expect Aston will say their lawful basis is “public task” Article 6(e) condition for processing is under Article 9(2)(e)- “made public by the data subject”.

However, it is worth noting that Mumsnet or any of the subjects can still challenge this on the basis that it is fair.

I would argue it is not fair because the research title makes the assumption that the discourse is transphobic.

Additionally, they need to consider the subject’s reasonable expectations for further use of the data.

PerkingFaintly · 23/04/2024 16:40

They like MN language, so I hope Justine offers them the classic specimen of "No", as a complete sentence...

Encyclopediaofnonsense · 23/04/2024 16:47

All their data, once published, will be identifiable if the post is extremely unique. All people have to do is paste the text into Google and they've found the poster.

Ereshkigalangcleg · 23/04/2024 16:48

Hasn't Eden Palmer already explained the reason for the research? To study "transphobic hate crimes on Mumsnet" as she said in her now deleted LinkedIn post. Not just transphobia, "transphobic hate crimes".

Astontacious · 23/04/2024 16:49

Ereshkigalangcleg · 23/04/2024 16:03

Have they accessed our real names/emails/locations too?

It's unlikely that they can get that information from Mumsnet unless it was posted on a thread by the poster. See Justine's post.

I would still like them to confirm it. The only definitive way they can test their theories is by having our real data. Since they scraped the site illegally, and they are computer bods and people that didn’t like working at mumsnet have been quoted in this thread, it seems a reasonable question to ask.

Please create an account

To comment on this thread you need to create a Mumsnet account.

This thread is not accepting new messages.
Swipe left for the next trending thread