Meet the Other Phone. Child-safe in minutes.

Meet the Other Phone.
Child-safe in minutes.

Buy now

Please or to access all these features

Site stuff

Join our Innovation Panel to try new features early and help make Mumsnet better.

See all MNHQ comments on this thread

Mumsnet Corpus

1000 replies

TokyoBouncyBall · 19/04/2024 11:36

Not a TAAT, but a bit of googling as a result of a now deleted thread has led me to this:

https://fold.aston.ac.uk/handle/123456789/18

I note it says that the License is uncertain. Can you confirm that you have given permission for posts to be used in this way, or is there something that Aston might like to look into?

I note it says Users who wish to access this dataset must make a detailed application to FoLD and the researcher, as well as potentially gain additional agreement from an external organisation before they can be approved for access.

Given one of the uses it is being put to, I think it is a bit dubious to say the least.

OP posts:
Thread gallery
82
Whinge · 24/04/2024 10:40

It is not satisfactory just to take Aston's position on this. We need an independant verification on this, from the experts. This is the ICO's job. Thats what it is there for.

I agree. I appreciate it's a developing situation and we're not privy to all the details, but from this side of the screen it doesn't seem like MNHQ are taking this as seriously as they should.

IncompleteSenten · 24/04/2024 10:41

Of course they're going to double down on this. They won't want to accept any wrongdoing.

IncompleteSenten · 24/04/2024 10:42

ArabellaScott · 24/04/2024 10:38

They don't get to scapegoat the student, btw. She's just been the canary.

Absolutely.

KellieJaysLapdog · 24/04/2024 10:42

ArabellaScott · 24/04/2024 10:38

They don't get to scapegoat the student, btw. She's just been the canary.

Absolutely. Please do let Aston know that MN users will not accept the VC at Aston University attempts to throw a young adult female student under the bus (even if said student doesn’t identify as a woman and thinks we’re all a bunch of nasty terfs).

Everyone needs to be held to account, not just the person at the very bottom of the totem pole.

Ereshkigalangcleg · 24/04/2024 10:42

I agree Arabella. She's responsible for her own PhD choice but most of this stuff has been signed off higher up. I don't think her PhD is ethical so I don't have a problem with them looking at it. I would say they should have done a better job before they approved it though, and that isn't down to her.

Boiledbeetle · 24/04/2024 10:43

"no intention to identify individual posters from their posts."

I call double bollocks on that line.

Especially as posters here were able to find the adoption thread with the posters comment that was mentioned in a previous study in double quick time I don't think he can legitimately say that. It only took a few words to identify the adoption poster.

TokyoBouncyBall · 24/04/2024 10:44

Boiledbeetle · 24/04/2024 10:43

"no intention to identify individual posters from their posts."

I call double bollocks on that line.

Especially as posters here were able to find the adoption thread with the posters comment that was mentioned in a previous study in double quick time I don't think he can legitimately say that. It only took a few words to identify the adoption poster.

I raise you quadruple bollocks on that because it's the POINT OF THE WORK.

OP posts:
Winnading · 24/04/2024 10:45

AstonCanKissMyArse · 24/04/2024 08:23

Have you had a reply? What email address did you contact?

I've contacted them about it but no reply.

Edit for clarity.

Edited

No reply yet. I sent the email about 7 this morning. I imagine there are many more for the same reason. I dont expect a quick response, and its sort of too late now. I'm more thinking about the future and other entities doing the same thing.

I am very bloody angry at Aston university, beyond livid.
Women , yet again not thought of as actual human beings, just something to dissect in any way.

I scrolled to the bottom of the page,clicked on contact us and used a generic email on that page.

Ereshkigalangcleg · 24/04/2024 10:47

I had a feeling that's what Aston would say.

Boiledbeetle · 24/04/2024 10:47

TokyoBouncyBall · 24/04/2024 10:44

I raise you quadruple bollocks on that because it's the POINT OF THE WORK.

That is true! (I'm not sufficiently with it today!)

😱 Bollocks!

ArabellaScott · 24/04/2024 10:49

For the VC:

A couple of things he wanted to stress - Aston believe they have legitimate rights to use the data

I don't believe them. They have contravened MN t&cs. They've acted contrary to the ICO statement on data scraping. There are also questions about breaching users' human rights on sensitive data handling and sharing.

there is/has been no intention to identify individual posters from their posts.

The two men who created the 'corpus'/scraped the data used it as a 'sandbox' to 'play with'. They also work on software that explicitly attempts to identify 'authorship'.

He also accepted that the recent research by a first year PHD into "transphobia" may not be of the quality they expect and that he will investigate and commit to enhancements in quality if appropriate.

This isn't about 'quality', this is about legality, ethics, and data protection.

AlisonDonut · 24/04/2024 10:52

If you look at the tweet in the screenshot above, by Nikki, they are using access to this data as a means to attract students.

This is so unethical from every angle.

Winnading · 24/04/2024 10:52

TokyoBouncyBall · 24/04/2024 09:16

It all depends on why you are anonymous. I don't post about feminism on any social media under my own name because I am a visible part of a small organisation which has a policy of not getting involved in this discussion in any way. Which I think is right for what we do.

However, I let rip here, under many user names.

If my user names were linked to my real life, a) there would be seven shades of hell to pay for someone b) it would not be my fault so I cannot see that it would be any issue for my organisation.

It becomes much more difficult for people who work in places like Aston but are GC (the heavens help you all) or are on other parts of the site but are suffering abuse or DV or whatever.

So me, I will continue to let rip.

I'm anonymous because stalker. I want to stay anonymous.

My limited social media is under fake names, but it wouldnt take a genius to figure it out. Unless I change my name completely to Jane Smith and move countries, I'm findable. And the last thing I need is my stalker back.

IcakethereforeIam · 24/04/2024 10:52

Yup. Eden meet bus.

Nice bit of minimising too. Eden's supervisor is obviously on board because, well, she's the supervisor and the change in the twiX banner wasn't coincidental. And it wasn't just research into transphobia. MN was called transphobic, were committing hate crimes and, ffs, they were going to have a lecture asserting that to anyone who attended or saw the stream. I feel confident that it would have been recorded for posterity and for any random who wanted to watch it or cite it.

It's not just 'a first year PHD' student!

AmaryllisNightAndDay · 24/04/2024 10:53

VitoCorleoneOfMNMafia · 24/04/2024 10:32

Many of us change names to segment aspects of our lives or to make doxxing harder over time.

This software and techniques developed by these researchers Piotr and Krzysztof is intended to demonstrate whether two different usernames on different forums are the same person by examining how words are used. It's a tiny step of logic to decide to use it to see if two different usernames on the same forum are the same person.

As @VitoCorleoneOfMNMafia says... some of Aston's research is intended to break the same informal privacy mechanisms that MumsNet users have specifically used to protect themselves and their families. Aston don't have "ground truth" (i.e. links from posting names to registered names) but in future they may get some good probabilities, especially if they also test the techniques on other sets where they do have ground truth.

If this is successful then it is open to being turned against the women who used the mechanism in the past, without realising it could be broken in future. And if Aston hold this dataset then these women and families are vulnerable to their privacy being broken without any control but Aston's own. These women have not consented to this use and indefinite storage of their data and neither MumsNet nor Aston can consent on their behalf.

Encyclopediaofnonsense · 24/04/2024 10:55

Can someone contact the ICO if @mnhq aren't going to

IncompleteSenten · 24/04/2024 10:57

So the purpose of the massive data collection is to see if it's possible to develop ai or programs that can identify posters across name changes and possibly across different platforms but they have no intention to identify posters.

What's the point of the study then?

They're full of shit.

They'd better not try to push this PhD student under the bus here.

She's done us such a huge favour here we should really fight for her to not be scapegoated. Without her we may never have known what was going on. We owe her a massive thanks.

Purpel · 24/04/2024 10:57

there is/has been no intention to identify individual posters from their posts.

i would be intrigued to know how they define identify.

linking poster a who’s talking about feminist rights, to a name change where she talks about her kids, to another name change where she talks in Scot’s/craicnet about living in a certain place, to another post where she talks about her niche job.
is identifying even if it isnt literally poster A is Jane smith who lives on x street, linking those posts regardless is identifying and that seems to be the overarching research that those two scrapers do.

AgathaAllAlong · 24/04/2024 11:00

This is serious for MN. This platform is the only place I post, because I believed in the ethos and anonymous protection. I'm sure it's the same for many. If this is no longer guaranteed, well, I guess that's one way of kicking my MN habit.

KellieJaysLapdog · 24/04/2024 11:01

Sure the MAIN POINT of FORENSIC LINGUISTICS is to assist law enforcement in identifying an author?

We are the only normie, non crime-related dataset in the FoLD Repository!

RedToothBrush · 24/04/2024 11:04

ArabellaScott · 24/04/2024 10:49

For the VC:

A couple of things he wanted to stress - Aston believe they have legitimate rights to use the data

I don't believe them. They have contravened MN t&cs. They've acted contrary to the ICO statement on data scraping. There are also questions about breaching users' human rights on sensitive data handling and sharing.

there is/has been no intention to identify individual posters from their posts.

The two men who created the 'corpus'/scraped the data used it as a 'sandbox' to 'play with'. They also work on software that explicitly attempts to identify 'authorship'.

He also accepted that the recent research by a first year PHD into "transphobia" may not be of the quality they expect and that he will investigate and commit to enhancements in quality if appropriate.

This isn't about 'quality', this is about legality, ethics, and data protection.

Quite

Why would we believe anything about this that Aston says when

a) they are admitting there is a problem. A problem big enough for the VC to get involved
b) MN are saying outright it's a breach of t&c. So what is the point in the T&C's if this can just be ignored. Users feel they had certain protections which may now not be the case. Have we misunderstood or been misled?
c) the ICO say there is a problem with data scraping without consent and issued a statement on it.

We deserve better. We need to know whether we've been protected and what protections we have going forward so we can make informed decisions.

Users here need the clarity from an independent source. That is ICO involvement to draw clarity to this area.

This is a public interest issue.

I see no logical reason why neither MN nor Aston haven't contacted the ICO if only to reassure people here.

KellieJaysLapdog · 24/04/2024 11:06

Encyclopediaofnonsense · 24/04/2024 10:55

Can someone contact the ICO if @mnhq aren't going to

Edited

There is an absolute corker of an FOI on What Do They Know?’ So I suggest we wait for that. It’ll definitely have to go the ICO at some point.

https://www.whatdotheyknow.com/request/ethics_and_data_protection_for_t

I’d like to think MN know how bad this is but won’t risk showing their legal hand by posting anything opinion-based in public.

There is no way that Aston isn’t following these threads.

Ethics and data protection for the acquisition of the source material for the "Toward linguistic explanation of idiolectal variation – understanding the black box" conference talk - a Freedom of Information request to Aston University

Please send to me any and all records that you hold about the ethics approval and Data Protection Act 2018 due diligence surrounding acquisition of the source material for the "Toward linguistic explanation of idiolectal variation – understanding the b...

https://www.whatdotheyknow.com/request/ethics_and_data_protection_for_t

ArabellaScott · 24/04/2024 11:09

KellieJaysLapdog · 24/04/2024 11:01

Sure the MAIN POINT of FORENSIC LINGUISTICS is to assist law enforcement in identifying an author?

We are the only normie, non crime-related dataset in the FoLD Repository!

To be fair, they obscured the names of some of the other corpera. Mumsnet was openly named, for some reason. I'd like to know why.

SqueakyDinosaur · 24/04/2024 11:10

Whinge · 24/04/2024 10:40

It is not satisfactory just to take Aston's position on this. We need an independant verification on this, from the experts. This is the ICO's job. Thats what it is there for.

I agree. I appreciate it's a developing situation and we're not privy to all the details, but from this side of the screen it doesn't seem like MNHQ are taking this as seriously as they should.

This came to light a couple of days ago. The CEO of Mumsnet has today spoken to the VC of Aston. No doubt there are people at Aston scrambling to get a defensible position together, but at this stage I'm not sure what else we could expect. MN may be getting their legal team to apply for remedy.

I would hope that MN is demanding that Aston stop all data-scraping from here and delete existing data, but I think saying they aren't taking it seriously is jumping the gun here.

KellieJaysLapdog · 24/04/2024 11:15

I suspect we’ve also piqued the interest of some journalists.

Has anyone got any suggestions of MPs or Lords with a particular interest in data protection and privacy?

Just to ready the eventual letter-writing troops.

Please create an account

To comment on this thread you need to create a Mumsnet account.

This thread is not accepting new messages.