Meet the Other Phone. Flexible and made to last.

Meet the Other Phone.
Flexible and made to last.

Buy now

Please or to access all these features

Site stuff

Join our Innovation Panel to try new features early and help make Mumsnet better.

See all MNHQ comments on this thread

Mumsnet Corpus

1000 replies

TokyoBouncyBall · 19/04/2024 11:36

Not a TAAT, but a bit of googling as a result of a now deleted thread has led me to this:

https://fold.aston.ac.uk/handle/123456789/18

I note it says that the License is uncertain. Can you confirm that you have given permission for posts to be used in this way, or is there something that Aston might like to look into?

I note it says Users who wish to access this dataset must make a detailed application to FoLD and the researcher, as well as potentially gain additional agreement from an external organisation before they can be approved for access.

Given one of the uses it is being put to, I think it is a bit dubious to say the least.

OP posts:
Thread gallery
82
VitoCorleoneOfMNMafia · 23/04/2024 14:35

@JustineMumsnet @LilyMumsnet Did you know that Lancaster Uni pulled a similar stunt?

wp.lancs.ac.uk/forge/2022/10/28/liu-assessing-hybrid-identities-in-online-extremist-communities-through-sociolinguistic-styles/

VitoCorleoneOfMNMafia · 23/04/2024 14:47

Encyclopediaofnonsense · 23/04/2024 14:43

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7499161/

No indication UCL got Mumsnet agreement either.

Et tu, UCL? Then fall, Vito!

RethinkingLife · 23/04/2024 14:52

Encyclopediaofnonsense · 23/04/2024 14:43

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7499161/

No indication UCL got Mumsnet agreement either.

UCL might consider the processes they've listed to be adequate. Some might dissent.

The initial data set consisted of 50 threads that contained over 750 posts. The threads dated from February 4, 2019, to June 6, 2019.

Parsehub is a freely available web-based scraping tool designed to extract internet data. Any original posts or comments including information that could potentially identify the user, such as age, name, or location, were omitted manually by the researchers before the data were analyzed.

Parsehub may be freely available but that's irrelevant if MNHQ's T&Cs have always forbidden webscraping.

The authors have also used non-attributed quotations from the threads.

This study is not without its flaws. There is a lack of demographic data available about Mumsnet users because of both the anonymized nature of the site and ethical restrictions that required any identifiable information such as age or occupation to be removed. Thus, inferences cannot be made about whether sample characteristics may have influenced the results.

I'm particularly taken aback to discover that

This work is supported by the National Institute for Health Research (NIHR) Great Ormond Street Hospital Biomedical Research Centre.

Where was their oversight of this before agreeing to the funding?

VitoCorleoneOfMNMafia · 23/04/2024 14:55

RethinkingLife · 23/04/2024 14:52

UCL might consider the processes they've listed to be adequate. Some might dissent.

The initial data set consisted of 50 threads that contained over 750 posts. The threads dated from February 4, 2019, to June 6, 2019.

Parsehub is a freely available web-based scraping tool designed to extract internet data. Any original posts or comments including information that could potentially identify the user, such as age, name, or location, were omitted manually by the researchers before the data were analyzed.

Parsehub may be freely available but that's irrelevant if MNHQ's T&Cs have always forbidden webscraping.

The authors have also used non-attributed quotations from the threads.

This study is not without its flaws. There is a lack of demographic data available about Mumsnet users because of both the anonymized nature of the site and ethical restrictions that required any identifiable information such as age or occupation to be removed. Thus, inferences cannot be made about whether sample characteristics may have influenced the results.

I'm particularly taken aback to discover that

This work is supported by the National Institute for Health Research (NIHR) Great Ormond Street Hospital Biomedical Research Centre.

Where was their oversight of this before agreeing to the funding?

Edited

No part of the site may be distributed, scraped or copied for any purpose without express approval.

This means you, academics.

There's something familiar about predominantly male researchers scraping a site used by predominantly women and owned by a woman without even checking that the T&C permit that or just ignoring the T&C.

Ah yes. Male entitlement and total disregard for women. Again.

CornishPorsche · 23/04/2024 15:01

From the Exeter study:

"Proof of concept: Study 1 data The online forum data for training our model were gathered from the online websiteMumsnet UK (www.mumsnet.com/talk), the largest parentonline network in the UK, with the kind permission of Mumsnet UK. This site provides different sub-forums in which users can discuss particular topics and themes. We analyzed posts from two sub-forums, “Being a Parent” and “Feminism”. The posts were collected in September 2012 from 2500 threads per sub-forum. Every person who wishes to contribute to Mumsnet UK is required to create a user account with a unique user ID. Hence, posts from the same author can be matched by the user ID, irrespective of the sub-forum in which they were posted".

Would have been nice to know we were being used as guinea pigs when MN gave them permission.

DrSpartacular · 23/04/2024 15:04

Every person who wishes to contribute to Mumsnet UK is required to create a user account with a unique user ID. Hence, posts from the same author can be matched by the user ID, irrespective of the sub-forum in which they were posted.

@JustineMumsnet are you able to confirm that no researchers have had access to our unique user IDs please?

stealthninjamum · 23/04/2024 15:07

Placemarking. This just feels so sinister.

JustineMumsnet · 23/04/2024 15:13

DrSpartacular · 23/04/2024 15:04

Every person who wishes to contribute to Mumsnet UK is required to create a user account with a unique user ID. Hence, posts from the same author can be matched by the user ID, irrespective of the sub-forum in which they were posted.

@JustineMumsnet are you able to confirm that no researchers have had access to our unique user IDs please?

Edited

Yes I can confirm that all researchers would be able to scrape are posts on the website and no other info.

JustineMumsnet · 23/04/2024 15:16

Update - Aston Uni have responded and offered a call with their Vice Chancellor to explain the reasons for the research, how they manage ethical approval and protect privacy and data. I'll be taking them up on that and putting some of our own (and your questions). Will report back!

DrSpartacular · 23/04/2024 15:18

Thanks @JustineMumsnet

Maybe Aston could do a webchat/Q&A?

ADoggyDogWorld · 23/04/2024 15:24

Thank you Justine, I hope the call provides satisfactory answers.

VitoCorleoneOfMNMafia · 23/04/2024 15:27

Given we had no knowledge our site was being scraped (against our T&Cs) and the data used in this way, surely the only answer that matters is to "have you deleted the scraped data yet?"

TokyoBouncyBall · 23/04/2024 15:32

DrSpartacular · 23/04/2024 15:18

Thanks @JustineMumsnet

Maybe Aston could do a webchat/Q&A?

I agree. I think it should be a condition of any settlement!

OP posts:
Tallisker · 23/04/2024 15:37

And please ask them for a truthful answer on why they cancelled Thursday's event.

Garlicked · 23/04/2024 15:42

Theeyeballsinthesky · 19/04/2024 17:17

votes for women!

(except mean TERFS who won’t accept TWAW amirite?)

i wonder how much they’ve looked into how newspapers of the time reported a suffragettes and compared it ti how GC women & men are talked about now and if that’s possibly jogged any thoughts in their brain….

This is a really good point - one I've only seen made by Mumsnetters and Glosswitch - which would form a valid foundation for a thesis.

AmaryllisNightAndDay · 23/04/2024 15:44

Thank you @JustineMumsnet

Basic questions like how long has this been going on, how many times has data been scraped, which groups.

Is he aware that women discuss the most personal, private and intimate details of their own and their children's lives, include mental health, sex, sexuality, physical, emotional and sexual abuse, as well as discussing the politically sensitive matters that also interest Aston? Does he agree that it is not appropriate to use such sensitive material for research into linkage and de-anonymisation?

I want to be reassured that Aston university understands that women who post on MumsNet deserve the very highest level of privacy, security and respect, and that this is reflected in Aston's ethics procedures and fully implemented. That includes an appropriate level of informed consent to use any of our data.

I;d like to know what measures are taken to protect individuals who have posted under their own names or who can easily be identified?

And will Aston review the collection and use of the data and should they should delete the dataset altogether if they have failed to gather it ethically or to manage it correctly? Are they going to report themselves to the ICO?

And please do remind him not to mess with the Vipers.

IDoNotConsentToAstonResearch · 23/04/2024 15:44

Oh to be a fly on the wall!
I look forward to hearing how he justifies it.

Ereshkigalangcleg · 23/04/2024 15:45

@JustineMumsnet

Please ask them to clarify how they know via the model whether the "linguistic transphobia" is original or reported speech, and how they take account of the fact that people are constantly quoting other people and media articles, both posted on the thread and offline. If there was one comment, it might be quoted by 20 people. That's not 20 separate uses of the "transphobic" term in any meaningful sense.

Point raised by @VitoCorleoneOfMNMafia

everythingthelighttouches · 23/04/2024 15:46

@JustineMumsnet .

I hope Aston can reassure you that their software AND storage systems take steps to remove any identifying information about subjects.

What is considered identifying is probably debatable.

A lot of the information in these types of threads is likely special category data which requires a higher standard of treatment.

AGlinnerOfHope · 23/04/2024 15:51

I love this site. So much.

Ereshkigalangcleg · 23/04/2024 15:57

Philosophical beliefs are special category data, as well as the more personally identifiable stuff. And as we all know, snd Aston might like to consider, gender critical beliefs, and separately a lack of belief in gender identity, are considered worthy of respect in a democratic society and protected under the Equality Act 2010. So the bar really needs to be high for "transphobia".

ArabellaScott · 23/04/2024 15:57

Thanks, @JustineMumsnet .

My question: Is there any reason we should not contact the ICO over this unethical and dubious data scraping?

Astontacious · 23/04/2024 16:01

Have they accessed our real names/emails/locations too?

Why did they do this without asking?

Why is it that they have decided mumsnet is transphobic in the PhD title?

Please create an account

To comment on this thread you need to create a Mumsnet account.

This thread is not accepting new messages.