Page 13 | They scraped Mumsnet again.

ArabellaSaurus · 2025-11-08T17:26:07+00:00

archive.ph/e0u3Z [[https://bulletin.appliedtransstudies.org/article/4/1-3/7/ https://bulletin.appliedtransstudies.org/article/4/1-3/7/]] Another data scrape. I'd say it's also defamatory against Mumsnet. I've archived. Article is a load of tedious wank, as you'd expect. 'In this study, however, we excavate what it means to ^write like a GC^ by analyzing how GC forum users rely on reactionary language and deploy storytelling practices in ways that calcify their anti-trans ideologies as personal and natural while rendering transgender people as anti-feminist, dangerous, and monstrous. To identify how GC groups perform political mythmaking and construct extremist identities, we undertook a computationally assisted discursive analysis of two popular GC forums: Ovarit and Mumsnet’s “Feminism: Sex & Gender” board (abbreviated to “FSG”). Through comparative platform discourse analysis, we analyzed over 80k posts and comments scraped from Ovarit and over 60k posts and comments scraped from Mumsnet (Burgess and Matamoros-Fernández 2016; Lewis and Marwick 2017)' The only mildly amusing thing about it is the name of the Journal. ^BATS. ^

DeanElderberry · 11/11/2025 10:20

DustyWindowsills · 11/11/2025 09:46

Is there a rationale for deletion, e.g. forbidden words picked up by an algorithm, or is it sufficient for somebody to report a post that offends them? I had a post deleted in which I referred to our most charming visitor as a small passerine bird of the Paridae family. I also called him "he".

Show quote history

However much you try to cloak your offence in Linnaean Latin, calling other posters chickadees (even small ones) is Just Not On.

DeanElderberry · 11/11/2025 10:21

Because the chickadees don't like it.

DustyWindowsills · 11/11/2025 10:53

DeanElderberry · 11/11/2025 10:20

However much you try to cloak your offence in Linnaean Latin, calling other posters chickadees (even small ones) is Just Not On.

Show quote history

I am a bad person. 😞

SabrinaThwaite · 11/11/2025 23:17

DustyWindowsills · 11/11/2025 09:46

Is there a rationale for deletion, e.g. forbidden words picked up by an algorithm, or is it sufficient for somebody to report a post that offends them? I had a post deleted in which I referred to our most charming visitor as a small passerine bird of the Paridae family. I also called him "he".

Show quote history

Long tailed, blue, coal or just great?

I recently saw a T-shirt with a variety of said birds and the slogan ‘nice tits’.

ETA or were you going for the WC Fields / Mae West classic ‘My Little Chickadee’?

DustyWindowsills · 11/11/2025 23:36

SabrinaThwaite · 11/11/2025 23:17

Long tailed, blue, coal or just great?

I recently saw a T-shirt with a variety of said birds and the slogan ‘nice tits’.

ETA or were you going for the WC Fields / Mae West classic ‘My Little Chickadee’?

Edited

Show quote history

Sadly, I didn't specify. Perhaps that was my big error.

I'm rather fond of Sarah Edmonds' range of bird-themed mugs and tea towels, featuring tits, cocks, shags and floaters. Not that I'm childish or anything!

NorthernBogbean · 12/11/2025 17:54

DrBlackbird · 10/11/2025 09:47

Public interest case? What is that and where are the published guidelines on it? Without offering evidence, this term sounds like something out of newspaper speak.

However, academic research is held to higher standards than journalism.

You may want to argue that data "should" be scraped without the platform’s consent, but all research ethics guidelines - for reputable researchers and universities - informed consent is one key aspect in using data, even internet data. Even the association of internet researchers highlights the ethical imperative in obtaining consent when a direct quote is used.

If you are an academic researcher, then I very much hope that you revisit your understanding of research ethics, your university’s ethical guidelines, and the UK Research Integrity Office. ‘Want to’ does not override legal and ethical research requirements involving real people.

https://ukrio.org/wp-content/uploads/UKRIO-Code-of-Practice-for-Research.pdf#page17

https://aoir.org/reports/ethics3.pdf

Edited to add: it would not be up to site owners to say whether it could be studied or not. AFAIK there's no legal or necessarily ethical impediment to that

Your opinion is not in alignment with most reputable guidelines on ethical and legal use of internet data. There are some exemptions but you are making too sweeping of an opinion here. Informed consent of those whose data is being used - including whole direct quotes - is still a key ethical issue. I’m shocked that you are arguing otherwise. There are good reasons for being aware of and sensitive to potentially dangerous outcomes in using such data. Hence the need for guidelines.

Edited

Show quote history

I wasn't going to keep responding on this thread because I know the p.o.v. I'm advocating for isn't popular here, and I've never wanted to derail or disrupt threads.

I just want to say this and will duly shut up, IMV private or other ownership interests on the public internet shouldn't be able to gatekeep public content from public scrutiny. Think of all the sites on the public internet that should be accessible to scrutiny.

Creating T&Cs will maybe protect a public site from ripping for commercial use if the site has an appetite for lawsuits but they won't IMV protect from public interest journalism or academic research where approved protocols laid down for research are properly in place and complied with. I am familiar with those but I no longer work in HEIs.

Research protocols provide guidelines, risk assessments and hurdles for ethical approval. If researchers in universities don't conduct studies / handle data properly and as agreed, they will / should be internally sanctioned. If the university hasn't taken enough care to ensure compliance, it can be prosecuted. It depends on the case as to what the consequences might be. Because of the principles of public interest and academic freedom in democracies, the parameters of study aren't simply decided by institutions / publications / people who might be studied.

The new protocol element is the way the public internet, specifically social media, combines citizens and publication. So protocols for data gathering fall somewhere between studying publications / media and studying groups of people. If you set up a private group hosted on the internet, which no-one can see the content of, unless signed up with whatever data the organisers require, you have an expectation of privacy, just as you would if you met in person. But if the content of a site is publicly visible to consume, contributors are not private in the same way. Contributors are the monetised content, and their position isn't like private people communicating just with one another. The dilemma is how researchers can follow protocols of defaulting to anonymity for a general public, in this case site users. This is the debate.

My view is that if data is handled well, further anonymising of users is possible. The biggest risk for breaching anonymity will always be the content that users themselves publicly post. Sites can offer various aids to anonymity and erasure of content (but the content may still be permanent) and it's their responsibility not to allow users' private data they hold to be accessible. The arguments about the status of user-contributors can go on, but a site can't be both a public resource, a monetised public entertainment or a public influence, and also private.

It's not possible to acquire individual consent from user-content contributors, an unidentifiable quantity of people contributing over time, as it is from people involved in old-fashioned field studies. You could argue this means you can't study sites, but I disagree and I don't think I'm out of line with ethics panels on that. IMV site or platform-owners consent or lack of consent should not decide whether it can be scrutinised or studied.

The technologies which enable the public internet and social media aren't in the hands of its users, or even their governments, so it's important to be able to scrutinise web content independently. Public web content isn't accessible to study as easily as printed or other physical record just by the nature of the medium. It can take a lot longer and far more bodies to data gather from an ever-moving, multifaceted source of information. The biggest obstacle to getting the data therefore is time / funding. This is why researchers use ways of speeding up data collection like software. Using software to access private / hidden data is hacking and is illegal. Using software tools to retrieve and catalogue public content isn't but what you do with it is variously restricted. Academic researchers have access to commercial made-for-purpose software which requires user licenses. This removes some independent control of data from the researchers and is potentially problematic for that reason. Use of self-built open source software tools to do the same job shouldn't impede data gathering IMV, it's not ethically different from doing it by hand - the tools are not the problem so much as what is done with the data - but that doesn't mean university managements won't self-censor as they are increasingly nervous of being subject to litigation, being increasingly commercial themselves and fearful of funding cuts.

Most barriers to research come down to money / resources as well as confidence in being supported by governments and the law - the allocation of funding and validation of institutions, or threat of its withdrawal, can have a freezing effect on research. Universities, media companies, journalists and even governments are increasingly subject to the power of global internet technology barons to withdraw their technologies. For these reasons, I think the principle of public scrutiny of the public internet is a prime concern.

Aside from bad or illegal use of data, the issue with sites being studied isn't the production of papers like the one that generated this thread but the lack of motivation for people researching in unis right now to produce better, challenging studies from a wide variety of angles. Maybe working academics on MN can do that, hopefully unhampered by 'T&Cs'.

FlirtsWithRhinos · 12/11/2025 18:24