I think there is possibly a number of reasons they are digging in about the thesis.
The first and most obvious is that it is on-going. So they have taken on a student, either as self-funded or funded by someone, either who has brought this project proposal to them or as a supervisor led proposal to do this PhD.
It is not the first time a PhD student at Aston has used MN, there is the example on the earlier thread of Jai MacKenzie. She took a completely different and much more ethical approach, but I did just wonder if there are other PhDs we just don’t know about.
Conversely, one might reasonably think also that the Aston doxy dudes have finished with MN and done the research they planned from the 2019 datascrape, and the database had been deposited for the use of other researchers. Therefore, it is the other researchers who now cannot use the material.
The other point is scale. If Aston are relying on copyright exemptions for research for their argument, then it would be odd to delete what is a fractional percentage of the site because the fair use part (5% I think) is would more clearly apply here. It is clear from the other thread that other academics have created mini corpuses for linguistic research (the Manchester and Nottingham examples) regardless of whether it breaches MN t&cs. So there is a question of precedent (I don’t agree with this, but it is the logic of their argument).
Alternatively, Aston have gone away and are going to come back with a more reasonable proposal for using the data which does not start from a negative presumption. I mean reasonable in terms of research question. As someone said on one of the FWR threads or the first site stuff one, it is possible to envisage a useful and interesting PhD which looks at this as a site for women’s activism and ties it to broader social and political developments. It is of course also possible to do that in the ethical Jai MacKenzie way and not rely on a data scrape. It is possibly not the forensic linguistics PhD envisaged or started, though.
The other alternative scenario is of course that Aston will update that they have re-thought and the 2024 data set will be deleted.