Libraries to store all UK web content - including MN posts?

(31 Posts)
ControlGeek Fri 05-Apr-13 08:11:57

Or have I read this wrong?

www.bbc.co.uk/news/entertainment-arts-22028738

"The British Library and four other "legal deposit libraries'" have the right to collect and store everything that is published online in the UK.

...

The archive will cover 4.8 million websites and will include magazines, books and academic journals as well as alternative sources of literature, news and comment such as Mumsnet, the Beano online, Stephen Hawking's website, and the unofficial armed forces' bulletin board, ARRSE."

SandyMumsnet (MNHQ) Tue 09-Apr-13 11:14:46

Hi there,

This is v interesting and we are going to have a think about it. <scratches chin>
In the meantime we will be moving the thread to Other subjects, just so it doesn't get lost.

Thanks

NetworkGuy Sat 06-Apr-13 21:31:36

ControlGeek - I just 'reported' the thread to see if it can be moved...

Might be one to consider for "Site Stuff" or "In the News" as it does perhaps need wider discussion, MN input, and not to disappear from Chat in 3 months time... or will it be archived by BL and be visible forever smile

NetworkGuy Sat 06-Apr-13 21:27:43

was the thread of black boxes

gah! was the threat of black boxes

which no doubt do exist by now... hint, encrypt, encrypt, encrypt and use VPN to get a connection to somewhere outside the UK smile

NetworkGuy Sat 06-Apr-13 21:22:59

I don't know how they are going to identify which things are posted online "in the UK" - do they save everything from Twitter and Facebook "in case" the user was in the UK when they posted it. What about people that travel a lot for fun or for work - does everything an air steward/ess write get stored or only the UK-posted stuff.

I must find out how they define "published". I remember when there was the thread of black boxes being installed at ISP services to log everything, one ISP (Clara.Net which hardly serves the residential market now) had a comment posted to the effect "they'd move their mail servers offshore to provide their customers with some degree of privacy".

I wouldn't expect many of my dozens of websites to be archived and may take steps to move some from the UK, but it begs the question as to whether the author, being in the UK, is described as "publishing online in the UK" merely by uploading, in which case it matters not that I have some sites stord on non-UK servers.

(I hasten to add I'm doing nothing illegal, but remember someone whom I had not had contact with for 30+ years sending an e-mail out of the blue... I don't necessarily want such surprises in future and removed the method of identification from the site where the person had identified me.

If it was all archived by BL then someone could still access the information from the archive even though I had removed it from current online sites... and yes, I am aware of Archive.org but you can {a} disable archiving with that and {b} my sites aren't important / popular enough to have been archived very much anyway!)

NetworkGuy Sat 06-Apr-13 21:21:48

I loved the comment on the Beeb site (even though it got some negative score):

"Just got to make sure they store most of it in fiction as you can't believe everything you read on the internet!"

It's one thing to archive it (and the "behind paywalls" question is a very good one) but another to even identify what will be archived.

Another comment - "It sounds like a great contract for someone: it's impossible without a magically replenishing blank cheque." HOW TRUE!

Also liked comments querying the usefulness of Tweets, Facebook etc.

Yeh, I like that idea too OneHundred as I've been doing a bit of family research recently. But I wonder if our descendants would be able to identify us from our posts, or if it's more likely our views would just be included in some general social research ? I think Mumsnet could make a great social history archive for the future smile I often think we don't realise how old-fashioned some attitudes sadly still are, such as in equality/feminist issues and working rights. I'm hoping progress will mean they look equally old-fashioned in 100 years as Edwardian life looks to us today ! (Was just re-reading Vera Britten's "Chronicle of Youth" diaries at the weekend telling of life, especially for young women, on the brink of the first world war)

Storage is cheap. Access is the key - what do they plan to do to make sure these pages are still accessible in even 10 years time, let alone 100? I have the misery of an old browser at work (IE7 - positively from the Dark Ages...) and a lot of websites already tell me my browser is not supported.
So, are they going to keep copies of all the old browsers, and the old operating systems, and the old PCs to run the old pages on? Or are they going to migrate all these millions of pages to new standards over time? And, if they do that, will the pages look completely the same as they did to the original reader, or is it ok that they don't so long as the words are still there?
Archiving electronic material is important, but there isn't an agreed standard yet on how exactly we are meant to do that. If they don't watch out they'll have millions of pages and no way to access them.

Look up the BBC Domesday Book project for an example of how quickly technology changes and makes your files un-readable grin

OneHundredSecondsofSolitude Sat 06-Apr-13 14:23:00

I love the idea of our ancestors researching us through out Facebook posts or what we put on mumsnet. So much more interesting than births deaths and marriage certificates smile

ControlGeek Sat 06-Apr-13 14:06:04

Sadly it's more political than ignorance tee hmm

I'm also curious about the archiving of pages behind paywalls - does this mean they will ultimately be available for free?

Tee2072 Sat 06-Apr-13 09:26:14

Well then your server team are stupid. 10GB is nothing on a server and should have been installed in minutes. Perhaps fire them all and get ones who know what they are doing?

Yes, Java, good search engine algorithm will be key.

javabean Sat 06-Apr-13 09:23:21

Don't think storage is a problem, but good access will be smile we need good ways to search and access it as it's far too much data to just look through. Computer/data scientists of the future will never be out of a job!

It's an interesting idea though, think I read somewhere that, for example, a lot of the online reaction to 7/7 has been lost. That sort of thing will be really interesting to analyse in the future.

hedgefund Sat 06-Apr-13 09:13:20

from my understanding if it's published on the net for all the public to see it's gonna be archived. i think it fab myself!

ControlGeek Sat 06-Apr-13 09:11:38

Would you mind telling that to my server team please tee? I've been trying to get an extra 10GB on one of the servers I support for the best part of four years.

To be honest, when I look at what kids grow up with these days, technology-wise not to mention the associated issues, it's as different from my childhood (25-30 ish years ago) as my own was from someone who lived say 100 years ago or more. The more I think about it the more I like the idea of there being some kind of archive or record to document this explosion because it's happening so fast it will be really difficult in the future to pinpoint the various turning points.

Tee2072 Sat 06-Apr-13 08:30:24

Maybe not MN specifically, ControlGeek, but online chat rooms/message boards and their effect on society and societal interaction.

And storage really isn't that much of an issue, considering my phone can hold 64gigs on it own and look how tiny that is?

Or the 2 tetrabyte hard drive that sits on my desk as a back up system which is the size of a paperback book.

Storage is cheap, small and plentiful and getting more so every day.

ControlGeek Sat 06-Apr-13 08:20:44

Oooh do you think at some point in the future we might be able to study for a degree in historic literature specialising in mumsnet then? And do essays on comparing and contrasting the various threads (obviously citing the relevant DM references)? What will they make of the slow cooked porn??

<getting a bit carried away with the idea now>

Manchesterhistorygirl Sat 06-Apr-13 00:16:34

It's a sort of furtherance of the mass observation diaries kept at some university or other. Sorry, tired so my brain isn't working properly.

MsVestibule Sat 06-Apr-13 00:12:10

I think it's a great idea! Imagine in 100 years time, people reading the endless WOHM/SAHM debates, the benefit bashing etc, in utter disbelief. Previously, history has recorded the main events, but not the 'average' persons reaction to it.

aufaniae Fri 05-Apr-13 23:52:18

I think it's a great idea as it'll be a fascinating and valuable insight into the past (once we are significantly in the past enough to be interesting that is!)

It would be a real shame to lose all the data on the internet because no one's bothered to store it IMO.

History? I love the idea of some random in the future coming across hamster stew or literally twatting a spider or something about neighbours stealing spoons grin

booksteensandmagazines Fri 05-Apr-13 20:07:56

When I was studying to be a librarian digital storing info was the big thing because that's how people communicate now and where so much info is stored. Its part of our history I suppose.

Notquite Fri 05-Apr-13 12:26:53

It does reproduce a huge no. of out-of-copyright books in a rather elgant, page-turning form - much more user-friendly than google books.

Notquite Fri 05-Apr-13 12:23:53

I visualise it floating around our ears ControlGeek (but your name suggests you know better!).

ControlGeek Fri 05-Apr-13 09:52:04

thumbtack I've got a vague recollection of hearing something about that, and laughing myself silly. The govt don't exactly have a great track record of implementing computer systems.

notquite I'll check out the link tonight from home, I wonder how they store it all, do they trim off the images, just store the text? That's a lot of pages!

meditrina Fri 05-Apr-13 09:41:40

Quite a lot is archived anyhow. This doesn't make that much difference, except expanded capacity, as everyone ought to know already that publishing online means publishing to the world and anything which can be seen, can be kept.

Thumbtack Fri 05-Apr-13 09:28:59

*wrong!! blush

Join the discussion

Join the discussion

Registering is free, easy, and means you can join in the discussion, get discounts, win prizes and lots more.

Register now