Libraries to store all UK web content - including MN posts?

(31 Posts)
ControlGeek Fri 05-Apr-13 08:11:57

Or have I read this wrong?

www.bbc.co.uk/news/entertainment-arts-22028738

"The British Library and four other "legal deposit libraries'" have the right to collect and store everything that is published online in the UK.

...

The archive will cover 4.8 million websites and will include magazines, books and academic journals as well as alternative sources of literature, news and comment such as Mumsnet, the Beano online, Stephen Hawking's website, and the unofficial armed forces' bulletin board, ARRSE."

CogitoErgoSometimes Fri 05-Apr-13 08:18:59

They're going to need one helluva storage facility....

ControlGeek Fri 05-Apr-13 09:14:06

Storage will be fine, it's been 10 years in the planning so obviously it will go perfectly smoothly! Love the randomness of the Beano and Stephen Hawking though smile

Thumbtack Fri 05-Apr-13 09:20:41

They have the 'right' but like Cogito said, its a helluva storage facility . . The government was once considering a similiar thing; storing all texts and emails, they asked my brother for advice (he is IT whizz kid) and my bro said it would ba a logistical nightmare . . (or something to that effect)

Thumbtack Fri 05-Apr-13 09:28:18

I might have got that wong (my bro will kill me!) confused

Notquite Fri 05-Apr-13 09:28:49

archive.org has archived 240bn web pages since 1996 shock

Thumbtack Fri 05-Apr-13 09:28:59

*wrong!! blush

meditrina Fri 05-Apr-13 09:41:40

Quite a lot is archived anyhow. This doesn't make that much difference, except expanded capacity, as everyone ought to know already that publishing online means publishing to the world and anything which can be seen, can be kept.

ControlGeek Fri 05-Apr-13 09:52:04

thumbtack I've got a vague recollection of hearing something about that, and laughing myself silly. The govt don't exactly have a great track record of implementing computer systems.

notquite I'll check out the link tonight from home, I wonder how they store it all, do they trim off the images, just store the text? That's a lot of pages!

Notquite Fri 05-Apr-13 12:23:53

I visualise it floating around our ears ControlGeek (but your name suggests you know better!).

Notquite Fri 05-Apr-13 12:26:53

It does reproduce a huge no. of out-of-copyright books in a rather elgant, page-turning form - much more user-friendly than google books.

booksteensandmagazines Fri 05-Apr-13 20:07:56

When I was studying to be a librarian digital storing info was the big thing because that's how people communicate now and where so much info is stored. Its part of our history I suppose.

PetiteRaleuse Fri 05-Apr-13 21:38:47

History? I love the idea of some random in the future coming across hamster stew or literally twatting a spider or something about neighbours stealing spoons grin

aufaniae Fri 05-Apr-13 23:52:18

I think it's a great idea as it'll be a fascinating and valuable insight into the past (once we are significantly in the past enough to be interesting that is!)

It would be a real shame to lose all the data on the internet because no one's bothered to store it IMO.

MsVestibule Sat 06-Apr-13 00:12:10

I think it's a great idea! Imagine in 100 years time, people reading the endless WOHM/SAHM debates, the benefit bashing etc, in utter disbelief. Previously, history has recorded the main events, but not the 'average' persons reaction to it.

Manchesterhistorygirl Sat 06-Apr-13 00:16:34

It's a sort of furtherance of the mass observation diaries kept at some university or other. Sorry, tired so my brain isn't working properly.

ControlGeek Sat 06-Apr-13 08:20:44

Oooh do you think at some point in the future we might be able to study for a degree in historic literature specialising in mumsnet then? And do essays on comparing and contrasting the various threads (obviously citing the relevant DM references)? What will they make of the slow cooked porn??

<getting a bit carried away with the idea now>

Tee2072 Sat 06-Apr-13 08:30:24

Maybe not MN specifically, ControlGeek, but online chat rooms/message boards and their effect on society and societal interaction.

And storage really isn't that much of an issue, considering my phone can hold 64gigs on it own and look how tiny that is?

Or the 2 tetrabyte hard drive that sits on my desk as a back up system which is the size of a paperback book.

Storage is cheap, small and plentiful and getting more so every day.

ControlGeek Sat 06-Apr-13 09:11:38

Would you mind telling that to my server team please tee? I've been trying to get an extra 10GB on one of the servers I support for the best part of four years.

To be honest, when I look at what kids grow up with these days, technology-wise not to mention the associated issues, it's as different from my childhood (25-30 ish years ago) as my own was from someone who lived say 100 years ago or more. The more I think about it the more I like the idea of there being some kind of archive or record to document this explosion because it's happening so fast it will be really difficult in the future to pinpoint the various turning points.

hedgefund Sat 06-Apr-13 09:13:20

from my understanding if it's published on the net for all the public to see it's gonna be archived. i think it fab myself!

javabean Sat 06-Apr-13 09:23:21

Don't think storage is a problem, but good access will be smile we need good ways to search and access it as it's far too much data to just look through. Computer/data scientists of the future will never be out of a job!

It's an interesting idea though, think I read somewhere that, for example, a lot of the online reaction to 7/7 has been lost. That sort of thing will be really interesting to analyse in the future.

Tee2072 Sat 06-Apr-13 09:26:14

Well then your server team are stupid. 10GB is nothing on a server and should have been installed in minutes. Perhaps fire them all and get ones who know what they are doing?

Yes, Java, good search engine algorithm will be key.

ControlGeek Sat 06-Apr-13 14:06:04

Sadly it's more political than ignorance tee hmm

I'm also curious about the archiving of pages behind paywalls - does this mean they will ultimately be available for free?

OneHundredSecondsofSolitude Sat 06-Apr-13 14:23:00

I love the idea of our ancestors researching us through out Facebook posts or what we put on mumsnet. So much more interesting than births deaths and marriage certificates smile

Storage is cheap. Access is the key - what do they plan to do to make sure these pages are still accessible in even 10 years time, let alone 100? I have the misery of an old browser at work (IE7 - positively from the Dark Ages...) and a lot of websites already tell me my browser is not supported.
So, are they going to keep copies of all the old browsers, and the old operating systems, and the old PCs to run the old pages on? Or are they going to migrate all these millions of pages to new standards over time? And, if they do that, will the pages look completely the same as they did to the original reader, or is it ok that they don't so long as the words are still there?
Archiving electronic material is important, but there isn't an agreed standard yet on how exactly we are meant to do that. If they don't watch out they'll have millions of pages and no way to access them.

Look up the BBC Domesday Book project for an example of how quickly technology changes and makes your files un-readable grin

Join the discussion

Join the discussion

Registering is free, easy, and means you can join in the discussion, get discounts, win prizes and lots more.

Register now