## Optical Storage

by lani Thu 09 Jun 2005 16:36:22

This is a shameless cry for geek help!

<p>Our lab would like to archive files for the future.  We are storing large tiff files that are about 200MB each.  Hopefully, I am putting out decent data and we will need to keep the files for&#8230;let&#8217;s say 10-20 years at least.  I was looking into external <span class="caps">DVD</span> burners (both Mac &amp; PC compatible) and am currently considering the Plextor 716UF.  And whilst searching, <a href="http://www.blu-ray.com/">Blu-ray</a> and HD-<span class="caps">DVD</span> cam up.  So here&#8217;s my question&#8230;and it may be stupid because I&#8217;m not a computer person&#8230;but does anyone thing that <span class="caps">DVD</span> storage of data will be uselessly obsolete in 10-20 years?  will I need to put <span class="caps">DVD</span>-<span class="caps">ROM</span>s into storage along with them?</p>
a volatile conversation topic indeed! my 2 cents: think in the long haul. "burned" dvds are non-archival. there are too many issues with the quality of the blanks(mfg's dont exacly test every disc), the effectiveness of any burner over heavy usage (are you willing to manually test and verify each burn?), and the persistance of the format you select (will you be able to buy a dvd+r capable drive in 20 years?) but then what to do, what to do? there's always tape drives, but for anything over 80 gbs of data, your overall system price starts to make your ears bleed. i love tape robots, but they are an enterprise level solution.
my suggestion would be reduntant hard disc array. massive drives are among us. a terabyte of raid 5 storage can be tossed together for under 1k. with the new perpendicular writing technologies coming out, current drive prices will be coming down. building a network area storage device with an enterprise level raid controller can be done for under 1500$. at that point you can add drives to the array as you need them. selecting a good manufacturer like seagate with a 5 year warrenty, and running this device solely for archiving, minimizing its usage means your drive failure rate should not push your yearly maintenance costs above 1-200 bucks. anyway i am rambling. basically you need to determine your budget and how much data you really need to store. sorry for the delay in responding. (i just got back from puerto rico and a place without internet). thanks for you advice though. thank god the order for the external dvd burner i put in didn't actually go through. i like the redundant hard disk array idea because it seems like it will easily be expandable. part of the problem, is that i really have no idea how much data we will really need to store. i think i can assess some of that now, but some of it is out of my hands and depends on how successful we are. (actually, thinking that my data may be useful in 10-20yrs is a touch on the optimistic side...but it's still better to have it backed up). definitely avoid the optical media. i burn cds if i want to move data around or make something easy to play in a commercial cd player or dvd player. never for archiving something i actually care about. the first thing you need to figure out is your total storage needs. how many of those 200MB tiff files do you need to keep around? how often are they produced? emile's right. at this point in time, regular hard drives are the best and cheapest option. you can build a 1TB raid drive for$1k. for that price, i'd actually recommend getting two and making them redundant. raid protects you from a drive failing, but it won't help you if you accidentily delete something that you shouldn't have. so build one big drive to store your stuff and then build another as a backup and setup a nightly script that automatically copies stuff to the backup drive. i can't stress the 'automatically' enough. any backup strategy that involves someone having to remember to do something on a regular basis is absolutely doomed to catastrophic failure. the one day that you forget to copy stuff over to the backup drive is the day that you accidently delete everything. murphy's law guarantees it.

as i said above, i'm not really sure how much we will need to keep around. i think i'm going to have to poke around the other microarray labs and find out what their storage solutions are.

i will definitely look into some type of automatic script, because if not, everything would depend on me...

even though i think the posts above basically override what i'm about to say, i'll add this in the off chance that you might find it useful. this is controversial as emile pointed out, and in the archival world the only medium other than paper of course that archivists place any faith in is microfilm. obviously you won't be going that route, and so to get to the point, the only semi-acceptable digital storage at this point is actually cds, not dvds. people i've talked to tend to prefer gold cds because the gold coat is more stable, but how stable? and for how long until they're obsolete? i don't know.

you may have already thought of this, but the other thing to standardize first /before/ storing is the metadata attached to your images. who's going to be using them? how will they search? what will they be looking for? that sort of thing. if you're interested, here are some places you can read up on past experiences, digital library/archiving standards, questions, and so forth:

an experience with tape drives: http://memory.loc.gov/ammem/techdocs/libt1999/libt1999.html#media via: http://memory.loc.gov/ammem/about/techIn.html

if you're not on enough listservs: www.dli2.nsf.gov/lists

for your free time: (much of this will be irrelevant, but there are some good things mentioned such as the book Moving Theory into Practice: Digital Imaging for Libraries and Archives and also the section on Objects): http://www.niso.org/framework/Framework2.html

thanks for the expert advice! i must admit i had no idea what went into picking an archiving system.

we are planning on building a searchable database for our data. (we are working with microarray data.) i think part of the problem is that we are not storing just images, but pixel values that are important for analysis. you bring up a good point about thinking of achiving in terms of what people will need to look for later and that's something that i will have to do some research on.

i should probably google "microarray archiving".

thanks for the direction!

oh sure, like a librarian knows anything about archiving massive amounts of information...

where's my fishing rod?

please accept this as my humble apology. it took a lotta baksheesh to finagle this.

