CASE.EDU:    HOME | DIRECTORIES | SEARCH
case western reserve university

UNIVERSITY ARCHIVES

 

 

Digital Preservation

 

Most University employees create or receive digital documents such as email messages, PDF reports, Excel spreadsheets, PowerPoint presentations, jpeg images, and Word files. These digital documents are used to conduct the University's work; this makes them University records. Many digital records do not need to be stored after the purpose for which they were created has been accomplished. However, some digital records need to be kept longer for program reporting, as precedents for future use, or to satisfy policy or regulatory requirements. Some digital records need to be kept permanently because they provide important information about University activities. Permanent digital records should be transferred to the University Archives.

Availability of trustworthy information about Case Western Reserve University's development in five years or one hundred years depends on the way digital records are managed today -- before they are transferred to the Archives.

This list of Frequently Asked Questions is intended for those in the Case Western Reserve University community who want to prolong the life of digital information at work and at home.

Overview

What is the problem?

What can I do to minimize the risks of information loss?

What is a preservation strategy?

Storage Media

What are storage media?

What do I need to keep in mind when choosing a storage medium?

Where can I store files so that they will last indefinitely?

If nothing lasts indefinitely, where do I put files?

Why do you recommend hard drives?

Why do you recommend CD-R?

What about CD-RW?

What about magnetic media?

What about DVDs or flash drives?

How long do CD-Rs last?

How do I choose a brand of CD-R?

What can I do to make my CDs last longer?

How should I store my CDs?

How should I handle my CDs?

How should I label CDs?

What about adhesive labels?

Ooops, I already put adhesive labels on CDs. How should I take them off?

How can I clean CDs?

I always thought the bottom of a CD was the side to be careful of. Why worry about the top?

Any other storage media tips?

As long as the storage media are not deteriorating, I'll be able to get files off them, right?

Ok. So as long as I still have old hardware around, I can use the files?

File Formats

What are file formats?

How do I identify a file format?

What do I need to keep in mind when choosing a file format?

What file formats should I use for the long term?

Should I compress digital files that I want to keep long term?

Should I encrypt digital files that I want to keep long term?

Safe Computing

What is safe computing?

I know backing up my computer is important, but I have no idea where to start. What should I do?

What can I do to protect information from hackers and viruses?

Any other safe computing tips?

What will happen if I don't do anything?

Additional Resources

Back to Top


Overview

What is the problem?

Digital records are powerful because of the ease with which they can be created and modified, distributed and copied, stored and retrieved. But digital records are fragile because they depend on many layers of technology to be rendered in ways that humans can see or hear. Encoding formats, application and operating system software, storage media change rapidly and at different cycles. You may have software that can open the file, but the file is on a disc for which your computer doesn't have a drive. You may have upgraded your operating system but the application publisher has stopped supporting your platform and the old version of the application won't run on the newer operating system. The farther in the future you need to keep digital records, the greater the chance of these incompatibilities making those records inaccessible.

If you box up your paper and put the box in the back of the closet, unless you have a disaster like fire or flood, in twenty years when you retire and your successor pulls the box out of the closet, those paper records will still be readable. We call this benign neglect. Benign neglect isn't the best approach to paper preservation, but it isn't a big threat.

Applying the benign neglect approach to digital preservation is almost a guarantee that your digital information won't be readable in the future. Until inexpensive and easy-to-implement technical solutions to digital obsolescence are developed, digital preservation will require a continuous program of monitoring and migration -- which means transferring digital records to each new generation of technology.

What can I do to minimize the risks of information loss?

Develop a preservation strategy.

What is a preservation strategy?

A preservation strategy is a plan for keeping records, and the information contained in those records, usable for as long as they are needed. A good preservation strategy includes smart selection of storage media and file formats, migration of files to new formats, and following safe computing practices.

Back to Top

Storage Media

What are storage media?

Storage media are the physical objects, such as hard drives, CDs, or floppy disks, that hold information. Storage media are where information is stored, and are not to be confused with file formats, such as jpeg or PDF, which have to do with how information is stored.

What do I need to keep in mind when choosing a storage medium?

> Convenience: how much information can each piece hold? It's easier to store and take care of fewer items.

> Efficiency: how easy and fast is it to copy to and from the medium?

> How widespread is it? The more people who are using it, the more likely it will be to stay around for a while and the easier it will be to get equipment to read it.

> How much does it cost? What is the cost of the equipment needed to read it?

> How long will the information on the medium be in good enough shape to be read?

Where can I store files so that they will last indefinitely?

There is currently no digital storage medium that can be expected to last indefinitely. How long different storage media will last depends on many factors such as how they are made, what they are made of, and how they are stored. Manufacturers' claims for the life spans of their products are not independently verified, and manufacturing processes for storage media are not standardized. In addition, many of these media have not been around long enough for people to really know what may happen to them after ten, fifteen, or fifty years.

If nothing lasts indefinitely, where do I put files?

In an office or a home situation, where professionally managed file servers are not used, we recommend two things. Your safest bet, provided safe computing practices are followed, is a hard drive. For removable media, we recommend Recordable Compact Discs (CD-R).

Why do you recommend hard drives?

Items on hard drives are less likely than removable media to be forgotten. Hard drives are very reliable, as long as safe computing practices are followed. On the downside, large volumes (hundreds of gigabytes) of material magnify the risk of something going wrong, and, of course, it's essential that you back up your data.

Why do you recommend CD-R?

For removable media, we recommend Recordable Compact Discs, also known as CD-R (R stands for Recordable). These are the discs that you purchase blank and then burn your own information to, and once it's burned, it can't be changed. CD-Rs from reputable manufacturers, if handled properly, will probably outlive the hardware and software necessary to read them. This means that the CDs themselves will physically still be intact when your CD drive has been replaced by whatever is going to replace CD drives.

What about CD-RW?

Another type of CD is rewriteable, or CD-RW (RW stands for Re-Writeable). These can be written, erased, and rewritten, like the older floppy disks. The technology that's used in these discs that makes them re-writeable also makes them susceptible to damage from the environment, particularly exposure to light. In addition, information on CD-RWs is less secure because it can be changed or rewritten. Therefore, these are not recommended for long-term storage.

What about magnetic media?

Flexible magnetic disks, which include 3.5-inch diskettes and zip disks, are considered to have a lifespan of five years or less and should not be used for long-term storage.

What about DVDs or flash drives?

DVDs and solid-state media, such as flash drives, haven't been around long enough to develop a track record, and cannot be recommended for long-term storage.

How long do CD-Rs last?

This depends on many factors such as how they are made, what they are made of, and how they are stored. Manufacturers' claims for the life spans of their products are not independently verified, and manufacturing processes for CDs are not standardized. CD-Rs from reputable manufacturers, if handled properly, will probably outlive the hardware and software necessary to read them. This means that the CDs themselves will physically still be intact when your CD drive has been replaced by whatever is going to replace CD drives.

How do I choose a brand of CD-R?

If you can, use a disc brand recommended by the manufacturer of your recorder to decrease the likelihood of errors during burning. According to research done in 2003, CD-Rs that use a gold metal reflective layer and phthalocyanine (THAL-o-CY-a-neen) -based dyes (so-called gold/gold discs) have the greatest life span. The gold/gold discs are more expensive than others, but they seem to be more stable. Be careful. Just because the label says "gold" or "silver" or because the CD looks gold or silver, it doesn't guarantee that the product's metal layer is actually gold or silver.

What can I do to make my CDs last longer?

Store and handle them properly.

How should I store my CDs?

CDs should be stored in a stable environment with temperatures between 40 F and 68 F and relative humidity between 20% and 50%. In an office environment, store them away from water lines and out of direct sunlight. In your home, the main things to remember are not to store them in your hot attic or your damp basement, in direct sunlight, or over a radiator. A dust and smoke-free environment is helpful, and you want to keep your CDs away from food and liquids. It is best to store the discs in rigid jewel cases because they give greater physical protection than paper sleeves. The jewel cases should be stored vertically, like a book.

How should I handle my CDs?

> Handle them as little as possible.

> Put them back in their jewel cases when not in use.

> Handle them by the edges or the center hole.

> Don't touch the top or the bottom.

> Don't bend the discs: remove them from their jewel cases by pressing down on the hub of the case while holding the outer edge of the disc and lifting.

How should I label CDs?

Don't use sharp or pointed writing implements, because these can scratch the thin lacquer and metal layers on top of CDs. Similarly, the chemicals in some markers can migrate into the protective layer and damage it. What you should do is mark the center part of the label side (the part you can see through around the hole) using a soft-tip marker with water-soluble permanent ink. And no, Sharpies are not water-soluble. Anything that has a strong odor probably isn't water-soluble.

What about adhesive labels?

You definitely don't want to use adhesive labels. The weight of the label can upset the balance of the disc during use, and the adhesives can damage the top layers of the disc.

Ooops, I already put adhesive labels on CDs. How should I take them off?

Don't try to remove labels that are already on CDs! Pulling off the label can damage the top layer of the disc, and if you can't get the whole label off, you could end up with even more balance problems than you had with the label on.

How can I clean CDs?

You should only clean CDs when absolutely necessary. Clean only the non-label side of CDs, and wipe from center to edge, not in a spiral, with a lint-free cloth. If you absolutely have to use something stronger to clean them, use a little isopropyl alcohol (rubbing alcohol).

I always thought the bottom of a CD was the side to be careful of. Why worry about the top?

Recordable CDs are made up of three layers. They have a clear plastic base on the bottom. The laser has to read through this layer, so scratches, dirt, and smudges on the bottom of the disc can all interfere with retrieving data.

The other side, often called the label side, has a metal layer covered by a thin coating of lacquer. In between the clear plastic layer on the bottom and the reflective metal layer on the top is the layer where the information is actually recorded.

The laser shines up through the clear bottom layer and is reflected back, and that's how the information is read. If the metal layer is scratched or there are holes in it, the laser passes through instead of being reflected, and the information can't be read.

Any other storage media tips?

> You want your CDs to be as fresh as possible, so don't stockpile them. Buy them as you need them, open them just before use, and check the disc surface before recording to make sure it looks ok. Then spot-check the data after recording.

> Once a year, look at the discs to check for visible signs of damage or deterioration, and at the same time, you should also check a sample for readability.

> Copy the files onto newer, fresher storage media. Storage media deteriorate as they age.

> Don't record at the maximum speed. Slower recording may take a few minutes longer, but it will reduce the chance of introducing errors into your data.

As long as the storage media are not deteriorating, I'll be able to get files off them, right?

Not necessarily. You can't read floppy disks with a CD drive. Monitor the marketplace. As newer storage media and drives replace current technology, copy information to the newer media. Don't wait until current technology has become obsolete. But don't rush to embrace each new technology as soon as it appears. Wait until the new media have an established presence.

Ok. So as long as I still have old hardware around, I can use the files?

File formats are subject to rapid technology obsolescence and evolution. If files are in an old file format and you no longer have the software that created them, you may not be able to open the files. If you can open them, they might not display as intended. It is important to choose file formats with care to prevent problems in the future.

Back to Top

File Formats

What are file formats?

File formats are how information is stored, and should not be confused with storage media, such as hard drives, CDs, or floppy disks, where information is stored.

How do I identify a file format?

The extension at the end of the file name is a clue to the format. Examples include .pdf, .jpg, .xls.

What do I need to keep in mind when choosing a file format?

> Is it proprietary or open? Proprietary file formats, such as Word and Excel, are developed by software companies, such as Microsoft, to encode data produced - and read - by their software. In contrast, open file formats, such as ASCII or RTF, can be supported by multiple software applications on different platforms. Chances for loss are increased if the information is locked into a proprietary format. What if the software to read the format is no longer available, does not have backward compatibility, or is simply not supported? Open formats provide a better chance of being supported since multiple software applications can read them.

> Is it well documented? If it's proprietary and the company that owns it goes out of business, will there be documentation around to help you get at your information?

> How long has the format been around? You might not want to pick a file format that is brand new.

> Has it been widely adopted? If you do use a proprietary file format, choose one that is very popular, such as PDF. This will increase the likelihood that it will be around for a while.

> Is it usable on different hardware and software platforms?

> Does it have a migration path and backward compatibility, such as Word's ability to open 5.1 version files and save them as Word 2004 versions?

What file formats should I use for the long term?

If you need to remove digital files from active use, here are some suggested formats:

> Store text as ASCII (American Standard Code for Information Interchange), RTF (Rich Text Format: Microsoft Word can save files as RTF) or PDF (Portable Document Format).

> Store databases and spreadsheets as comma-delimited or tab-delimited ASCII text or XML (many common proprietary database and spreadsheet applications can export data into these formats).

> Store PowerPoint as GIF or PDF.

> Store images as uncompressed TIFF.

Should I compress digital files that I want to keep long term?

Compression adds complexity to long-term preservation. Some compression techniques shed "redundant" information. As an example, JPEG removes information to reduce file size. The image might look fine on your current monitor, but as monitors improve, the lower quality of the image will be more obvious

Should I encrypt digital files that I want to keep long term?

Encryption is used to make sure that information is read and/or used only by authorized users. Like compression, encryption adds complexity to long-term preservation. Encryption algorithms change over time and backwards compatibility should not be assumed.

Back to Top

Safe Computing

What is safe computing?

There are a number of regular practices you should adopt to minimize the likelihood of short-term information loss.

I know backing up my computer is important, but I have no idea where to start. What should I do?

> Write data to different types of storage media. Not all storage media are manufactured the same, even the same brand (such as CDs). If you can, use two different types of media, such as a hard drive and a CD, or two different manufacturers or batches of CDs.

> Write copies with different software. This will protect against corruption from malfunctions, viruses, or bugs.

> Store backed-up files off-site. What's the point of having back-ups if a disaster strikes your home or office, and your back-ups are destroyed along with your computer?

What can I do to protect information from hackers and viruses?

> Install firewalls. We are protected at the University; however, when you use a high-speed connection such as cable modem or DSL, your computer is connected to the Internet as long as it is on, not just when it is being used.

> Use "safe" passwords. Use combinations of numbers and letters, as well as capitalization. Avoid using dictionary words, the name of the computer or your account. Change passwords frequently.

> Keep security patches current.

> Never open any unrequested or unidentifiable files you receive as email attachments until you know what they contain, even if the message appears to have been sent by someone you know and trust. Turn off features in email software that automatically open attachments.

> Computer viruses can destroy or corrupt data on computers they infect. Always use virus detection and removal software. Update virus definitions and scan your computer frequently. Better yet, set preferences to automatically update.

Any other safe computing tips?

> How many times have you accidentally altered a file? Or thrown it away and then emptied the trash? Lock those important files to prevent accidental alteration or destruction.

> Use surge protectors to guard against power outages, electrical spikes, lightning and other types of power problems.

What will happen if I don't do anything?

If basic steps are not taken, information will be lost and the cost of recovering it could be very high.

Back to Top

Additional Resources

ITS help desk's website <http://help.case.edu/safe/maintain/>

Ohio State University's website on Safe Computing at <http://safecomputing.osu.edu/safecomputing>

Wilhelm Imaging Research <http://www.wilhelm-research.com/> - a good source for specific information about printers and dyes.

PADI (Preserving Access to Digital Information) <http://www.nla.gov.au/padi/index.html> - a gateway to international digital preservation resources. Following one of the trails is a good way to get started.

Image Permanence Institute, Rochester Institute of Technology Consumer Guides <http://www.imagepermanenceinstitute.org/sub_pages/consguides.htm> - a good source for traditional and digital photographic preservation.

Fred R. Byers, Care and Handling of CDs and DVDs--A Guide for Librarians and Archivists <http://www.itl.nist.gov/div895/carefordisc/> - is technical and detailed but provides excellent information about CDs and DVDs.

The DVD Association and the National Institute of Standards and Technology are working to establish a standard industry test to determine the archival quality of recordable CDs and DVDs. The number of years used in the test standard will be determined by responses to a 2-question survey available at <http://www.dvda.org/html/nist_survey.php>. Responses will be collected until May 31, 2005.

Back to Top