The BBC avoided an archiving nightmare last week when it upgraded its outdated storage unit to a tape library to house its vast online photo library of some 300,000 images. The move is part of a huge archiving project that demonstrates the importance of digital data preservation.
An enormous amount of information is now created, stored and preserved only in electronic format, instead of being copied to paper, photo or film. Modern war reporters use digital cameras, which produce images and footage that never leaves the electronic environment.
This has stirred up historians, who fear that information will get lost in the next significant technology upgrade. Some even believe that images were changed before they reached the public.
Similar fears hold true for corporate data. Certain information, such as electronically approved contracts and agreements, needs to be preserved for a long time. But most industry observers say that network managers do not take electronic data preservation seriously enough.
"Part of the network data needs to be preserved for a very long time. For instance, to defend yourself in a lawsuit, you may have to retrieve old emails to reconstruct a pattern of conversation," said Gordon Buxton, senior developer at Oxford Computer Consultants. "Nobody knows how long a data CD lasts. Some say it is 20 years, but for certain data that is not a long time."
The National Digital Archive of Datasets (NDAD) takes data preservation very seriously. Its digital archives group manager, Kevin Ashley, said record preservation was crucial for a cultural and historical sense as well as for commercial and scientific reasons.
"Computer data is fragile," he said. "It is easy to change, so unless you make sure any changes will leave an audit trail, it is difficult to prove the record is authentic.
"Electronic data is not visible and so is easily forgotten. When you neglect a paper document and realise years later that it was important, you can still read it. If the same thing happens with computer data, it has become either irretrievable or very expensive to retrieve."
Corporate data preservation is just as important as public data preservation. "It is an urgent problem. Disks don't last very long," said Ashley.
"First, realise that you've got a preservation problem. Think about how you will access information when it outlives the software," he said, explaining why data needs to be extracted from software and be part of software upgrades.
NDAD archivist Patricia Sleeman pointed out that some data records were difficult to extract from their technology, as was the case in archiving the electronic geographic data repository GIS. "GIS contains many layers of data mixed with proprietary technology. So what do you preserve?" she said.
Network managers also need to identify which part of network data is transient and which important, with the help of a company record or file manager.
Record management tools from companies such as Tower Software and Fabasoft can improve authenticity by making a genuine copy any time a file is created. The tools also support typed user keys, to identify the document's hierarchy. "But the hardest part is getting people to be consistent in the keywords that they use," Ashley said.
Key Points:
- Most companies neglect data preservation
- Good data preservation extracts data from software
- Focus preservation resources on significant data





Do you agree?
Have your say on this article