While proper storage and back-up of research data is essential to fulfil the good stewardship requirements of the Australian Code for the Responsible Conduct of Research, it is also crucial to check that your data can be successfully and fully restored from back-up.
One way to verify and validate back-up files is to restore them to another location and compare them with the original. Back-up copies can be checked for completeness and integrity, for example, by looking at the checksum value, the file size and the date. If checksums differ, this may mean that some files or parts of files may have been corrupted.
How you restore your data will largely depend on how the data was originally backed-up. In some cases, it may simply be a matter of restoring a single uncompressed file, such as a database. Or perhaps the file was compressed for back-up and now needs to be unzipped again to be functional.
In more complicated cases, where there are multiple files, and back-ups have recorded incremental changes to only some of the files, the computer must sift through all the changes made to specific files since the start of the process, and must restructure and reconnect files. Without a clear roadmap to manage this, it would be easy to make mistakes.
Be sure to approach any data restoration process patiently, and be ready to call for expert IT help if things do not go as planned.
If you are planning to implement some kind of compressed, incremental archive for your research data, it would be wise to test-drive the restoration process with dummy data and files to ensure it will work in the way you expect. Document each step as you go.
If you test-drive the process before any significant data is added, you are less likely to encounter problems when it is too late to correct them.
When you have to restore data from a back-up, be sure to restore to a new location or drive, so that you do not mistakenly overwrite new data with old.
To ensure your files are safely backed-up, implement a data validation process. This will check whether a file is actually present, is not corrupted, and is what you expect it to be.
Be sure to designate a primary copy of the data before you create any back-ups. If you have no means of identifying what your primary copy is, how will you know whether or not you have restored the right data?