Skip to main content

Research data management: Data formats

Research data management covers the planning, collecting, organising, managing, storage, security, backing up, preserving, and sharing your data.

Why are data formats important?

Why your choice of data format is important

The choice of format for your data will determine how that data may be used, analysed, backed up, stored and potentially reused in the future.

Researchers should consider saving their research data using  standard, interchangeable and longer-lasting formats, to avoid being unable to use the data, either during the project if it is long term or in the future. Similarly for back-ups of data, standard formats should be considered.

The best way to guarantee long term data usability and access to use standard formats that most software can easily interpret or that can be migrated to a new format easily at a later date.

While the choice of format will depend to some extent on discipline- specific standards, the software used and the method of analysis, it is generally recommened that researchers use non-proprietary software and formats based on open standards where possible. For example:

  • Open Document Format (ODF)
  • Tab-delimited format
  • Comma-separated values
  • XML

The UK Data Archive has a comprehensive list of recommended file formats. They also provide advice on the suitability of formats for file conversion.

Choosing appropriate formats will also minimise loss of data through compression or software obsolescence.

Researchers should also consider:

  • longevity of any hardware used to create, analyse or visualise research data
  • If your work needs specific software to be understood, archive the software and its documentation as well as the data.
  • Saving your files in their original state - uncompressed and unencrypted.

Considerations when choosing formats

There are a number of things to consider when selecting a format for your data.  These should be addressed in your data management plan.

Data issues to consider include:

  • could the hardware, software or digital storage media fail or become obsolete within the project's lifetime?
  • what impact would such a failure have on the project?
  • what level of technical support is available long term?
  • can existing data be migrated easily to new hardware and software platforms?

Be sure to list in your data management plan:

  • what data formats will be used
  • what special software will be used, if any
  • what the expected data volume will be
  • what the data access requirements are

More resources

The hard truth about data formats

CC-BY-NC-SA by Cyberslayer via flickr