Skip to main content

Research data management: Sensitive Data

Research data management covers the planning, collecting, organising, managing, storage, security, backing up, preserving, and sharing your data.

Data Management Planning and Sensitive Data

When working with sensitive data it is important to consider the future potential for re-use or sharing of that data and how you will manage that data, at the earliest stages of your Data Management Plan.  Some of the key things to consider include:

  • Type of data
  • Consent and ethics approvals
  • Legal requirements
  • Processes for confidentialising and de-identifying the data
  • Storage and security issues
  • Conditions that may be placed around access
  • Licensing

The ANDS Guide to Sharing and Publishing Sensitive Data has more information on each of these issues, or contact your Client Services Librarian.

What is Sensitive Data?

ANDS defines sensitive data as data "that can be used to identify an individual, species, object, process, or location that introduces a risk of discrimination, harm, or unwanted attention. Under law and the research ethics governance of most institutions, sensitive data cannot typically be shared in this form, with few exceptions."

It can include, but is not restricted to: 

  • Human health and personal data, including information about secret or sacred practices; or
  • Ecological data that may place vulnerable species at risk.

Sharing Sensitive Data

The University of Queensland, funding agencies and others actively encourage sharing of datasets, including sensitive datasets, as long as they are managed appropriately according to best practice. The ability to share and re-use these datasets has a range of benefits as for other types of research data, but also include improved efficiency, conservation, and reduced participant fatigue and disturbance.   Sharing your data also means others can discover and cite your work.

ANDS points out that " Research data-even sensitive and confidential data-can be shared ethically and legally if researchers pay attention, from the beginning of research, to four important aspects:

  • including provision for data sharing when gaining informed consent
  • protecting people's identities by anonymising data where needed
  • considering controlling access to data
  • applying an appropriate licence

It is possible that many sensitive datasets can be ethically shared, where much of the data obtained from participants will not relate to confidentiality. 

However, it is important to remember that you can publish a description of your data only, to aid discoverability, without making it publicly available.  Alternatively, you may wish to put access conditions or assessment processes in place before sharing the data.

Appropriate steps taken at the data management planning stage, such as gaining consent and putting anonymisation in place, can address many of the issues.

De-Identification Software

The following are examples of software that might be useful for de-identifying some types of data.

Research Data

Clinical Data

Anonymisation

Anonymisation is the "process taken to turn data into a form which does not identify individuals, and where identification is not likely to take place".  The need for anonymisation arises where research data is to be shared or re-used.

Anonymisation may be needed for ethical reasons to protect individual's identities, for legal reasons, or for commercial reasons. Personal data should not be disclosed from research information, unless a participant has given specific consent to do so.

ANDS provides tips and techniques for de-identifying data, and the UK Data Service provides guidelines and methods for de-identifying quantitative and qualitative data.  Links for more resources can be found above.