Data Sharing Policy

1. Purpose 

The MHP is committed to open science and strives to make data available to safe researchers in a timely and responsible manner. This document covers the general principles of data sharing within the MHP, including computer code, with other MHP members, with the academic community and others (e.g. those seeking to use data for purposes that involve profit, such as drug development). The guidance applies to core funded MHP Hub and associate members. This guidance is based upon other published UK and international guidance, particularly that published by the UK Dementia Research Institute (UK DRI).

 

2. Background 

The MHP is dedicated to advancing mental health research that will improve the health outcomes of people with mental illness. MHP researchers are generating datasets that are important to both the primary researchers and the wider scientific community that may wish to reproduce, meta-analyse or conduct novel studies. We are committed to ensuring MHP data can be put to best use by the scientific community for public benefit, where legally and ethically appropriate. The basis of effective data sharing and reuse is outlined in the FAIR Guiding Principles  for scientific data management and stewardship that stress the Findability, Accessibility, Interoperability and Reuse of data (FAIR data). Briefly, data should be easily findable by both humans and computers (enabling automatic discovery). Data should be accessible, and be suitable for use with processing, storage, and analytic software. Data should be well described to allow replication and/or combination in different settings to maximise reuse. 

 

3. Policy Statements 

The need to safeguard intellectual property (IP) and commercialisation opportunities should not unduly delay or prevent data sharing. We expect data to be shared with as few restrictions/barriers as possible in a timely and responsible manner. 

 

  • All final research data and computer code (referred to as ‘data’) generated by UK MHP members are covered by this guidance and should be shared so that they are findable, accessible, interoperable (harmonised) and re-useable (FAIR)
  • Researchers should follow regulatory requirements relating to information governance and the ethical use of data. 
  • Research data generated by investigators at a specific UKRI-funded Hub is available to those investigators in the first instance, but that should not prevent data sharing on the DATAMIND Trusted Research Environment (TRE) for the purposes of secure storage and archiving, with a view to its later re-use. 
  • Research meta-data, detailing the study PI, Hub identity, number of participants assessed, participant characteristics, and the types of samples and data collected, should be provided to DATAMIND for publication on the HDR UK Gateway. This should contain no sensitive data that cannot be shared publicly. 
  • Managed access to final research data on DATAMIND’s TRE would be provided no later than the publication of the main findings or the submission of the main findings to a preprint repository. 
  • Computer code should be released upon submission of the manuscript to a preprint repository or upon publication, whichever is the earliest. DATAMIND will provide a GitHub organisation and Mental Health Platform Team/Project for this purpose. Users will need to be registered with GitHub and will need to share their usernames with the DATAMIND team. 
  • Data should be annotated and documented with metadata to avoid misinterpretation. Established standards for (meta)data collection, management, ontology, and formats should be adopted wherever possible. 
  • Data and code should be made available under an appropriate license (e.g. CC BY 4.0.) that allows reuse with minimal restrictions and deposited, wherever possible, to established public repositories.
  • Researchers who generate, preserve, and share data should receive recognition by secondary users, their academic institutions, and funders. 
  • Users are expected to cite the source, preserve data confidentiality, and observe ethical and legal requirements.
  • The parent institution of the researcher or hub that generates the data remains the Data controller. In cross-hub projects where researchers process the same data with a common purpose, the parent institutions will be joint data controllers. The UK MHP coordinating centre and DATAMIND will support MHP researchers to share their data. 
  • UK MHP researchers should include data management plans (DMPs) as an explicit part of their research projects. We expect DMPs to be “living documents” that may change throughout the lifecycle of a project.
  • After a project has ended, there should already be a clear plan for how data should be managed locally (or remotely on DATAMIND), that is consistent with the host institution’s data management policy
  • Data generated across the MHP through the Collaboration and Innovation or Capacity Building Award should be made available on DATAMIND as soon as possible after generation. Decisions on who will be given access to these data will be taken by the Leadership Team which will include at least one investigator from each Hub.
  • Disputes around the storage or access of data from the MHP will be resolved through discussion in the Leadership Team who will help to support investigators in finding an acceptable solution. 

 

4. Practical considerations

Dataset might include raw data and derived variables, which should be described in the documentation accompanying the dataset. Summary data refers to the results of analyses conducted on the raw/derived data and is often presented in submitted manuscripts. 

 

5. Ethical and governance considerations

It is important to be aware of relevant legislation covering data use, including the Data Protection Act 2018, the UK’s implementation of the General Data Protection Regulation (GDPR). Data that is anonymised or pseudonymised better protects participants’ privacy and is not subject to the same restrictions as personal data. 

For research with human participants, researchers must have participants’ consent to collect, use, store and share their personal data. Consent procedures should include provision for data sharing in a way that maximises the value of the data for wider research use, while providing adequate safeguards for participants. Every effort must be made to protect the identity of participants, including through appropriate anonymisation procedures and managed access processes. Prior to sharing, data should be anonymised, and any indirect identifiers that may lead to deductive disclosures should be removed to reduce the risk of identification. All appropriate ethical, legal, and institutional regulatory permissions must be in place before the data can be shared. 

 

6. Data deposition

All research data generated from the MHP should be deposited on DATAMIND’s Trusted Research Environment. This ensures a safe off-site back-up of all datasets collected in the MHP and ensures that they could be used by researchers within and outside of the MHP in future. Deposition in the DATAMIND TRE does not mean that the data is immediately available to others. Data deposited may be submitted to DATAMIND that is Open Access (e.g. anonymised results from individual studies), or Managed Access. Managed Access includes data that is provided for other secondary uses contingent upon approval by the data owner, which is usually the Principle or Co-Investigator of the study that has generated the data. Managed Access may also, in future, be delegated to a third party, such as the DATAMIND Data Controller. 

 

7. Summary data and analytic computer code

All results (participant-unidentifiable/anonymous summary data) from MHP studies should be made publicly available on acceptance of the primary research paper, or after the end of the current funding period (whichever is the earlier). Computer code should be submitted to a code repository upon submission of the manuscript to a preprint repository or upon publication, whichever is the earliest. We encourage analytic code to be continuously submitted to and backed up to a code repository such as GitHub regularly to prevent data loss.

Open access data includes data that can be released into the public domain without restriction. It must be impossible to re-identify the research participants.  Researchers are strongly encouraged to opt for the most open licence to allow the widest possible scope for reuse and redistribution. Creative Commons Attribution (CC BY) license should be used in open access publications. For more information on other conformant open licenses for data and content, please consult Open Definition lists.

 

8. Sharing of raw data and derived variables

Raw data and derived variables can only be released under the restrictions on use set out in the ethical approvals and should be consistent with the Patient Information Sheet and Consent forms used in the study. This type of data should be deposited in a controlled, secure repository with strict protocols governing how information is managed, stored, and distributed. This active management of individual level participant data is referred to as ‘Managed Access.’ MHP Datasets should be deposited in the Mental Health Data Hub ‘DATAMIND’ on their Trusted Research Environment. Researchers should opt for managed access in most cases, as it is typically required for most raw datasets where data is available on individual participants. 

 

9. Making data available for secondary use

Data should ideally be made available under licence to safe secondary users so that its maximum public benefit can be realised. Examples of secondary data use include:

  • Combining multiple samples together so that the effects of a particular risk factor can be more accurately measured (meta-analysis).
  • Attempting to test whether the results of one study can be repeated using data from a second, independent one (replication)
  • Allowing an academic, health care or company researchers secure access to data remotely stored on another computer, so that they can ask and answer their own research questions

Procedures for accessing data subject to the necessary legal, ethical and governance requirements are well-established in DATAMIND. Advice should be sought from the DATAMIND Team in advance of deposition and preferably before ethical permission has been applied for. 

MHP Hubs and other award holders must capture their data-sharing activities (including unique identifiers and where the data is shared) on ResearchFish. Researchers will be required to enter this information into their own project’s ResearchFish entry and will also need to alert the Coordinating Centre staff to its presence, so that it can be reported MHP-wide, where appropriate.

 

10. Data access policy

Access to MHP data generated from a single Hub (‘Hub-Award funded’) will be decided by application to the MHP through a mechanism to be determined in the coming months. It is assumed that decisions around data sharing from a single Hub will be led by their PI, supported by the Leadership Team. We anticipate that applications seeking to test a hypothesis set out in the original Hub application, or developed subsequently, will either be refused or deferred until the Hub has been able to address that question or the end of the MHP funding period (whichever is earlier). 

Applications to access data collected across two or more Hubs, typically funded by the MHP Collaboration and Innovation Award, will be considered upon application to the MHP Leadership Team, who will act as a Data Access Committee. We will proactively discuss applications with lived experience members of the MHP, at least one of whom will be members of the Data Access Committee. 

Data deposition and access flow chart

Figure 1: Overview of data deposition and access procedures for the MHP