The sharing of patient-level data between research groups and pharmaceutical companies has the potential to generate enormous benefits for health. However, in an article published in the BMJ, Elizabeth Pisani and colleagues argue that data sharing per se is not enough to guarantee these benefits; time and effort must also be invested in ensuring that the data are organised effectively.
Pisani et al. draw a distinction between shared data that are accessible, useable and useful. Data deposited in an open access repository, such as Figshare, can be accessed by anyone with an internet connection. But because the data can be uploaded in any format and with only minimal accompanying information, they may not necessarily be useable. To meet this criterion, shared data should additionally be discoverable and accompanied by well-documented metadata. The pharmaceutical industry has committed to depositing clinical trial data in useable forms in repositories such as Clinical Study Data Request.
To be truly useful, shared data – which will have been collected by different groups at different times using different protocols and endpoints – must also be standardised and quality controlled. Examples of curated repositories include the Infectious Diseases Data Observatory and Yale University Open Data Access. This additional processing requires a substantial upfront investment of time and money, but transforms the shared data into a valuable resource. As Pisani et al. conclude, ‘funders must commit to long term investments in both technical and human infrastructure if they want to promote data sharing that is useful, used, and likely to change policies for the greater benefit of patients.’