Identifying opportunities for ‘big data’ in medicines development and regulatory science.

Recent rapid expansions of health and biological data, combined with the availability of sophisticated computational technologies, offer unprecedented opportunities to bene t public health. It is timely that the European regulatory network has, with this workshop, sought to de ne the landscape of the big data eld in order to clarify the possibilities and challenges and identify how medicines regulators can exploit big data to support medicines development and regulatory decision-making.

In the medicines eld, expanding data sets from, for example, electronic health records and mobile devices, and new insights gleaned from analysis of entire organisms and biological systems, offer tantalising potential to bene t patients and consumers. While collection and use of data come with the responsibility to ensure security and protect patient privacy, patient groups expressed their willingness to share data in the knowledge that it advances research and ultimately helps other patients.

Massive and continually increasing amounts of healthcare data are being produced in all dimensions: data from the population level down to the individual, health and disease data, diverse data types from traditional health measurements to ‘omics and social media, and information on patients’ experiences through time. Unfortunately, most of these data are unstructured and stored in poorly curated repositories or in silos, and thus inaccessible.

To maximise the opportunities for public health, data must be made ndable, accessible, interoperable and reusable (FAIR). Integration of data from various sources for subsequent analysis is a further challenge. However this may be overcome using approaches such
as common data models: the Observational Medical Outcome Partnership (OMOP) and the Sentinel Common Data Models were presented during the workshop.

Once data are available and in a structured format, computational technologies for analysis, such as machine learning and data mining, already exist. Many insights from big data analysis were presented during the workshop including examples in target discovery, drug- drug interactions, image analysis, mapping vaccine uptake, patterns of medicine use and prediction of disease. Although storage and analysis of huge data sets is computationally intensive, use of the cloud instead of on-site systems can help provide the necessary capacity.

However, big data is noisy and there may be missing data, and known or unknown biases. It can be dif cult to create trust in results and to establish causality and rule out coincidence, factors that are essential for evidence in medicines regulation. Approaches to overcome these challenges discussed during the workshop include analysing lower quality data to create hypotheses, followed by corroboration with careful retrospective analysis of higher quality, more meticulously selected data and laboratory experiments; performing randomised trials with real world data; and using data sources to supplement prospective randomised clinical trials.

In addition, to avoid bad analysis of observational data and increase robustness of results, the health and research community should come together to agree best practices, develop open-source analytical tools and establish quality standards for observational data analogous to those currently in place for clinical trials.

Rather than the development of proprietary tools and restriction of access to data, an ‘open science’ approach to big data, incorporating public-private collaborations, was favoured by workshop participants. This would create networks of people implementing new technologies, approaches and standards and promote a collaborative environment driving sharing of knowledge.

The impact of both big data and real world data (data collected outside randomised clinical trials usually during normal clinical care) is signi cant across the medicines life cycle. There are many existing and potential applications: in drug discovery, promising compounds for screening can be selected in silico; in clinical trials, epidemiological information can inform trial designs in terms of de ning outcome measures and site selection for optimal recruitment; in the post-marketing stage, real world data can be used to continually assess the bene t-risk balance in the wider population. In the early post-authorisation phase, determining the added bene t of a medicine compared to standard of care, the remit of health technology assessment (HTA) bodies, and later relative effectiveness, could be a key application of real world data.

Privacy, security and transparency were overarching themes of the workshop. Frank and transparent exchange with patients and the public about how their data are used, how their privacy is protected and the potential bene ts of data sharing must underpin projects to ensure success. The free ow of data, while protecting individuals’ privacy, will be facilitated by the General Data Protection Regulation, coming into force in Europe in May 2018. This regulation focuses on the accountability of data controllers, making them responsible for compliance with data protection principles.

Workshop participants emphasised that patients and the public appreciate the value of sharing their data and that patients and patient groups play a pivotal role, not only in consenting to data use, but also in proactively driving data capture and providing informative patient perspectives.

EMA will continue to engage with stakeholders and will develop the skills and regulatory processes needed to ensure big data is harnessed to facilitate robust medicines assessments and complement clinical trial data. EMA is committed to an open and forward- looking vision, aligning the regulators with advances in technology for the protection of public health.

Download the report.

The report was developed as a result of EMA's workhop on November 14-15, 2016. Videos and presentations from the workshop are available on EMA’s website.

Identifying opportunities for ‘big data’ in medicines development and regulatory science.

Enabling the effective (re)use of data.