WP3 has been assigned the task of creating a common data model, to support homogenization of the data creation of relationships among the datasets. This will facilitate comparisons between treatments, patient histories, and clinical and demographic data.
- Exploit its ability to combine sets of heterogenic data structures from each source (cooperative groups, hospitals, academic partners, and so on) into such groupings/data types as demographics, clinical information, epidemiology, molecular data, etc.;
- Support future data collection from existing data sources;
- Standardize the format and content of the observational data;
- Create models that will facilitate the establishment of relationships between the data;
- Create a definition for data-use requirements that will be aligned with HARMONY’s needs;
- Harmonize, through the application of algorithms and rule engines, the processing of data to create an organized system for the categorization of findings.
Bayer, Celgene, EBMT, ELN, EORTC, GMV, HULAFE, IBSAL, Janssen, LeukaNET, MediUni Wien, Menarini, Novartis, Takeda, Ulm University, UNIBO, University of York.
First Year Achievements
Considerable progress has been made in terms of establishing the Big Data platform and the methodology for analyzing data:
- HARMONY's Big Data platform has been established;
- A common data model that adheres to the FAIR (findable, accessible, interoperable, and reusable) data-sharing principles has been created;
- HARMONY has begun developing and testing models based on available data on AML (TCGA public data, UNIBO internal data and Sanger Institute data);
- HARMONY has started analyzing the description of datasets;
WP3 will expand the recently established Harmony big data platform in 2018 with an initial focus on AML, followed by MDS and CLL structures and data.
- Exploitation of the ‘Data Access’ with the onboarding of new data sets in AML and other selected indications;
- Implementation of the data monitoring plan;
- Further expansion of the common data model to the new indications;
- Collect data sets supporting the first set of questions across the different HM’s (AML, CLL, MDS, MM); Evaluate the proposed workflow for the execution of the research questions during the pilot study;
- Generation of quality reports for each source to support the evaluation of clinical quality by DQSC (Data Quality Supervision Committee).
Work Package Leadership