eMATCH: electronic Muse dATa Center for biomedical researcH

Extraction of patient data and interoperability for basic and clinical research; validation using a multi-pathology/multi-institute cohort.

Project description and objectives

Clinical and laboratory data and biomarker analysis are essential for confirming diagnosis and defining pathology sub-types, for establishing links with disease progression or response to treatment, and more. These topics are at the heart of many academic hospital research projects across different MUSE fields, from basic research to clinical investigation.

IT tools and solutions are critically needed to extract patient data in a consistent format that allows optimal exploitation, while also taking into account the information strategies of each healthcare institution. Automated data extraction also favors content exchange locally, nationally, and internationally through interoperability.

The initial objective of this project is to develop and implement a solution to automatically link clinical data from the Electronic Health Record (EHR) to biobank data within each institute. Then, the data from each institute can be sent to a common intermediary data repository in an authorized health data hosting system.

To start, only structured data coded in a standardized manner will be extracted from the EHR, in particular ICD-10 diagnosis codes, laboratory test results, and pharmacological treatments coded according to Anatomical Therapeutic Chemical (ATC) classification.

All operations will be carried out in complete respect of the French Data Protection Act (Loi Informatique et Liberté) and European rules and regulations for the protection of personal data in order to safeguard the data confidentiality.

The project concerns the three MUSE healthcare institutes (Montpellier and Nîmes University Hospital Centers, and Montpellier Cancer Institute – ICM) that all have different health information systems.

Performance indicators


  • Number of patient/biological sample pairs with data extracted
  • Number of institutes allowing data extraction
  • Establishment of the KIM data repository (EDKIM)
  • Number of data pairs for patient/biological sample pairs transferred to EDKIM

Two-year objective:

  • 500 patient/biological sample pairs integrated in EDKIM, distributed among the five pathology domains that correspond to the FHU-COEN network (cancer, infectious diseases, experimental and regenerative medicine, neurology, and metabolic diseases)
  • Two of the three healthcare institutes enhancing EDKIM data