A major challenge in EPOS is the integration of multi-disciplinary, multi-organisational, distributed resources and community assets into a single overarching Research Infrastructure - the EPOS Integrated Core Services (ICS). ICS aggregate and harmonise descriptions of datasets, data products, software and services from different domain-specific services - the Thematic Core Services (TCS). TCS adopt heterogeneous formats, vocabularies, protocols and standards to represent and make their resources available.
Two main ingredients are used to achieve interoperability:
- the metadata, that describes services and assets provided by the Thematic Communities;
- a service-based architecture, which enable Thematic Communities Research Infrastructure to provide metadata, data, data products in a simple and flexible way.
Such architecture takes into account the specific requirements of the communities and does not force thematic data providers to stick to a single technology or technical approach, but implements, at the same time, a strong and robust procedure to achieve data integration.
The exchange of metadata between ICS and TCS is crucial to achieve integration and interoperability in EPOS. In order to capture, organise and harmonise information from different sources and to enable semantic interoperability, a data model has been developed and adopted, namely EPOS-DCAT-AP. It extends and builds on an established W3C standard - the Data Catalog Vocabulary (DCAT). EPOS-DCAT-AP is represented in RDF/turtle, its latest version is available on GitHub - it includes a UML diagram, ontology definition, examples and more details.
The actual catalog uses as its format CERIF: Common European Research Information Format, an EU Recommendation to Member States for research information. CERIF is maintained by EuroCRIS and used widely (also outside Europe) and is embedded in products from Elsevier and Thomson-Reuters as well as being used in OpenAIRE.
The metadata catalog is populated from the heterogeneous metadata formats used within the TCS via EPOS-DCAT-AP in an ingestion pipeline (see below) that includes dynamic update by specified TCS managers within a secure environment.