EOSC Synergy and its 10 thematic services

EOSC Synergy aims at expanding the uptake of EOSC by building capacities. Thematic services constitute an important part of the project and are the final layer that is exposed to final users. Therefore, the expansion of the capacity of the thematic services will require improved platform services and improved infrastructure services.

The thematic services expect to reach a workload between 400 and 46.500 CPU hours per week (an accumulated 71K CPU hours per week) consumed by up to 10k jobs per week requiring a median of 16 GB RAM and 15 GB of storage per job. The persistent storage ranges from 2 GB to 500 GB (a median of 100GB and a total of 1 PB). This workload is not straightforward and it will require involving additional resource providers.

The last analysis of the thematic services also defined the Data Management Plans (DMPs) for the 10 Thematic Services. These DMPs will be improved progressively as the thematic services evolve during the project.

The thematic services have also defined a set of performance metrics grouped into five categories: impact on users, on capacity and capability of the service, on scientific outreach, on the usability of the service and on cross-fertilization. These metrics can provide quantitative indicators of the performance of the thematic services and how they improve.

Thematic services constitute a key activity to evaluate the impact of the capabilities in EOSC Synergy with respect to adopting mature and scalable services, software and service quality assurance, increased resource capacity and improved user skills. Do we still have your attention? Good! Let’s dive a bit deeper into the thematic services of EOSC Synergy.

What are these thematic services all about and who do they serve?

The project has identified ten thematic services addressing four scientific areas (Earth Observation, Environment, Biomedicine and Astrophysics). Those thematic services are heterogeneous, addressing a wider range of requirements, maturity level, user targets and usage models.

In the area of earth observation, services address the monitoring of coastal changes and inundations, the processing of satellite image data and the estimation of forest mass, addressing different types of targets. In environment, the thematic service covers the monitoring and protection of ozone, the forecast of sand and dust storms, the simulation of water network distribution and untargeted mass-spectrometry analysis for toxics. In astrophysics, the project aims at setting up an European service for the Latin American Giant Observatory and in biomedicine EOSC Synergy covers the benchmarking of genomic data processing tools and the processing of Cyron-electron microscopy imaging.

In the frame of this project, these thematic services will improve in terms of authentication and authorisation, resource management, job scheduling, data management and accounting. Not all the services have identified gaps in all the previous aspects so each thematic service will focus the adaptation in the aspects that are more relevant according to the bottlenecks.

Which technical tools support these services?

In a preliminary analysis performed by all thematic services several technical commonalities and differences have been identified.

  • All thematic services share the importance of using a robust Authentication and Authorisation Infrastructure (AAI) compatible with the ones used by the target institutions. EGI Check-in has revealed to be a widely accepted choice.
  • With respect to resource management, all services have the interest of dynamically provisioning processing resources, most of the cases on demand. Infrastructure Manager (IM) and the Elastic Compute Clusters in the Cloud (EC3) client have been identified by most of the thematic services as a technology capable of filling in this gap.
  • Regarding job management, most thematic services use batch queues, which could be extended to support containerised jobs. The usage of Kubernetes to orchestrate microservices and job queues of containers is also considered.
  • The most challenging part is the management of data. Thematic services have identified important issues on transferring and accessing large volumes of data and require smart caching, advanced data transferring and massive persistent data storage. Solutions available in the EOSC marketplace will be studied and prototyped before adapting them into the thematic services.
  • Finally, monitoring will be inherent to the usage of platform services.