This section covers general details about your project.
1.1 Please provide the name of your project.
A framework for streamlining research workflow in Geosciences
1.2 Please provide a description of your project
This project aims at developing a fully integrated computing framework for running Geosciences applications. Many scientists in Geosciences are still reluctant to use HPC resources, therefore being able to provide users with appropriate support is vital. When users have such expertise, they however spend a lot of time to set-up the same models on the same machines. Therefore, it is important not only to optimize user applications or models but to give an appropriate framework for running large simulations. The current bottleneck for most applications in Geosciences is to define efficient workflows (fetching input data, running parallel models, storing outputs data with data exploration facilities). In this project we aim at developing such framework for a range of known applications: openIFS (ECMWF forecast model), FLEXPART transport model (“FLEXible PARTicle dispersion model”), MITgcm (MIT General Circulation Model), WRF (Weather Research and Forecasting), ENKI (hydrological modelling toolbox and a hydrological forecasting system), CESM (Community Earth System Model) and ad-hoc user codes developed at the Geosciences Department. This framework will be used both for research and teaching (for instance GEF 4530 at the University of Oslo). We have successfully installed most of these applications on the targeted computing platforms and have started to write workflows in python scripting language. We will now continue our effort and put emphasis on how best use NorStore facilities in these workflows (and also for data exploration) as most of these models require a large volume of input data and generate large outputs to be analyzed.
1.3 Which academic subject(s) does your project belong to?
Climate Science
1.4 Please provide the name of the project principle investigator
Anne Fouilloux
1.5 Please provide the funding sources for this project.
Storage for master's students, new PhDs and postdocs are core activities supported by the department of Geosciences (as core activities).
EOSC-Nordic: The EOSC-Nordic project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 857652.
1.6 Who will be the Data Officer for your project?
Anne Fouilloux (and at some point my replacement)
1.7 Does your project have the appropriate resources for the management of your data?
Data management activities related to EOSC-Nordic are funded by EOSC-Nordic (staffing) while core activities related to education of students and researchers are funded by the department of Geosciences.
For each folder, data creators are expected to generate a README file with a minimum information on the datasets. We encourage users to use the following template: https://cornell.app.box.com/v/ReadmeTemplate
Data are archived as soon as possible i.e. as soon as consolidated datasets are produced: then these datasets become readonly (permissions changed on NIRD project area) and kept on NS1000K until scientists (or students) give their approval for removing data. Emails are sent every 6 months to all data "creators" so that we can get up to date information about their datasets.
For publication, plots, codes (to generate plots) are sometimes added to the archive.
Archiving after the end of a project is funded by the department of Geosciences.