Make Model Output Comparable

Type Bachelor or Master

Task A key issue in the ocean and climate modelling community is to compare different model output from different models, as they are in different formats, notations and units. The goal of this thesis is to create a common format or a way to specify a format together with transformations that convert output to a common output.

There exists previous work in this topic:

  • CMIP
  • EMSValTool https://www.esmvaltool.org/
  • CMOR https://cmor.llnl.gov/

Dataflow Analysis of Climate Models

Type Bachelor or Master

Task: Static analysis of Fortran code with FParser to extract data flow and create a data flow model from it. FParser is written in Python. Most of the analysis tools are written in Java. Thus, FParser will be used to identify read and write accesses to data and return a list of such accesses which can then be used to enrich our existing dataflow model.

Identify and analyze coding techniques for mathematical methods in Fortran

Type Bachelor (one project, one analysis)

Task: Analyze a scientific modeling software written in Fortran for
coding techniques that implement mathematical methods, and identify
invariants and create testable assertions in Python.

Sources & Notes

Starting point: Existing Fortran software

  • UVic http://terra.seos.uvic.ca/model/
  • MITgcm https://github.com/MITgcm/MITgcm

Thematic Domain Analysis for Ocean Modeling

We just published our paper on the analysis of the ocean modeling domain. It provides answers on the characteristics of the domain, how scientists develop and research models, how they implement them, and how technologies and methods are applied to this endeavor. Based on them, software engineers can better apply their tools, methods and approaches to the scientific modeling domain to support the software side of the model development which suffers from a lack of engineering insight.

The paper is available as a pre-print on https://arxiv.org/abs/2202.00747 and the final version is available via 10.1016/j.envsoft.2022.105323

Software Development Processes in Ocean System Modeling

Scientific modeling provides mathematical abstractions of real-world systems and builds software as implementations of these mathematical abstractions. Ocean science is a multidisciplinary discipline developing scientific models and simulations as ocean system models that are an essential research asset.
In software engineering and information systems research, modeling is also an essential activity. In particular, business process modeling for business process management and systems engineering is the activity of representing processes of an enterprise, so that the current process may be analyzed, improved, and automated.
In this paper, we employ process modeling for analyzing scientific software development in ocean science to advance the state in engineering of ocean system models and to better understand how ocean system models are developed and maintained in ocean science. We interviewed domain experts in semi-structured interviews, analyzed the results via thematic analysis, and modeled the results via the business process modeling notation BPMN.
The processes modeled as a result describe an aspired state of software development in the domain, which are often not (yet) implemented. This enables existing processes in simulation-based system engineering to be improved with the help of these process models.

The paper can be found at https://arxiv.org/abs/2108.08589

Thematic Map of the Domain Analysis

One of our initial initiatives were to understand the domain of ocean modeling, its processes and characteristics. Therefore, we conducted a set of interviews with domain experts, i.e., scientists, research software engineers, and technicians. To analyze the interview data, we relied on an Thematic Analysis approach. The resulting map can be found here. To open or close a theme (yellow) or category (blue) click on the respective node.

Develop a DSL for Bio-Geo-Chemical Models

Type Master Thesis

Task Part of the OceanDSL project is to provide a DSL for biogeochemical models or parts of them. These models can be specified in various ways. Our goal is to provide a concise DSL that allows to create and extend such models. The key tasks in this thesis are:

  • Analyze the domain to understand the how bio-geo-chemical models are researched and developed.
  • Identify parts we can address with a new DSL.
  • Design a DSL based on our technology stack, the stack of Dusk/Dawn or PSyclone, or a combination of those, depending on your findings.

Resources

  • Dusk/Dawn https://github.com/MeteoSwiss-APN/dawn
  • Metos3D
  • PSyclone https://psyclone.readthedocs.io/en/stable/
  • Biogeochemical models
    • Piwonski, J. and Slawig, T. (2016). Metos3D: the Marine Ecosystem Toolkit for Optimization and Simulation in 3-D – Part 1: Simulation Package v0.3.2. Geoscientific Model Development, 9:3729–3750
    • Kriest, I., Khatiwala, S., and Oschlies, A. (2010). Towards an assessment of simple global marine biogeochemical models of different complexity. Progress In Oceanography, 86(3-4):337–360

DSL Designing And Evaluating For Ocean Models

The development of ocean models requires knowledge from different domains. One aspect of the modeling is the model configuration that takes place in code files or parameter lists. The process of configuration of each ocean model is different and their users must know the differences. To make a configuration of the ocean models is easy we can implement a DSL that generates valid configuration files for each model. In this thesis we design and implement a such configuration DSL. Hereby we study the use cases scenarios involving model parameterization and one ocean model. Based on the findings we designed and implemented the DSL. Although the DSL does not generate all configuration files, the evaluation shows that the concept works.

  1. Serafim Simonov. 2020. DSL Designing And Evaluating For Ocean Models. Kiel University. Retrieved from http://eprints.uni-kiel.de/51160/

First Visualization of the UVic Architecture

Our goal is to understand the composition of climate and ocean models to support their modularization and future development. Recently, we applied runtime monitoring on the MITgcm model. Based on our experience there, we applied the technique to the Earth System Climate Model (ESCM) of University of Victoria, Canada. Please be aware that these are very early results and may be erroneous.

The UVic model can be compiled with GNU Fortran (gfortran), but the current setup, we used, only produces a running executable with the Intel Fortran compiler (ifort). Fortunately, ifort support the same interface for runtime instrumentation as gfortran. Thus, we could apply the same probes in this context.

Based on this setup, we recorded 79 GB of binary monitoring data from a partial model run. We aim to have a complete run in future, but for the proof of concept, a partial run is sufficient. For our analysis we aimed to use the standard Kieker trace-analysis tool.

However, the Kieker trace-analysis tool uses call traces to reconstruct the deployed architecture. It is designed that way based on knowledge from web-based and service-oriented services. They have usually a small set of calls in a trace, triggered by an incoming event, message or request. In models, this is quite different. They are called once and run for a long time. Essentially, this results in one big trace. In our case 79 GB trace. This would not fit into memory, and even if, it would be very slow to process. Thus, we created a new architecture reconstruction tool based on another set of Kieker analysis stages. Utilizing this tool, we could generate our first component and operations graphs. The first component graph can be seen below.

UVic architecture based on Kieker monitoring data. Files are considered to be components.

We will continue our analysis to provide better readable graphs.

Thematic Analysis Tool

Thematic Analysis is a method to analyze text, audio, video and other material qualitatively. In OceanDSL, we use this method to analyze interview transcripts. Goal of the thesis is to develop suitable tooling or parts of it and evaluate the tool based on existing transcripts.

Type Bachelor / Master

Task Create a coding editor in Eclipse supporting coding of text, tagging/categorizing, code and category editing.

Task Provide an interactive visualization based on ELK/Kieler.

Task Develop a web-based visualization tool

Features

  • coding, categorizing/themes
  • regrouping of codes
  • re-coding
  • splitting codes
  • merging codes