How to Recover an Architecture on Monitoring Data

We can collect monitoring data from software utilizing Kieker monitoring probes. This can be done with Kieker 4 Java, Kieker 4 C and Kieker 4 Python. The C package can also be used for other programming languages that can be compiled with the GNU Compiler Collection or Intel Compilers. It may also be usable with others.

In this how to we will explain the process for Fortran and Java software with each one way to set up the monitoring. Please note that there are other options available, which can be found in the Kieker documentation.

Prerequisites

For this how to, we assume that you have installed:

  • gcc/gfortran or ifort compiler (for the Fortran/C example)
  • Java 11 or newer (there might be issues with Java 17)
  • Build tools, like make, autoconf, automake, autotools, libtool (for the Fortran/C example)
  • git client
  • tar and gzip

Tool Installation

For the dynamic analysis we need the OceanDSL tools, Kieker tools and Kieker monitoring probes for C and Java. To start it is helpful to have a working directory for your setup.

  • Create a working directory. In this how to we call this $WORKSPACE.
  • Create the directory with: mkdir $WORKSPACE
  • Change to the workspace: cd $WORKSPACE

Install OceanDSL Tools

Clone oceandsl tools with: git clone https://git.se.informatik.uni-kiel.de/oceandsl/oceandsl-tools.git

To build the oceandsl tools enter the tool directory with

cd oceandsl-tools

Run the gradle build script

./gradlew build

When the build was successful, assemble all tools in one archive. This is not really necessary, but reduces work later, as you do not have to unpack each tool separately.

./assemble-tools.sh

Leave the oceandsl-tools directory, move the repository out of the way and extract the tool archive.

cd ..
mv oceandsl-tools oceandsl-tools-repo
tar -xvzpf oceandsl-tools-repo/build/oceandsl-tools.tgz

This create a folder oceandsl-tools with a bin and lib folder inside containing all tools necessary for this how to and others on this side.

Install Kieker Language pack C

First you have to clone Kieker language pack C from github.

git clone https://github.com/kieker-monitoring/kieker-lang-pack-c.git

Build the Kieker language pack C library for monitoring (this is only necessary for the Fortran/C example). Enter the directory with

cd kieker-lang-pack-c/source

and run the following tools from the autotools, autoconf, automake and libtoolize packages. If these tools are not installed, please install them first.

libtoolize
aclocal
autoconf
automake --add-missing

Note: We try to provide a Debian package for this language pack. However, currently it is not available. Therefore, you have to compile the library yourself.

Continue with compiling the library. The library can either be installed system wide in that case: ./configure ; make will build it and make install run as root user will install it. However, in many cases this is not the best option. To install it in the $WORKSPACE, run

./configure --prefix=$WORKSPACE
make
make install

The latter will install the library and some examples in $WORKSPACE/lib and $WORKSPACE/bin, respectively.

Switch back to working directory with cd $WORKSPACE

Install Kieker Collector

The Kieker Collector is used to collect all monitoring events during runtime. It is part of the Kieker bundle. Download the bundle from:

https://github.com/kieker-monitoring/kieker/releases/download/1.15.2/kieker-1.15.2-binaries.zip

Unzip the bundle and unpack the collector in the $WORKSPACE directory.

unzip kieker-1.15.2-binaries.zip
unzip kieker-1.15.2/tools/collector-1.15.2.zip

The collector requires a configuration file to know where to store the monitoring data and to know on which port it should listen. Here is a example configuration file collector.conf. Please note that the word WORKSPACE has to be replaced with the fully qualified path of the working directory.

# common
kieker.monitoring.name=
kieker.monitoring.hostname=
kieker.monitoring.metadata=true

# TCP collector
kieker.tools.source=kieker.tools.source.MultipleConnectionTcpSourceCompositeStage
kieker.tools.source.MultipleConnectionTcpSourceCompositeStage.port=5678
kieker.tools.source.MultipleConnectionTcpSourceCompositeStage.capacity=8192

# dump stage
kieker.monitoring.writer=kieker.monitoring.writer.filesystem.FileWriter
kieker.monitoring.writer.filesystem.FileWriter.customStoragePath=WORKSPACE
kieker.monitoring.writer.filesystem.FileWriter.charsetName=UTF-8
kieker.monitoring.writer.filesystem.FileWriter.maxEntriesInFile=25000
kieker.monitoring.writer.filesystem.FileWriter.maxLogSize=-1
kieker.monitoring.writer.filesystem.FileWriter.maxLogFiles=-1
kieker.monitoring.writer.filesystem.FileWriter.mapFileHandler=kieker.monitoring.writer.filesystem.TextMapFileHandler
kieker.monitoring.writer.filesystem.TextMapFileHandler.flush=true
kieker.monitoring.writer.filesystem.TextMapFileHandler.compression=kieker.monitoring.writer.compression.NoneCompressionFilter
kieker.monitoring.writer.filesystem.FileWriter.logFilePoolHandler=kieker.monitoring.writer.filesystem.RotatingLogFilePoolHandler
kieker.monitoring.writer.filesystem.FileWriter.logStreamHandler=kieker.monitoring.writer.filesystem.BinaryLogStreamHandler
kieker.monitoring.writer.filesystem.FileWriter.flush=true
kieker.monitoring.writer.filesystem.FileWriter.bufferSize=81920

For more details on the option, you can consult the Kieker documentation.

Fortran Example Setup

For the Fortran example, we use a publicly available MIT General Circulation Model (MITgcm).

Checkout the model with

git clone https://github.com/MITgcm/MITgcm.git

Switch to the verification directory which contains a couple of example model setups with:

cd MITgcm/verfication

For this how to, we select tutorial_barotropic_gyre and type

cd tutorial_barotropic_gyre

First, we check out whether it compiles at all to be sure that everything is in place (cf. README.md). We type:

cd build
../../../tools/genmake2 -mods ../code [-of my_platform_optionFile]
make depend
make
cd ..

The option -of can be omitted for this test, but we have to find the correct one for our platform to add instrumentation. Thus, we look into the ../../tools/build_options/ directory with ls ../../tools/build_options/.

In case you have gfortran installed and work on Linux you choose ../../tools/build_options/linux_amd64_gfortran.

Now try to run the model with

cd run
ln -s ../input/* .
ln -s ../build/mitgcmuv .
./mitgcmuv > output.txt
cd ..

After the model has run, check the output.txt file for errors. In case everything went fine, we can instrument the model.

Now you can instrument the model. We will describe this regarding the example. For other C or Fortran projects, this will look different. Therefore, we provide a general explanation first and then apply this to MITgcm’s setup.

As you have installed the Kieker language pack C, there is a libkieker in the local library folder. To be able to use it you have to do X things with gcc or ifort

  1. Specify -finstrument-functions as command line parameter to the compiler
  2. Add the library to the library path with -L$WORKSPACE/lib
  3. Add the library to the libraries to be used during compilation with -lkieker
  4. Set option -g to include debug symbols in the executable

With MITgcm this is done by creating a new platform file. Lets assume you have used the linux_amd_gfortran file before. Duplicate the file and append the lines below.

cp ../../tools/build_options/linux_amd64_gfortran ../../tools/build_options/linux_amd64_gfortran_kieker
nano ../../tools/build_options/linux_amd64_gfortran_kieker

This will open the nano editor, if available. In case you want to use another editor feel free to use it. In the editor append to the platform file the following lines at the bottom. Please replace the word WORKSPACE with the actual path to the workspace directory to avoid unwanted behavior.

# Kieker setup
FOPTIM=""
F90OPTIM=""

FFLAGS="$FFLAGS -finstrument-functions -g"
CFLAGS="$CFLAGS -finstrument-functions -g"
LIBS="$LIBS -L/usr/lib/x86_64-linux-gnu -L$WORKSPACE/lib -lkieker -ldl"

In some instance the -L/usr/lib/x86_64-linux-gnu and -ldl options are not necessary. This seems to depend highly on the version and distribution of Linux.

Save the file.

cd build
../../../tools/genmake2 -mods ../code ../../../tools/build_options/linux_amd64_gfortran_kieker
make depend
make
cd ..

Start the collector in a separate shell or window with the configuration file you created above:

collector-1.15.2/bin/collector -c collector.conf

When the collector is running properly, it will output something like the listing below. However, if it shows a strange stacktrace, this could be due to Java 17. In this case you have to install an older version of Java and specify its location in the JAVA_HOME variable, e.g.,

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

The listing below depicts the information of a properly running collector. To terminate it, you can press CTRL-C. The collector does not terminate automatically, as it works as a monitoring collection service where applications can connect to and disconnect and reconnect. Therefore, it will not terminate just because no application is currently using the service.

17:19:19.094 [main] INFO  k.m.core.controller.TCPController - Could not parse port for the TCPController, deactivating this option. Received string was: 
17:19:19.115 [main] INFO  k.m.core.controller.StateController - Enabling monitoring
17:19:19.123 [main] INFO  k.m.c.c.MonitoringController - Current State of kieker.monitoring (1.15.2) Status: 'enabled'
	Name: '2.9'; Hostname: 'glasgow'; experimentID: '1'
JMXController: JMX disabled
TimeSource: 'kieker.monitoring.timer.SystemNanoTimer'
	Time in nanoseconds (with nanoseconds precision) since Thu Jan 01 01:00:00 CET 1970'
ProbeController: disabled
WriterController:
	Queue type: class kieker.monitoring.queue.BlockingQueueDecorator
	Queue capacity: 10000
	Insert behavior (a.k.a. QueueFullBehavior): class kieker.monitoring.queue.behavior.BlockOnFailedInsertBehavior
		numBlocked: 0
Writer: 'kieker.monitoring.writer.filesystem.FileWriter'
	Configuration:
		kieker.monitoring.writer.filesystem.FileWriter.logFilePoolHandler='kieker.monitoring.writer.filesystem.RotatingLogFilePoolHandler'
		kieker.monitoring.writer.filesystem.FileWriter.charsetName='UTF-8'
		kieker.monitoring.writer.filesystem.FileWriter.logStreamHandler='kieker.monitoring.writer.filesystem.BinaryLogStreamHandler'
		kieker.monitoring.writer.filesystem.FileWriter.bufferSize='81920'
		kieker.monitoring.writer.filesystem.FileWriter.maxEntriesInFile='25000'
		kieker.monitoring.writer.filesystem.FileWriter.maxLogFiles='-1'
		kieker.monitoring.writer.filesystem.FileWriter.maxLogSize='-1'
		kieker.monitoring.writer.filesystem.FileWriter.mapFileHandler='kieker.monitoring.writer.filesystem.TextMapFileHandler'
		kieker.monitoring.writer.filesystem.FileWriter.flush='true'
		kieker.monitoring.writer.filesystem.FileWriter.customStoragePath='WORKSPACE'
		kieker.monitoring.writer.filesystem.FileWriter.actualStoragePath='WORKSPACE/kieker-20230223-161919-19823802456291-UTC--2.9'

	Automatic assignment of logging timestamps: 'true'
Sampling Controller: Periodic Sensor available: Poolsize: '0'; Scheduled Tasks: '0'
17:19:19.132 [main] INFO  teetime.framework.Execution - Using scheduler: teetime.framework.scheduling.pushpullmodel.PushPullScheduling@250d440

Now switch back to the other shell and start the model. If both a running on the same machine, the model should connect to the collector which in turn will print out statistics on how many events it has received.

cd run
ln -s ../input/* .
ln -s ../build/mitgcmuv .
./mitgcmuv > output.txt
cd ..

Depending on your setup, this may take some time. So now is some time for a coffee or a walk, depending on the runtime of the model.

Java Example Setup

<to be added later>

Recover Architecture from Monitoring Data

When the model has terminated, you can stop the collector with CTRL-C or send the application a term signal.

To run the architecture recovery, you have to find the models executable and the addr2line tool. If this is not installed, it is often located in a package called binutils. However, this may differ from system to system. addr2line is able to extract the symbol names from the executable and the the analysis tool can use this information to translate the function pointers which have been recorded during monitoring.

On a standard Linux installation, addr2line is in /usr/bin/addr2line and the model executable should be in $WORKSPACE/MITgcm/verification/tutorial_barotropic_gyre/run/mitgcmuv.

To execute the dynamic analysis we run the dynamic analysis recovery tool:

$WORKSPACE/oceandsl-tools/bin/dar -i $WORKSPACE/kieker-* -o $WORKSPACE/model -c -E uniq-experiment-name -l dynamic -m file-mode -s elf -a /usr/bin/addr2line -e $WORKSPACE/MITgcm/verification/tutorial_barotropic_gyre/run/mitgcmuv

This will produce multiple model files in the WORKSPACE/model directory including a .project file suitable for Eclipse so we can open the models in Eclipse easily and use the Kieker Development Tools to inspect them. However, this is part of another how to.

For the options used in this run, there is information on them and their usage in the tool documentation.

Architecture Recovery of UVic

We recently compiled a summary on how we analyze earth system models (ESMs) based on runtime data. We also do this based on code analysis, but both approaches have their merit, i.e., dynamic architecture recovery — aka using runtime observations — allow to see which parts of a model program are actually used, how much they are used and how much time has been consumed by the respective function or operation.

Dynamic recovery process

Step 1: Understand the Model’s Build Process

Before we can perform any analysis, we need to understand the build process of a scientific model and how to instrument it with Kieker. This often requires to consult the model developers.

Step 2: Configure Model and Setup Parameters

It is of great importance to develop a model setup that ensures that all required parts of the model are executed, but also does not take an excessive amount of time to execute.
This is important for two reasons:

  • Monitoring will introduce overhead, and on top of that some code optimizations must be turned off. Otherwise, probes would be removed from the code.
  • The longer the run, the more monitoring data is generated. The log can become quite extensive and hard to process. To ensure, we got a good example, we compile and run the chosen setup. This ensures that the code compiles and all the necessary setup and forcing data is in place.

Step 3: Instrument Scientific Model

We use the ability of the GNU Compiler Collection (GCC) and the Intel Fortran compiler (ifort, version 19.0.4 and 2021.1.2) to weave in instrumentation probes (command line option -finstrument-functions) into a program. Kieker4C provides specific functions for that.

Both compiler suites are capable to instrument all functions, procedures and subroutines (we refer to these as operations) in Fortran, C and other compatible languages.
It is possible to select only a subset of these operations.

Besides activating instrumentation by the compiler, we have also included the Kieker monitoring library in the build path. This library provides an implementation of the two probes with the following signatures:

The compiler will then weave in calls to two probes that are called at the beginning and end of an operation, respectively.
While -finstrument-functions causes all operations to be instrumented, this can be controlled by additional flags, such as -finstrument-functions-exclude-function-list and -finstrument-functions-exclude-file-list, which exclude operations and files.

Besides activating the instrumentation feature, we have to provide an implementation to two probes, these are:

void __cyg_profile_func_enter(void *this_fn, void *call_site);
void __cyg_profile_func_exit(void *this_fn, void *call_site);

The Kieker4C-library implements both probe functions and produces with Kieker the minimal set of trace events, i.e., BeforeOperationEvent, AfterOperationEvent, and TraceMetadata.

In Java, we can obtain method and class names at runtime. This is not possible in compiled Fortran code. Instead, the compiler can append debug symbols to the program, which are then used to resolve name during analysis. Our analysis tools automatically call this program to extract the necessary information to resolve the names in the Kieker events.

The names in Fortran are case-insensitive, but the symbols in object code are case-sensitive. Thus, compilers convert names to lowercase and prefix them with _. During recovery, we remove these, otherwise it deviates from the source code.

Details on how to introduce compiler flags and the library into the respective models can be found in our replication package.

Step 4: Model Execution

When the scientific model is set up, we execute it to collect monitoring data. Depending on the model and setup, this can take minutes or hours. For the actual collection of the monitoring data, we use the Kieker collector to receive all monitoring data, compress it and store it. The collector can be started on a different machine and produce Kieker logs, including splitting up logs to avoid file size issues.

In case the collector is too slow to process all events, as it instantiates new objects for every event and facilitate event modifications, we can use NetCAT as a server which is available for various platforms. Together with split, it is possible to create a setup that allows to store huge monitoring logs. On Linux and similar operating systems, it can be run with nc -l 5678 | split -b 1000000000 - data- where 5678 is the port number the probes are writing to. These dumps can then only be read by the TCP reader stage of Kieker when replaying the log.

To execute and monitor the scientific model, we first start the collector or NetCAT and then start the instrumented scientific model.

Step 5: Monitoring Data Analytics

After the model run, we analyze the collected monitoring data. Depending on how the monitoring data was collected (see above), we use the file reader or TCP reader stage with our analysis tools. The architecture reconstruction only relies on operation calls and can be created from the log data with a minimal memory footprint. Our tools utilize the Kieker Architecture Model, but other architecture models can be used too.

The analysis produces a basic architecture model, based on observations and debug symbols. To improve the results, it is possible to generate a map file that lists all the functions found in the monitoring data, the file in which they are defined, and add a column to identify an additional grouping. The directory structure of the source code is an example for such a grouping. Our tooling provides options to generate such mapping files automatically, which can then be tweaked to satisfy the engineer.

Besides a dynamic analysis with Kieker, we also perform static code analysis and merge the recovered static architecture. All elements from these architectures are tagged to indicate their origin. This allows to identify whether an operation or component exists in the static or dynamic recovered architecture. It is also possible to join multiple dynamic analyses to identify shared components. This is helpful when analyzing variants and versions.

UVic model (v2.9.2) architecture with two levels of components

Step 6: Recover Interfaces

While newer Fortran dialects support interfaces comparable to interfaces of modules and units in Pascal and Modula-2, respectively, older versions do not have any interface information. Therefore, we aim to recover interfaces based on the calls between two components. There are different strategies available, for example, all calls from one component to one other component are grouped into one interface. This will produce very large interfaces and is not helpful for program comprehension. Therefore, we collect for each provided operation all its callee and caller components. Then, operations with an identical set of caller components are put into one provided interface of the callee component. This will create too many interfaces, as not every component will use all operations provided by another component. However, it provides a good starting place for semi-automated refinement.

Step 7: Inspect the Recovered Architecture

There are different tools available to visualize and measure the recovered architecture.
First, the Kieker development tools include two views that allow to view the architecture in Eclipse utilizing KLighD. One view only addresses the composition of the assembly model without links based on calls, the other one includes call information. Both visualizations allow to inspect the recovered model interactively.

Second, the mvis command line tool allows to visualize, inspect and measure recovered architectures. It can color the model based on the data source of a recovery, which is helpful when mixing different recovered architectures from dynamic and static recovery.
For example, to identify components and operations present in both architectures, shared elements can be colored differently. In addition, mvis is able to compute different metrics regarding the architecture.

Resources

Make Model Output Comparable

Type Bachelor or Master

Task A key issue in the ocean and climate modelling community is to compare different model output from different models, as they are in different formats, notations and units. The goal of this thesis is to create a common format or a way to specify a format together with transformations that convert output to a common output.

There exists previous work in this topic:

  • CMIP
  • EMSValTool https://www.esmvaltool.org/
  • CMOR https://cmor.llnl.gov/

Dataflow Analysis of Climate Models

Type Bachelor or Master

Task: Static analysis of Fortran code with FParser to extract data flow and create a data flow model from it. FParser is written in Python. Most of the analysis tools are written in Java. Thus, FParser will be used to identify read and write accesses to data and return a list of such accesses which can then be used to enrich our existing dataflow model.

Identify and analyze coding techniques for mathematical methods in Fortran

Type Bachelor (one project, one analysis)

Task: Analyze a scientific modeling software written in Fortran for
coding techniques that implement mathematical methods, and identify
invariants and create testable assertions in Python.

Sources & Notes

Starting point: Existing Fortran software

  • UVic http://terra.seos.uvic.ca/model/
  • MITgcm https://github.com/MITgcm/MITgcm

Thematic Domain Analysis for Ocean Modeling

We just published our paper on the analysis of the ocean modeling domain. It provides answers on the characteristics of the domain, how scientists develop and research models, how they implement them, and how technologies and methods are applied to this endeavor. Based on them, software engineers can better apply their tools, methods and approaches to the scientific modeling domain to support the software side of the model development which suffers from a lack of engineering insight.

The paper is available as a pre-print on https://arxiv.org/abs/2202.00747 and the final version is available via 10.1016/j.envsoft.2022.105323

Software Development Processes in Ocean System Modeling

Scientific modeling provides mathematical abstractions of real-world systems and builds software as implementations of these mathematical abstractions. Ocean science is a multidisciplinary discipline developing scientific models and simulations as ocean system models that are an essential research asset.
In software engineering and information systems research, modeling is also an essential activity. In particular, business process modeling for business process management and systems engineering is the activity of representing processes of an enterprise, so that the current process may be analyzed, improved, and automated.
In this paper, we employ process modeling for analyzing scientific software development in ocean science to advance the state in engineering of ocean system models and to better understand how ocean system models are developed and maintained in ocean science. We interviewed domain experts in semi-structured interviews, analyzed the results via thematic analysis, and modeled the results via the business process modeling notation BPMN.
The processes modeled as a result describe an aspired state of software development in the domain, which are often not (yet) implemented. This enables existing processes in simulation-based system engineering to be improved with the help of these process models.

The paper can be found at https://arxiv.org/abs/2108.08589

Thematic Map of the Domain Analysis

One of our initial initiatives were to understand the domain of ocean modeling, its processes and characteristics. Therefore, we conducted a set of interviews with domain experts, i.e., scientists, research software engineers, and technicians. To analyze the interview data, we relied on an Thematic Analysis approach. The resulting map can be found here. To open or close a theme (yellow) or category (blue) click on the respective node.

Develop a DSL for Bio-Geo-Chemical Models

Type Master Thesis

Task Part of the OceanDSL project is to provide a DSL for biogeochemical models or parts of them. These models can be specified in various ways. Our goal is to provide a concise DSL that allows to create and extend such models. The key tasks in this thesis are:

  • Analyze the domain to understand the how bio-geo-chemical models are researched and developed.
  • Identify parts we can address with a new DSL.
  • Design a DSL based on our technology stack, the stack of Dusk/Dawn or PSyclone, or a combination of those, depending on your findings.

Resources

  • Dusk/Dawn https://github.com/MeteoSwiss-APN/dawn
  • Metos3D
  • PSyclone https://psyclone.readthedocs.io/en/stable/
  • Biogeochemical models
    • Piwonski, J. and Slawig, T. (2016). Metos3D: the Marine Ecosystem Toolkit for Optimization and Simulation in 3-D – Part 1: Simulation Package v0.3.2. Geoscientific Model Development, 9:3729–3750
    • Kriest, I., Khatiwala, S., and Oschlies, A. (2010). Towards an assessment of simple global marine biogeochemical models of different complexity. Progress In Oceanography, 86(3-4):337–360

DSL Designing And Evaluating For Ocean Models

The development of ocean models requires knowledge from different domains. One aspect of the modeling is the model configuration that takes place in code files or parameter lists. The process of configuration of each ocean model is different and their users must know the differences. To make a configuration of the ocean models is easy we can implement a DSL that generates valid configuration files for each model. In this thesis we design and implement a such configuration DSL. Hereby we study the use cases scenarios involving model parameterization and one ocean model. Based on the findings we designed and implemented the DSL. Although the DSL does not generate all configuration files, the evaluation shows that the concept works.

  1. Serafim Simonov. 2020. DSL Designing And Evaluating For Ocean Models. Kiel University. Retrieved from http://eprints.uni-kiel.de/51160/