We recently compiled a summary on how we analyze earth system models (ESMs) based on runtime data. We also do this based on code analysis, but both approaches have their merit, i.e., dynamic architecture recovery — aka using runtime observations — allow to see which parts of a model program are actually used, how much they are used and how much time has been consumed by the respective function or operation.
Step 1: Understand the Model’s Build Process
Before we can perform any analysis, we need to understand the build process of a scientific model and how to instrument it with Kieker. This often requires to consult the model developers.
Step 2: Configure Model and Setup Parameters
It is of great importance to develop a model setup that ensures that all required parts of the model are executed, but also does not take an excessive amount of time to execute.
This is important for two reasons:
- Monitoring will introduce overhead, and on top of that some code optimizations must be turned off. Otherwise, probes would be removed from the code.
- The longer the run, the more monitoring data is generated. The log can become quite extensive and hard to process. To ensure, we got a good example, we compile and run the chosen setup. This ensures that the code compiles and all the necessary setup and forcing data is in place.
Step 3: Instrument Scientific Model
We use the ability of the GNU Compiler Collection (GCC) and the Intel Fortran compiler (ifort, version 19.0.4 and 2021.1.2) to weave in instrumentation probes (command line option
-finstrument-functions) into a program. Kieker4C provides specific functions for that.
Both compiler suites are capable to instrument all functions, procedures and subroutines (we refer to these as operations) in Fortran, C and other compatible languages.
It is possible to select only a subset of these operations.
Besides activating instrumentation by the compiler, we have also included the Kieker monitoring library in the build path. This library provides an implementation of the two probes with the following signatures:
The compiler will then weave in calls to two probes that are called at the beginning and end of an operation, respectively.
-finstrument-functions causes all operations to be instrumented, this can be controlled by additional flags, such as
-finstrument-functions-exclude-file-list, which exclude operations and files.
Besides activating the instrumentation feature, we have to provide an implementation to two probes, these are:
void __cyg_profile_func_enter(void *this_fn, void *call_site); void __cyg_profile_func_exit(void *this_fn, void *call_site);
The Kieker4C-library implements both probe functions and produces with Kieker the minimal set of trace events, i.e.,
In Java, we can obtain method and class names at runtime. This is not possible in compiled Fortran code. Instead, the compiler can append debug symbols to the program, which are then used to resolve name during analysis. Our analysis tools automatically call this program to extract the necessary information to resolve the names in the Kieker events.
The names in Fortran are case-insensitive, but the symbols in object code are case-sensitive. Thus, compilers convert names to lowercase and prefix them with _. During recovery, we remove these, otherwise it deviates from the source code.
Details on how to introduce compiler flags and the library into the respective models can be found in our replication package.
Step 4: Model Execution
When the scientific model is set up, we execute it to collect monitoring data. Depending on the model and setup, this can take minutes or hours. For the actual collection of the monitoring data, we use the Kieker collector to receive all monitoring data, compress it and store it. The collector can be started on a different machine and produce Kieker logs, including splitting up logs to avoid file size issues.
In case the collector is too slow to process all events, as it instantiates new objects for every event and facilitate event modifications, we can use NetCAT as a server which is available for various platforms. Together with
split, it is possible to create a setup that allows to store huge monitoring logs. On Linux and similar operating systems, it can be run with
nc -l 5678 | split -b 1000000000 - data- where 5678 is the port number the probes are writing to. These dumps can then only be read by the TCP reader stage of Kieker when replaying the log.
To execute and monitor the scientific model, we first start the collector or NetCAT and then start the instrumented scientific model.
Step 5: Monitoring Data Analytics
After the model run, we analyze the collected monitoring data. Depending on how the monitoring data was collected (see above), we use the file reader or TCP reader stage with our analysis tools. The architecture reconstruction only relies on operation calls and can be created from the log data with a minimal memory footprint. Our tools utilize the Kieker Architecture Model, but other architecture models can be used too.
The analysis produces a basic architecture model, based on observations and debug symbols. To improve the results, it is possible to generate a map file that lists all the functions found in the monitoring data, the file in which they are defined, and add a column to identify an additional grouping. The directory structure of the source code is an example for such a grouping. Our tooling provides options to generate such mapping files automatically, which can then be tweaked to satisfy the engineer.
Besides a dynamic analysis with Kieker, we also perform static code analysis and merge the recovered static architecture. All elements from these architectures are tagged to indicate their origin. This allows to identify whether an operation or component exists in the static or dynamic recovered architecture. It is also possible to join multiple dynamic analyses to identify shared components. This is helpful when analyzing variants and versions.
Step 6: Recover Interfaces
While newer Fortran dialects support interfaces comparable to interfaces of modules and units in Pascal and Modula-2, respectively, older versions do not have any interface information. Therefore, we aim to recover interfaces based on the calls between two components. There are different strategies available, for example, all calls from one component to one other component are grouped into one interface. This will produce very large interfaces and is not helpful for program comprehension. Therefore, we collect for each provided operation all its callee and caller components. Then, operations with an identical set of caller components are put into one provided interface of the callee component. This will create too many interfaces, as not every component will use all operations provided by another component. However, it provides a good starting place for semi-automated refinement.
Step 7: Inspect the Recovered Architecture
There are different tools available to visualize and measure the recovered architecture.
First, the Kieker development tools include two views that allow to view the architecture in Eclipse utilizing KLighD. One view only addresses the composition of the assembly model without links based on calls, the other one includes call information. Both visualizations allow to inspect the recovered model interactively.
mvis command line tool allows to visualize, inspect and measure recovered architectures. It can color the model based on the data source of a recovery, which is helpful when mixing different recovered architectures from dynamic and static recovery.
For example, to identify components and operations present in both architectures, shared elements can be colored differently. In addition, mvis is able to compute different metrics regarding the architecture.