How to Recover an Architecture on Monitoring Data

We can collect monitoring data from software utilizing Kieker monitoring probes. This can be done with Kieker 4 Java, Kieker 4 C and Kieker 4 Python. The C package can also be used for other programming languages that can be compiled with the GNU Compiler Collection or Intel Compilers. It may also be usable with others.

In this how to we will explain the process for Fortran and Java software with each one way to set up the monitoring. Please note that there are other options available, which can be found in the Kieker documentation.

Prerequisites

For this how to, we assume that you have installed:

  • gcc/gfortran or ifort compiler (for the Fortran/C example)
  • Java 11 or newer (there might be issues with Java 17)
  • Build tools, like make, autoconf, automake, autotools, libtool (for the Fortran/C example)
  • git client
  • tar and gzip

Tool Installation

For the dynamic analysis we need the OceanDSL tools, Kieker tools and Kieker monitoring probes for C and Java. To start it is helpful to have a working directory for your setup.

  • Create a working directory. In this how to we call this $WORKSPACE.
  • Create the directory with: mkdir $WORKSPACE
  • Change to the workspace: cd $WORKSPACE

Install OceanDSL Tools

Clone oceandsl tools with: git clone https://git.se.informatik.uni-kiel.de/oceandsl/oceandsl-tools.git

To build the oceandsl tools enter the tool directory with

cd oceandsl-tools

Run the gradle build script

./gradlew build

When the build was successful, assemble all tools in one archive. This is not really necessary, but reduces work later, as you do not have to unpack each tool separately.

./assemble-tools.sh

Leave the oceandsl-tools directory, move the repository out of the way and extract the tool archive.

cd ..
mv oceandsl-tools oceandsl-tools-repo
tar -xvzpf oceandsl-tools-repo/build/oceandsl-tools.tgz

This create a folder oceandsl-tools with a bin and lib folder inside containing all tools necessary for this how to and others on this side.

Install Kieker Language pack C

First you have to clone Kieker language pack C from github.

git clone https://github.com/kieker-monitoring/kieker-lang-pack-c.git

Build the Kieker language pack C library for monitoring (this is only necessary for the Fortran/C example). Enter the directory with

cd kieker-lang-pack-c/source

and run the following tools from the autotools, autoconf, automake and libtoolize packages. If these tools are not installed, please install them first.

libtoolize
aclocal
autoconf
automake --add-missing

Note: We try to provide a Debian package for this language pack. However, currently it is not available. Therefore, you have to compile the library yourself.

Continue with compiling the library. The library can either be installed system wide in that case: ./configure ; make will build it and make install run as root user will install it. However, in many cases this is not the best option. To install it in the $WORKSPACE, run

./configure --prefix=$WORKSPACE
make
make install

The latter will install the library and some examples in $WORKSPACE/lib and $WORKSPACE/bin, respectively.

Switch back to working directory with cd $WORKSPACE

Install Kieker Collector

The Kieker Collector is used to collect all monitoring events during runtime. It is part of the Kieker bundle. Download the bundle from:

https://github.com/kieker-monitoring/kieker/releases/download/1.15.2/kieker-1.15.2-binaries.zip

Unzip the bundle and unpack the collector in the $WORKSPACE directory.

unzip kieker-1.15.2-binaries.zip
unzip kieker-1.15.2/tools/collector-1.15.2.zip

The collector requires a configuration file to know where to store the monitoring data and to know on which port it should listen. Here is a example configuration file collector.conf. Please note that the word WORKSPACE has to be replaced with the fully qualified path of the working directory.

# common
kieker.monitoring.name=
kieker.monitoring.hostname=
kieker.monitoring.metadata=true

# TCP collector
kieker.tools.source=kieker.tools.source.MultipleConnectionTcpSourceCompositeStage
kieker.tools.source.MultipleConnectionTcpSourceCompositeStage.port=5678
kieker.tools.source.MultipleConnectionTcpSourceCompositeStage.capacity=8192

# dump stage
kieker.monitoring.writer=kieker.monitoring.writer.filesystem.FileWriter
kieker.monitoring.writer.filesystem.FileWriter.customStoragePath=WORKSPACE
kieker.monitoring.writer.filesystem.FileWriter.charsetName=UTF-8
kieker.monitoring.writer.filesystem.FileWriter.maxEntriesInFile=25000
kieker.monitoring.writer.filesystem.FileWriter.maxLogSize=-1
kieker.monitoring.writer.filesystem.FileWriter.maxLogFiles=-1
kieker.monitoring.writer.filesystem.FileWriter.mapFileHandler=kieker.monitoring.writer.filesystem.TextMapFileHandler
kieker.monitoring.writer.filesystem.TextMapFileHandler.flush=true
kieker.monitoring.writer.filesystem.TextMapFileHandler.compression=kieker.monitoring.writer.compression.NoneCompressionFilter
kieker.monitoring.writer.filesystem.FileWriter.logFilePoolHandler=kieker.monitoring.writer.filesystem.RotatingLogFilePoolHandler
kieker.monitoring.writer.filesystem.FileWriter.logStreamHandler=kieker.monitoring.writer.filesystem.BinaryLogStreamHandler
kieker.monitoring.writer.filesystem.FileWriter.flush=true
kieker.monitoring.writer.filesystem.FileWriter.bufferSize=81920

For more details on the option, you can consult the Kieker documentation.

Fortran Example Setup

For the Fortran example, we use a publicly available MIT General Circulation Model (MITgcm).

Checkout the model with

git clone https://github.com/MITgcm/MITgcm.git

Switch to the verification directory which contains a couple of example model setups with:

cd MITgcm/verfication

For this how to, we select tutorial_barotropic_gyre and type

cd tutorial_barotropic_gyre

First, we check out whether it compiles at all to be sure that everything is in place (cf. README.md). We type:

cd build
../../../tools/genmake2 -mods ../code [-of my_platform_optionFile]
make depend
make
cd ..

The option -of can be omitted for this test, but we have to find the correct one for our platform to add instrumentation. Thus, we look into the ../../tools/build_options/ directory with ls ../../tools/build_options/.

In case you have gfortran installed and work on Linux you choose ../../tools/build_options/linux_amd64_gfortran.

Now try to run the model with

cd run
ln -s ../input/* .
ln -s ../build/mitgcmuv .
./mitgcmuv > output.txt
cd ..

After the model has run, check the output.txt file for errors. In case everything went fine, we can instrument the model.

Now you can instrument the model. We will describe this regarding the example. For other C or Fortran projects, this will look different. Therefore, we provide a general explanation first and then apply this to MITgcm’s setup.

As you have installed the Kieker language pack C, there is a libkieker in the local library folder. To be able to use it you have to do X things with gcc or ifort

  1. Specify -finstrument-functions as command line parameter to the compiler
  2. Add the library to the library path with -L$WORKSPACE/lib
  3. Add the library to the libraries to be used during compilation with -lkieker
  4. Set option -g to include debug symbols in the executable

With MITgcm this is done by creating a new platform file. Lets assume you have used the linux_amd_gfortran file before. Duplicate the file and append the lines below.

cp ../../tools/build_options/linux_amd64_gfortran ../../tools/build_options/linux_amd64_gfortran_kieker
nano ../../tools/build_options/linux_amd64_gfortran_kieker

This will open the nano editor, if available. In case you want to use another editor feel free to use it. In the editor append to the platform file the following lines at the bottom. Please replace the word WORKSPACE with the actual path to the workspace directory to avoid unwanted behavior.

# Kieker setup
FOPTIM=""
F90OPTIM=""

FFLAGS="$FFLAGS -finstrument-functions -g"
CFLAGS="$CFLAGS -finstrument-functions -g"
LIBS="$LIBS -L/usr/lib/x86_64-linux-gnu -L$WORKSPACE/lib -lkieker -ldl"

In some instance the -L/usr/lib/x86_64-linux-gnu and -ldl options are not necessary. This seems to depend highly on the version and distribution of Linux.

Save the file.

cd build
../../../tools/genmake2 -mods ../code ../../../tools/build_options/linux_amd64_gfortran_kieker
make depend
make
cd ..

Start the collector in a separate shell or window with the configuration file you created above:

collector-1.15.2/bin/collector -c collector.conf

When the collector is running properly, it will output something like the listing below. However, if it shows a strange stacktrace, this could be due to Java 17. In this case you have to install an older version of Java and specify its location in the JAVA_HOME variable, e.g.,

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

The listing below depicts the information of a properly running collector. To terminate it, you can press CTRL-C. The collector does not terminate automatically, as it works as a monitoring collection service where applications can connect to and disconnect and reconnect. Therefore, it will not terminate just because no application is currently using the service.

17:19:19.094 [main] INFO  k.m.core.controller.TCPController - Could not parse port for the TCPController, deactivating this option. Received string was: 
17:19:19.115 [main] INFO  k.m.core.controller.StateController - Enabling monitoring
17:19:19.123 [main] INFO  k.m.c.c.MonitoringController - Current State of kieker.monitoring (1.15.2) Status: 'enabled'
	Name: '2.9'; Hostname: 'glasgow'; experimentID: '1'
JMXController: JMX disabled
TimeSource: 'kieker.monitoring.timer.SystemNanoTimer'
	Time in nanoseconds (with nanoseconds precision) since Thu Jan 01 01:00:00 CET 1970'
ProbeController: disabled
WriterController:
	Queue type: class kieker.monitoring.queue.BlockingQueueDecorator
	Queue capacity: 10000
	Insert behavior (a.k.a. QueueFullBehavior): class kieker.monitoring.queue.behavior.BlockOnFailedInsertBehavior
		numBlocked: 0
Writer: 'kieker.monitoring.writer.filesystem.FileWriter'
	Configuration:
		kieker.monitoring.writer.filesystem.FileWriter.logFilePoolHandler='kieker.monitoring.writer.filesystem.RotatingLogFilePoolHandler'
		kieker.monitoring.writer.filesystem.FileWriter.charsetName='UTF-8'
		kieker.monitoring.writer.filesystem.FileWriter.logStreamHandler='kieker.monitoring.writer.filesystem.BinaryLogStreamHandler'
		kieker.monitoring.writer.filesystem.FileWriter.bufferSize='81920'
		kieker.monitoring.writer.filesystem.FileWriter.maxEntriesInFile='25000'
		kieker.monitoring.writer.filesystem.FileWriter.maxLogFiles='-1'
		kieker.monitoring.writer.filesystem.FileWriter.maxLogSize='-1'
		kieker.monitoring.writer.filesystem.FileWriter.mapFileHandler='kieker.monitoring.writer.filesystem.TextMapFileHandler'
		kieker.monitoring.writer.filesystem.FileWriter.flush='true'
		kieker.monitoring.writer.filesystem.FileWriter.customStoragePath='WORKSPACE'
		kieker.monitoring.writer.filesystem.FileWriter.actualStoragePath='WORKSPACE/kieker-20230223-161919-19823802456291-UTC--2.9'

	Automatic assignment of logging timestamps: 'true'
Sampling Controller: Periodic Sensor available: Poolsize: '0'; Scheduled Tasks: '0'
17:19:19.132 [main] INFO  teetime.framework.Execution - Using scheduler: teetime.framework.scheduling.pushpullmodel.PushPullScheduling@250d440

Now switch back to the other shell and start the model. If both a running on the same machine, the model should connect to the collector which in turn will print out statistics on how many events it has received.

cd run
ln -s ../input/* .
ln -s ../build/mitgcmuv .
./mitgcmuv > output.txt
cd ..

Depending on your setup, this may take some time. So now is some time for a coffee or a walk, depending on the runtime of the model.

Java Example Setup

<to be added later>

Recover Architecture from Monitoring Data

When the model has terminated, you can stop the collector with CTRL-C or send the application a term signal.

To run the architecture recovery, you have to find the models executable and the addr2line tool. If this is not installed, it is often located in a package called binutils. However, this may differ from system to system. addr2line is able to extract the symbol names from the executable and the the analysis tool can use this information to translate the function pointers which have been recorded during monitoring.

On a standard Linux installation, addr2line is in /usr/bin/addr2line and the model executable should be in $WORKSPACE/MITgcm/verification/tutorial_barotropic_gyre/run/mitgcmuv.

To execute the dynamic analysis we run the dynamic analysis recovery tool:

$WORKSPACE/oceandsl-tools/bin/dar -i $WORKSPACE/kieker-* -o $WORKSPACE/model -c -E uniq-experiment-name -l dynamic -m file-mode -s elf -a /usr/bin/addr2line -e $WORKSPACE/MITgcm/verification/tutorial_barotropic_gyre/run/mitgcmuv

This will produce multiple model files in the WORKSPACE/model directory including a .project file suitable for Eclipse so we can open the models in Eclipse easily and use the Kieker Development Tools to inspect them. However, this is part of another how to.

For the options used in this run, there is information on them and their usage in the tool documentation.