Research Topic

InstRO: A Component-Based Toolbox For Performance Instrumentation


Performance analysis is an essential aspect of optimizing the viability, efficiency and scalability of high performance computing (HPC) applications. Performance analysis is defined as the exploration of the target application to identify scalability and efficiency limiting artifacts, called bottlenecks.

A frequently used source of information for such performance analyses are so called performance measurements, i.e. observations of the runtime behavior of the target.

Three dominant inspection techniques are used to facilitate such performance measurements: sampling, pre-link instrumentation and binary instrumentation.
As with all measurements, these techniques cause measurement perturbation, called overhead. Depending on how and in what detail the measurements occur, this overhead can easily surpass the runtime of the original application, even by orders of magnitude.

Therefore, users must strike a balance and often compromise between the inevitable overhead and the data coverage. This requires the flexibility to adjust what is measured and how often it is measured, which state-of-the-art tools based on sampling or binary instrumentation offer. However, current pre-link instrumentation tools, such as, for example, the popular GNU and Intel compilers, lack in this regard. As each of these techniques is required, depending on the circumstances and the current analysis question, an improved pre-link instrumentation capability is needed.


To enable performance analysis of current and future HPC programs we propose InstRO, a component-based compiler-based toolbox for performance instrumentation to offer a solution in regards to flexibility and control.

The InstRO concept provided two layers, serving different needs observed in research and production environments: The primary analyst layer with the analyst interface and the supporting tooling layer with the tooling interface. The analyst layer offers a high-level instrumentation specification mechanism based on selection and instrumentation components, called passes. Each pass provides a specific instrumentation action, such as selection of specific constructs according to some property, or injection of measurement code.

By orchestrating these components into an instrumentation graph, called instrumentation configuration (IC), the analyst can customize the instrumentation according to the analysis question and target application at hand. Fig. 1 shows an example for such an IC specifying a simple GCC-style function instrumentation with black- and white-listing.

As research and HPC methods evolve, novel analysis and measurement schemes will become necessary. In such cases and situations where the predefined components are insufficient, the analyst can fall back on the tooling layer, which offers analyses and interfaces necessary to customize existing or develop new passes.

The available analyses, such as the call-graph or the control-flow-graph analyses, are tailored to the instrumentation needs and offer abstraction of the complex compiler technology used. Additional code transformation and instrumentation abstractions enable easy interfacing with different analysis tools.

Fig. 1: An instrumentation configuration for a simple GCC style function instrumentation with black and white listing


With the InstRO prototypes developed, either based on the ROSE source-to-source framework or on the CLANG compiler framework, we were able to develop new and useful methods to provide engineers improved measurement capabilities:

  • Call-context tracking: A novel call-context tracking technique, called call path differentiation (CPD), was enabled by the InstRO concept. This techniques enables in many cases considerably cheaper call-context tracking than with existing methods.
  • Improved measurement detail: The InstRO concept enabled extend instrumentation coverage with other constructs besides the established functions, adding control constructs, such as loops, to the instrumentation process. This enables measurement of key driving loop structures and functions of the SPEC MPI 104.milc benchmark with less than 3% overhead, providing engineers easy access to understanding of their programs behaviour.
  • Overview measurement support: The Instro technology provided mechanisms for out-of-the-box overview measurements, i.e. the initial measurement without prior information in order to capture performance information for as many constructs as possible, without generating too much overhead. Existing full instrumentation fails here with overheads exceeding orders of magnitude for our three test cases. Using Instro to customize a novel selection scheme based on call-graph analysis, instrumentation without a-priori information was able to achieve usable overview measurements with less than 10% overhead.


The individual examples researched show that the InstRO design and prototypes enable new instrumentation schemes. Altogether, we believe that the instrumentation philosophy that InstRO represents will provide HPC analysts with another powerful tool that empowers them for extensive and productive performance exploration of sophisticated and substantial HPC codes.
InstRO provides a basis for some of current and planned  research projects by the Chair of Scientific Computing and is used by the Hessian Center for High Performance Comptuing to provided refined performance measurement capability.

The current version of InstRO is available on GitHUB (

Key Research Area

Performance measurement techniques; Compiler-based Program Instrumentation; Massive Parallelism; High Performance Computation


Template Student


Dolivostraße 15

D-64293 Darmstadt



+49 6151 16 - 24401 or 24402


+49 6151 16 - 24404


christian.iwainsky (at) sc.tu-...

 Print |  Impressum |  Sitemap |  Search |  Contact |  Privacy Policy
zum Seitenanfangzum Seitenanfang