TypeART [TA18; TA20; TA22; TA24] is a type and memory allocation tracking sanitizer based on the LLVM compiler toolchain for C/C++ (OpenMP) codes. It pairs a compiler plugin (for instrumentation) with a runtime library to track memory type, size, and location of heap, stack and global allocations.
Low-level C APIs often rely on void* pointers for generic types, requiring users to specify type and size manually, a process prone to errors. Examples of type-unsafe APIs include the Message-Passing Interface (MPI), checkpointing libraries, and numeric solver libraries. TypeART facilitates verification by ensuring, for example, that a void* argument corresponds to an array of expected type T with length n.
MUST [MU13], a dynamic MPI correctness checker, detects issues like deadlocks or mismatched MPI datatypes. For more details, visit its project page.
MUST intercepts MPI calls for analysis but cannot deduce the effective type of void* buffers in MPI APIs. TypeART addresses this by tracking memory allocations relevant to MPI communication in user code, allowing MUST to validate type compatibility between MPI buffers and declared datatypes.
To demonstrate the utility of TypeART, consider the following code:
// Otherwise unknown to MUST, TypeART tracks this allocation (memory address, type and size):
double* array = (double*) malloc(length*sizeof(double));
// MUST intercepts this MPI call, asking TypeART's runtime for type information:
// 1. Is the first argument of type double (due to MPI_DOUBLE)?
// 2. Is the allocation at least of size *length*?
MPI_Send((void*) array, length, MPI_DOUBLE, ...)MUST and TypeART also support MPI derived datatypes with complex underlying data structures. For further details, see our publications, or download MUST (v1.8 or higher integrates TypeART) from its project page.
Using TypeART involves two phases:
- Compilation, see Section 1.1: Compile code with Clang/LLVM using the TypeART LLVM pass plugin via the compiler wrapper script. The plugin (1) serializes static type information and (2) instruments relevant allocations.
- Execution, see Section 1.2: Run the instrumented program. The TypeART runtime tracks all memory allocations. Clients can query the runtime for type information regarding a memory pointer at relevant points during program execution.
+----Compiler----+ +-----------------------------------+
| typeart-mpicc +----+--->| TypeART-instrumented Application |
+----------------+ | +--+-----+-------------------+------+
^ Static | | |
| Type v v v
+----+----+ Info Alloc/Free Intercepted API
| Sources | | +-----------+ +-------------+
+---------+ | | TypeART |+--------+ Correctness |
+--->| Runtime || Query | Tool |
| |+------->| (ex. MUST) |
+-----------+ +-------------+
The TypeART LLVM compiler pass instruments allocations and serializes static type layouts. Compiler wrapper scripts are provided (available in the bin directory of the installation) for Clang and MPI. By default, these wrappers instrument heap, stack, and global allocations. MPI wrappers additionally filter allocations unrelated to MPI calls (see Section 2.3).
Replace the compiler variable as follows:
| Variable | TypeART Wrapper | Equivalent to |
|---|---|---|
CXX |
typeart-clang++ |
clang++ |
CC |
typeart-clang |
clang |
MPICC |
typeart-mpicc |
mpicc |
MPICXX |
typeart-mpic++ |
mpic++ |
The wrappers handle the LLVM pass injection and linking:
# Compile, replace direct clang++ call with wrapper of the TypeART installation:
$> typeart-clang++ -O2 $(COMPILE_FLAGS) -c code.cpp -o code.o
# Link, also with the wrapper:
$> typeart-clang++ $(LINK_FLAGS) code.o -o binaryWhen using CMake, disable the wrapper during configuration (to pass internal compiler checks) but enable it for the build step.
# Temporarily disable wrapper with environment flag TYPEART_WRAPPER=OFF for configuration:
$> TYPEART_WRAPPER=OFF cmake -B build -DCMAKE_C_COMPILER=/path/to/typeart-clang
# Compile with typeart-clang:
$> cmake --build build --target installExecute the target binary directly.
# Ensure the TypeART runtime is in the library path:
$> env LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(TYPEART_LIBPATH) ./binaryThe folder demo contains an example of MPI-related type errors that can be detected using TypeART. The target code is instrumented with TypeART, and executed by preloading the MPI-related check library implemented
in tool.c. The tool library uses the TypeART runtime query interface.
It overloads the required MPI calls and checks that the passed void* buffer corresponds to the MPI derived datatype.
To compile and run the demo targets:
- Makefile
# Valid MPI demo: $> MPICC=*TypeART prefix*/bin/typeart-mpicc make run-demo # Type-error MPI demo: $> MPICC=*TypeART prefix*/bin/typeart-mpicc make run-demo_broken
- CMake, likewise:
$> TYPEART_WRAPPER=OFF cmake -S demo -B build_demo -DCMAKE_C_COMPILER=*TypeART prefix*/bin/typeart-mpicc $> cmake --build build_demo --target run-demo $> cmake --build build_demo --target run-demo_broken
Pass behavior is configured via the environment flags listed below. The TypeART pass prioritizes environment flags (if set) over default configuration options.
Specifically, TYPEART_OPTIONS can globally modify the TypeART pass (stack/heap specific options exist). The format requires option names separated by a semicolon, e.g., TYPEART_OPTIONS="filter-glob=API_*;no-stats" sets the filter glob target to API_* and deactivates stats printing. Prepending no- to boolean flags sets them to false.
Note: Single environment options take precedence over TYPEART_OPTIONS.
| Env. variable | Option name | Default value | Description |
|---|---|---|---|
TYPEART_OPTIONS |
Set multiple options at once, separated by ;. |
||
TYPEART_OPTIONS_STACK |
Same as above for stack phase only. | ||
TYPEART_OPTIONS_HEAP |
Same as above for heap phase only. | ||
TYPEART_TYPES |
types |
typeart-types.yaml |
Serialized type layout information of user-defined types. File location and name can also be controlled with the env variable TYPEART_TYPES. |
TYPEART_HEAP |
heap |
true |
Instrument heap allocations |
TYPEART_STACK |
stack |
false |
Instrument stack and global allocations. Enables instrumentation of global allocations. |
TYPEART_STACK_LIFETIME |
stack-lifetime |
true |
Instrument stack llvm.lifetime.start instead of alloca directly |
TYPEART_GLOBAL |
global |
false |
Instrument global allocations (see stack). |
TYPEART_TYPEGEN |
typegen |
dimeta |
Values: dimeta, ir. How serializing of type information is done, see Section 2.2. |
TYPEART_TYPE_SERIALIZATION |
type-serialization |
hybrid |
Values: file, hybrid, inline. How type information are stored (in the executable or externally), see Section 2.2. |
TYPEART_STATS |
stats |
false |
Show instrumentation statistic counters |
TYPEART_FILTER |
filter |
false |
Filter stack and global allocations. See also Section 2.3 |
TYPEART_FILTER_IMPLEMENTATION |
filter-implementation |
std |
Values: std, none. See also Section 2.3 |
TYPEART_FILTER_GLOB |
filter-glob |
*MPI_* |
Filter API string target (glob string) |
TYPEART_FILTER_GLOB_DEEP |
filter-glob-deep |
MPI_* |
Filter values based on specific API: Values passed as ptr are correlated when string matched. |
TYPEART_ANALYSIS_FILTER_GLOBAL |
analysis-filter-global |
true |
Filter global alloca based on heuristics |
TYPEART_ANALYSIS_FILTER_HEAP_ALLOCA |
analysis-filter-heap-alloca |
true |
Filter stack alloca that have a store instruction from a heap allocation |
TYPEART_ANALYSTS_FILTER_NON_ARRAY_ALLOCA |
analysis-filter-non-array-alloca |
false |
Filter scalar valued allocas |
TYPEART_ANALYSIS_FILTER_POINTER_ALLOCA |
analysis-filter-pointer-alloca |
true |
Filter allocas of pointer types |
Additionally, there are two debug environment flags for dumping the LLVM IR per phase (pre heap, heap, opt, stack) to a set of files.
| Env. variable | Description |
|---|---|
TYPEART_WRAPPER_EMIT_IR |
If set, the compiler wrapper will create 4 files for each TypeART phase with the file pattern ${source_basename}_heap.ll etc. |
TYPEART_PASS_INTERNAL_EMIT_IR |
Internal pass use only. Toggled by wrapper. |
TypeART uses either the LLVM IR type system (typegen=ir) or the external library llvm-dimeta (typegen=dimeta), which extracts type information using LLVM debug metadata. The latter is the default; the former is compatible only with LLVM 14.
The layout is serialized either as a global variable inside each translation unit (type-serialization=hybrid or inline) or via an external YAML file (type-serialization=file).
Note: In file mode, compilation must be serialized (e.g., make -j 1) to ensure consistent type information across translation units.
Type serialization for each user-defined type (mode hybrid) or all types (mode inline) are stored as (constant) globals with the following format:
struct GlobalTypeInfo {
std::int32_t type_id;
const std::uint32_t extent;
const GlobalTypeInfoData* data; // nullptr for built-ins
};
struct GlobalTypeInfoData {
const char* type_name;
// data : [ num_member, flag, offsets[num_member], array_sizes[num_member] ]:
const std::uint16_t* data;
const GlobalTypeInfo** member_types;
}Each type is registered at startup with the TypeART runtime using the callback void __typeart_register_type(const void* type_ptr);. This adds the type information to the type database (for user queries) and assigns a unique type-id.
Each user-defined type layout is assigned a unique integer type-id starting at 256. Built-in types (e.g., float) use predefined type-ids (< 256) and byte layouts. The runtime library correlates the allocation with the respective type (and layout) during execution via the type-id.
After instrumentation, the file typeart-types.yaml (env TYPEART_TYPES) contains the static type information. Each user-defined type layout is
extracted and an integer type-id is attached to it (similarly to hybrid and inline serialization).
For example, consider the following C struct:
struct s1_t {
char a[3];
struct s1_t* b;
}The TypeART pass may write a typeart-types.yaml file with the following content:
- id: 256 // struct type-id
name: s1_t
extent: 16 // size in bytes
member_count: 2
offsets: [ 0, 8 ] // byte offsets from struct start
types: [ 5, 1 ] // member type-ids (5->char, 1->pointer)
sizes: [ 3, 1 ] // member (array) lengthExecuting a target binary requires access to the typeart-types.yaml file to correlate the type-id with actual type layouts. Specify the path using the environment variable TYPEART_TYPES:
$> export TYPEART_TYPES=/path/to/typeart-types.yaml
# If the TypeART runtime is not resolved, LD_LIBRARY_PATH is set:
$> env LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(TYPEART_LIBPATH) ./binaryThe list of supported built-in type-ids is defined in TypeInterface.h and reflects the types that TypeART can represent with LLVM Debug Metadata. In contrast, when using LLVM IR Type System, certain constraints are imposed. For instance, C/C++ types like unsigned integers are unsupported (and represented like signed integers).
To improve performance, a translation unit-local (TU) data-flow filter for global and stack variables exist. It follows the LLVM IR use-def chain. If the allocation provably never reaches the target API, it can be filtered. Otherwise, it is instrumented. Use the option filter to enable filtering and filter-glob=<target API glob> (default: *MPI_*) to specify the API.
Consider the following example.
extern foo_bar(float*); // No definition in the TU
void bar(float* x, float* y) {
*x = 2.f; // x is not used after
MPI_Send(y, ...);
}
void foo() {
float a = 1.f, b = 2.f, c = 3.f;
bar(&a, &b);
foo_bar(&c);
}ais filtered because the aliasing pointerxis never part of an MPI call.bis instrumented because the aliasing pointeryis part of an MPI call.cis instrumented because the body offoo_barcannot be reasoned about.
TypeART supports LLVM version 14, 18-21, and CMake version >= 3.20.
- MPI library: (soft requirement) Needed for the MPI compiler wrappers, tests, the demo, our MPI interceptor library, and for logging with our TypeART runtime library within an MPI target application.
- OpenMP-enabled Clang compiler: Needed for some tests.
Other smaller, external dependencies are defined within the externals folder (depending on configuration options), see Section 3.3 (Runtime). They are automatically downloaded during configuration time.
TypeART uses CMake to build, cf. GitHub CI build file for a complete recipe to build.
Example build recipe (debug build, installs to default prefix
${typeart_SOURCE_DIR}/install/typeart)
$> git clone https://github.com/tudasc/TypeART
$> cd TypeART
$> cmake -B build
$> cmake --build build --target install --parallel| Option | Default | Description |
|---|---|---|
TYPEART_MPI_WRAPPER |
ON |
Install TypeART MPI wrapper (mpic, mpic++). Requires MPI. |
TYPEART_USE_LEGACY_WRAPPER |
OFF |
Use legacy wrapper invoking opt/llc directly instead of Clang's -fpass-plugin. |
| Option | Default | Description |
|---|---|---|
TYPEART_ABSEIL |
ON |
Enable usage of btree-backed map of the Abseil project (LTS release) for storing allocation data. |
TYPEART_PHMAP |
OFF |
Enable usage of a btree-backed map (alternative to Abseil). |
TYPEART_SOFTCOUNTERS |
OFF |
Enable runtime tracking of #tracked addrs. / #distinct checks / etc. |
TYPEART_LOG_LEVEL_RT |
0 |
Granularity of runtime logger. 3 is most verbose, 0 is least. |
Default mode is to protect the global data structure with a (shared) mutex. Two main options exist:
| Option | Default | Description |
|---|---|---|
TYPEART_DISABLE_THREAD_SAFETY |
OFF |
Disable thread safety of runtime |
TYPEART_SAFEPTR |
OFF |
Instead of a mutex, use a special data structure wrapper for concurrency, see object_threadsafe |
| Option | Default | Description |
|---|---|---|
TYPEART_SHOW_STATS |
ON |
Passes show compile-time summary w.r.t. allocations counts |
TYPEART_MPI_INTERCEPT_LIB |
ON |
Library to intercept MPI calls by preloading and check whether TypeART tracks the buffer pointer |
TYPEART_MPI_LOGGER |
ON |
Enable better logging support in MPI execution context |
TYPEART_LOG_LEVEL |
0 |
Granularity of pass logger. 3 is most verbose, 0 is least |
| Option | Default | Description |
|---|---|---|
TYPEART_TEST_CONFIG |
OFF |
Enable testing, and set (force) logging levels to appropriate levels for test runner to succeed |
TYPEART_CODE_COVERAGE |
OFF |
Enable code coverage statistics using LCOV 1.14 and genhtml (gcovr optional) |
TYPEART_LLVM_CODE_COVERAGE |
OFF |
Enable llvm-cov code coverage statistics (llvm-cov and llvm-profdata required) |
TYPEART_ASAN, TSAN, UBSAN |
OFF |
Enable Clang sanitizers (tsan is mutually exclusive w.r.t. ubsan and asan as they don't play well together) |
The wrappers typeart-mpicc and typeart-mpic++ are generated for compiling MPI codes with TypeART.
The build system detects the vendor to generate wrappers with appropriate environment variables that force the use of the Clang/LLVM compiler.
Detection is supported for OpenMPI, Intel MPI, and MPICH based on mpi.h symbols. The following flags are used to set the Clang compiler:
| Vendor | Symbol | C compiler env. var | C++ compiler env. var |
|---|---|---|---|
| Open MPI | OPEN_MPI | OMPI_CC | OMPI_CXX |
| Intel MPI | I_MPI_VERSION | I_MPI_CC | I_MPI_CXX |
| MPICH | MPICH_NAME | MPICH_CC | MPICH_CXX |
Example using CMake FetchContent for consuming the TypeART runtime library.
FetchContent_Declare(
typeart
GIT_REPOSITORY https://github.com/tudasc/TypeART
GIT_TAG v2.2
GIT_SHALLOW 1
)
FetchContent_MakeAvailable(typeart)
target_link_libraries(my_project_target PRIVATE typeart::Runtime)| [TA18] | Hück, Alexander and Lehr, Jan-Patrick and Kreutzer, Sebastian and Protze, Joachim and Terboven, Christian and Bischof, Christian and Müller, Matthias S. "Compiler-aided type tracking for correctness checking of MPI applications." In 2nd International Workshop on Software Correctness for HPC Applications (Correctness), pages 51–58. IEEE, 2018. DOI: 10.1109/Correctness.2018.00011 |
| [TA20] | Hück, Alexander and Protze, Joachim and Lehr, Jan-Patrick and Terboven, Christian and Bischof, Christian and Müller, Matthias S. "Towards compiler-aided correctness checking of adjoint MPI applications." In 4th International Workshop on Software Correctness for HPC Applications (Correctness), pages 40–48. IEEE/ACM, 2020. DOI: 10.1109/Correctness51934.2020.00010 |
| [TA22] | Hück, Alexander and Kreutzer, Sebastian and Protze, Joachim and Lehr, Jan-Patrick and Bischof, Christian and Terboven, Christian and Müller, Matthias S. "Compiler-Aided Type Correctness of Hybrid MPI-OpenMP Applications." In IT Professional, vol. 24, no. 2, pages 45–51. IEEE, 2022. DOI: 10.1109/MITP.2021.3093949 |
| [TA24] | Hück, Alexander and Ziegler, Tim and Schwitanski, Simon and Jenke, Joachim and Bischof, Christian. "Compiler-Aided Correctness Checking of CUDA-Aware MPI Applications." In SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 204–213, IEEE/ACM, 2024. DOI: 10.1109/SCW63240.2024.00032 |
| [MU13] | Hilbrich, Tobias and Protze, Joachim and Schulz, Martin and de Supinski, Bronis R. and Müller, Matthias S. "MPI Runtime Error Detection with MUST: Advances in Deadlock Detection." In Scientific Programming, vol. 21, no. 3-4, pages 109–121, 2013. DOI: 10.3233/SPR-130368 |