Clang reflector for Data Layout and OpenMP


This is a Work in Progress related to a tool for analyzing C++ code. In particular two operations are of interest:
  • Extraction  of the memory layout of all the types used in a program, event std ones, for the purpose of accessing them without manual serialization. This means that only fully instantiated templates are reported.
  • Extraction of the OpenMP statements in the program
The output of the program is a JSON containing the information above. The implementation is based on CLang, specifically 3.7 due to the support of OpenMP, and the program itself is a CLang tool so it can accept all the CLang options.

The source code is on github: https://github.com/eruffaldi/yreflectcxx

Updated 2016-1-31: discussion about execution flow and added lines

Updated 2017-10-31: Facebook provides a broader tool for converting C++ AST into JSON and then analyze it using OCaml:  https://github.com/facebook/facebook-clang-plugins

OpenMP Analysis

The OpenMP static analysis is useful in the context of research about static scheduling of OpenMP program. This is also related to the paper we are going to present to the ACM SAC conference this April:
Ruffaldi E., Dabisias G., Brizzi F. & Buttazzo G. (2016). SOMA: An OpenMP Toolchain For Multicore Partitioning. In 31st ACM/SIGAPP Symposium on Applied Computing . ACM. (PDF)
SOMA is a toolchain that performs hybrid static and dynamic analysis of an OpenMP program and then computes a static scheduling for a multicore machine based on the concept of virtual processors.

The yextractmp tool syntax is the one of CLang tools:

yextractmp source.cpp [tooloptions] -- [compileroptions]

with the automatic addition of -fopenmp to every invocation

The yextractmp tool takes the CLang AST and emits a simplified AST related to the interesting statements. Then, this AST can be improved by annotating the statements flows by means of a Python script that manipulates the JSON file (ast2flow.py). The flow contains the relationship between the different statements (e.g. if, break, continue) and it can be compacted into a basic blocks flow by compacting sequences of contiguous blocks. Please not that compound statements have no side effects in C but in C++ they perform resource release meaning that their exit needs to be explicitly stated in the flow. The flow is a graph in which nodes are the statements and the edges correspond to workflow annotated with semantics (e.g. then, else, return, ...).

The next step (in development) takes the flow and annotates it by adding the parallel execution semantics of OpenMP. In particular two transitions are important: parallel execute and wait.

The overall steps are:

Future Work

The OpenMP extractor needs to be completed with the support of all the major OpenMP statements, and all the associated flags. The support for member functions has been removed but it will be added later.

For the data layout tool there are some bug fixing for special structures, more testing in general. One idea is to provide a way to limit the output by filtering by target namespace. In this way only the types related to a namespace, and all the dependencies will be listed.

Comments

Popular Posts