LLVM
Table of Contents
1 Note from LLVM
When doing casting, don't use C++ static_cast
or
dynamic_cast
. Instead, use the following:
- isa<>: check and return bool
- cast<>: this is checked cast from base class to derived. will trigger assertion failure if the instance is not class.
- dyncast<>: this is checking cast, will return nullptr on failure
These functions also have one under clang namespace.
The type inside the cast should NOT be the pointer.
2 Build and install LLVM system
svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm cd llvm/tools svn co http://llvm.org/svn/llvm-project/cfe/trunk clang cd llvm/tools/clang/tools svn co http://llvm.org/svn/llvm-project/clang-tools-extra/trunk extra cd llvm/projects svn co http://llvm.org/svn/llvm-project/compiler-rt/trunk compiler-rt cd .. mkdir build
Using make
cmake -G "Unix Makefiles" ..
make
make install
Or using Ninja
cmake -G Ninja .. ninja ninja install
Note that using Ninja will be slow and consume a lot of memory. The resulting executable is huge (in my case the clang executable is 5G). This is because the debug information is built-in to the executable. So use the release build:
cmake -G Ninja .. -DLLVM_BUILD_TESTS=ON -DCMAKE_BUILD_TYPE=Release
Variables:
DLLVM_ENABLE_RTTI=ON
: support RTTI
3 try the llvm toolchain
clang --help
clang file.c -fsyntax-only
check for correctnessclang file.c -S -emit-llvm -o -
print out unoptimized llvm codeclang file.c -S -emit-llvm -o - -O3
clang file.c -S -O3 -o -
output native machine code
from http://llvm.org/docs/GettingStarted.html
Use the following code to test
#include <stdio.h> int main() { printf("hello world\n"); return 0; }
It can be used just as gcc:
# one way to run clang hello.c -o hello ./hello
Compile into llvm bitcode:
clang -O3 -emit-llvm hello.c -c -o hello.bc
Bit code can be inspected by converting back to IR:
# look at the assemble code llvm-dis < hello.bc | less
Bitcode can be run directly:
lli hello.bc
Alternatively, you can compile LLVM bitcode into assembly file, then assemble and run it
llc hello.bc -o hello.s gcc hello.s -o hello.native ./hello.native
4 Use the framework
4.1 Project setup
project directory should look like this
pass-project/CMakeLists.txt pass-project/mypass/CMakeLists.txt pass-project/mypass/MyPass.cpp
The top level CMakeLists.txt will configure the environment, including finding the LLVM package
find_package(LLVM REQUIRED CONFIG) add_definitions(${LLVM_DEFINITIONS}) include_directories(${LLVM_INCLUDE_DIRS}) add_definitions(-std=c++11) # patch: used c++ 11 # patch: I didn't compile LLVM with rtti, # so I need to disable rtti when compiling pass # or I will get error when opt -load my pass SET(CMAKE_CXX_FLAGS "-Wall -fno-rtti") add_subdirectory(hellopass)
The sub-directory CMakeLists.txt file will tell the pass source files
add_library(HebiPass MODULE MyPass.cpp)
The pass source file should look like this:
#include "llvm/Pass.h" #include "llvm/IR/Function.h" #include "llvm/Support/raw_ostream.h" using namespace llvm; namespace { struct Hello : public FunctionPass { static char ID; Hello() : FunctionPass(ID) {} bool runOnFunction(Function &F) override { errs() << "Hello: "; errs().write_escaped(F.getName()) << "\n"; return false; } }; } char Hello::ID = 0; static RegisterPass<Hello> X("hello", "Hello World Pass", false, false);
Compile it into the shared library. To run it
- first load the library by
-load /path/to/so/file
. -hello
means to run this path. The name is given in the source file byRegisterPass
class.
cmake . make # output mypass/libHebiPass.so opt -load ./mypass/libHebiPass.so -hello < hello.bc
It first load the library
4.2 Passes
4.2.1 Various passes
All these functions return false indicating they do not modify the code, true otherwise.
class ModulePass { virtual bool runOnModule(Module &M) = 0; } class FunctionPass { virtual bool runOnFunction(Function &F) = 0; } class BasicBlockPass { virtual bool runOnBasicBlock(BasicBlock &BB) = 0; }
4.2.2 register a pass
The four parameters:
- command line option to invoke the path (
-hello
) - Help message
- If a pass walks CFG without modifying it then the third argument is set to true;
- if a pass is an analysis pass, for example true for dominator tree pass
static RegisterPass<Hello> X("hello", "Hello World Pass", false /* Only looks at CFG */, false /* Analysis Pass */);
4.2.3 Pass Interaction
MyPass::getAnalysisUsage
will set the required passes.
It also tells what information is modified (or preserved) by this pass.
void MyPass::getAnalysisUsage(AnalysisUsage &AU) const { AU.setPreservesAll(); // AU.setPreservesCFG(); AU.addRequired<LoopInfoWrapperPass>(); }
Inside that Pass, you can use getAnalysis
to get that pass itself.
In this example, getLoopInfo
is a method of LoopInfoWrapperPass
.
bool MyPass::runOnFunction(Function &F) { // this must be in the Pass class LoopInfo &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo(); //... }
4.3 LLVM template
isa
:
if (isa<Constant>(V) || isa<Argument>(V) || isa<GlobalValue>(V)) return true;
cast
: This is a checked cast. If the cast is not valid, assertion failure.
cast<Instruction>(V)->getParent()
dyn_cast
: This is a checking cast. If not valid, NULL pointer is returned.
if (AllocationInst *AI = dyn_cast<AllocationInst>(Val)) {}
4.4 Values
4.4.1 Function
Iterating basic blocks:
// func is a pointer to a Function instance for (Function::iterator it = func->begin(), end = func->end(); it != end; ++it) { BasicBlock *bb = &*i; }
Iterating instructions directly:
// f is a pointer to a Function instance for (inst_iterator it=inst_begin(f), end=inst_end(f);it!=end;++it) { Instruction *inst = &*it; }
4.4.2 BasicBlock
// blk is a pointer to a BasicBlock instance for (BasicBlock::iterator it=blk->begin(), end=blk->end();it!=end;++it) { Instruction *inst = &*it; }
4.5 User
Get users of a value:
Function *F; for (User *U : F->users()) { if (Instruction *Inst = dyn_cast<Instruction>(U)) { errs() << "F is used in instruction:\n"; errs() << *Inst << "\n"; }
Get used values of an instruction:
Instruction *pi; for (Use &U : pi->operands()) { Value *v = U.get(); }
4.6 CFG
CFG consists of basic blocks.
#include "llvm/Support/CFG.h" BasicBlock *BB = ...; for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) { BasicBlock *Pred = *PI; }
5 Reference
Some useful manuals:
- Manual: http://llvm.org/docs/ProgrammersManual.html
- IR Reference: http://llvm.org/docs/LangRef.html
- Pass: http://llvm.org/docs/WritingAnLLVMPass.html
- Source Level Debugging: http://llvm.org/docs/SourceLevelDebugging.html