UP | HOME

[19jun2024] injecting symbols into llvm jit

Table of Contents

1. Introduction

I have a hobby/learning project xo-jit jumping off from the content in the LLVM Kaleidoscope tutorial. 1 In contrast to Kaleidoscope, xo-jit builds a shared library; this is intended to support jit-compiled code invoked from a python REPL.

2. TL;DR

When binding a symbols for in-process jit, consider supplyin absolute address instead of symbol name.

3. Context

  • trying to make library functions callable from llvm-compiled functions.
  • library xo-reflect provides c++ reflection
  • library xo-expression provides abstract syntax trees (for a typed lambda calculus-ish language)
  • library xo-jit compiles expressions using LLVM + links into running process via JIT
  • libraries xo-pyreflect, xo-pyexpression, xo-pygit provide pybind11 wrappers.

Using these libraries can from python:

  • construct an AST for a program
  • compile to machine code
  • run resulting machine code from the same python session, thanks to Jit

4. Setup

Libraries built with CMAKE_INSTALL_PREFIX set to ~/local2. Then run python like

PYTHONPATH=~/local2/lib:$PYTHONPATH python

(my PYTHONPATH rather long, contains a plethora of nix directories)

Then from python:

from xo_pyreflect import *
from xo_pyexpression import *
from xo_pyjit import *

# builing program like
#   lambda (x) : x * x
# with
#   x :: double

f64_t = TypeDescr.lookup_by_name('double')
x = make_var('x', f64_t)
#f = make_sin_pm()     # works
f = make_mul_f64_pm()  # fails to resolve
c = make_apply(f, [x, x])
lm = make_lambda('sq', [x], c)

mp = MachPipeline.make()

code = mp.codegen(lm)
mp.machgen_current_model()

5. Problem + Diagnosis

All this appears to work, however the llvm jit is lazy, so at this point although it's ready to produce machine code, it hasn't actually done so yet.

To get hold of the llvm-compiled sq function, so we can invoke it from python, we want to fetch corresponding symbol from the jit:

sq = mp.lookup_fn('double (*)(double)', 'sq')

This step would fail with error

Unable to resolve symbol: mul_f64

Even though inspecting libxo_jit.so shows the symbol is present

$ readelf -d ~/proj/local2/lib/libxo_jit.so | grep mul_f64
... D mul_f64 ...

We were relying on feature adopted from Kaleidoscope's JIT

// in xo/jit/Jit.hpp

dest_dynamic_lib_.addGenerator
    (cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess
              (data_layout_.getGlobalPrefix())));

Although this seems to work for kaleidoscope, it's somehow not sufficient here.

Hypothesis:

  • perhaps GenericLibrarySearchGenerator::GetForCurrentProcess() behaves differently when invoked from a shared library (in this case libxo_jit.so)?
  • maybe it can find the symbol for ::sin() because they're in a library that's in scope starting from libLLVM.so, but can't find ::mul_f64() because that comes from a sibling xo library (even when jamming the symbol into libxo_jit.so itself!)

6. Workaround

  • Found this article on stack overflow:
  • One of the answers gives a workaround for LLVM-16. Was able to adapt that solution for LLVM-18:
  • Workaround is to explicitly bind the symbol to an absolute address.
  • This only works via JIT in a running process, since only then do we know the absolute address for a symbol.

In our jit:

class Jit {
public:
    /** intern @p symbol, binding it to address @p dest **/
    template <typename T>
    llvm::Error intern_symbol(const std::string & symbol, T * dest) {
        llvm::orc::SymbolMap symbol_map;
        symbol_map[mangler_(symbol)]
            = llvm::orc::ExecutorSymbolDef(llvm::orc::ExecutorAddr::fromPtr(dest),
                                           llvm::JITSymbolFlags());

        auto materializer = llvm::orc::absoluteSymbols(symbol_map);

        return dest_dynamic_lib_.define(materializer);
    } /*intern_symbol*/
};

Footnotes:

1

Using gcc 13.2, LLVM 18.1.5, building on ubuntu (really WSL2-on-windows), with nix supplying dependencies

Author: Roland Conybeare

Created: 2024-09-08 Sun 18:01

Validate