Peeter Joot's Blog

Tagged V3 of my toy compiler (playing with the MLIR -> LLVM-IR toolchain)

June 3, 2025 clang/llvm alloca, arrays, boolean operators, clang, flang, global, insertion point, llvm, LLVM IR, lowering, MLIR, module, string, string-literal, symbol table

Screenshot

I’ve added a number of elements to the language and compiler:

comparison operators (<, <=, EQ, NE) yielding BOOL values. These work for any combinations of floating and integer types (including BOOL.)
integer bitwise operators (OR, AND, XOR). These only for for integer types (including BOOL.)
a NOT operator, yielding BOOL.
Array + string declaration and lowering support, including debug instrumentation, and print support for string variables.

This version also fixes a few specific issues:

Fixed -g/-OX propagation to lowering. If -g not specified, now don’t generate the DI.
Show the optimized .ll with –emit-llvm instead of the just-lowered .ll (unless not invoking the assembly printer, where the ll optimization passes are registered.)
Reorganize the grammar so that all the simple lexer tokens are last. Rename a bunch of the tokens, introducing some consistency.
calculator.td: introduce IntOrFloat constraint type, replacing AnyType usage; array decl support, and string support.
driver: writeLL helper function, pass -g to lowering if set.
parser: handle large integer constants properly, array decl support, and string support.
simplest.cpp: This MWE is updated to include a global variable and global variable access.
parser: implicit exit: use the last saved location, instead of the module start. This means the line numbers don’t jump around at the very end of the program anymore (i.e.: implicit return/exit)

I started with the comparison operators, thinking that I’d add if statement support, and loops, but got sidetracked. In particular, I generated a number of really large test programs, and without some way to print a string message, it was hard to figure out where an error was occuring. This led to implementing PRINT string-variable support as an interesting feature first.

As a side effect of adding STRING support, I’ve also got declaration support for arbitrary fixed size arrays for any type. I haven’t implemented array access yet, but that probably won’t be too hard.

Here’s an example of a program that uses STRING variables:

STRING t[2];
STRING u[3];
STRING s[2];
INT8 s2;
s = "hi";
PRINT s;
t = "hi";
PRINT t;
u = "hi";
PRINT u;
u = "bye";
PRINT u;

This is the MLIR for the program:

module {
  toy.program {
    "toy.declare"() <{name = "t", size = 2 : i64, type = i8}> : () -> ()
    "toy.declare"() <{name = "u", size = 3 : i64, type = i8}> : () -> ()
    "toy.declare"() <{name = "s", size = 2 : i64, type = i8}> : () -> ()
    "toy.declare"() <{name = "s2", type = i8}> : () -> ()
    toy.string_assign "s" = "hi"
    %0 = toy.load "s" : !llvm.ptr
    toy.print %0 : !llvm.ptr
    toy.string_assign "t" = "hi"
    %1 = toy.load "t" : !llvm.ptr
    toy.print %1 : !llvm.ptr
    toy.string_assign "u" = "hi"
    %2 = toy.load "u" : !llvm.ptr
    toy.print %2 : !llvm.ptr
    toy.string_assign "u" = "bye"
    %3 = toy.load "u" : !llvm.ptr
    toy.print %3 : !llvm.ptr
    toy.exit
  }
}

It’s a bit clunky, because I cheated and didn’t try to implement PRINT support of string literals directly. I thought that since I had variable support already (which emits llvm.alloca), I could change that alloca trivially to an array from a scalar value.

I think that this did turn out to be a relatively easy way to do it, but this little item did take much more effort than I expected.

The DeclareOp builder is fairly straightforward:

builder.create<toy::DeclareOp>( loc, builder.getStringAttr( varName ), mlir::TypeAttr::get( ty ), nullptr ); // for scalars
...
auto sizeAttr = builder.getI64IntegerAttr( arraySize );
builder.create<toy::DeclareOp>( loc, builder.getStringAttr( varName ), mlir::TypeAttr::get( ty ), sizeAttr ); // for arrays

This matches the Optional size now added to DeclareOp for arrays:

def Toy_DeclareOp : Op<Toy_Dialect, "declare"> {
  let summary = "Declare a variable or array, specifying its name, type (integer or float), and optional size.";
  let arguments = (ins StrAttr:$name, TypeAttr:$type, OptionalAttr:$size);
  let results = (outs);

There’s a new AssignStringOp complementing AssignOp:

def Toy_AssignOp : Op<Toy_Dialect, "assign"> {
  let summary = "Assign a (non-string) value to a variable associated with a declaration";
  let arguments = (ins StrAttr:$name, AnyType:$value);
  let results = (outs);

  // toy.assign "x", %0 : i32
  let assemblyFormat = "$name `,` $value `:` type($value) attr-dict";
}

def Toy_AssignStringOp : Op<Toy_Dialect, "string_assign"> {
  let summary = "Assign a string literal to an i8 array variable";
  let arguments = (ins StrAttr:$name, Builtin_StringAttr:$value);
  let results = (outs);
  let assemblyFormat = "$name `=` $value attr-dict";
}

I also feel this is a cludge. I probably really want a string literal type like flang’s. Here’s a fortran hello world:

program hello
  print *, "Hello, world!"
end program hello

and selected parts of the flang fir dialect MLIR for it:

    %4 = fir.declare %3 typeparams %c13 {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX48656C6C6F2C20776F726C6421"} : (!fir.ref<!fir.char<1,13>>, index) -> !fir.ref<!fir.char<1,13>>
    %5 = fir.convert %4 : (!fir.ref<!fir.char<1,13>>) -> !fir.ref
    %6 = fir.convert %c13 : (index) -> i64
    %7 = fir.call @_FortranAioOutputAscii(%2, %5, %6) fastmath : (!fir.ref, !fir.ref, i64) -> i1
  ...

  fir.global linkonce @_QQclX48656C6C6F2C20776F726C6421 constant : !fir.char<1,13> {
    %0 = fir.string_lit "Hello, world!"(13) : !fir.char<1,13>
    fir.has_value %0 : !fir.char<1,13>
  }

I had to specialize the LoadOp builder too so that it didn’t create a scalar load. That code looks like:

mlir::Type varType;
mlir::Type elemType = declareOp.getTypeAttr().getValue();
        
if ( declareOp.getSizeAttr() )    // Check if size attribute exists
{       
    // Array: load a generic pointer 
    varType = mlir::LLVM::LLVMPointerType::get( builder.getContext(), /*addressSpace=*/0 );
}       
else    
{       
    // Scalar: load the value
    varType = elemType;
}       
        
auto value = builder.create<toy::LoadOp>( loc, varType, builder.getStringAttr( varName ) );

Lowering didn’t require too much. I needed a print function object:

auto ptrType = LLVM::LLVMPointerType::get( ctx );
auto printFuncStringType = LLVM::LLVMFunctionType::get( LLVM::LLVMVoidType::get( ctx ),
                                                        { pr_builder.getI64Type(), ptrType }, false );
pr_printFuncString = pr_builder.create<LLVM::LLVMFuncOp>( pr_module.getLoc(), "__toy_print_string",
                                                          printFuncStringType, LLVM::Linkage::External );

With LoadOp now possibly having pointer valued return

if ( loadOp.getResult().getType().isa<mlir::LLVM::LLVMPointerType>() )
{           
    // Return the allocated pointer
    LLVM_DEBUG( llvm::dbgs() << "Loading array address: " << allocaOp.getResult() << '\n' );
    rewriter.replaceOp( op, allocaOp.getResult() );
}           
else        
{           
    // Scalar load
    auto load = rewriter.create<LLVM::LoadOp>( loc, elemType, allocaOp );
    LLVM_DEBUG( llvm::dbgs() << "new load op: " << load << '\n' );
    rewriter.replaceOp( op, load.getResult() );
}

assign-string lowering basically just generates a memcpy from a global:

Type elemType = allocaOp.getElemType();
int64_t numElems = 0;
if ( auto constOp = allocaOp.getArraySize().getDefiningOp<LLVM::ConstantOp>() )
{           
    auto intAttr = mlir::dyn_cast<IntegerAttr>( constOp.getValue() );
    numElems = intAttr.getInt();
}           
LLVM_DEBUG( llvm::dbgs() << "numElems: " << numElems << '\n' );
LLVM_DEBUG( llvm::dbgs() << "elemType: " << elemType << '\n' );

if ( !mlir::isa<mlir::IntegerType>( elemType ) || elemType.getIntOrFloatBitWidth() != 8 )
{           
    return rewriter.notifyMatchFailure( assignOp, "string assignment requires i8 array" );
}           
if ( numElems == 0 )
{           
    return rewriter.notifyMatchFailure( assignOp, "invalid array size" );
}           

size_t strLen = value.size();
size_t copySize = std::min( strLen + 1, static_cast<size_t>( numElems ) );
if ( strLen > static_cast<size_t>( numElems ) )
{           
    return rewriter.notifyMatchFailure( assignOp, "string too large for array" );
}           

mlir::LLVM::GlobalOp globalOp = lState.lookupOrInsertGlobalOp( rewriter, value, loc, copySize, strLen );

auto globalPtr = rewriter.create<LLVM::AddressOfOp>( loc, globalOp ); 

auto destPtr = allocaOp.getResult();

auto sizeConst = 
    rewriter.create<LLVM::ConstantOp>( loc, rewriter.getI64Type(), rewriter.getI64IntegerAttr( copySize ) );

rewriter.create<LLVM::MemcpyOp>( loc, destPtr, globalPtr, sizeConst, rewriter.getBoolAttr( false ) );

rewriter.eraseOp( op );

I used global’s like what we’d find in clang LLVM-IR. For example, here’s a C hello world:

#include <string.h>

int main()
{
    const char* s = "hi there";
    char buf[100];
    memcpy( buf, s, strlen( s ) + 1 );

    return strlen( buf );
}

where our LLVM-IR looks like:

@.str = private unnamed_addr constant [9 x i8] c"hi there\00", align 1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @main() #0 {
  %1 = alloca i32, align 4
  %2 = alloca ptr, align 8
  %3 = alloca [100 x i8], align 16
  store i32 0, ptr %1, align 4
  store ptr @.str, ptr %2, align 8
  %4 = getelementptr inbounds [100 x i8], ptr %3, i64 0, i64 0
  %5 = load ptr, ptr %2, align 8
  %6 = load ptr, ptr %2, align 8
  %7 = call i64 @strlen(ptr noundef %6) #3
  %8 = add i64 %7, 1
  call void @llvm.memcpy.p0.p0.i64(ptr align 16 %4, ptr align 1 %5, i64 %8, i1 false)
  %9 = getelementptr inbounds [100 x i8], ptr %3, i64 0, i64 0
  %10 = call i64 @strlen(ptr noundef %9) #3
  %11 = trunc i64 %10 to i32
  ret i32 %11
}

My lowered LLVM-IR for the program is similar:

@str_1 = private constant [3 x i8] c"bye"
@str_0 = private constant [2 x i8] c"hi"

declare void @__toy_print_f64(double)

declare void @__toy_print_i64(i64)

declare void @__toy_print_string(i64, ptr)

define i32 @main() {
  %1 = alloca i8, i64 2, align 1
  %2 = alloca i8, i64 3, align 1
  %3 = alloca i8, i64 2, align 1
  %4 = alloca i8, i64 1, align 1
  call void @llvm.memcpy.p0.p0.i64(ptr align 1 %3, ptr align 1 @str_0, i64 2, i1 false)
  call void @__toy_print_string(i64 2, ptr %3)
  call void @llvm.memcpy.p0.p0.i64(ptr align 1 %1, ptr align 1 @str_0, i64 2, i1 false)
  call void @__toy_print_string(i64 2, ptr %1)
  call void @llvm.memcpy.p0.p0.i64(ptr align 1 %2, ptr align 1 @str_0, i64 3, i1 false)
  call void @__toy_print_string(i64 3, ptr %2)
  call void @llvm.memcpy.p0.p0.i64(ptr align 1 %2, ptr align 1 @str_1, i64 3, i1 false)
  call void @__toy_print_string(i64 3, ptr %2)
  ret i32 0
}
...

I managed my string literals with a simple hash, avoiding replication if repeated:

mlir::LLVM::GlobalOp lookupOrInsertGlobalOp( ConversionPatternRewriter& rewriter, mlir::StringAttr& stringLit,
                                             mlir::Location loc, size_t copySize, size_t strLen )
{       
    mlir::LLVM::GlobalOp globalOp;
    auto it = pr_stringLiterals.find( stringLit.str() );
    if ( it != pr_stringLiterals.end() )
    {       
        globalOp = it->second;
        LLVM_DEBUG( llvm::dbgs() << "Reusing global: " << globalOp.getSymName() << '\n' ); 
    }       
    else    
    {       
        auto savedIP = rewriter.saveInsertionPoint();
        rewriter.setInsertionPointToStart( pr_module.getBody() );

        auto i8Type = rewriter.getI8Type();
        auto arrayType = mlir::LLVM::LLVMArrayType::get( i8Type, copySize );

        SmallVector<char> stringData( stringLit.begin(), stringLit.end() );
        if ( copySize > strLen )
        {       
            stringData.push_back( '\0' ); 
        }       
        auto denseAttr = DenseElementsAttr::get( RankedTensorType::get( { static_cast<int64_t>( copySize ) }, i8Type ),
                                                 ArrayRef<char>( stringData ) );

        std::string globalName = "str_" + std::to_string( pr_stringLiterals.size() );
        globalOp =
            rewriter.create<LLVM::GlobalOp>( loc, arrayType, true, LLVM::Linkage::Private, globalName, denseAttr );
        globalOp->setAttr( "unnamed_addr", rewriter.getUnitAttr() );

        pr_stringLiterals[stringLit.str()] = globalOp;
        LLVM_DEBUG( llvm::dbgs() << "Created global: " << globalName << '\n' );

        rewriter.restoreInsertionPoint( savedIP );
    }           

    return globalOp;
}

Without the insertion point swaperoo, this GlobalOp creation doesn’t work, as we need to be in the ModuleOp level where the symbol table lives.

… anyways, it looks like I’m droning on. There’s been lots of stuff to get this far, but there are still many many things to do before what I’ve got even qualifies as a basic programming language (if statements, loops, functions, array assignments, types, …)

Have added boolean operations to my toy MLIR compiler

May 30, 2025 C/C++ development and debugging. boolean, LLVM IR, lowering, MLIR, predicate

Screenshot

The git repo for the project now has a way to encode predicates, which I figured was a good first step towards adding some useful control flow (IF+LOOPS). Specifically, the toy language/compiler now supports the following operators:

This list works for any floating point or integer type (including BOOL, which is like “INT1”). I also added AND,OR,XOR (for integer types, including BOOL.) The grammar has a NOT operator, but it’s not implemented in the parser yet.

Here’s a sample program:

BOOL b;
BOOL i1;
INT16 l16;
i1 = TRUE;
l16 = -100;
b = i1 < l16;
PRINT b;
b = i1 > l16;
PRINT b;

My MLIR is:

module {
  toy.program {
    toy.declare "b" : i1
    toy.declare "i1" : i1
    toy.declare "l16" : i16
    %true = arith.constant true
    toy.assign "i1", %true : i1
    %c-100_i64 = arith.constant -100 : i64
    toy.assign "l16", %c-100_i64 : i64
    %0 = toy.load "i1" : i1
    %1 = toy.load "l16" : i16
    %2 = "toy.less"(%0, %1) : (i1, i16) -> i1
    toy.assign "b", %2 : i1
    %3 = toy.load "b" : i1
    toy.print %3 : i1
    %4 = toy.load "i1" : i1
    %5 = toy.load "l16" : i16
    %6 = "toy.less"(%5, %4) : (i16, i1) -> i1
    toy.assign "b", %6 : i1
    %7 = toy.load "b" : i1
    toy.print %7 : i1
    toy.exit
  }
}

Here’s the LLVM-IR after lowering:

declare void @__toy_print_f64(double)

declare void @__toy_print_i64(i64)

define i32 @main() !dbg !4 {
  %1 = alloca i1, i64 1, align 1, !dbg !8
    #dbg_declare(ptr %1, !9, !DIExpression(), !8)
  %2 = alloca i1, i64 1, align 1, !dbg !11
    #dbg_declare(ptr %2, !12, !DIExpression(), !11)
  %3 = alloca i16, i64 1, align 2, !dbg !13
    #dbg_declare(ptr %3, !14, !DIExpression(), !13)
  store i1 true, ptr %2, align 1, !dbg !16
  store i16 -100, ptr %3, align 2, !dbg !17
  %4 = load i1, ptr %2, align 1, !dbg !18
  %5 = load i16, ptr %3, align 2, !dbg !18
  %6 = zext i1 %4 to i16, !dbg !18
  %7 = icmp slt i16 %6, %5, !dbg !18
  store i1 %7, ptr %1, align 1, !dbg !18
  %8 = load i1, ptr %1, align 1, !dbg !19
  %9 = zext i1 %8 to i64, !dbg !19
  call void @__toy_print_i64(i64 %9), !dbg !19
  %10 = load i1, ptr %2, align 1, !dbg !20
  %11 = load i16, ptr %3, align 2, !dbg !20
  %12 = zext i1 %10 to i16, !dbg !20
  %13 = icmp slt i16 %11, %12, !dbg !20
  store i1 %13, ptr %1, align 1, !dbg !20
  %14 = load i1, ptr %1, align 1, !dbg !21
  %15 = zext i1 %14 to i64, !dbg !21
  call void @__toy_print_i64(i64 %15), !dbg !21
  ret i32 0, !dbg !8
}

I’m going to want to try to refactor the type conversion logic, as what I have now in lowering is pretty clunky.

Debugging now works in my toy MLIR compiler!

May 25, 2025 C/C++ development and debugging. DIBasicTypeAttr, DICompileUnitAttr, DILocalVariableAttr, DISubprogramAttr, DISubroutineTypeAttr, getFusedLoc, LLVM IR, lowering, MLIR, mlir::LLVM::DbgDeclareOp, verify

Screenshot

I’ve now got both line debugging (break, next, continue) working, and variable display (and modification) debugging now working for my toy language and compiler.

Here’s an example program:

BOOL i1;
i1 = TRUE;
PRINT i1;

INT8 i8;
i8 = 10;
PRINT i8;

INT16 i16;
i16 = 1000;
PRINT i16;

INT32 i32;
i32 = 100000;
PRINT i32;

INT64 i64;
i64 = 100000000000;
PRINT i64;

FLOAT32 f32;
f32 = 1.1;
PRINT f32;

FLOAT64 f64;
f64 = 2.2E-1;
PRINT f64;

It doesn’t do anything interesting, other than demonstrate that I got the DILocalVariableAttr declarations right for each supported type. Here’s the MLIR for this program:

"builtin.module"() ({
  "toy.program"() ({
    "toy.declare"() <{name = "i1", type = i1}> : () -> () loc(#loc)
    %0 = "arith.constant"() <{value = true}> : () -> i1 loc(#loc1)
    "toy.assign"(%0) <{name = "i1"}> : (i1) -> () loc(#loc1)
    %1 = "toy.load"() <{name = "i1"}> : () -> i1 loc(#loc2)
    "toy.print"(%1) : (i1) -> () loc(#loc2)
    "toy.declare"() <{name = "i8", type = i8}> : () -> () loc(#loc3)
    %2 = "arith.constant"() <{value = 10 : i64}> : () -> i64 loc(#loc4)
    "toy.assign"(%2) <{name = "i8"}> : (i64) -> () loc(#loc4)
    %3 = "toy.load"() <{name = "i8"}> : () -> i8 loc(#loc5)
    "toy.print"(%3) : (i8) -> () loc(#loc5)
    "toy.declare"() <{name = "i16", type = i16}> : () -> () loc(#loc6)
    %4 = "arith.constant"() <{value = 1000 : i64}> : () -> i64 loc(#loc7)
    "toy.assign"(%4) <{name = "i16"}> : (i64) -> () loc(#loc7)
    %5 = "toy.load"() <{name = "i16"}> : () -> i16 loc(#loc8)
    "toy.print"(%5) : (i16) -> () loc(#loc8)
    "toy.declare"() <{name = "i32", type = i32}> : () -> () loc(#loc9)
    %6 = "arith.constant"() <{value = 100000 : i64}> : () -> i64 loc(#loc10)
    "toy.assign"(%6) <{name = "i32"}> : (i64) -> () loc(#loc10)
    %7 = "toy.load"() <{name = "i32"}> : () -> i32 loc(#loc11)
    "toy.print"(%7) : (i32) -> () loc(#loc11)
    "toy.declare"() <{name = "i64", type = i64}> : () -> () loc(#loc12)
    %8 = "arith.constant"() <{value = 100000000000 : i64}> : () -> i64 loc(#loc13)
    "toy.assign"(%8) <{name = "i64"}> : (i64) -> () loc(#loc13)
    %9 = "toy.load"() <{name = "i64"}> : () -> i64 loc(#loc14)
    "toy.print"(%9) : (i64) -> () loc(#loc14)
    "toy.declare"() <{name = "f32", type = f32}> : () -> () loc(#loc15)
    %10 = "arith.constant"() <{value = 1.100000e+00 : f64}> : () -> f64 loc(#loc16)
    "toy.assign"(%10) <{name = "f32"}> : (f64) -> () loc(#loc16)
    %11 = "toy.load"() <{name = "f32"}> : () -> f32 loc(#loc17)
    "toy.print"(%11) : (f32) -> () loc(#loc17)
    "toy.declare"() <{name = "f64", type = f64}> : () -> () loc(#loc18)
    %12 = "arith.constant"() <{value = 2.200000e-01 : f64}> : () -> f64 loc(#loc19)
    "toy.assign"(%12) <{name = "f64"}> : (f64) -> () loc(#loc19)
    %13 = "toy.load"() <{name = "f64"}> : () -> f64 loc(#loc20)
    "toy.print"(%13) : (f64) -> () loc(#loc20)
    "toy.exit"() : () -> () loc(#loc)
  }) : () -> () loc(#loc)
}) : () -> () loc(#loc)
#loc = loc("types.toy":1:1)
#loc1 = loc("types.toy":2:6)
#loc2 = loc("types.toy":3:1)
#loc3 = loc("types.toy":5:1)
#loc4 = loc("types.toy":6:6)
#loc5 = loc("types.toy":7:1)
#loc6 = loc("types.toy":9:1)
#loc7 = loc("types.toy":10:7)
#loc8 = loc("types.toy":11:1)
#loc9 = loc("types.toy":13:1)
#loc10 = loc("types.toy":14:7)
#loc11 = loc("types.toy":15:1)
#loc12 = loc("types.toy":17:1)
#loc13 = loc("types.toy":18:7)
#loc14 = loc("types.toy":19:1)
#loc15 = loc("types.toy":21:1)
#loc16 = loc("types.toy":22:7)
#loc17 = loc("types.toy":23:1)
#loc18 = loc("types.toy":25:1)
#loc19 = loc("types.toy":26:7)
#loc20 = loc("types.toy":27:1)

and the generated LLVM-IR

; ModuleID = 'types.toy'
source_filename = "types.toy"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare void @__toy_print_f64(double)

declare void @__toy_print_i64(i64)

define i32 @main() !dbg !4 {
  %1 = alloca i1, i64 1, align 1, !dbg !8
    #dbg_declare(ptr %1, !9, !DIExpression(), !8)
  store i1 true, ptr %1, align 1, !dbg !11
  %2 = load i1, ptr %1, align 1, !dbg !12
  %3 = zext i1 %2 to i64, !dbg !12
  call void @__toy_print_i64(i64 %3), !dbg !12
  %4 = alloca i8, i64 1, align 1, !dbg !13
    #dbg_declare(ptr %4, !14, !DIExpression(), !13)
  store i8 10, ptr %4, align 1, !dbg !16
  %5 = load i8, ptr %4, align 1, !dbg !17
  %6 = sext i8 %5 to i64, !dbg !17
  call void @__toy_print_i64(i64 %6), !dbg !17
  %7 = alloca i16, i64 1, align 2, !dbg !18
    #dbg_declare(ptr %7, !19, !DIExpression(), !18)
  store i16 1000, ptr %7, align 2, !dbg !21
  %8 = load i16, ptr %7, align 2, !dbg !22
  %9 = sext i16 %8 to i64, !dbg !22
  call void @__toy_print_i64(i64 %9), !dbg !22
  %10 = alloca i32, i64 1, align 4, !dbg !23
    #dbg_declare(ptr %10, !24, !DIExpression(), !23)
  store i32 100000, ptr %10, align 4, !dbg !26
  %11 = load i32, ptr %10, align 4, !dbg !27
  %12 = sext i32 %11 to i64, !dbg !27
  call void @__toy_print_i64(i64 %12), !dbg !27
  %13 = alloca i64, i64 1, align 8, !dbg !28
    #dbg_declare(ptr %13, !29, !DIExpression(), !28)
  store i64 100000000000, ptr %13, align 8, !dbg !31
  %14 = load i64, ptr %13, align 8, !dbg !32
  call void @__toy_print_i64(i64 %14), !dbg !32
  %15 = alloca float, i64 1, align 4, !dbg !33
    #dbg_declare(ptr %15, !34, !DIExpression(), !33)
  store float 0x3FF19999A0000000, ptr %15, align 4, !dbg !36
  %16 = load float, ptr %15, align 4, !dbg !37
  %17 = fpext float %16 to double, !dbg !37
  call void @__toy_print_f64(double %17), !dbg !37
  %18 = alloca double, i64 1, align 8, !dbg !38
    #dbg_declare(ptr %18, !39, !DIExpression(), !38)
  store double 2.200000e-01, ptr %18, align 8, !dbg !41
  %19 = load double, ptr %18, align 8, !dbg !42
  call void @__toy_print_f64(double %19), !dbg !42
  ret i32 0, !dbg !8
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare void @llvm.dbg.declare(metadata, metadata, metadata) #0

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

!llvm.module.flags = !{!0}
!llvm.dbg.cu = !{!1}
!llvm.ident = !{!3}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = distinct !DICompileUnit(language: DW_LANG_C, file: !2, producer: "toycalculator", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug)
!2 = !DIFile(filename: "types.toy", directory: ".")
!3 = !{!"toycalculator V2"}
!4 = distinct !DISubprogram(name: "main", linkageName: "main", scope: !2, file: !2, line: 1, type: !5, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !1)
!5 = !DISubroutineType(types: !6)
!6 = !{!7}
!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!8 = !DILocation(line: 1, column: 1, scope: !4)
!9 = !DILocalVariable(name: "i1", scope: !4, file: !2, line: 1, type: !10, align: 8)
!10 = !DIBasicType(name: "bool", size: 8, encoding: DW_ATE_boolean)
!11 = !DILocation(line: 2, column: 6, scope: !4)
!12 = !DILocation(line: 3, column: 1, scope: !4)
!13 = !DILocation(line: 5, column: 1, scope: !4)
!14 = !DILocalVariable(name: "i8", scope: !4, file: !2, line: 5, type: !15, align: 8)
!15 = !DIBasicType(name: "int8_t", size: 8, encoding: DW_ATE_signed)
!16 = !DILocation(line: 6, column: 6, scope: !4)
!17 = !DILocation(line: 7, column: 1, scope: !4)
!18 = !DILocation(line: 9, column: 1, scope: !4)
!19 = !DILocalVariable(name: "i16", scope: !4, file: !2, line: 9, type: !20, align: 16)
!20 = !DIBasicType(name: "int16_t", size: 16, encoding: DW_ATE_signed)
!21 = !DILocation(line: 10, column: 7, scope: !4)
!22 = !DILocation(line: 11, column: 1, scope: !4)
!23 = !DILocation(line: 13, column: 1, scope: !4)
!24 = !DILocalVariable(name: "i32", scope: !4, file: !2, line: 13, type: !25, align: 32)
!25 = !DIBasicType(name: "int32_t", size: 32, encoding: DW_ATE_signed)
!26 = !DILocation(line: 14, column: 7, scope: !4)
!27 = !DILocation(line: 15, column: 1, scope: !4)
!28 = !DILocation(line: 17, column: 1, scope: !4)
!29 = !DILocalVariable(name: "i64", scope: !4, file: !2, line: 17, type: !30, align: 64)
!30 = !DIBasicType(name: "int64_t", size: 64, encoding: DW_ATE_signed)
!31 = !DILocation(line: 18, column: 7, scope: !4)
!32 = !DILocation(line: 19, column: 1, scope: !4)
!33 = !DILocation(line: 21, column: 1, scope: !4)
!34 = !DILocalVariable(name: "f32", scope: !4, file: !2, line: 21, type: !35, align: 32)
!35 = !DIBasicType(name: "float", size: 32, encoding: DW_ATE_float)
!36 = !DILocation(line: 22, column: 7, scope: !4)
!37 = !DILocation(line: 23, column: 1, scope: !4)
!38 = !DILocation(line: 25, column: 1, scope: !4)
!39 = !DILocalVariable(name: "f64", scope: !4, file: !2, line: 25, type: !40, align: 64)
!40 = !DIBasicType(name: "double", size: 64, encoding: DW_ATE_float)
!41 = !DILocation(line: 26, column: 7, scope: !4)
!42 = !DILocation(line: 27, column: 1, scope: !4)

Interesting bits include:

source_filename = "types.toy"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

!llvm.module.flags = !{!0}
!llvm.dbg.cu = !{!1}
!llvm.ident = !{!3}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = distinct !DICompileUnit(language: DW_LANG_C, file: !2, producer: "toycalculator", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug)
!2 = !DIFile(filename: "types.toy", directory: ".")
!3 = !{!"toycalculator V2"}
!4 = distinct !DISubprogram(name: "main", linkageName: "main", scope: !2, file: !2, line: 1, type: !5, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !1)
!5 = !DISubroutineType(types: !6)
!6 = !{!7}

Unlike flang’s AddDebugInfoPass DI instrumentation pass, I didn’t try to do anything fancy, instead just implemented a couple of helper functions.

One for the target triple:

void setModuleAttrs()
{
    std::string targetTriple = llvm::sys::getDefaultTargetTriple();
    llvm::Triple triple( targetTriple );
    assert( triple.isArch64Bit() &amp;amp;amp;amp;&amp;amp;amp;amp; triple.isOSLinux() );

    std::string error;
    const llvm::Target* target = llvm::TargetRegistry::lookupTarget( targetTriple, error );
    assert( target );
    llvm::TargetOptions options;
    auto targetMachine = std::unique_ptr&amp;amp;amp;lt;llvm::TargetMachine&amp;amp;amp;gt;( target-&amp;amp;amp;gt;createTargetMachine(
        targetTriple, "generic", "", options, std::optional&amp;amp;amp;lt;llvm::Reloc::Model&amp;amp;amp;gt;( llvm::Reloc::PIC_ ) ) );
    assert( targetMachine );
    std::string dataLayoutStr = targetMachine-&amp;amp;amp;gt;createDataLayout().getStringRepresentation();

    module-&amp;amp;amp;gt;setAttr( "llvm.ident", builder.getStringAttr( COMPILER_NAME COMPILER_VERSION ) );
    module-&amp;amp;amp;gt;setAttr( "llvm.data_layout", builder.getStringAttr( dataLayoutStr ) );
    module-&amp;amp;amp;gt;setAttr( "llvm.target_triple", builder.getStringAttr( targetTriple ) );
}

one for the DICompileUnitAttr, and DISubprogramAttr:

void createMain()
{
    auto ctx = builder.getContext();
    auto mainFuncType = LLVM::LLVMFunctionType::get( builder.getI32Type(), {}, false );
    mainFunc =
        builder.create&amp;amp;amp;lt;LLVM::LLVMFuncOp&amp;amp;amp;gt;( module.getLoc(), ENTRY_SYMBOL_NAME, mainFuncType, LLVM::Linkage::External );

    // Construct module level DI state:
    fileAttr = mlir::LLVM::DIFileAttr::get( ctx, driverState.filename, "." );
    auto distinctAttr = mlir::DistinctAttr::create( builder.getUnitAttr() );
    auto compileUnitAttr = mlir::LLVM::DICompileUnitAttr::get(
        ctx, distinctAttr, llvm::dwarf::DW_LANG_C, fileAttr, builder.getStringAttr( COMPILER_NAME ), false,
        mlir::LLVM::DIEmissionKind::Full, mlir::LLVM::DINameTableKind::Default );
    auto ta =
        mlir::LLVM::DIBasicTypeAttr::get( ctx, (unsigned)llvm::dwarf::DW_TAG_base_type, builder.getStringAttr( "int" ),
                                          32, (unsigned)llvm::dwarf::DW_ATE_signed );
    llvm::SmallVector&amp;amp;amp;lt;mlir::LLVM::DITypeAttr, 1&amp;amp;amp;gt; typeArray;
    typeArray.push_back( ta );
    auto subprogramType = mlir::LLVM::DISubroutineTypeAttr::get( ctx, 0, typeArray );
    subprogramAttr = mlir::LLVM::DISubprogramAttr::get(
        ctx, mlir::DistinctAttr::create( builder.getUnitAttr() ), compileUnitAttr, fileAttr,
        builder.getStringAttr( ENTRY_SYMBOL_NAME ), builder.getStringAttr( ENTRY_SYMBOL_NAME ), fileAttr, 1, 1,
        mlir::LLVM::DISubprogramFlags::Definition, subprogramType, llvm::ArrayRef&amp;amp;amp;lt;mlir::LLVM::DINodeAttr&amp;amp;amp;gt;{},
        llvm::ArrayRef&amp;amp;amp;lt;mlir::LLVM::DINodeAttr&amp;amp;amp;gt;{} );
    mainFunc-&amp;amp;amp;gt;setAttr( "llvm.debug.subprogram", subprogramAttr );

    // This is the key to ensure that translateModuleToLLVMIR does not strip the location info (instead converts
    // loc's into !dbg's)
    mainFunc-&amp;amp;amp;gt;setLoc( builder.getFusedLoc( { module.getLoc() }, subprogramAttr ) );
}

The ‘setLoc’ call above, right near the end is critical. Without that, the call to mlir::translateModuleToLLVMIR strips out all the loc() references, instead of replacing them with !DILocation.

Finally, one for the variable DI creation:

void constructVariableDI( llvm::StringRef varName, mlir::Type&amp;amp;amp;amp; elemType, mlir::FileLineColLoc loc,
                          unsigned elemSizeInBits, mlir::LLVM::AllocaOp&amp;amp;amp;amp; allocaOp )
{
    auto ctx = builder.getContext();
    allocaOp-&amp;amp;amp;gt;setAttr( "bindc_name", builder.getStringAttr( varName ) );

    mlir::LLVM::DILocalVariableAttr diVar;

    if ( elemType.isa&amp;amp;amp;lt;mlir::IntegerType&amp;amp;amp;gt;() )
    {
        const char* typeName{};
        unsigned dwType = llvm::dwarf::DW_ATE_signed;
        unsigned sz = elemSizeInBits;

        switch ( elemSizeInBits )
        {
            case 1:
            {
                typeName = "bool";
                dwType = llvm::dwarf::DW_ATE_boolean;
                sz = 8;
                break;
            }
            case 8:
            {
                typeName = "int8_t";
                break;
            }
            case 16:
            {
                typeName = "int16_t";
                break;
            }
            case 32:
            {
                typeName = "int32_t";
                break;
            }
            case 64:
            {
                typeName = "int64_t";
                break;
            }
            default:
            {
                llvm_unreachable( "Unsupported float type size" );
            }
        }

        auto diType = mlir::LLVM::DIBasicTypeAttr::get( ctx, llvm::dwarf::DW_TAG_base_type,
                                                        builder.getStringAttr( typeName ), sz, dwType );

        diVar = mlir::LLVM::DILocalVariableAttr::get( ctx, subprogramAttr, builder.getStringAttr( varName ), fileAttr,
                                                      loc.getLine(), 0, sz, diType, mlir::LLVM::DIFlags::Zero );
    }
    else
    {
        const char* typeName{};

        switch ( elemSizeInBits )
        {
            case 32:
            {
                typeName = "float";
                break;
            }
            case 64:
            {
                typeName = "double";
                break;
            }
            default:
            {
                llvm_unreachable( "Unsupported float type size" );
            }
        }

        auto diType =
            mlir::LLVM::DIBasicTypeAttr::get( ctx, llvm::dwarf::DW_TAG_base_type, builder.getStringAttr( typeName ),
                                              elemSizeInBits, llvm::dwarf::DW_ATE_float );

        diVar =
            mlir::LLVM::DILocalVariableAttr::get( ctx, subprogramAttr, builder.getStringAttr( varName ), fileAttr,
                                                  loc.getLine(), 0, elemSizeInBits, diType, mlir::LLVM::DIFlags::Zero );
    }
            
    builder.setInsertionPointAfter( allocaOp );
    builder.create&amp;amp;amp;lt;mlir::LLVM::DbgDeclareOp&amp;amp;amp;gt;( loc, allocaOp, diVar );
        
    symbolToAlloca[varName] = allocaOp;
}

In this code, the call to builder.setInsertionPointAfter is critical. When the lowering eraseOp takes out the DeclareOp, we need the replacement instructions to all end up in the same place. Without that, the subsequent AssignOp lowering results in an error like this:

//===-------------------------------------------===//
Legalizing operation : 'toy.assign'(0x2745ab50) {
  "toy.assign"(%3) <{name = "x"}> : (i64) -> ()Fold {
  } -> FAILURE : unable to fold
Pattern : 'toy.assign -> ()' {
Trying to match "toy::AssignOpLowering"
Lowering AssignOp: toy.assign "x", %c5_i64 : i64
name: x
value: ImplicitTypeIDRegistry::lookupOrInsert(mlir::PromotableOpInterface::Trait<mlir::TypeID::get()::Empty>)
...
operand #0 does not dominate this use
mlir-asm-printer: 'builtin.module' failed to verify and will be printed in generic form
%3 = "arith.constant"() <{value = 5 : i64}> : () -> i64
valType: i64
elemType: f64
** Insert  : 'llvm.sitofp'(0x274a6ed0)
ImplicitTypeIDRegistry::lookupOrInsert(mlir::LLVM::detail::StoreOpGenericAdaptorBase::Properties)
** Insert  : 'llvm.store'(0x27437f30)
** Erase   : 'toy.assign'(0x2745ab50)
"toy::AssignOpLowering" result 1

My DI insertion isn’t fancy like flang’s, but I have only simple types to deal with, and don’t even support functions yet, so my simple way seemed like a reasonable choice. Regardless, getting working debugger support is nice milestone.

The quest for DWARF instrumentation in a toy MLIR compiler

May 25, 2025 C/C++ development and debugging. debug, dwarf, flang, fused-location, gdb, LLVM IR, lowering, MLIR

I’ve gotten my toy compiler project to the stage where I can generate a program, compile and run it. Here’s a little example:

Screenshot

This is the MLIR that is generated for this little program:

"builtin.module"() ({
  "toy.program"() ({
    "toy.declare"() <{name = "x", type = f64}> : () -> () loc(#loc)
    %0 = "arith.constant"() <{value = 5 : i64}> : () -> i64 loc(#loc1)
    %1 = "arith.constant"() <{value = 3.140000e+00 : f64}> : () -> f64 loc(#loc1)
    %2 = "toy.add"(%0, %1) : (i64, f64) -> f64 loc(#loc1)
    "toy.assign"(%2) <{name = "x"}> : (f64) -> () loc(#loc1)
    %3 = "toy.load"() <{name = "x"}> : () -> f64 loc(#loc2)
    "toy.print"(%3) : (f64) -> () loc(#loc2)
    "toy.declare"() <{name = "y", type = f64}> : () -> () loc(#loc3)
    %4 = "toy.load"() <{name = "x"}> : () -> f64 loc(#loc4)
    %5 = "arith.constant"() <{value = 2 : i64}> : () -> i64 loc(#loc4)
    %6 = "toy.mul"(%4, %5) : (f64, i64) -> f64 loc(#loc4)
    "toy.assign"(%6) <{name = "y"}> : (f64) -> () loc(#loc4)
    %7 = "toy.load"() <{name = "y"}> : () -> f64 loc(#loc5)
    "toy.print"(%7) : (f64) -> () loc(#loc5)
    "toy.exit"() : () -> () loc(#loc)
  }) : () -> () loc(#loc)
}) : () -> () loc(#loc)
#loc = loc("test.toy":1:1)
#loc1 = loc("test.toy":2:5)
#loc2 = loc("test.toy":3:1)
#loc3 = loc("test.toy":4:1)
#loc4 = loc("test.toy":5:5)
#loc5 = loc("test.toy":6:1)

Notice how the location information is baked into the cake. After lowering of all the MLIR operations to LLVM-IR (except for the top most MLIR module), we have:

"builtin.module"() ({
  "llvm.func"() <{CConv = #llvm.cconv, function_type = !llvm.func, linkage = #llvm.linkage, sym_name = "__toy_print_f64", visibility_ = 0 : i64}> ({
  }) : () -> () loc(#loc)
  "llvm.func"() <{CConv = #llvm.cconv, function_type = !llvm.func, linkage = #llvm.linkage, sym_name = "__toy_print_i64", visibility_ = 0 : i64}> ({
  }) : () -> () loc(#loc)
  "llvm.func"() <{CConv = #llvm.cconv, function_type = !llvm.func, linkage = #llvm.linkage, sym_name = "main", visibility_ = 0 : i64}> ({
    %0 = "llvm.mlir.constant"() <{value = 1 : i64}> : () -> i64 loc(#loc)
    %1 = "llvm.alloca"(%0) <{alignment = 8 : i64, elem_type = f64}> : (i64) -> !llvm.ptr loc(#loc)
    %2 = "llvm.mlir.constant"() <{value = 5 : i64}> : () -> i64 loc(#loc1)
    %3 = "llvm.mlir.constant"() <{value = 3.140000e+00 : f64}> : () -> f64 loc(#loc1)
    %4 = "llvm.sitofp"(%2) : (i64) -> f64 loc(#loc1)
    %5 = "llvm.fadd"(%4, %3) <{fastmathFlags = #llvm.fastmath}> : (f64, f64) -> f64 loc(#loc1)
    "llvm.store"(%5, %1) <{ordering = 0 : i64}> : (f64, !llvm.ptr) -> () loc(#loc1)
    %6 = "llvm.load"(%1) <{ordering = 0 : i64}> : (!llvm.ptr) -> f64 loc(#loc2)
    "llvm.call"(%6) <{CConv = #llvm.cconv, TailCallKind = #llvm.tailcallkind, callee = @__toy_print_f64, fastmathFlags = #llvm.fastmath, op_bundle_sizes = array, operandSegmentSizes = array}> : (f64) -> () loc(#loc2)
    %7 = "llvm.alloca"(%0) <{alignment = 8 : i64, elem_type = f64}> : (i64) -> !llvm.ptr loc(#loc3)
    %8 = "llvm.load"(%1) <{ordering = 0 : i64}> : (!llvm.ptr) -> f64 loc(#loc4)
    %9 = "llvm.mlir.constant"() <{value = 2 : i64}> : () -> i64 loc(#loc4)
    %10 = "llvm.sitofp"(%9) : (i64) -> f64 loc(#loc4)
    %11 = "llvm.fmul"(%8, %10) <{fastmathFlags = #llvm.fastmath}> : (f64, f64) -> f64 loc(#loc4)
    "llvm.store"(%11, %7) <{ordering = 0 : i64}> : (f64, !llvm.ptr) -> () loc(#loc4)
    %12 = "llvm.load"(%7) <{ordering = 0 : i64}> : (!llvm.ptr) -> f64 loc(#loc5)
    "llvm.call"(%12) <{CConv = #llvm.cconv, TailCallKind = #llvm.tailcallkind, callee = @__toy_print_f64, fastmathFlags = #llvm.fastmath, op_bundle_sizes = array, operandSegmentSizes = array}> : (f64) -> () loc(#loc5)
    %13 = "llvm.mlir.constant"() <{value = 0 : i32}> : () -> i32 loc(#loc)
    "llvm.return"(%13) : (i32) -> () loc(#loc)
  }) : () -> () loc(#loc)
}) : () -> () loc(#loc)
#loc = loc("../samples/test.toy":1:1)
#loc1 = loc("../samples/test.toy":2:5)
#loc2 = loc("../samples/test.toy":3:1)
#loc3 = loc("../samples/test.toy":4:1)
#loc4 = loc("../samples/test.toy":5:5)
#loc5 = loc("../samples/test.toy":6:1)

All is still good. See how all the location information has been retained through the lowering to LLVM-IR transformations. However, as soon as we run mlir::translateModuleToLLVMIR, our final LLVM-IR ends up stripped of all the location information:

; ModuleID = '../samples/test.toy'
source_filename = "../samples/test.toy"

declare void @__toy_print_f64(double)

declare void @__toy_print_i64(i64)

define i32 @main() {
  %1 = alloca double, i64 1, align 8
  store double 8.140000e+00, ptr %1, align 8
  %2 = load double, ptr %1, align 8
  call void @__toy_print_f64(double %2)
  %3 = alloca double, i64 1, align 8
  %4 = load double, ptr %1, align 8
  %5 = fmul double %4, 2.000000e+00
  store double %5, ptr %3, align 8
  %6 = load double, ptr %3, align 8
  call void @__toy_print_f64(double %6)
  ret i32 0
}

!llvm.module.flags = !{!0}

!0 = !{i32 2, !"Debug Info Version", i32 3}

If we build a simple standalone C program with clang, it’s a much different story:

int main() {
    int x = 42;
    return 0;
}

fedora:/home/pjoot/toycalculator/prototypes> clang -g -S -emit-llvm test.c
fedora:/home/pjoot/toycalculator/prototypes> cat test.s
; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-redhat-linux-gnu"

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @main() #0 !dbg !8 {
  %1 = alloca i32, align 4
  %2 = alloca i32, align 4
  store i32 0, ptr %1, align 4
    #dbg_declare(ptr %2, !13, !DIExpression(), !14)
  store i32 42, ptr %2, align 4, !dbg !14
  ret i32 0, !dbg !15
}

attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!2, !3, !4, !5, !6}
!llvm.ident = !{!7}

!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 20.1.3 (Fedora 20.1.3-1.fc42)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
!1 = !DIFile(filename: "test.c", directory: "/home/pjoot/toycalculator/prototypes", checksumkind: CSK_MD5, checksum: "8d269ab8efc2854a823fb05a50ccf4b0")
!2 = !{i32 7, !"Dwarf Version", i32 5}
!3 = !{i32 2, !"Debug Info Version", i32 3}
!4 = !{i32 1, !"wchar_size", i32 4}
!5 = !{i32 7, !"uwtable", i32 2}
!6 = !{i32 7, !"frame-pointer", i32 2}
!7 = !{!"clang version 20.1.3 (Fedora 20.1.3-1.fc42)"}
!8 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 1, type: !9, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !12)
!9 = !DISubroutineType(types: !10)
!10 = !{!11}
!11 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!12 = !{}
!13 = !DILocalVariable(name: "x", scope: !8, file: !1, line: 2, type: !11)
!14 = !DILocation(line: 2, column: 9, scope: !8)
!15 = !DILocation(line: 3, column: 5, scope: !8)

This has a number of features that our automatically-lowered LLVM-IR does not have including:

!DICompileUnit
!DIFile
Dwarf Version
!DISubprogram
!DILocation
!llvm.dbg.cu
source_filename
target datalayout
target triple

It doesn’t surprise me that we don’t have some of these (like target triple, …) without doing extra work, but I’d have expected the location info to be converted into !DILocation.

It’s possible to emit DWARF info from an LLVM builder, which is how a non-MLIR project would do it. I have brutal hack of an example of that in my project — that program is a little standalone beastie that constructs equivalent LLVM-IR for a simulated C program. After MLIR lowering to LLVM-IR, I’m able to inject all the DI instrumentation (relying on the fact that I know the operations in this program, and the line numbers for those operations). It would be very difficult to do this in my toy-language compiler driver, since that has to work for all programs. It is totally and completely obvious that this is the WRONG way to do it. However, the little program is at least debuggable, and perhaps worth something for that reason.

We see the same sort of DI artifacts if we build a Fortran program with flang (not classic-flang, but MLIR flang):

program main
  implicit none
  integer :: x
  x = 42
  print *, x
end program main

The flang LL has all the expected DI elements:

; ModuleID = 'FIRModule'
source_filename = "FIRModule"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

$_QQclXc451cd8b1f5eae110b31be7df6c0fcf9 = comdat any

@_QQclXc451cd8b1f5eae110b31be7df6c0fcf9 = linkonce constant [43 x i8] c"/home/pjoot/toycalculator/fortran/test.f90\00", comdat

define void @_QQmain() #0 !dbg !6 {
  %1 = alloca i32, i64 1, align 4, !dbg !9
    #dbg_declare(ptr %1, !10, !DIExpression(), !9)
  store i32 42, ptr %1, align 4, !dbg !12
  %2 = call ptr @_FortranAioBeginExternalListOutput(i32 6, ptr @_QQclXc451cd8b1f5eae110b31be7df6c0fcf9, i32 5), !dbg !13
  %3 = load i32, ptr %1, align 4, !dbg !14
  %4 = call i1 @_FortranAioOutputInteger32(ptr %2, i32 %3), !dbg !14
  %5 = call i32 @_FortranAioEndIoStatement(ptr %2), !dbg !13
  ret void, !dbg !15
}

declare !dbg !16 ptr @_FortranAioBeginExternalListOutput(i32, ptr, i32)

declare !dbg !21 zeroext i1 @_FortranAioOutputInteger32(ptr, i32)

declare !dbg !25 i32 @_FortranAioEndIoStatement(ptr)

declare !dbg !28 void @_FortranAProgramStart(i32, ptr, ptr, ptr)

declare !dbg !31 void @_FortranAProgramEndStatement()

define i32 @main(i32 %0, ptr %1, ptr %2) #0 !dbg !33 {
  call void @_FortranAProgramStart(i32 %0, ptr %1, ptr %2, ptr null), !dbg !36
  call void @_QQmain(), !dbg !36
  call void @_FortranAProgramEndStatement(), !dbg !36
  ret i32 0, !dbg !36
}

attributes #0 = { "frame-pointer"="all" "target-cpu"="x86-64" }

!llvm.module.flags = !{!0, !1, !2}
!llvm.dbg.cu = !{!3}
!llvm.ident = !{!5}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = !{i32 8, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = distinct !DICompileUnit(language: DW_LANG_Fortran95, file: !4, producer: "flang version 20.1.5 (https://github.com/llvm/llvm-project.git 7b09d7b446383b71b63d429b21ee45ba389c5134)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug)
!4 = !DIFile(filename: "test.f90", directory: "/home/pjoot/toycalculator/fortran")
!5 = !{!"flang version 20.1.5 (https://github.com/llvm/llvm-project.git 7b09d7b446383b71b63d429b21ee45ba389c5134)"}
!6 = distinct !DISubprogram(name: "main", linkageName: "_QQmain", scope: !4, file: !4, line: 1, type: !7, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagMainSubprogram, unit: !3)
!7 = !DISubroutineType(cc: DW_CC_program, types: !8)
!8 = !{null}
!9 = !DILocation(line: 3, column: 14, scope: !6)
!10 = !DILocalVariable(name: "x", scope: !6, file: !4, line: 3, type: !11)
!11 = !DIBasicType(name: "integer", size: 32, encoding: DW_ATE_signed)
!12 = !DILocation(line: 4, column: 3, scope: !6)
!13 = !DILocation(line: 5, column: 3, scope: !6)
!14 = !DILocation(line: 5, column: 12, scope: !6)
!15 = !DILocation(line: 6, column: 1, scope: !6)
!16 = !DISubprogram(name: "_FortranAioBeginExternalListOutput", linkageName: "_FortranAioBeginExternalListOutput", scope: !4, file: !4, line: 5, type: !17, scopeLine: 5, spFlags: 0)
!17 = !DISubroutineType(cc: DW_CC_normal, types: !18)
!18 = !{!19, !11, !20, !11}
!19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
!20 = !DIBasicType(name: "integer", size: 8, encoding: DW_ATE_signed)
!21 = !DISubprogram(name: "_FortranAioOutputInteger32", linkageName: "_FortranAioOutputInteger32", scope: !4, file: !4, line: 5, type: !22, scopeLine: 5, spFlags: 0)
!22 = !DISubroutineType(cc: DW_CC_normal, types: !23)
!23 = !{!24, !20, !11}
!24 = !DIBasicType(name: "integer", size: 1, encoding: DW_ATE_signed)
!25 = !DISubprogram(name: "_FortranAioEndIoStatement", linkageName: "_FortranAioEndIoStatement", scope: !4, file: !4, line: 5, type: !26, scopeLine: 5, spFlags: 0)
!26 = !DISubroutineType(cc: DW_CC_normal, types: !27)
!27 = !{!11, !20}
!28 = !DISubprogram(name: "_FortranAProgramStart", linkageName: "_FortranAProgramStart", scope: !4, file: !4, line: 6, type: !29, scopeLine: 6, spFlags: 0)
!29 = !DISubroutineType(cc: DW_CC_normal, types: !30)
!30 = !{null, !11, !11, !11, !11}
!31 = !DISubprogram(name: "_FortranAProgramEndStatement", linkageName: "_FortranAProgramEndStatement", scope: !4, file: !4, line: 6, type: !32, scopeLine: 6, spFlags: 0)
!32 = !DISubroutineType(cc: DW_CC_normal, types: !8)
!33 = distinct !DISubprogram(name: "main", linkageName: "main", scope: !4, file: !4, line: 6, type: !34, scopeLine: 6, spFlags: DISPFlagDefinition, unit: !3)
!34 = !DISubroutineType(cc: DW_CC_normal, types: !35)
!35 = !{!11, !11, !11, !11}
!36 = !DILocation(line: 6, column: 1, scope: !33)

Ignoring all the language bootstrap gorp, this has all the expected DI elements, so we know that there’s a mechanism for flang to emit the DWARF instrumentation.

Looking through the flang source, the only place I found any sort of DI instrumentation was flang/lib/Optimizer/Transforms/AddDebugInfo.cpp, which, in about 600 lines, appears to implement all the DI instrumentation needed for variable and function declarations for the Fortran language (as a MLIR pass.) There’s nothing in there for location translation that I can see, except for a couple of little sneaky bits

bool debugInfoIsAlreadySet(mlir::Location loc) {
  if (mlir::isa<mlir::FusedLoc>(loc)) {
    if (loc->findInstanceOf<mlir::FusedLocWith<fir::LocationKindAttr>>())
      return false;
    return true;
  }
  return false;
}

...
declOp->setLoc(builder.getFusedLoc({declOp->getLoc()}, localVarAttr));
...
globalOp->setLoc(builder.getFusedLoc({globalOp.getLoc()}, arrayAttr));
...

void AddDebugInfoPass::handleFuncOp(mlir::func::FuncOp funcOp,
                                    mlir::LLVM::DIFileAttr fileAttr,
                                    mlir::LLVM::DICompileUnitAttr cuAttr,
                                    fir::DebugTypeGenerator &typeGen,
                                    mlir::SymbolTable *symbolTable) {
  mlir::Location l = funcOp->getLoc();
  // If fused location has already been created then nothing to do
  // Otherwise, create a fused location.
  if (debugInfoIsAlreadySet(l))
    return;
...
// We have the imported entities now. Generate the final DISubprogramAttr.
  spAttr = mlir::LLVM::DISubprogramAttr::get(
      context, recId, /*isRecSelf=*/false, id2, compilationUnit, Scope,
      funcName, fullName, funcFileAttr, line, line, subprogramFlags,
      subTypeAttr, entities, /*annotations=*/{});
  funcOp->setLoc(builder.getFusedLoc({l}, spAttr));
...

It looks as if a fused location is used to replace any MLIR supplied location for each function, but all the rest of the location info is left as is. My presumption is that translateModuleToLLVMIR() handles the plain old loc() references to !dbg. To get an idea what happens in AddDebugInfoPass, let’s look at the MLIR dump at the beginning, and compare to afterwards.

To see this, I used the following gdb script:

set follow-fork-mode child
set detach-on-fork off
set breakpoint pending on
break AddDebugInfoPass::runOnOperation
run

Here’s the debug session to see the module dumps before and after AddDebugInfoPass

fedora:/home/pjoot/toycalculator/fortran> gdb -q -x b --args /usr/local/llvm-20.1.5/bin/flang-new -g -S -emit-llvm test.f90 -o test.ll
...
[Switching to Thread 0x7fffda437640 (LWP 119612)]

Thread 2.1 "flang" hit Breakpoint 1.2, (anonymous namespace)::AddDebugInfoPass::runOnOperation (this=0x6368e0)
    at /home/pjoot/llvm-project/flang/lib/Optimizer/Transforms/AddDebugInfo.cpp:520
520	  mlir::ModuleOp module = getOperation();
(gdb) n
521	  mlir::MLIRContext *context = &getContext();
(gdb) p module->dump()
module attributes {dlti.dl_spec = #dlti.dl_spec : vector<2xi64>, !llvm.ptr<270> = dense<32> : vector<4xi64>, f16 = dense<16> : vector<2xi64>, f64 = dense<64> : vector<2xi64>, i32 = dense<32> : vector<2xi64>, i16 = dense<16> : vector<2xi64>, i8 = dense<8> : vector<2xi64>, i1 = dense<8> : vector<2xi64>, !llvm.ptr = dense<64> : vector<4xi64>, f80 = dense<128> : vector<2xi64>, i128 = dense<128> : vector<2xi64>, i64 = dense<64> : vector<2xi64>, !llvm.ptr<271> = dense<32> : vector<4xi64>, !llvm.ptr<272> = dense<64> : vector<4xi64>, "dlti.endianness" = "little", "dlti.stack_alignment" = 128 : i64>, fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", fir.target_cpu = "x86-64", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.ident = "flang version 20.1.5 (https://github.com/llvm/llvm-project.git 7b09d7b446383b71b63d429b21ee45ba389c5134)", llvm.target_triple = "x86_64-unknown-linux-gnu"} {
  func.func @_QQmain() attributes {fir.bindc_name = "main"} {
    %c5_i32 = arith.constant 5 : i32
    %c6_i32 = arith.constant 6 : i32
    %c42_i32 = arith.constant 42 : i32
    %0 = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFEx"}
    %1 = fircg.ext_declare %0 {uniq_name = "_QFEx"} : (!fir.ref) -> !fir.ref
    fir.store %c42_i32 to %1 : !fir.ref
    %2 = fir.address_of(@_QQclXc451cd8b1f5eae110b31be7df6c0fcf9) : !fir.ref<!fir.char<1,43>>
    %3 = fir.convert %2 : (!fir.ref<!fir.char<1,43>>) -> !fir.ref
    %4 = fir.call @_FortranAioBeginExternalListOutput(%c6_i32, %3, %c5_i32) fastmath : (i32, !fir.ref, i32) -> !fir.ref
    %5 = fir.load %1 : !fir.ref
    %6 = fir.call @_FortranAioOutputInteger32(%4, %5) fastmath : (!fir.ref, i32) -> i1
    %7 = fir.call @_FortranAioEndIoStatement(%4) fastmath : (!fir.ref) -> i32
    return
  }
  func.func private @_FortranAioBeginExternalListOutput(i32, !fir.ref, i32) -> !fir.ref attributes {fir.io, fir.runtime}
  fir.global linkonce @_QQclXc451cd8b1f5eae110b31be7df6c0fcf9 constant : !fir.char<1,43> {
    %0 = fir.string_lit "/home/pjoot/toycalculator/fortran/test.f90\00"(43) : !fir.char<1,43>
    fir.has_value %0 : !fir.char<1,43>
  }
  func.func private @_FortranAioOutputInteger32(!fir.ref, i32) -> i1 attributes {fir.io, fir.runtime}
  func.func private @_FortranAioEndIoStatement(!fir.ref) -> i32 attributes {fir.io, fir.runtime}
  func.func private @_FortranAProgramStart(i32, !llvm.ptr, !llvm.ptr, !llvm.ptr)
  func.func private @_FortranAProgramEndStatement()
  func.func @main(%arg0: i32, %arg1: !llvm.ptr, %arg2: !llvm.ptr) -> i32 {
    %c0_i32 = arith.constant 0 : i32
    %0 = fir.zero_bits !fir.ref<tuple<i32, !fir.ref<!fir.array<0xtuple<!fir.ref, !fir.ref>>>>>
    fir.call @_FortranAProgramStart(%arg0, %arg1, %arg2, %0) fastmath : (i32, !llvm.ptr, !llvm.ptr, !fir.ref<tuple<i32, !fir.ref<!fir.array<0xtuple<!fir.ref, !fir.ref>>>>>) -> ()
    fir.call @_QQmain() fastmath : () -> ()
    fir.call @_FortranAProgramEndStatement() fastmath : () -> ()
    return %c0_i32 : i32
  }
}
(gdb) c
Continuing.

Thread 2.1 "flang" hit Breakpoint 3.2, Fortran::frontend::CodeGenAction::generateLLVMIR (this=0x485530) at /home/pjoot/llvm-project/flang/lib/Frontend/FrontendActions.cpp:889
889	  timingScopeMLIRPasses.stop();

(gdb) mlirModule.op.dump()
module attributes {dlti.dl_spec = #dlti.dl_spec : vector<2xi64>, !llvm.ptr<270> = dense<32> : vector<4xi64>, f16 = dense<16> : vector<2xi64>, f64 = dense<64> : vector<2xi64>, i32 = dense<32> : vector<2xi64>, i16 = dense<16> : vector<2xi64>, i8 = dense<8> : vector<2xi64>, i1 = dense<8> : vector<2xi64>, !llvm.ptr = dense<64> : vector<4xi64>, f80 = dense<128> : vector<2xi64>, i128 = dense<128> : vector<2xi64>, i64 = dense<64> : vector<2xi64>, !llvm.ptr<271> = dense<32> : vector<4xi64>, !llvm.ptr<272> = dense<64> : vector<4xi64>, "dlti.endianness" = "little", "dlti.stack_alignment" = 128 : i64>, fir.defaultkind = "a1c4d8i4l4r4", fir.kindmap = "", fir.target_cpu = "x86-64", llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128", llvm.ident = "flang version 20.1.5 (https://github.com/llvm/llvm-project.git 7b09d7b446383b71b63d429b21ee45ba389c5134)", llvm.target_triple = "x86_64-unknown-linux-gnu"} {
  llvm.func @_QQmain() attributes {fir.bindc_name = "main", frame_pointer = #llvm.framePointerKind, target_cpu = "x86-64"} {
    %0 = llvm.mlir.constant(1 : i64) : i64
    %1 = llvm.alloca %0 x i32 {bindc_name = "x"} : (i64) -> !llvm.ptr
    %2 = llvm.mlir.constant(5 : i32) : i32
    %3 = llvm.mlir.constant(6 : i32) : i32
    %4 = llvm.mlir.constant(42 : i32) : i32
    %5 = llvm.mlir.constant(1 : i64) : i64
    llvm.intr.dbg.declare #llvm.di_local_variable, id = distinct[1]<>, compileUnit = , sourceLanguage = DW_LANG_Fortran95, file = <"test.f90" in "/home/pjoot/toycalculator/fortran">, producer = "flang version 20.1.5 (https://github.com/llvm/llvm-project.git 7b09d7b446383b71b63d429b21ee45ba389c5134)", isOptimized = false, emissionKind = Full>, scope = #llvm.di_file<"test.f90" in "/home/pjoot/toycalculator/fortran">, name = "main", linkageName = "_QQmain", file = <"test.f90" in "/home/pjoot/toycalculator/fortran">, line = 1, scopeLine = 1, subprogramFlags = "Definition|MainSubprogram", type = >, name = "x", file = <"test.f90" in "/home/pjoot/toycalculator/fortran">, line = 3, type = #llvm.di_basic_type> = %1 : !llvm.ptr
    llvm.store %4, %1 : i32, !llvm.ptr
    %6 = llvm.mlir.addressof @_QQclXc451cd8b1f5eae110b31be7df6c0fcf9 : !llvm.ptr
    %7 = llvm.call @_FortranAioBeginExternalListOutput(%3, %6, %2) {fastmathFlags = #llvm.fastmath} : (i32, !llvm.ptr, i32) -> !llvm.ptr
    %8 = llvm.load %1 : !llvm.ptr -> i32
    %9 = llvm.call @_FortranAioOutputInteger32(%7, %8) {fastmathFlags = #llvm.fastmath} : (!llvm.ptr, i32) -> i1
    %10 = llvm.call @_FortranAioEndIoStatement(%7) {fastmathFlags = #llvm.fastmath} : (!llvm.ptr) -> i32
    llvm.return
  }
  llvm.func @_FortranAioBeginExternalListOutput(i32, !llvm.ptr, i32) -> !llvm.ptr attributes {fir.io, fir.runtime, frame_pointer = #llvm.framePointerKind, sym_visibility = "private", target_cpu = "x86-64"}
  llvm.mlir.global linkonce constant @_QQclXc451cd8b1f5eae110b31be7df6c0fcf9() comdat(@__llvm_comdat::@_QQclXc451cd8b1f5eae110b31be7df6c0fcf9) {addr_space = 0 : i32} : !llvm.array<43 x i8> {
    %0 = llvm.mlir.constant("/home/pjoot/toycalculator/fortran/test.f90\00") : !llvm.array<43 x i8>
    llvm.return %0 : !llvm.array<43 x i8>
  }
  llvm.comdat @__llvm_comdat {
    llvm.comdat_selector @_QQclXc451cd8b1f5eae110b31be7df6c0fcf9 any
  }
  llvm.func @_FortranAioOutputInteger32(!llvm.ptr, i32) -> (i1 {llvm.zeroext}) attributes {fir.io, fir.runtime, frame_pointer = #llvm.framePointerKind, sym_visibility = "private", target_cpu = "x86-64"}
  llvm.func @_FortranAioEndIoStatement(!llvm.ptr) -> i32 attributes {fir.io, fir.runtime, frame_pointer = #llvm.framePointerKind, sym_visibility = "private", target_cpu = "x86-64"}
  llvm.func @_FortranAProgramStart(i32, !llvm.ptr, !llvm.ptr, !llvm.ptr) attributes {frame_pointer = #llvm.framePointerKind, sym_visibility = "private", target_cpu = "x86-64"}
  llvm.func @_FortranAProgramEndStatement() attributes {frame_pointer = #llvm.framePointerKind, sym_visibility = "private", target_cpu = "x86-64"}
  llvm.func @main(%arg0: i32, %arg1: !llvm.ptr, %arg2: !llvm.ptr) -> i32 attributes {frame_pointer = #llvm.framePointerKind, target_cpu = "x86-64"} {
    %0 = llvm.mlir.constant(0 : i32) : i32
    %1 = llvm.mlir.zero : !llvm.ptr
    llvm.call @_FortranAProgramStart(%arg0, %arg1, %arg2, %1) {fastmathFlags = #llvm.fastmath} : (i32, !llvm.ptr, !llvm.ptr, !llvm.ptr) -> ()
    llvm.call @_QQmain() {fastmathFlags = #llvm.fastmath} : () -> ()
    llvm.call @_FortranAProgramEndStatement() {fastmathFlags = #llvm.fastmath} : () -> ()
    llvm.return %0 : i32
  }
}

Even before the AddDebugInfoPass we have target_cpu, data_layout, ident, and target_triple set, so AddDebugInfoPass doesn’t add those. We do have all of these, either added by AddDebugInfoPass, or something after it:

 llvm.intr.dbg.declare #llvm.di_local_variable, id = distinct[1]<>, compileUnit = , sourceLanguage = DW_LANG_Fortran95, file = <"test.f90" in "/home/pjoot/toycalculator/fortran">, producer = "flang version 20.1.5 (https://github.com/llvm/llvm-project.git 7b09d7b446383b71b63d429b21ee45ba389c5134)", isOptimized = false, emissionKind = Full>, scope = #llvm.di_file<"test.f90" in "/home/pjoot/toycalculator/fortran">, name = "main", linkageName = "_QQmain", file = <"test.f90" in "/home/pjoot/toycalculator/fortran">, line = 1, scopeLine = 1, subprogramFlags = "Definition|MainSubprogram", type = >, name = "x", file = <"test.f90" in "/home/pjoot/toycalculator/fortran">, line = 3, type = #llvm.di_basic_type> = %1 : !llvm.ptr

So, it looks like we need to add a compileUnit, subprogram, and if we want debug variables, instrumentation for those too.

In some standalone code (prototypes/simplest.cpp) adding those wasn’t enough to ensure that translateModuleToLLVMIR() didn’t strip out the location info.

What did make the difference was adding this bit:

func->setLoc( builder.getFusedLoc( { loc }, subprogramAttr ) );

with that, the generated LLVM-IR, not only doesn’t have the loc() stripped, but everything is properly converted to !dbg:

; ModuleID = 'test'
source_filename = "test"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define i32 @main() !dbg !4 {
  %1 = alloca i64, i64 1, align 4, !dbg !8
  store i32 42, ptr %1, align 4, !dbg !8
    #dbg_declare(ptr %1, !9, !DIExpression(), !8)
  ret i32 0, !dbg !10
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare void @llvm.dbg.declare(metadata, metadata, metadata) #0

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

!llvm.module.flags = !{!0}
!llvm.dbg.cu = !{!1}
!llvm.ident = !{!3}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = distinct !DICompileUnit(language: DW_LANG_C, file: !2, producer: "testcompiler", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug)
!2 = !DIFile(filename: "test.c", directory: ".")
!3 = !{!"toycompiler 0.0"}
!4 = distinct !DISubprogram(name: "main", linkageName: "main", scope: !2, file: !2, line: 1, type: !5, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !1)
!5 = !DISubroutineType(types: !6)
!6 = !{!7}
!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!8 = !DILocation(line: 2, column: 3, scope: !4)
!9 = !DILocalVariable(name: "x", scope: !4, file: !2, line: 2, type: !7, align: 32)
!10 = !DILocation(line: 3, column: 3, scope: !4)

I’m also able to debug a faked source file corresponding to the LLVM-IR generated by this MWE program:

fedora:/home/pjoot/toycalculator/prototypes> ../build/simplest > output.ll
fedora:/home/pjoot/toycalculator/prototypes> clang -g -o output output.ll -Wno-override-module
fedora:/home/pjoot/toycalculator/prototypes> gdb -q ./output
Reading symbols from ./output...
(gdb) b main
Breakpoint 1 at 0x400450: file test.c, line 2.
(gdb) run
Starting program: /home/pjoot/toycalculator/prototypes/output 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, main () at test.c:2
2           int x = 42;
(gdb) n
3           return 0;
(gdb) p x
$1 = 42
(gdb) rerun
Undefined command: "rerun".  Try "help".
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/pjoot/toycalculator/prototypes/output 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, main () at test.c:2
2           int x = 42;
(gdb) b 3
Breakpoint 2 at 0x400458: file test.c, line 3.
(gdb) c
Continuing.

Breakpoint 2, main () at test.c:3
3           return 0;
(gdb) q
A debugging session is active.

        Inferior 1 [process 119835] will be killed.

Quit anyway? (y or n) y

Now that I have this working in a standalone test case, it’s time to do it in the toy compiler itself, but that’s a task for another day.

Adding fixed size types to my toy MLIR compiler

May 17, 2025 C/C++ development and debugging. LLVM IR, lowering, MLIR

In toycalculator, an MLIR/LLVM compiler experiment, I described a rudimentary MLIR based compiler.

I’ve now added fixed size integer types, a boolean type and boolean constants (but not boolean operators), and two floating point types. For the time being, the untyped ‘DCL’ declaration type (FLOAT64) is still in the grammar and the parser/MLIR-builder.

There’s lots still in the TODO list, including control flow, functions and dwarf support. I was quite surprised that given the location requirements for all MLIR statements, we don’t get DWARF instrumentation for free when doing LLVM lowering, and I wonder if I am missing some sort of setup for that.

New Language/Compiler enhancements

Here are the new elements added to the language:

Declare variables with BOOL, INT8, INT16, INT32, INT64, FLOAT32, FLOAT64 types:
```
BOOL b;
INT16 i;
FLOAT32 f;
```
TRUE, FALSE, and floating point constants:
```
b = TRUE;
f = 5 + 3.14E0;
```
An EXIT builtin to return a Unix command line value (must be the last statement in the program):
```
EXIT 1;
EXIT x;
```
Expression type conversions:
```
INT16 x;
FLOAT32 y;
y = 3.14E0;
x = 1 + y;
```
The type conversion rules in the language are not like C.
Instead, all expression elements are converted to the type of the destination before the operation, and integers are truncated.
Example:
```
INT32 x;
x = 1.78 + 3.86E0;
FLOAT64 f;
f = x;
PRINT f;
f = 1.78 + 3.86E0;
PRINT f;
```
The expected output for this program is:
```
4.000000
5.640000
```

MLIR

Example 1

The MLIR for the language now matches the statements of the language much more closely. Consider test.toy for example:

DCL x;
x = 5 + 3.14E0;
PRINT x;
DCL y;
y = x * 2;
PRINT y;

for which the MLIR is now free of memref dialect:

"builtin.module"() ({
  "toy.program"() ({
    "toy.declare"() <{name = "x", type = f64}> : () -> () loc(#loc)
    %0 = "arith.constant"() <{value = 5 : i64}> : () -> i64 loc(#loc1)
    %1 = "arith.constant"() <{value = 3.140000e+00 : f64}> : () -> f64 loc(#loc1)
    %2 = "toy.add"(%0, %1) : (i64, f64) -> f64 loc(#loc1)
    "toy.assign"(%2) <{name = "x"}> : (f64) -> () loc(#loc1)
    %3 = "toy.load"() <{name = "x"}> : () -> f64 loc(#loc2)
    "toy.print"(%3) : (f64) -> () loc(#loc2)
    "toy.declare"() <{name = "y", type = f64}> : () -> () loc(#loc3)
    %4 = "toy.load"() <{name = "x"}> : () -> f64 loc(#loc4)
    %5 = "arith.constant"() <{value = 2 : i64}> : () -> i64 loc(#loc4)
    %6 = "toy.mul"(%4, %5) : (f64, i64) -> f64 loc(#loc4)
    "toy.assign"(%6) <{name = "y"}> : (f64) -> () loc(#loc4)
    %7 = "toy.load"() <{name = "y"}> : () -> f64 loc(#loc5)
    "toy.print"(%7) : (f64) -> () loc(#loc5)
    "toy.exit"() : () -> () loc(#loc)
  }) : () -> () loc(#loc)
}) : () -> () loc(#loc)
#loc = loc("test.toy":1:1)
#loc1 = loc("test.toy":2:5)
#loc2 = loc("test.toy":3:1)
#loc3 = loc("test.toy":4:1)
#loc4 = loc("test.toy":5:5)
#loc5 = loc("test.toy":6:1)

Variable references still use llvm.alloca, but are symbol based in the builder, and llvm.alloca now doesn’t show up until lowering to LLVM-IR:

; ModuleID = 'test.toy'
source_filename = "test.toy"

declare void @__toy_print(double)

define i32 @main() {
  %1 = alloca double, i64 1, align 8
  store double 8.140000e+00, ptr %1, align 8
  %2 = load double, ptr %1, align 8
  call void @__toy_print(double %2)
  %3 = alloca double, i64 1, align 8
  %4 = load double, ptr %1, align 8
  %5 = fmul double %4, 2.000000e+00
  store double %5, ptr %3, align 8
  %6 = load double, ptr %3, align 8
  call void @__toy_print(double %6)
  ret i32 0
}

!llvm.module.flags = !{!0}

!0 = !{i32 2, !"Debug Info Version", i32 3}

Here is an example generated assembly, for the program above:

0000000000000000 <main>:
   0:   push   %rax
   1:   movsd  0x0(%rip),%xmm0        
   9:   call   e <main+0xe>
   e:   movsd  0x0(%rip),%xmm0       
  16:   call   1b <main+0x1b>
  1b:   xor    %eax,%eax
  1d:   pop    %rcx
  1e:   ret

Example 2

Here’s an example that highlights the new type support a bit better:

DESKTOP-16N83AG:/home/pjoot/toycalculator/samples> cat addi.toy 
INT32 x;
x = 5 + 3.14E0;
FLOAT64 f;
f = x;
PRINT f;

The MLIR for this is:

"builtin.module"() ({
  "toy.program"() ({
    "toy.declare"() <{name = "x", type = i32}> : () -> () loc(#loc)
    %0 = "arith.constant"() <{value = 5 : i64}> : () -> i64 loc(#loc1)
    %1 = "arith.constant"() <{value = 3.140000e+00 : f64}> : () -> f64 loc(#loc1)
    %2 = "toy.add"(%0, %1) : (i64, f64) -> i32 loc(#loc1)
    "toy.assign"(%2) <{name = "x"}> : (i32) -> () loc(#loc1)
    "toy.declare"() <{name = "f", type = f64}> : () -> () loc(#loc2)
    %3 = "toy.load"() <{name = "x"}> : () -> i32 loc(#loc3)
    "toy.assign"(%3) <{name = "f"}> : (i32) -> () loc(#loc3)
    %4 = "toy.load"() <{name = "f"}> : () -> f64 loc(#loc4)
    "toy.print"(%4) : (f64) -> () loc(#loc4)
    "toy.exit"() : () -> () loc(#loc)
  }) : () -> () loc(#loc)
}) : () -> () loc(#loc)
#loc = loc("addi.toy":1:1)
#loc1 = loc("addi.toy":2:5)
#loc2 = loc("addi.toy":3:1)
#loc3 = loc("addi.toy":4:5)
#loc4 = loc("addi.toy":5:1)

for which the LLVM IR lowering result is:

; ModuleID = 'addi.toy'
source_filename = "addi.toy"

declare void @__toy_print(double)

define i32 @main() {
  %1 = alloca i32, i64 1, align 4
  store i32 8, ptr %1, align 4
  %2 = alloca double, i64 1, align 8
  %3 = load i32, ptr %1, align 4
  %4 = sitofp i32 %3 to double
  store double %4, ptr %2, align 8
  %5 = load double, ptr %2, align 8
  call void @__toy_print(double %5)
  ret i32 0
}

!llvm.module.flags = !{!0}

!0 = !{i32 2, !"Debug Info Version", i32 3}

Because of lack of side effects, most of that code is obliterated in the assembly printing stage, leaving just:

0000000000000000 <main>:
   0:   push   %rax
   1:   movsd  0x0(%rip),%xmm0
   9:   call   e <main+0xe>
   e:   xor    %eax,%eax
  10:   pop    %rcx
  11:   ret

LLVM IR

Tagged V3 of my toy compiler (playing with the MLIR -> LLVM-IR toolchain)

Like this:

Have added boolean operations to my toy MLIR compiler

Like this:

Debugging now works in my toy MLIR compiler!

Like this:

The quest for DWARF instrumentation in a toy MLIR compiler

Like this:

Adding fixed size types to my toy MLIR compiler

New Language/Compiler enhancements

MLIR

Example 1

Example 2

Like this:

Pages

Categories

Recent Posts

Archives

Meta

Categories

Recent Posts

Archives

LLVM IR

Tagged V3 of my toy compiler (playing with the MLIR -> LLVM-IR toolchain)

Share this:

Like this:

Have added boolean operations to my toy MLIR compiler

Share this:

Like this:

Debugging now works in my toy MLIR compiler!

Share this:

Like this:

The quest for DWARF instrumentation in a toy MLIR compiler

Share this:

Like this:

Adding fixed size types to my toy MLIR compiler

New Language/Compiler enhancements

MLIR

Example 1

Example 2

Share this:

Like this:

Pages

Categories

Recent Posts

Tags

Archives

Meta

Categories

Recent Posts

Tags

Archives