LLVM IR

Line stepping through MLIR with a debugger!

February 10, 2026 C/C++ development and debugging. , , , , ,

gdb session

I’ve added an alternate input source for the silly compiler.  As well as the .silly files that it previously accepted, it now also accepts .mlir (silly-dialect) files as input.

This means that if there’s an experimental language feature that requires new style MLIR, but I don’t want to figure out how to push that all the way through grammar -> parser -> builder -> lowering all at once, I might be able to at least understand the required MLIR patterns by by manually modifying exiting MLIR (generated with ‘silly –emit-mlir’).

For example, I don’t have BREAK support for FOR loops. I can do something simple:

INT64 v;

FOR (INT64 myLoopVar : (1, 5))
{
    PRINT myLoopVar;
    v = myLoopVar + 1;
};

PRINT "after loop: ", v;

The MLIR for this (with location info stripped out), looks like:

fedoravm:/home/peeter/toycalculator/tests/endtoend/for> silly-opt --pretty -s out/for_simplest.mlir 
module {
  func.func @main() -> i32 {
    %c0_i32 = arith.constant 0 : i32
    %c5_i64 = arith.constant 5 : i64
    %c1_i64 = arith.constant 1 : i64
    "silly.scope"() ({
      %0 = "silly.declare"() <{sym_name = "v"}> : () -> !silly.var
      scf.for %arg0 = %c1_i64 to %c5_i64 step %c1_i64  : i64 {
        "silly.print"(%c0_i32, %arg0) : (i32, i64) -> ()
        %3 = "silly.add"(%arg0, %c1_i64) : (i64, i64) -> i64
        silly.assign %0 :  = %3 : i64
      }
      %1 = "silly.string_literal"() <{value = "after loop: "}> : () -> !llvm.ptr
      %2 = silly.load %0 :  : i64
      "silly.print"(%c0_i32, %1, %2) : (i32, !llvm.ptr, i64) -> ()
      "silly.return"(%c0_i32) : (i32) -> ()
    }) : () -> ()
    "silly.yield"() : () -> ()
  }
}

If I want to add a BREAK into the mix (which I don’t support in any of grammar or parser or builder right now), something like:

INT64 v; 
FOR (INT64 i : (1, 5)) {
    PRINT i; 
    v = i + 1; 
    IF (i == 3) { BREAK; }; 
};
PRINT "after loop: ", v; 

Then it can be done by replacing the scf.for with scf.while, and putting in additional termination condition logic. Example:

module {
  func.func @main() -> i32 {
    %c0_i32 = arith.constant 0 : i32
    %c1_i64 = arith.constant 1 : i64
    %c3_i64 = arith.constant 3 : i64
    %c5_i64 = arith.constant 5 : i64
    %true = arith.constant true
    %false = arith.constant false

    "silly.scope"() ({
      %0 = "silly.declare"() <{sym_name = "v"}> : () -> !silly.var

      scf.while (%i = %c1_i64, %broke = %false) : (i64, i1) -> (i64, i1) {
        %not_broke = arith.xori %broke, %true : i1
        %in_range = arith.cmpi slt, %i, %c5_i64 : i64
        %continue = arith.andi %in_range, %not_broke : i1
        scf.condition(%continue) %i, %broke : i64, i1
      } do {
      ^bb0(%loop_var: i64, %break_flag: i1):
        "silly.print"(%c0_i32, %loop_var) : (i32, i64) -> ()
        %2 = "silly.add"(%loop_var, %c1_i64) : (i64, i64) -> i64
        silly.assign %0 :  = %2 : i64

        %is_three = arith.cmpi eq, %loop_var, %c3_i64 : i64
        %should_break = arith.ori %break_flag, %is_three : i1

        %next = arith.addi %loop_var, %c1_i64 : i64
        scf.yield %next, %should_break : i64, i1
      }

      %lit = "silly.string_literal"() <{value = "after loop: "}> : () -> !llvm.ptr
      %p = silly.load %0 :  : i64
      "silly.print"(%c0_i32, %lit, %p) : (i32, !llvm.ptr, i64) -> ()

      "silly.return"(%c0_i32) : (i32) -> ()
    }) : () -> ()
    "silly.yield"() : () -> ()
  }
}

Now, here’s where things get cool.  I noticed something curious when I looked at the .mlir dump from the MLIR parser (which I dumped to verify I was getting the expected round trip output before lowering). The MLIR parser, given only MLIR source, and no other location tagging, goes off and tags everything with location info for the MLIR source itself.  Example:

#loc15 = loc("forbreak.mlsilly":27:12)
#loc16 = loc("forbreak.mlsilly":27:28)
module {
  func.func @main() -> i32 {
    %c0_i32 = arith.constant 0 : i32 loc(#loc2)
    %c1_i64 = arith.constant 1 : i64 loc(#loc3)
    %c3_i64 = arith.constant 3 : i64 loc(#loc4)
    %c5_i64 = arith.constant 5 : i64 loc(#loc5)
    %true = arith.constant true loc(#loc6)
    %false = arith.constant false loc(#loc7)
    "silly.scope"() ({
      %0 = "silly.declare"() <{sym_name = "v"}> : () -> !silly.var loc(#loc9)
      %1:2 = scf.while (%arg0 = %c1_i64, %arg1 = %false) : (i64, i1) -> (i64, i1) {
        %4 = arith.xori %arg1, %true : i1 loc(#loc11)
        %5 = arith.cmpi slt, %arg0, %c5_i64 : i64 loc(#loc12)
        %6 = arith.andi %5, %4 : i1 loc(#loc13)
        scf.condition(%6) %arg0, %arg1 : i64, i1 loc(#loc14)
      } do {
      ^bb0(%arg0: i64 loc("forbreak.mlsilly":27:12), %arg1: i1 loc("forbreak.mlsilly":27:28)):
        "silly.print"(%c0_i32, %arg0) : (i32, i64) -> () loc(#loc17)
        %4 = "silly.add"(%arg0, %c1_i64) : (i64, i64) -> i64 loc(#loc18)
        silly.assign %0 :  = %4 : i64 loc(#loc19)
        %5 = arith.cmpi eq, %arg0, %c3_i64 : i64 loc(#loc20)
        %6 = arith.ori %arg1, %5 : i1 loc(#loc21)
        %7 = arith.addi %arg0, %c1_i64 : i64 loc(#loc22)
        scf.yield %7, %6 : i64, i1 loc(#loc23)
      } loc(#loc10)
      %2 = "silly.string_literal"() <{value = "after loop: "}> : () -> !llvm.ptr loc(#loc24)
      %3 = silly.load %0 :  : i64 loc(#loc25)
      "silly.print"(%c0_i32, %2, %3) : (i32, !llvm.ptr, i64) -> () loc(#loc26)
      "silly.return"(%c0_i32) : (i32) -> () loc(#loc27)
    }) : () -> () loc(#loc8)
    "silly.yield"() : () -> () loc(#loc28)
  } loc(#loc1)
} loc(#loc)
#loc = loc("forbreak.mlsilly":9:1)
#loc1 = loc("forbreak.mlsilly":10:3)
#loc2 = loc("forbreak.mlsilly":11:15)
#loc3 = loc("forbreak.mlsilly":12:15)
#loc4 = loc("forbreak.mlsilly":13:15)
#loc5 = loc("forbreak.mlsilly":14:15)
#loc6 = loc("forbreak.mlsilly":15:13)
#loc7 = loc("forbreak.mlsilly":16:14)
...

My compiler can then turns that location info into dwarf DI, just as it does for regular .silly source file, so I can actually line step through the MLIR itself with any debugger! Here’s an example session:

Breakpoint 1, main () at forbreak.mlsilly:25
25              scf.condition(%continue) %i, %broke : i64, i1
(gdb) l
20            
21            scf.while (%i = %c1_i64, %broke = %false) : (i64, i1) -> (i64, i1) {
22              %not_broke = arith.xori %broke, %true : i1
23              %in_range = arith.cmpi slt, %i, %c5_i64 : i64
24              %continue = arith.andi %in_range, %not_broke : i1
25              scf.condition(%continue) %i, %broke : i64, i1
26            } do {
27            ^bb0(%loop_var: i64, %break_flag: i1):
28              "silly.print"(%c0_i32, %loop_var) : (i32, i64) -> ()
29              %2 = "silly.add"(%loop_var, %c1_i64) : (i64, i64) -> i64
(gdb) l
30              silly.assign %0 :  = %2 : i64
31              
32              %is_three = arith.cmpi eq, %loop_var, %c3_i64 : i64
33              %should_break = arith.ori %break_flag, %is_three : i1
34              
35              %next = arith.addi %loop_var, %c1_i64 : i64
36              scf.yield %next, %should_break : i64, i1
37            }
38
39            %lit = "silly.string_literal"() <{value = "after loop: "}> : () -> !llvm.ptr
(gdb) b 32
Breakpoint 2 at 0x40076c: file forbreak.mlsilly, line 32.
(gdb) c
Continuing.
1

Breakpoint 2, main () at forbreak.mlsilly:32
32              %is_three = arith.cmpi eq, %loop_var, %c3_i64 : i64
(gdb) disassemble
Dump of assembler code for function main:
   0x000000000040072c <+0>:     sub     sp, sp, #0x60
   0x0000000000400730 <+4>:     stp     x30, x21, [sp, #64]
   0x0000000000400734 <+8>:     stp     x20, x19, [sp, #80]
   0x0000000000400738 <+12>:    mov     w19, wzr
   0x000000000040073c <+16>:    mov     w20, #0x1                       // #1
   0x0000000000400740 <+20>:    mov     w21, #0x1                       // #1
   0x0000000000400744 <+24>:    str     xzr, [sp, #8]
   0x0000000000400748 <+28>:    cmp     x21, #0x4
   0x000000000040074c <+32>:    b.gt    0x400784 
   0x0000000000400750 <+36>:    tbnz    w19, #0, 0x400784 
   0x0000000000400754 <+40>:    add     x1, sp, #0x10
   0x0000000000400758 <+44>:    mov     w0, #0x1                        // #1
   0x000000000040075c <+48>:    stp     x21, xzr, [sp, #24]
   0x0000000000400760 <+52>:    str     x20, [sp, #16]
   0x0000000000400764 <+56>:    bl      0x4005b0 <__silly_print@plt>
   0x0000000000400768 <+60>:    add     x21, x21, #0x1
=> 0x000000000040076c <+64>:    cmp     x21, #0x4
   0x0000000000400770 <+68>:    str     x21, [sp, #8]
   0x0000000000400774 <+72>:    cset    w8, eq  // eq = none
   0x0000000000400778 <+76>:    orr     w19, w19, w8
   0x000000000040077c <+80>:    cmp     x21, #0x4
   0x0000000000400780 <+84>:    b.le    0x400750 
   0x0000000000400784 <+88>:    mov     x8, #0x3                        // #3
   0x0000000000400788 <+92>:    ldr     x9, [sp, #8]
   0x000000000040078c <+96>:    mov     w10, #0xc                       // #12
   0x0000000000400790 <+100>:   movk    x8, #0x1, lsl #32
   0x0000000000400794 <+104>:   add     x1, sp, #0x10
   0x0000000000400798 <+108>:   mov     w0, #0x2                        // #2
   0x000000000040079c <+112>:   stp     x8, x10, [sp, #16]
   0x00000000004007a0 <+116>:   adrp    x8, 0x400000
   0x00000000004007a4 <+120>:   add     x8, x8, #0x7f8
   0x00000000004007a8 <+124>:   stp     x9, xzr, [sp, #48]
   0x00000000004007ac <+128>:   mov     w9, #0x1                        // #1
   0x00000000004007b0 <+132>:   stp     x8, x9, [sp, #32]
   0x00000000004007b4 <+136>:   bl      0x4005b0 <__silly_print@plt>
   0x00000000004007b8 <+140>:   ldp     x20, x19, [sp, #80]
   0x00000000004007bc <+144>:   mov     w0, wzr
   0x00000000004007c0 <+148>:   ldp     x30, x21, [sp, #64]
   0x00000000004007c4 <+152>:   add     sp, sp, #0x60
   0x00000000004007c8 <+156>:   ret
End of assembler dump.



(gdb) c
Continuing.
2

Breakpoint 2, main () at forbreak.mlsilly:32
32              %is_three = arith.cmpi eq, %loop_var, %c3_i64 : i64
(gdb) p v
$2 = 2

Having built a compiler for an arbitrary language, and having implemented DWARF instrumentation for that language, I get line support for stepping through the MLIR itself, if I want it.

I can imagine a scenerio where I’ve screwed up the MLIR ops generation in the builder. This lets me set a breakpoint right at the MLIR line in question, and poke around at the disassembly for that point in the code, and see what’s going on. What a cool compiler debugging tool!

Added FUNCTION/CALL support to my toy compiler

July 7, 2025 clang/llvm , , , , ,

I’ve tagged V4 for my toy language and MLIR based compiler.

See the Changelog for the gory details (or the commit history).  There are three specific new features, relative to the V3 tag:

    1. Adds support (grammar, builder, lowering) for function declarations, and function calls. Much of the work for this was done in branch use_mlir_funcop_with_scopeop, later squashed and merged as a big commit. Here’s an example

      FUNCTION bar ( INT16 w, INT32 z )
      {
          PRINT "In bar";
          PRINT w;
          PRINT z;
          RETURN;
      };
      
      FUNCTION foo ( )
      {
          INT16 v;
          v = 3;
          PRINT "In foo";
          CALL bar( v, 42 );
          PRINT "Called bar";
          RETURN;
      };
      
      PRINT "In main";
      CALL foo();
      PRINT "Back in main";
      

      Here is the MLIR for this program:

      module {
        func.func private @foo() {
          "toy.scope"() ({
            "toy.declare"() <{type = i16}> {sym_name = "v"} : () -> ()
            %c3_i64 = arith.constant 3 : i64
            "toy.assign"(%c3_i64) <{var_name = @v}> : (i64) -> ()
            %0 = "toy.string_literal"() <{value = "In foo"}> : () -> !llvm.ptr
            toy.print %0 : !llvm.ptr
            %1 = "toy.load"() <{var_name = @v}> : () -> i16
            %c42_i64 = arith.constant 42 : i64
            %2 = arith.trunci %c42_i64 : i64 to i32
            "toy.call"(%1, %2) <{callee = @bar}> : (i16, i32) -> ()
            %3 = "toy.string_literal"() <{value = "Called bar"}> : () -> !llvm.ptr
            toy.print %3 : !llvm.ptr
            "toy.return"() : () -> ()
          }) : () -> ()
          "toy.yield"() : () -> ()
        }
        func.func private @bar(%arg0: i16, %arg1: i32) {
          "toy.scope"() ({
            "toy.declare"() <{param_number = 0 : i64, parameter, type = i16}> {sym_name = "w"} : () -> ()
            "toy.declare"() <{param_number = 1 : i64, parameter, type = i32}> {sym_name = "z"} : () -> ()
            %0 = "toy.string_literal"() <{value = "In bar"}> : () -> !llvm.ptr
            toy.print %0 : !llvm.ptr
            %1 = "toy.load"() <{var_name = @w}> : () -> i16
            toy.print %1 : i16
            %2 = "toy.load"() <{var_name = @z}> : () -> i32
            toy.print %2 : i32
            "toy.return"() : () -> ()
          }) : () -> ()
          "toy.yield"() : () -> ()
        }
        func.func @main() -> i32 {
          "toy.scope"() ({
            %c0_i32 = arith.constant 0 : i32
            %0 = "toy.string_literal"() <{value = "In main"}> : () -> !llvm.ptr
            toy.print %0 : !llvm.ptr
            "toy.call"() <{callee = @foo}> : () -> ()
            %1 = "toy.string_literal"() <{value = "Back in main"}> : () -> !llvm.ptr
            toy.print %1 : !llvm.ptr
            "toy.return"(%c0_i32) : (i32) -> ()
          }) : () -> ()
          "toy.yield"() : () -> ()
        }
      }
      

      Here’s a sample program with an assigned CALL value:

      FUNCTION bar ( INT16 w )
      {
          PRINT w;
          RETURN;
      };
      
      PRINT "In main";
      CALL bar( 3 );
      PRINT "Back in main";
      

      The MLIR for this one looks like:

      module {
        func.func private @bar(%arg0: i16) {
          "toy.scope"() ({
            "toy.declare"() <{param_number = 0 : i64, parameter, type = i16}> {sym_name = "w"} : () -> ()
            %0 = "toy.load"() <{var_name = @w}> : () -> i16
            toy.print %0 : i16
            "toy.return"() : () -> ()
          }) : () -> ()
          "toy.yield"() : () -> ()
        }
        func.func @main() -> i32 {
          "toy.scope"() ({
            %c0_i32 = arith.constant 0 : i32
            %0 = "toy.string_literal"() <{value = "In main"}> : () -> !llvm.ptr
            toy.print %0 : !llvm.ptr
            %c3_i64 = arith.constant 3 : i64
            %1 = arith.trunci %c3_i64 : i64 to i16
            "toy.call"(%1) <{callee = @bar}> : (i16) -> ()
            %2 = "toy.string_literal"() <{value = "Back in main"}> : () -> !llvm.ptr
            toy.print %2 : !llvm.ptr
            "toy.return"(%c0_i32) : (i32) -> ()
          }) : () -> ()
          "toy.yield"() : () -> ()
        }
      }
      

      I’ve implemented a two stage lowering, where the toy.scope, toy.yield, toy.call, and toy.returns are stripped out leaving just the func and llvm dialects. Code from that stage of the lowering is cleaner looking

      llvm.mlir.global private constant @str_1(dense<[66, 97, 99, 107, 32, 105, 110, 32, 109, 97, 105, 110]> : tensor<12xi8>) {addr_space = 0 : i32} : !llvm.array<12 x i8>
      func.func private @__toy_print_string(i64, !llvm.ptr)
      llvm.mlir.global private constant @str_0(dense<[73, 110, 32, 109, 97, 105, 110]> : tensor<7xi8>) {addr_space = 0 : i32} : !llvm.array<7 x i8>
      func.func private @__toy_print_i64(i64)
      func.func private @bar(%arg0: i16) {
        %0 = llvm.mlir.constant(1 : i64) : i64
        %1 = llvm.alloca %0 x i16 {alignment = 2 : i64, bindc_name = "w.addr"} : (i64) -> !llvm.ptr
        llvm.store %arg0, %1 : i16, !llvm.ptr
        %2 = llvm.load %1 : !llvm.ptr -> i16
        %3 = llvm.sext %2 : i16 to i64
        call @__toy_print_i64(%3) : (i64) -> ()
        return
      }
      func.func @main() -> i32 {
        %0 = llvm.mlir.constant(0 : i32) : i32
        %1 = llvm.mlir.addressof @str_0 : !llvm.ptr
        %2 = llvm.mlir.constant(7 : i64) : i64
        call @__toy_print_string(%2, %1) : (i64, !llvm.ptr) -> ()
        %3 = llvm.mlir.constant(3 : i64) : i64
        %4 = llvm.mlir.constant(3 : i16) : i16
        call @bar(%4) : (i16) -> ()
        %5 = llvm.mlir.addressof @str_1 : !llvm.ptr
        %6 = llvm.mlir.constant(12 : i64) : i64
        call @__toy_print_string(%6, %5) : (i64, !llvm.ptr) -> ()
        return %0 : i32
      }
      

      There are some dead code constants left there (%3), seeming due to type conversion, but they get stripped out nicely by the time we get to LLVM-IR:

      @str_1 = private constant [12 x i8] c"Back in main"
      @str_0 = private constant [7 x i8] c"In main"
      
      declare void @__toy_print_string(i64, ptr)
      
      declare void @__toy_print_i64(i64)
      
      define void @bar(i16 %0) {
        %2 = alloca i16, i64 1, align 2
        store i16 %0, ptr %2, align 2
        %3 = load i16, ptr %2, align 2
        %4 = sext i16 %3 to i64
        call void @__toy_print_i64(i64 %4)
        ret void
      }
      
      define i32 @main() {
        call void @__toy_print_string(i64 7, ptr @str_0)
        call void @bar(i16 3)
        call void @__toy_print_string(i64 12, ptr @str_1)
        ret i32 0
      }
    2. Generalize NegOp lowering to support all types, not just f64.
    3. Allow PRINT of string literals, avoiding requirement for variables. Example:

          %0 = "toy.string_literal"() <{value = "A string literal!"}> : () -> !llvm.ptr loc(#loc)
          "toy.print"(%0) : (!llvm.ptr) -> () loc(#loc)

       

The next obvious thing to do for the language/compiler would be to implement conditionals (IF/ELIF/ELSE) and loops. I think that there are MLIR dialects to facilitate both (like the affine dialect for loops.)

However, having now finished this function support feature (which I’ve been working on for quite a while), I’m going to take a break from this project. Even though I’ve only been working on this toy compiler project in my spare time, it periodically invades my thoughts. With all that I have to learn for my new job, I’d rather have one less extra thing to think about, so that I don’t feel pulled in too many directions at once.

Tagged V3 of my toy compiler (playing with the MLIR -> LLVM-IR toolchain)

June 3, 2025 clang/llvm , , , , , , , , , , , , , ,

Screenshot

 

I’ve added a number of elements to the language and compiler:

  • comparison operators (<, <=, EQ, NE) yielding BOOL values.  These work for any combinations of floating and integer types (including BOOL.)

  • integer bitwise operators (OR, AND, XOR).  These only for for integer types (including BOOL.)

  • a NOT operator, yielding BOOL.

  • Array + string declaration and lowering support, including debug instrumentation, and print support for string variables.

This version also fixes a few specific issues:

  • Fixed -g/-OX propagation to lowering.  If -g not specified, now don’t generate the DI.
  • Show the optimized .ll with –emit-llvm instead of the just-lowered .ll (unless not invoking the assembly printer, where the ll optimization passes are registered.)
  • Reorganize the grammar so that all the simple lexer tokens are last.  Rename a bunch of the tokens, introducing some consistency.
  • calculator.td: introduce IntOrFloat constraint type, replacing AnyType usage; array decl support, and string support.
  • driver: writeLL helper function, pass -g to lowering if set.
  • parser: handle large integer constants properly, array decl support, and string support.
  • simplest.cpp: This MWE is updated to include a global variable and global variable access.
  • parser: implicit exit: use the last saved location, instead of the module start.  This means the line numbers don’t jump around at the very end of the program anymore (i.e.: implicit return/exit)

I started with the comparison operators, thinking that I’d add if statement support, and loops, but got sidetracked.  In particular, I generated a number of really large test programs, and without some way to print a string message, it was hard to figure out where an error was occuring.  This led to implementing PRINT string-variable support as an interesting feature first.

As a side effect of adding STRING support, I’ve also got declaration support for arbitrary fixed size arrays for any type.  I haven’t implemented array access yet, but that probably won’t be too hard.

Here’s an example of a program that uses STRING variables:

STRING t[2];
STRING u[3];
STRING s[2];
INT8 s2;
s = "hi";
PRINT s;
t = "hi";
PRINT t;
u = "hi";
PRINT u;
u = "bye";
PRINT u;

This is the MLIR for the program:

module {
  toy.program {
    "toy.declare"() <{name = "t", size = 2 : i64, type = i8}> : () -> ()
    "toy.declare"() <{name = "u", size = 3 : i64, type = i8}> : () -> ()
    "toy.declare"() <{name = "s", size = 2 : i64, type = i8}> : () -> ()
    "toy.declare"() <{name = "s2", type = i8}> : () -> ()
    toy.string_assign "s" = "hi"
    %0 = toy.load "s" : !llvm.ptr
    toy.print %0 : !llvm.ptr
    toy.string_assign "t" = "hi"
    %1 = toy.load "t" : !llvm.ptr
    toy.print %1 : !llvm.ptr
    toy.string_assign "u" = "hi"
    %2 = toy.load "u" : !llvm.ptr
    toy.print %2 : !llvm.ptr
    toy.string_assign "u" = "bye"
    %3 = toy.load "u" : !llvm.ptr
    toy.print %3 : !llvm.ptr
    toy.exit
  }
}

It’s a bit clunky, because I cheated and didn’t try to implement PRINT support of string literals directly. I thought that since I had variable support already (which emits llvm.alloca), I could change that alloca trivially to an array from a scalar value.

I think that this did turn out to be a relatively easy way to do it, but this little item did take much more effort than I expected.

The DeclareOp builder is fairly straightforward:

builder.create<toy::DeclareOp>( loc, builder.getStringAttr( varName ), mlir::TypeAttr::get( ty ), nullptr ); // for scalars
...
auto sizeAttr = builder.getI64IntegerAttr( arraySize );
builder.create<toy::DeclareOp>( loc, builder.getStringAttr( varName ), mlir::TypeAttr::get( ty ), sizeAttr ); // for arrays

This matches the Optional size now added to DeclareOp for arrays:

def Toy_DeclareOp : Op<Toy_Dialect, "declare"> {
  let summary = "Declare a variable or array, specifying its name, type (integer or float), and optional size.";
  let arguments = (ins StrAttr:$name, TypeAttr:$type, OptionalAttr:$size);
  let results = (outs);

There’s a new AssignStringOp complementing AssignOp:

def Toy_AssignOp : Op<Toy_Dialect, "assign"> {
  let summary = "Assign a (non-string) value to a variable associated with a declaration";
  let arguments = (ins StrAttr:$name, AnyType:$value);
  let results = (outs);

  // toy.assign "x", %0 : i32
  let assemblyFormat = "$name `,` $value `:` type($value) attr-dict";
}

def Toy_AssignStringOp : Op<Toy_Dialect, "string_assign"> {
  let summary = "Assign a string literal to an i8 array variable";
  let arguments = (ins StrAttr:$name, Builtin_StringAttr:$value);
  let results = (outs);
  let assemblyFormat = "$name `=` $value attr-dict";
}

I also feel this is a cludge. I probably really want a string literal type like flang’s. Here’s a fortran hello world:

program hello
  print *, "Hello, world!"
end program hello

and selected parts of the flang fir dialect MLIR for it:

    %4 = fir.declare %3 typeparams %c13 {fortran_attrs = #fir.var_attrs, uniq_name = "_QQclX48656C6C6F2C20776F726C6421"} : (!fir.ref<!fir.char<1,13>>, index) -> !fir.ref<!fir.char<1,13>>
    %5 = fir.convert %4 : (!fir.ref<!fir.char<1,13>>) -> !fir.ref
    %6 = fir.convert %c13 : (index) -> i64
    %7 = fir.call @_FortranAioOutputAscii(%2, %5, %6) fastmath : (!fir.ref, !fir.ref, i64) -> i1
  ...

  fir.global linkonce @_QQclX48656C6C6F2C20776F726C6421 constant : !fir.char<1,13> {
    %0 = fir.string_lit "Hello, world!"(13) : !fir.char<1,13>
    fir.has_value %0 : !fir.char<1,13>
  }

I had to specialize the LoadOp builder too so that it didn’t create a scalar load. That code looks like:

mlir::Type varType;
mlir::Type elemType = declareOp.getTypeAttr().getValue();
        
if ( declareOp.getSizeAttr() )    // Check if size attribute exists
{       
    // Array: load a generic pointer 
    varType = mlir::LLVM::LLVMPointerType::get( builder.getContext(), /*addressSpace=*/0 );
}       
else    
{       
    // Scalar: load the value
    varType = elemType;
}       
        
auto value = builder.create<toy::LoadOp>( loc, varType, builder.getStringAttr( varName ) );

Lowering didn’t require too much. I needed a print function object:

auto ptrType = LLVM::LLVMPointerType::get( ctx );
auto printFuncStringType = LLVM::LLVMFunctionType::get( LLVM::LLVMVoidType::get( ctx ),
                                                        { pr_builder.getI64Type(), ptrType }, false );
pr_printFuncString = pr_builder.create<LLVM::LLVMFuncOp>( pr_module.getLoc(), "__toy_print_string",
                                                          printFuncStringType, LLVM::Linkage::External );

With LoadOp now possibly having pointer valued return

if ( loadOp.getResult().getType().isa<mlir::LLVM::LLVMPointerType>() )
{           
    // Return the allocated pointer
    LLVM_DEBUG( llvm::dbgs() << "Loading array address: " << allocaOp.getResult() << '\n' );
    rewriter.replaceOp( op, allocaOp.getResult() );
}           
else        
{           
    // Scalar load
    auto load = rewriter.create<LLVM::LoadOp>( loc, elemType, allocaOp );
    LLVM_DEBUG( llvm::dbgs() << "new load op: " << load << '\n' );
    rewriter.replaceOp( op, load.getResult() );
}           

assign-string lowering basically just generates a memcpy from a global:

Type elemType = allocaOp.getElemType();
int64_t numElems = 0;
if ( auto constOp = allocaOp.getArraySize().getDefiningOp<LLVM::ConstantOp>() )
{           
    auto intAttr = mlir::dyn_cast<IntegerAttr>( constOp.getValue() );
    numElems = intAttr.getInt();
}           
LLVM_DEBUG( llvm::dbgs() << "numElems: " << numElems << '\n' );
LLVM_DEBUG( llvm::dbgs() << "elemType: " << elemType << '\n' );

if ( !mlir::isa<mlir::IntegerType>( elemType ) || elemType.getIntOrFloatBitWidth() != 8 )
{           
    return rewriter.notifyMatchFailure( assignOp, "string assignment requires i8 array" );
}           
if ( numElems == 0 )
{           
    return rewriter.notifyMatchFailure( assignOp, "invalid array size" );
}           

size_t strLen = value.size();
size_t copySize = std::min( strLen + 1, static_cast<size_t>( numElems ) );
if ( strLen > static_cast<size_t>( numElems ) )
{           
    return rewriter.notifyMatchFailure( assignOp, "string too large for array" );
}           

mlir::LLVM::GlobalOp globalOp = lState.lookupOrInsertGlobalOp( rewriter, value, loc, copySize, strLen );

auto globalPtr = rewriter.create<LLVM::AddressOfOp>( loc, globalOp ); 

auto destPtr = allocaOp.getResult();

auto sizeConst = 
    rewriter.create<LLVM::ConstantOp>( loc, rewriter.getI64Type(), rewriter.getI64IntegerAttr( copySize ) );

rewriter.create<LLVM::MemcpyOp>( loc, destPtr, globalPtr, sizeConst, rewriter.getBoolAttr( false ) );

rewriter.eraseOp( op );

I used global’s like what we’d find in clang LLVM-IR. For example, here’s a C hello world:

#include <string.h>

int main()
{
    const char* s = "hi there";
    char buf[100];
    memcpy( buf, s, strlen( s ) + 1 );

    return strlen( buf );
}

where our LLVM-IR looks like:

@.str = private unnamed_addr constant [9 x i8] c"hi there\00", align 1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @main() #0 {
  %1 = alloca i32, align 4
  %2 = alloca ptr, align 8
  %3 = alloca [100 x i8], align 16
  store i32 0, ptr %1, align 4
  store ptr @.str, ptr %2, align 8
  %4 = getelementptr inbounds [100 x i8], ptr %3, i64 0, i64 0
  %5 = load ptr, ptr %2, align 8
  %6 = load ptr, ptr %2, align 8
  %7 = call i64 @strlen(ptr noundef %6) #3
  %8 = add i64 %7, 1
  call void @llvm.memcpy.p0.p0.i64(ptr align 16 %4, ptr align 1 %5, i64 %8, i1 false)
  %9 = getelementptr inbounds [100 x i8], ptr %3, i64 0, i64 0
  %10 = call i64 @strlen(ptr noundef %9) #3
  %11 = trunc i64 %10 to i32
  ret i32 %11
}

My lowered LLVM-IR for the program is similar:

@str_1 = private constant [3 x i8] c"bye"
@str_0 = private constant [2 x i8] c"hi"

declare void @__toy_print_f64(double)

declare void @__toy_print_i64(i64)

declare void @__toy_print_string(i64, ptr)

define i32 @main() {
  %1 = alloca i8, i64 2, align 1
  %2 = alloca i8, i64 3, align 1
  %3 = alloca i8, i64 2, align 1
  %4 = alloca i8, i64 1, align 1
  call void @llvm.memcpy.p0.p0.i64(ptr align 1 %3, ptr align 1 @str_0, i64 2, i1 false)
  call void @__toy_print_string(i64 2, ptr %3)
  call void @llvm.memcpy.p0.p0.i64(ptr align 1 %1, ptr align 1 @str_0, i64 2, i1 false)
  call void @__toy_print_string(i64 2, ptr %1)
  call void @llvm.memcpy.p0.p0.i64(ptr align 1 %2, ptr align 1 @str_0, i64 3, i1 false)
  call void @__toy_print_string(i64 3, ptr %2)
  call void @llvm.memcpy.p0.p0.i64(ptr align 1 %2, ptr align 1 @str_1, i64 3, i1 false)
  call void @__toy_print_string(i64 3, ptr %2)
  ret i32 0
}
...

I managed my string literals with a simple hash, avoiding replication if repeated:

mlir::LLVM::GlobalOp lookupOrInsertGlobalOp( ConversionPatternRewriter& rewriter, mlir::StringAttr& stringLit,
                                             mlir::Location loc, size_t copySize, size_t strLen )
{       
    mlir::LLVM::GlobalOp globalOp;
    auto it = pr_stringLiterals.find( stringLit.str() );
    if ( it != pr_stringLiterals.end() )
    {       
        globalOp = it->second;
        LLVM_DEBUG( llvm::dbgs() << "Reusing global: " << globalOp.getSymName() << '\n' ); 
    }       
    else    
    {       
        auto savedIP = rewriter.saveInsertionPoint();
        rewriter.setInsertionPointToStart( pr_module.getBody() );

        auto i8Type = rewriter.getI8Type();
        auto arrayType = mlir::LLVM::LLVMArrayType::get( i8Type, copySize );

        SmallVector<char> stringData( stringLit.begin(), stringLit.end() );
        if ( copySize > strLen )
        {       
            stringData.push_back( '\0' ); 
        }       
        auto denseAttr = DenseElementsAttr::get( RankedTensorType::get( { static_cast<int64_t>( copySize ) }, i8Type ),
                                                 ArrayRef<char>( stringData ) );

        std::string globalName = "str_" + std::to_string( pr_stringLiterals.size() );
        globalOp =
            rewriter.create<LLVM::GlobalOp>( loc, arrayType, true, LLVM::Linkage::Private, globalName, denseAttr );
        globalOp->setAttr( "unnamed_addr", rewriter.getUnitAttr() );

        pr_stringLiterals[stringLit.str()] = globalOp;
        LLVM_DEBUG( llvm::dbgs() << "Created global: " << globalName << '\n' );

        rewriter.restoreInsertionPoint( savedIP );
    }           

    return globalOp;
}         

Without the insertion point swaperoo, this GlobalOp creation doesn’t work, as we need to be in the ModuleOp level where the symbol table lives.

 

… anyways, it looks like I’m droning on.  There’s been lots of stuff to get this far, but there are still many many things to do before what I’ve got even qualifies as a basic programming language (if statements, loops, functions, array assignments, types, …)

Have added boolean operations to my toy MLIR compiler

May 30, 2025 C/C++ development and debugging. , , , ,

Screenshot

The git repo for the project now has a way to encode predicates, which I figured was a good first step towards adding some useful control flow (IF+LOOPS).  Specifically, the toy language/compiler now supports the following operators:

  • <
  • <=
  • >
  • >=
  • EQ
  • NE

This list works for any floating point or integer type (including BOOL, which is like “INT1”).  I also added AND,OR,XOR (for integer types, including BOOL.)  The grammar has a NOT operator, but it’s not implemented in the parser yet.

Here’s a sample program:

BOOL b;
BOOL i1;
INT16 l16;
i1 = TRUE;
l16 = -100;
b = i1 < l16;
PRINT b;
b = i1 > l16;
PRINT b;

My MLIR is:

module {
  toy.program {
    toy.declare "b" : i1
    toy.declare "i1" : i1
    toy.declare "l16" : i16
    %true = arith.constant true
    toy.assign "i1", %true : i1
    %c-100_i64 = arith.constant -100 : i64
    toy.assign "l16", %c-100_i64 : i64
    %0 = toy.load "i1" : i1
    %1 = toy.load "l16" : i16
    %2 = "toy.less"(%0, %1) : (i1, i16) -> i1
    toy.assign "b", %2 : i1
    %3 = toy.load "b" : i1
    toy.print %3 : i1
    %4 = toy.load "i1" : i1
    %5 = toy.load "l16" : i16
    %6 = "toy.less"(%5, %4) : (i16, i1) -> i1
    toy.assign "b", %6 : i1
    %7 = toy.load "b" : i1
    toy.print %7 : i1
    toy.exit
  }
}

Here’s the LLVM-IR after lowering:

declare void @__toy_print_f64(double)

declare void @__toy_print_i64(i64)

define i32 @main() !dbg !4 {
  %1 = alloca i1, i64 1, align 1, !dbg !8
    #dbg_declare(ptr %1, !9, !DIExpression(), !8)
  %2 = alloca i1, i64 1, align 1, !dbg !11
    #dbg_declare(ptr %2, !12, !DIExpression(), !11)
  %3 = alloca i16, i64 1, align 2, !dbg !13
    #dbg_declare(ptr %3, !14, !DIExpression(), !13)
  store i1 true, ptr %2, align 1, !dbg !16
  store i16 -100, ptr %3, align 2, !dbg !17
  %4 = load i1, ptr %2, align 1, !dbg !18
  %5 = load i16, ptr %3, align 2, !dbg !18
  %6 = zext i1 %4 to i16, !dbg !18
  %7 = icmp slt i16 %6, %5, !dbg !18
  store i1 %7, ptr %1, align 1, !dbg !18
  %8 = load i1, ptr %1, align 1, !dbg !19
  %9 = zext i1 %8 to i64, !dbg !19
  call void @__toy_print_i64(i64 %9), !dbg !19
  %10 = load i1, ptr %2, align 1, !dbg !20
  %11 = load i16, ptr %3, align 2, !dbg !20
  %12 = zext i1 %10 to i16, !dbg !20
  %13 = icmp slt i16 %11, %12, !dbg !20
  store i1 %13, ptr %1, align 1, !dbg !20
  %14 = load i1, ptr %1, align 1, !dbg !21
  %15 = zext i1 %14 to i64, !dbg !21
  call void @__toy_print_i64(i64 %15), !dbg !21
  ret i32 0, !dbg !8
}

I’m going to want to try to refactor the type conversion logic, as what I have now in lowering is pretty clunky.

Debugging now works in my toy MLIR compiler!

May 25, 2025 C/C++ development and debugging. , , , , , , , , , ,

Screenshot

Screenshot

I’ve now got both line debugging (break, next, continue) working, and variable display (and modification) debugging now working for my toy language and compiler.

Here’s an example program:

BOOL i1;
i1 = TRUE;
PRINT i1;

INT8 i8;
i8 = 10;
PRINT i8;

INT16 i16;
i16 = 1000;
PRINT i16;

INT32 i32;
i32 = 100000;
PRINT i32;

INT64 i64;
i64 = 100000000000;
PRINT i64;

FLOAT32 f32;
f32 = 1.1;
PRINT f32;

FLOAT64 f64;
f64 = 2.2E-1;
PRINT f64;

It doesn’t do anything interesting, other than demonstrate that I got the DILocalVariableAttr declarations right for each supported type. Here’s the MLIR for this program:

"builtin.module"() ({
  "toy.program"() ({
    "toy.declare"() <{name = "i1", type = i1}> : () -> () loc(#loc)
    %0 = "arith.constant"() <{value = true}> : () -> i1 loc(#loc1)
    "toy.assign"(%0) <{name = "i1"}> : (i1) -> () loc(#loc1)
    %1 = "toy.load"() <{name = "i1"}> : () -> i1 loc(#loc2)
    "toy.print"(%1) : (i1) -> () loc(#loc2)
    "toy.declare"() <{name = "i8", type = i8}> : () -> () loc(#loc3)
    %2 = "arith.constant"() <{value = 10 : i64}> : () -> i64 loc(#loc4)
    "toy.assign"(%2) <{name = "i8"}> : (i64) -> () loc(#loc4)
    %3 = "toy.load"() <{name = "i8"}> : () -> i8 loc(#loc5)
    "toy.print"(%3) : (i8) -> () loc(#loc5)
    "toy.declare"() <{name = "i16", type = i16}> : () -> () loc(#loc6)
    %4 = "arith.constant"() <{value = 1000 : i64}> : () -> i64 loc(#loc7)
    "toy.assign"(%4) <{name = "i16"}> : (i64) -> () loc(#loc7)
    %5 = "toy.load"() <{name = "i16"}> : () -> i16 loc(#loc8)
    "toy.print"(%5) : (i16) -> () loc(#loc8)
    "toy.declare"() <{name = "i32", type = i32}> : () -> () loc(#loc9)
    %6 = "arith.constant"() <{value = 100000 : i64}> : () -> i64 loc(#loc10)
    "toy.assign"(%6) <{name = "i32"}> : (i64) -> () loc(#loc10)
    %7 = "toy.load"() <{name = "i32"}> : () -> i32 loc(#loc11)
    "toy.print"(%7) : (i32) -> () loc(#loc11)
    "toy.declare"() <{name = "i64", type = i64}> : () -> () loc(#loc12)
    %8 = "arith.constant"() <{value = 100000000000 : i64}> : () -> i64 loc(#loc13)
    "toy.assign"(%8) <{name = "i64"}> : (i64) -> () loc(#loc13)
    %9 = "toy.load"() <{name = "i64"}> : () -> i64 loc(#loc14)
    "toy.print"(%9) : (i64) -> () loc(#loc14)
    "toy.declare"() <{name = "f32", type = f32}> : () -> () loc(#loc15)
    %10 = "arith.constant"() <{value = 1.100000e+00 : f64}> : () -> f64 loc(#loc16)
    "toy.assign"(%10) <{name = "f32"}> : (f64) -> () loc(#loc16)
    %11 = "toy.load"() <{name = "f32"}> : () -> f32 loc(#loc17)
    "toy.print"(%11) : (f32) -> () loc(#loc17)
    "toy.declare"() <{name = "f64", type = f64}> : () -> () loc(#loc18)
    %12 = "arith.constant"() <{value = 2.200000e-01 : f64}> : () -> f64 loc(#loc19)
    "toy.assign"(%12) <{name = "f64"}> : (f64) -> () loc(#loc19)
    %13 = "toy.load"() <{name = "f64"}> : () -> f64 loc(#loc20)
    "toy.print"(%13) : (f64) -> () loc(#loc20)
    "toy.exit"() : () -> () loc(#loc)
  }) : () -> () loc(#loc)
}) : () -> () loc(#loc)
#loc = loc("types.toy":1:1)
#loc1 = loc("types.toy":2:6)
#loc2 = loc("types.toy":3:1)
#loc3 = loc("types.toy":5:1)
#loc4 = loc("types.toy":6:6)
#loc5 = loc("types.toy":7:1)
#loc6 = loc("types.toy":9:1)
#loc7 = loc("types.toy":10:7)
#loc8 = loc("types.toy":11:1)
#loc9 = loc("types.toy":13:1)
#loc10 = loc("types.toy":14:7)
#loc11 = loc("types.toy":15:1)
#loc12 = loc("types.toy":17:1)
#loc13 = loc("types.toy":18:7)
#loc14 = loc("types.toy":19:1)
#loc15 = loc("types.toy":21:1)
#loc16 = loc("types.toy":22:7)
#loc17 = loc("types.toy":23:1)
#loc18 = loc("types.toy":25:1)
#loc19 = loc("types.toy":26:7)
#loc20 = loc("types.toy":27:1)

and the generated LLVM-IR

; ModuleID = 'types.toy'
source_filename = "types.toy"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare void @__toy_print_f64(double)

declare void @__toy_print_i64(i64)

define i32 @main() !dbg !4 {
  %1 = alloca i1, i64 1, align 1, !dbg !8
    #dbg_declare(ptr %1, !9, !DIExpression(), !8)
  store i1 true, ptr %1, align 1, !dbg !11
  %2 = load i1, ptr %1, align 1, !dbg !12
  %3 = zext i1 %2 to i64, !dbg !12
  call void @__toy_print_i64(i64 %3), !dbg !12
  %4 = alloca i8, i64 1, align 1, !dbg !13
    #dbg_declare(ptr %4, !14, !DIExpression(), !13)
  store i8 10, ptr %4, align 1, !dbg !16
  %5 = load i8, ptr %4, align 1, !dbg !17
  %6 = sext i8 %5 to i64, !dbg !17
  call void @__toy_print_i64(i64 %6), !dbg !17
  %7 = alloca i16, i64 1, align 2, !dbg !18
    #dbg_declare(ptr %7, !19, !DIExpression(), !18)
  store i16 1000, ptr %7, align 2, !dbg !21
  %8 = load i16, ptr %7, align 2, !dbg !22
  %9 = sext i16 %8 to i64, !dbg !22
  call void @__toy_print_i64(i64 %9), !dbg !22
  %10 = alloca i32, i64 1, align 4, !dbg !23
    #dbg_declare(ptr %10, !24, !DIExpression(), !23)
  store i32 100000, ptr %10, align 4, !dbg !26
  %11 = load i32, ptr %10, align 4, !dbg !27
  %12 = sext i32 %11 to i64, !dbg !27
  call void @__toy_print_i64(i64 %12), !dbg !27
  %13 = alloca i64, i64 1, align 8, !dbg !28
    #dbg_declare(ptr %13, !29, !DIExpression(), !28)
  store i64 100000000000, ptr %13, align 8, !dbg !31
  %14 = load i64, ptr %13, align 8, !dbg !32
  call void @__toy_print_i64(i64 %14), !dbg !32
  %15 = alloca float, i64 1, align 4, !dbg !33
    #dbg_declare(ptr %15, !34, !DIExpression(), !33)
  store float 0x3FF19999A0000000, ptr %15, align 4, !dbg !36
  %16 = load float, ptr %15, align 4, !dbg !37
  %17 = fpext float %16 to double, !dbg !37
  call void @__toy_print_f64(double %17), !dbg !37
  %18 = alloca double, i64 1, align 8, !dbg !38
    #dbg_declare(ptr %18, !39, !DIExpression(), !38)
  store double 2.200000e-01, ptr %18, align 8, !dbg !41
  %19 = load double, ptr %18, align 8, !dbg !42
  call void @__toy_print_f64(double %19), !dbg !42
  ret i32 0, !dbg !8
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare void @llvm.dbg.declare(metadata, metadata, metadata) #0

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

!llvm.module.flags = !{!0}
!llvm.dbg.cu = !{!1}
!llvm.ident = !{!3}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = distinct !DICompileUnit(language: DW_LANG_C, file: !2, producer: "toycalculator", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug)
!2 = !DIFile(filename: "types.toy", directory: ".")
!3 = !{!"toycalculator V2"}
!4 = distinct !DISubprogram(name: "main", linkageName: "main", scope: !2, file: !2, line: 1, type: !5, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !1)
!5 = !DISubroutineType(types: !6)
!6 = !{!7}
!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!8 = !DILocation(line: 1, column: 1, scope: !4)
!9 = !DILocalVariable(name: "i1", scope: !4, file: !2, line: 1, type: !10, align: 8)
!10 = !DIBasicType(name: "bool", size: 8, encoding: DW_ATE_boolean)
!11 = !DILocation(line: 2, column: 6, scope: !4)
!12 = !DILocation(line: 3, column: 1, scope: !4)
!13 = !DILocation(line: 5, column: 1, scope: !4)
!14 = !DILocalVariable(name: "i8", scope: !4, file: !2, line: 5, type: !15, align: 8)
!15 = !DIBasicType(name: "int8_t", size: 8, encoding: DW_ATE_signed)
!16 = !DILocation(line: 6, column: 6, scope: !4)
!17 = !DILocation(line: 7, column: 1, scope: !4)
!18 = !DILocation(line: 9, column: 1, scope: !4)
!19 = !DILocalVariable(name: "i16", scope: !4, file: !2, line: 9, type: !20, align: 16)
!20 = !DIBasicType(name: "int16_t", size: 16, encoding: DW_ATE_signed)
!21 = !DILocation(line: 10, column: 7, scope: !4)
!22 = !DILocation(line: 11, column: 1, scope: !4)
!23 = !DILocation(line: 13, column: 1, scope: !4)
!24 = !DILocalVariable(name: "i32", scope: !4, file: !2, line: 13, type: !25, align: 32)
!25 = !DIBasicType(name: "int32_t", size: 32, encoding: DW_ATE_signed)
!26 = !DILocation(line: 14, column: 7, scope: !4)
!27 = !DILocation(line: 15, column: 1, scope: !4)
!28 = !DILocation(line: 17, column: 1, scope: !4)
!29 = !DILocalVariable(name: "i64", scope: !4, file: !2, line: 17, type: !30, align: 64)
!30 = !DIBasicType(name: "int64_t", size: 64, encoding: DW_ATE_signed)
!31 = !DILocation(line: 18, column: 7, scope: !4)
!32 = !DILocation(line: 19, column: 1, scope: !4)
!33 = !DILocation(line: 21, column: 1, scope: !4)
!34 = !DILocalVariable(name: "f32", scope: !4, file: !2, line: 21, type: !35, align: 32)
!35 = !DIBasicType(name: "float", size: 32, encoding: DW_ATE_float)
!36 = !DILocation(line: 22, column: 7, scope: !4)
!37 = !DILocation(line: 23, column: 1, scope: !4)
!38 = !DILocation(line: 25, column: 1, scope: !4)
!39 = !DILocalVariable(name: "f64", scope: !4, file: !2, line: 25, type: !40, align: 64)
!40 = !DIBasicType(name: "double", size: 64, encoding: DW_ATE_float)
!41 = !DILocation(line: 26, column: 7, scope: !4)
!42 = !DILocation(line: 27, column: 1, scope: !4)

Interesting bits include:

source_filename = "types.toy"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

!llvm.module.flags = !{!0}
!llvm.dbg.cu = !{!1}
!llvm.ident = !{!3}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = distinct !DICompileUnit(language: DW_LANG_C, file: !2, producer: "toycalculator", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug)
!2 = !DIFile(filename: "types.toy", directory: ".")
!3 = !{!"toycalculator V2"}
!4 = distinct !DISubprogram(name: "main", linkageName: "main", scope: !2, file: !2, line: 1, type: !5, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !1)
!5 = !DISubroutineType(types: !6)
!6 = !{!7}

Unlike flang’s AddDebugInfoPass DI instrumentation pass, I didn’t try to do anything fancy, instead just implemented a couple of helper functions.

One for the target triple:

void setModuleAttrs()
{
    std::string targetTriple = llvm::sys::getDefaultTargetTriple();
    llvm::Triple triple( targetTriple );
    assert( triple.isArch64Bit() &amp;amp;amp;amp;&amp;amp;amp;amp; triple.isOSLinux() );

    std::string error;
    const llvm::Target* target = llvm::TargetRegistry::lookupTarget( targetTriple, error );
    assert( target );
    llvm::TargetOptions options;
    auto targetMachine = std::unique_ptr&amp;amp;amp;lt;llvm::TargetMachine&amp;amp;amp;gt;( target-&amp;amp;amp;gt;createTargetMachine(
        targetTriple, "generic", "", options, std::optional&amp;amp;amp;lt;llvm::Reloc::Model&amp;amp;amp;gt;( llvm::Reloc::PIC_ ) ) );
    assert( targetMachine );
    std::string dataLayoutStr = targetMachine-&amp;amp;amp;gt;createDataLayout().getStringRepresentation();

    module-&amp;amp;amp;gt;setAttr( "llvm.ident", builder.getStringAttr( COMPILER_NAME COMPILER_VERSION ) );
    module-&amp;amp;amp;gt;setAttr( "llvm.data_layout", builder.getStringAttr( dataLayoutStr ) );
    module-&amp;amp;amp;gt;setAttr( "llvm.target_triple", builder.getStringAttr( targetTriple ) );
}

one for the DICompileUnitAttr, and DISubprogramAttr:

void createMain()
{
    auto ctx = builder.getContext();
    auto mainFuncType = LLVM::LLVMFunctionType::get( builder.getI32Type(), {}, false );
    mainFunc =
        builder.create&amp;amp;amp;lt;LLVM::LLVMFuncOp&amp;amp;amp;gt;( module.getLoc(), ENTRY_SYMBOL_NAME, mainFuncType, LLVM::Linkage::External );

    // Construct module level DI state:
    fileAttr = mlir::LLVM::DIFileAttr::get( ctx, driverState.filename, "." );
    auto distinctAttr = mlir::DistinctAttr::create( builder.getUnitAttr() );
    auto compileUnitAttr = mlir::LLVM::DICompileUnitAttr::get(
        ctx, distinctAttr, llvm::dwarf::DW_LANG_C, fileAttr, builder.getStringAttr( COMPILER_NAME ), false,
        mlir::LLVM::DIEmissionKind::Full, mlir::LLVM::DINameTableKind::Default );
    auto ta =
        mlir::LLVM::DIBasicTypeAttr::get( ctx, (unsigned)llvm::dwarf::DW_TAG_base_type, builder.getStringAttr( "int" ),
                                          32, (unsigned)llvm::dwarf::DW_ATE_signed );
    llvm::SmallVector&amp;amp;amp;lt;mlir::LLVM::DITypeAttr, 1&amp;amp;amp;gt; typeArray;
    typeArray.push_back( ta );
    auto subprogramType = mlir::LLVM::DISubroutineTypeAttr::get( ctx, 0, typeArray );
    subprogramAttr = mlir::LLVM::DISubprogramAttr::get(
        ctx, mlir::DistinctAttr::create( builder.getUnitAttr() ), compileUnitAttr, fileAttr,
        builder.getStringAttr( ENTRY_SYMBOL_NAME ), builder.getStringAttr( ENTRY_SYMBOL_NAME ), fileAttr, 1, 1,
        mlir::LLVM::DISubprogramFlags::Definition, subprogramType, llvm::ArrayRef&amp;amp;amp;lt;mlir::LLVM::DINodeAttr&amp;amp;amp;gt;{},
        llvm::ArrayRef&amp;amp;amp;lt;mlir::LLVM::DINodeAttr&amp;amp;amp;gt;{} );
    mainFunc-&amp;amp;amp;gt;setAttr( "llvm.debug.subprogram", subprogramAttr );

    // This is the key to ensure that translateModuleToLLVMIR does not strip the location info (instead converts
    // loc's into !dbg's)
    mainFunc-&amp;amp;amp;gt;setLoc( builder.getFusedLoc( { module.getLoc() }, subprogramAttr ) );
}

The ‘setLoc’ call above, right near the end is critical.  Without that, the call to mlir::translateModuleToLLVMIR strips out all the loc() references, instead of replacing them with !DILocation.

Finally, one for the variable DI creation:

void constructVariableDI( llvm::StringRef varName, mlir::Type&amp;amp;amp;amp; elemType, mlir::FileLineColLoc loc,
                          unsigned elemSizeInBits, mlir::LLVM::AllocaOp&amp;amp;amp;amp; allocaOp )
{
    auto ctx = builder.getContext();
    allocaOp-&amp;amp;amp;gt;setAttr( "bindc_name", builder.getStringAttr( varName ) );

    mlir::LLVM::DILocalVariableAttr diVar;

    if ( elemType.isa&amp;amp;amp;lt;mlir::IntegerType&amp;amp;amp;gt;() )
    {
        const char* typeName{};
        unsigned dwType = llvm::dwarf::DW_ATE_signed;
        unsigned sz = elemSizeInBits;

        switch ( elemSizeInBits )
        {
            case 1:
            {
                typeName = "bool";
                dwType = llvm::dwarf::DW_ATE_boolean;
                sz = 8;
                break;
            }
            case 8:
            {
                typeName = "int8_t";
                break;
            }
            case 16:
            {
                typeName = "int16_t";
                break;
            }
            case 32:
            {
                typeName = "int32_t";
                break;
            }
            case 64:
            {
                typeName = "int64_t";
                break;
            }
            default:
            {
                llvm_unreachable( "Unsupported float type size" );
            }
        }

        auto diType = mlir::LLVM::DIBasicTypeAttr::get( ctx, llvm::dwarf::DW_TAG_base_type,
                                                        builder.getStringAttr( typeName ), sz, dwType );

        diVar = mlir::LLVM::DILocalVariableAttr::get( ctx, subprogramAttr, builder.getStringAttr( varName ), fileAttr,
                                                      loc.getLine(), 0, sz, diType, mlir::LLVM::DIFlags::Zero );
    }
    else
    {
        const char* typeName{};

        switch ( elemSizeInBits )
        {
            case 32:
            {
                typeName = "float";
                break;
            }
            case 64:
            {
                typeName = "double";
                break;
            }
            default:
            {
                llvm_unreachable( "Unsupported float type size" );
            }
        }

        auto diType =
            mlir::LLVM::DIBasicTypeAttr::get( ctx, llvm::dwarf::DW_TAG_base_type, builder.getStringAttr( typeName ),
                                              elemSizeInBits, llvm::dwarf::DW_ATE_float );

        diVar =
            mlir::LLVM::DILocalVariableAttr::get( ctx, subprogramAttr, builder.getStringAttr( varName ), fileAttr,
                                                  loc.getLine(), 0, elemSizeInBits, diType, mlir::LLVM::DIFlags::Zero );
    }
            
    builder.setInsertionPointAfter( allocaOp );
    builder.create&amp;amp;amp;lt;mlir::LLVM::DbgDeclareOp&amp;amp;amp;gt;( loc, allocaOp, diVar );
        
    symbolToAlloca[varName] = allocaOp;
}

In this code, the call to builder.setInsertionPointAfter is critical.  When the lowering eraseOp takes out the DeclareOp, we need the replacement instructions to all end up in the same place.  Without that, the subsequent AssignOp lowering results in an error like this:

//===-------------------------------------------===//
Legalizing operation : 'toy.assign'(0x2745ab50) {
  "toy.assign"(%3) <{name = "x"}> : (i64) -> ()Fold {
  } -> FAILURE : unable to fold
Pattern : 'toy.assign -> ()' {
Trying to match "toy::AssignOpLowering"
Lowering AssignOp: toy.assign "x", %c5_i64 : i64
name: x
value: ImplicitTypeIDRegistry::lookupOrInsert(mlir::PromotableOpInterface::Trait<mlir::TypeID::get()::Empty>)
...
operand #0 does not dominate this use
mlir-asm-printer: 'builtin.module' failed to verify and will be printed in generic form
%3 = "arith.constant"() <{value = 5 : i64}> : () -> i64
valType: i64
elemType: f64
** Insert  : 'llvm.sitofp'(0x274a6ed0)
ImplicitTypeIDRegistry::lookupOrInsert(mlir::LLVM::detail::StoreOpGenericAdaptorBase::Properties)
** Insert  : 'llvm.store'(0x27437f30)
** Erase   : 'toy.assign'(0x2745ab50)
"toy::AssignOpLowering" result 1

My DI insertion isn’t fancy like flang’s, but I have only simple types to deal with, and don’t even support functions yet, so my simple way seemed like a reasonable choice. Regardless, getting working debugger support is nice milestone.