C/C++ development and debugging.

mixed results with more C++ module experimentation

November 22, 2025 C/C++ development and debugging. , , , , ,

Here is some followup from my earlier attempt to use bleeding edge C++ module support.

First experiment: do I need to import all of the std library?

I tried this:

import iostream;

int main() {

  std::cout << "hello world\n";

  return 0;
}

I didn’t know if gcm.cache/std.gcm (already built from my previous experiment) would supply that export, but I get:

g++ -std=c++23 -fmodules   -c -o broken.o broken.cc
In module imported at broken.cc:1:1:
iostream: error: failed to read compiled module: No such file or directory
iostream: note: compiled module file is ‘gcm.cache/iostream.gcm’
iostream: note: imports must be built before being imported
iostream: fatal error: returning to the gate for a mechanical issue
compilation terminated.
make: *** [: broken.o] Error 1

so it appears the answer is no. Also, /usr/include/c++/15/bits/ only appears to have a std.cc, and no iostream.cc:

> find  /usr/include/c++/15/bits/ -name "*.cc"
/usr/include/c++/15/bits/std.cc
/usr/include/c++/15/bits/std.compat.cc

so it appears, for the time being, g++-15 is all or nothing with respect to std imports. However, when using precompiled headers, you usually want a big pre-generated pch that has just about everything, and this is similar, so maybe that’s not so bad (other than namespace pollution.)

Second experiment. Adding a non-std import/export.

I moved a variant of Stroustrup’s collect_lines function into a separate module, like so:

// stuff.cc
export module stuff;
import std;

namespace stuff {

void helper() { std::cout << "call to a private function\n"; }

export
std::vector<std::string> collect_lines(std::istream &is) {

  helper();

  std::unordered_set<std::string> s;
  for (std::string line; std::getline(is, line);)
    s.insert(line);

  //return std::vector<std::string>(s.begin(), s.end());
  return std::vector{std::from_range, s};
}
} // namespace stuff

It turns out that I needed the export keyword on ‘module stuff’, as well as for any function that I wanted to export. Without that I get:

> make 
g++ -std=c++23 -fmodules -c /usr/include/c++/15/bits/std.cc
g++ -std=c++23 -fmodules   -c -o stuff.o stuff.cc
g++ -std=c++23 -fmodules   -c -o try.o try.cc
try.cc: In function ‘int main()’:
try.cc:8:12: error: ‘stuff’ has not been declared
    8 |   auto v = stuff::collect_lines(std::cin);
      |            ^~~~~
make: *** [: try.o] Error 1

The compile error is not very good. It doesn’t complain that collect_lines is not exported, but instead complains that stuff, the namespace itself, is not declared.

I can export the namespace, which is the naive resolution to the compiler diagnostic presented, for example:

export module stuff;
import std;

export namespace stuff {

void helper() { std::cout << "call to a private function\n"; }

//export
std::vector<std::string> collect_lines(std::istream &is) {

  helper();

  std::unordered_set<std::string> s;
  for (std::string line; std::getline(is, line);)
    s.insert(line);

  //return std::vector<std::string>(s.begin(), s.end());
  return std::vector{std::from_range, s};
}
} // namespace stuff

However, that means that the calling code can now call stuff::helper, which was not my intent.

There also does not appear to be any good way to enumerate exports available in the gcm.cache. nm output for the symbol is not any different with or without the export keyword:

> nm stuff.o | grep collect_lines | c++filt
0000000000000028 T stuff::collect_lines@stuff[abi:cxx11](std::basic_istream<char, std::char_traits >&)

This is a critically important tooling failure if modules are going to be used in production. Anybody who has programmed with windows dlls or AIX shared objects, or Linux shared objects with symbol versioning, knows about the resulting hellish nature of the linker error chase, when an export is missed from such an enumeration. Hopefully, there’s some external tool that can enumerate gcm.cache exports. Both grok and chatgpt were unsuccessful advising about tools for this sort of task. The best answer was chatgpt’s recommendation for -fmodule-dump:

> g++ -std=c++23 -fmodules -save-temps -fdump-lang-module   -c -o stuff.o stuff.cc 
fedoravm:/home/peeter/physicsplay/programming/module> ls
broken.cc  gcm.cache  makefile  makefile.clang  std.o  stuff.cc  stuff.cc.002l.module  stuff.ii  stuff.o  stuff.s  try.cc

but that *.module output doesn’t have anything that obviously distinguishes exported vs. non-exported symbols:

> grep -2e stuff::helper -e stuff::collect_lines *.module
Wrote section:28 named-by:'::std::vector<::std::__cxx11::basic_string@std:1<char,::std::char_traits@std:1,::std::allocator@std:1>,::std::allocator<::std::__cxx11::basic_string@std:1<char,::std::char_traits@std:1,::std::allocator@std:1>>>'
Writing section:29 2 depsets
 Depset:0 decl entity:403 function_decl:'::stuff::collect_lines'
 Wrote declaration entity:403 function_decl:'::stuff::collect_lines'
 Depset:1 binding namespace_decl:'::stuff::collect_lines'
Wrote section:29 named-by:'::stuff::collect_lines'
Writing section:30 2 depsets
 Depset:0 decl entity:404 function_decl:'::stuff::helper'
 Wrote declaration entity:404 function_decl:'::stuff::helper'
 Depset:1 binding namespace_decl:'::stuff::helper'
Wrote section:30 named-by:'::stuff::helper'
Writing section:31 4 depsets
 Depset:0 specialization entity:405 type_decl:'::std::__replace_first_arg<::std::allocator<::std::__detail::_Hash_node<::std::__cxx11::basic_string@std:1<char,::std::char_traits@std:1,::std::allocator@std:1>,0x1>>,::std::__cxx11::basic_string@std:1<char,::std::char_traits@std:1,::std::allocator@std:1>>'
--
Writing binding table
 Bindings '::std::operator==' section:8
 Bindings '::stuff::collect_lines' section:29
 Bindings '::stuff::helper' section:30
 Bindings '::std::swap' section:35
Writing pending-entities

Chatgpt summarizes this as follows:

“This is confirmed by overwhelming evidence:

  • GCC bug 113590
  • GCC mailing list discussion July 2024
  • Confirmation from module implementers: “GCC BMIs do not currently record export flags.”

This is intentional (for now): GCC’s binary module interface tracks reachable declarations, not exported ones.”

Trying clang

After considerable experimentation, and both grok and chatgpt help, I was finally able to get a working compile and link sequence using the clang toolchain:

fedoravm:/home/peeter/physicsplay/programming/module> make -f *.clang clean
rm -f *.o *.pcm try
fedoravm:/home/peeter/physicsplay/programming/module> make -f *.clang 
clang++ -std=c++23 -stdlib=libc++ -Wall -Wextra -Wno-reserved-module-identifier --precompile /usr/share/libc++/v1/std.cppm -o std.pcm
clang++ -std=c++23 -stdlib=libc++ -Wall -Wextra -fmodule-file=std=std.pcm --precompile stuff.cppm -o stuff.pcm
clang++ -std=c++23 -stdlib=libc++ -Wall -Wextra -fmodule-file=std=std.pcm -fmodule-file=stuff=stuff.pcm -c try.cc -o try.o
clang++ -std=c++23 -stdlib=libc++ -Wall -Wextra -fmodule-file=std=std.pcm -c stuff.cc -o stuff.o
clang++ -std=c++23 -stdlib=libc++ -Wall -Wextra -fmodule-file=std=std.pcm -fmodule-file=stuff=stuff.pcm try.o stuff.o -o try

Unlike g++, I have to build both the module and the object code for stuff.cc (and facilitated that with a clang.cppm -> clang.cc symlink), but unlike g++, I didn’t need a std.o (for reasons that I don’t understand.)

Dumping the clang-AST appears to be the closest that we can get to enumerating exports. Example:

> clang++ -std=c++23 -stdlib=libc++ -Wall -Wextra -fmodule-file=std=std.pcm  -Xclang -ast-dump -fsyntax-only stuff.cc | less -R

This shows output like:

Screenshot

This is not terribly user friendly, and not something that a typical clang front end user would attempt to do.

This hints that the “way” do dump exports would be to write a clang-AST visitor that dumps all the ExportDecl’s that are encountered (or a complex grep script that attempts to mine the -ast-dump output)

C++ sample code with modules!

November 21, 2025 C/C++ development and debugging. ,

Screenshot

A coworker shared the Stroustrup paper titled “21st Century C++”. I was reading a PDF version, but a search turns up an online version too.

This paper included use of C++ with modules. I’ve had my eyes on those since working on DB2, which suffered from include file hell (DB2’s include file hierarchy was a fully connected graph). However, until today, I didn’t realize that there were non-experimental compilers that included module support.

Here’s a sample program that uses modules (Stroustrup’s, with a main added)

import std;

using namespace std;

vector<string> collect_lines(istream &is) {
  unordered_set<string> s;
  for (string line; getline(is, line);)
    s.insert(line);

  return vector{from_range, s};
}

int main() {
  auto v = collect_lines(cin);
  for (const auto &i : v) {
    cout << format("{}\n", i);
  }

  return 0;
}

A first attempt to compile this, even with -std=c++23 bombs:

fedoravm:/home/peeter/physicsplay/programming/module> g++ -std=c++23 -o try try.cc 2>&1 | head -5
try.cc:1:1: error: ‘import’ does not name a type
    1 | import std;
      | ^~~~~~
try.cc:1:1: note: C++20 ‘import’ only available with ‘-fmodules’, which is not yet enabled with ‘-std=c++20’
try.cc:5:8: error: ‘string’ was not declared in this scope

but we get a hint about what is needed (-fmodules). However, that’s not enough by itself:

fedoravm:/home/peeter/physicsplay/programming/module> g++ -std=c++23 -fmodules -o try try.cc 2>&1 | head -5
In module imported at try.cc:1:1:
std: error: failed to read compiled module: No such file or directory
std: note: compiled module file is ‘gcm.cache/std.gcm’
std: note: imports must be built before being imported
std: fatal error: returning to the gate for a mechanical issue

Here’s the magic sequence that we need, which includes a build of the C++ std export too:

g++ -std=c++23 -fmodules -c /usr/include/c++/15/bits/std.cc
g++ -std=c++23 -fmodules   -c -o try.o try.cc
g++ -std=c++23 -fmodules -o try std.o try.o  

On this VM, I have g++-15 installed, which is sufficient to build and run this little program, modules and all.

An update to floatexplorer.

August 4, 2025 C/C++ development and debugging.

The IEEE 32-bit float explorer that I wrote about previously, has now been extended from just float (e8m23) to include floating point support for a number of other representations, including additional CPU floating point types:

  • 64-bit IEEE (double: e11m52),
  • Intel “80-bit” (long double: e15m64),
  • 128-bit IEEE (long double on ARM Linux: e15m122).   This is also the GCC quadmath representation,

and GPU floating point types:

  • e5m2
  • e4m3
  • fp16 (e5m10)
  • bf16 (e8m7)

The CUDA API is used for floating point conversions of the GPU floating point types (if available), and a manual convertor has been implemented if CUDA is not available.

The Intel long double format is currently only supported when building on x64.  This type is different from all the others, where normal values do not use an implicit leading mantissa bit.

I have not implemented mainframe HEXFLOAT support.

Here is some sample output:

type: bf16
value:    3
hex:      4040
bits:     0100000001000000
sign:     0
exponent:  10000000                        (127 +1)
mantissa:          1000000
number:          1.1000000 x 2^(1)

type: fp16
value:    3
hex:      4200
bits:     0100001000000000
sign:     0
exponent:  10000                        (15 +1)
mantissa:       1000000000
number:       1.1000000000 x 2^(1)

type: e4m3
value:    3
hex:      44
bits:     01000100
sign:     0
exponent:  1000                        (7 +1)
mantissa:      100
number:      1.100 x 2^(1)

type: e5m2
value:    3
hex:      42
bits:     01000010
sign:     0
exponent:  10000                        (15 +1)
mantissa:       10
number:       1.10 x 2^(1)

type: float
value:    3
hex:      40400000
bits:     01000000010000000000000000000000
sign:     0
exponent:  10000000                        (127 +1)
mantissa:          10000000000000000000000
number:          1.10000000000000000000000 x 2^(1)

type: double
value:    3
hex:      4008000000000000
bits:     0100000000001000000000000000000000000000000000000000000000000000
sign:     0
exponent:  10000000000                                                     (1023 +1)
mantissa:             1000000000000000000000000000000000000000000000000000
number:             1.1000000000000000000000000000000000000000000000000000 x 2^(1)

type: long double
value:    3
hex:      4000C000000000000000
bits:     01000000000000001100000000000000000000000000000000000000000000000000000000000000
sign:     0
exponent:  100000000000000                                                     (16383 +1)
mantissa:                 1100000000000000000000000000000000000000000000000000000000000000
number:                 0.1100000000000000000000000000000000000000000000000000000000000000 x 2^(2)

type: float128
value:    3.000000
hex:      40008000000000000000000000000000
bits:     01000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
sign:     0
exponent:  100000000000000                                                     (16383 +1)
mantissa:                 1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
number:                 1.1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 x 2^(1)

Have added boolean operations to my toy MLIR compiler

May 30, 2025 C/C++ development and debugging. , , , ,

Screenshot

The git repo for the project now has a way to encode predicates, which I figured was a good first step towards adding some useful control flow (IF+LOOPS).  Specifically, the toy language/compiler now supports the following operators:

  • <
  • <=
  • >
  • >=
  • EQ
  • NE

This list works for any floating point or integer type (including BOOL, which is like “INT1”).  I also added AND,OR,XOR (for integer types, including BOOL.)  The grammar has a NOT operator, but it’s not implemented in the parser yet.

Here’s a sample program:

BOOL b;
BOOL i1;
INT16 l16;
i1 = TRUE;
l16 = -100;
b = i1 < l16;
PRINT b;
b = i1 > l16;
PRINT b;

My MLIR is:

module {
  toy.program {
    toy.declare "b" : i1
    toy.declare "i1" : i1
    toy.declare "l16" : i16
    %true = arith.constant true
    toy.assign "i1", %true : i1
    %c-100_i64 = arith.constant -100 : i64
    toy.assign "l16", %c-100_i64 : i64
    %0 = toy.load "i1" : i1
    %1 = toy.load "l16" : i16
    %2 = "toy.less"(%0, %1) : (i1, i16) -> i1
    toy.assign "b", %2 : i1
    %3 = toy.load "b" : i1
    toy.print %3 : i1
    %4 = toy.load "i1" : i1
    %5 = toy.load "l16" : i16
    %6 = "toy.less"(%5, %4) : (i16, i1) -> i1
    toy.assign "b", %6 : i1
    %7 = toy.load "b" : i1
    toy.print %7 : i1
    toy.exit
  }
}

Here’s the LLVM-IR after lowering:

declare void @__toy_print_f64(double)

declare void @__toy_print_i64(i64)

define i32 @main() !dbg !4 {
  %1 = alloca i1, i64 1, align 1, !dbg !8
    #dbg_declare(ptr %1, !9, !DIExpression(), !8)
  %2 = alloca i1, i64 1, align 1, !dbg !11
    #dbg_declare(ptr %2, !12, !DIExpression(), !11)
  %3 = alloca i16, i64 1, align 2, !dbg !13
    #dbg_declare(ptr %3, !14, !DIExpression(), !13)
  store i1 true, ptr %2, align 1, !dbg !16
  store i16 -100, ptr %3, align 2, !dbg !17
  %4 = load i1, ptr %2, align 1, !dbg !18
  %5 = load i16, ptr %3, align 2, !dbg !18
  %6 = zext i1 %4 to i16, !dbg !18
  %7 = icmp slt i16 %6, %5, !dbg !18
  store i1 %7, ptr %1, align 1, !dbg !18
  %8 = load i1, ptr %1, align 1, !dbg !19
  %9 = zext i1 %8 to i64, !dbg !19
  call void @__toy_print_i64(i64 %9), !dbg !19
  %10 = load i1, ptr %2, align 1, !dbg !20
  %11 = load i16, ptr %3, align 2, !dbg !20
  %12 = zext i1 %10 to i16, !dbg !20
  %13 = icmp slt i16 %11, %12, !dbg !20
  store i1 %13, ptr %1, align 1, !dbg !20
  %14 = load i1, ptr %1, align 1, !dbg !21
  %15 = zext i1 %14 to i64, !dbg !21
  call void @__toy_print_i64(i64 %15), !dbg !21
  ret i32 0, !dbg !8
}

I’m going to want to try to refactor the type conversion logic, as what I have now in lowering is pretty clunky.

Debugging now works in my toy MLIR compiler!

May 25, 2025 C/C++ development and debugging. , , , , , , , , , ,

Screenshot

Screenshot

I’ve now got both line debugging (break, next, continue) working, and variable display (and modification) debugging now working for my toy language and compiler.

Here’s an example program:

BOOL i1;
i1 = TRUE;
PRINT i1;

INT8 i8;
i8 = 10;
PRINT i8;

INT16 i16;
i16 = 1000;
PRINT i16;

INT32 i32;
i32 = 100000;
PRINT i32;

INT64 i64;
i64 = 100000000000;
PRINT i64;

FLOAT32 f32;
f32 = 1.1;
PRINT f32;

FLOAT64 f64;
f64 = 2.2E-1;
PRINT f64;

It doesn’t do anything interesting, other than demonstrate that I got the DILocalVariableAttr declarations right for each supported type. Here’s the MLIR for this program:

"builtin.module"() ({
  "toy.program"() ({
    "toy.declare"() <{name = "i1", type = i1}> : () -> () loc(#loc)
    %0 = "arith.constant"() <{value = true}> : () -> i1 loc(#loc1)
    "toy.assign"(%0) <{name = "i1"}> : (i1) -> () loc(#loc1)
    %1 = "toy.load"() <{name = "i1"}> : () -> i1 loc(#loc2)
    "toy.print"(%1) : (i1) -> () loc(#loc2)
    "toy.declare"() <{name = "i8", type = i8}> : () -> () loc(#loc3)
    %2 = "arith.constant"() <{value = 10 : i64}> : () -> i64 loc(#loc4)
    "toy.assign"(%2) <{name = "i8"}> : (i64) -> () loc(#loc4)
    %3 = "toy.load"() <{name = "i8"}> : () -> i8 loc(#loc5)
    "toy.print"(%3) : (i8) -> () loc(#loc5)
    "toy.declare"() <{name = "i16", type = i16}> : () -> () loc(#loc6)
    %4 = "arith.constant"() <{value = 1000 : i64}> : () -> i64 loc(#loc7)
    "toy.assign"(%4) <{name = "i16"}> : (i64) -> () loc(#loc7)
    %5 = "toy.load"() <{name = "i16"}> : () -> i16 loc(#loc8)
    "toy.print"(%5) : (i16) -> () loc(#loc8)
    "toy.declare"() <{name = "i32", type = i32}> : () -> () loc(#loc9)
    %6 = "arith.constant"() <{value = 100000 : i64}> : () -> i64 loc(#loc10)
    "toy.assign"(%6) <{name = "i32"}> : (i64) -> () loc(#loc10)
    %7 = "toy.load"() <{name = "i32"}> : () -> i32 loc(#loc11)
    "toy.print"(%7) : (i32) -> () loc(#loc11)
    "toy.declare"() <{name = "i64", type = i64}> : () -> () loc(#loc12)
    %8 = "arith.constant"() <{value = 100000000000 : i64}> : () -> i64 loc(#loc13)
    "toy.assign"(%8) <{name = "i64"}> : (i64) -> () loc(#loc13)
    %9 = "toy.load"() <{name = "i64"}> : () -> i64 loc(#loc14)
    "toy.print"(%9) : (i64) -> () loc(#loc14)
    "toy.declare"() <{name = "f32", type = f32}> : () -> () loc(#loc15)
    %10 = "arith.constant"() <{value = 1.100000e+00 : f64}> : () -> f64 loc(#loc16)
    "toy.assign"(%10) <{name = "f32"}> : (f64) -> () loc(#loc16)
    %11 = "toy.load"() <{name = "f32"}> : () -> f32 loc(#loc17)
    "toy.print"(%11) : (f32) -> () loc(#loc17)
    "toy.declare"() <{name = "f64", type = f64}> : () -> () loc(#loc18)
    %12 = "arith.constant"() <{value = 2.200000e-01 : f64}> : () -> f64 loc(#loc19)
    "toy.assign"(%12) <{name = "f64"}> : (f64) -> () loc(#loc19)
    %13 = "toy.load"() <{name = "f64"}> : () -> f64 loc(#loc20)
    "toy.print"(%13) : (f64) -> () loc(#loc20)
    "toy.exit"() : () -> () loc(#loc)
  }) : () -> () loc(#loc)
}) : () -> () loc(#loc)
#loc = loc("types.toy":1:1)
#loc1 = loc("types.toy":2:6)
#loc2 = loc("types.toy":3:1)
#loc3 = loc("types.toy":5:1)
#loc4 = loc("types.toy":6:6)
#loc5 = loc("types.toy":7:1)
#loc6 = loc("types.toy":9:1)
#loc7 = loc("types.toy":10:7)
#loc8 = loc("types.toy":11:1)
#loc9 = loc("types.toy":13:1)
#loc10 = loc("types.toy":14:7)
#loc11 = loc("types.toy":15:1)
#loc12 = loc("types.toy":17:1)
#loc13 = loc("types.toy":18:7)
#loc14 = loc("types.toy":19:1)
#loc15 = loc("types.toy":21:1)
#loc16 = loc("types.toy":22:7)
#loc17 = loc("types.toy":23:1)
#loc18 = loc("types.toy":25:1)
#loc19 = loc("types.toy":26:7)
#loc20 = loc("types.toy":27:1)

and the generated LLVM-IR

; ModuleID = 'types.toy'
source_filename = "types.toy"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare void @__toy_print_f64(double)

declare void @__toy_print_i64(i64)

define i32 @main() !dbg !4 {
  %1 = alloca i1, i64 1, align 1, !dbg !8
    #dbg_declare(ptr %1, !9, !DIExpression(), !8)
  store i1 true, ptr %1, align 1, !dbg !11
  %2 = load i1, ptr %1, align 1, !dbg !12
  %3 = zext i1 %2 to i64, !dbg !12
  call void @__toy_print_i64(i64 %3), !dbg !12
  %4 = alloca i8, i64 1, align 1, !dbg !13
    #dbg_declare(ptr %4, !14, !DIExpression(), !13)
  store i8 10, ptr %4, align 1, !dbg !16
  %5 = load i8, ptr %4, align 1, !dbg !17
  %6 = sext i8 %5 to i64, !dbg !17
  call void @__toy_print_i64(i64 %6), !dbg !17
  %7 = alloca i16, i64 1, align 2, !dbg !18
    #dbg_declare(ptr %7, !19, !DIExpression(), !18)
  store i16 1000, ptr %7, align 2, !dbg !21
  %8 = load i16, ptr %7, align 2, !dbg !22
  %9 = sext i16 %8 to i64, !dbg !22
  call void @__toy_print_i64(i64 %9), !dbg !22
  %10 = alloca i32, i64 1, align 4, !dbg !23
    #dbg_declare(ptr %10, !24, !DIExpression(), !23)
  store i32 100000, ptr %10, align 4, !dbg !26
  %11 = load i32, ptr %10, align 4, !dbg !27
  %12 = sext i32 %11 to i64, !dbg !27
  call void @__toy_print_i64(i64 %12), !dbg !27
  %13 = alloca i64, i64 1, align 8, !dbg !28
    #dbg_declare(ptr %13, !29, !DIExpression(), !28)
  store i64 100000000000, ptr %13, align 8, !dbg !31
  %14 = load i64, ptr %13, align 8, !dbg !32
  call void @__toy_print_i64(i64 %14), !dbg !32
  %15 = alloca float, i64 1, align 4, !dbg !33
    #dbg_declare(ptr %15, !34, !DIExpression(), !33)
  store float 0x3FF19999A0000000, ptr %15, align 4, !dbg !36
  %16 = load float, ptr %15, align 4, !dbg !37
  %17 = fpext float %16 to double, !dbg !37
  call void @__toy_print_f64(double %17), !dbg !37
  %18 = alloca double, i64 1, align 8, !dbg !38
    #dbg_declare(ptr %18, !39, !DIExpression(), !38)
  store double 2.200000e-01, ptr %18, align 8, !dbg !41
  %19 = load double, ptr %18, align 8, !dbg !42
  call void @__toy_print_f64(double %19), !dbg !42
  ret i32 0, !dbg !8
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare void @llvm.dbg.declare(metadata, metadata, metadata) #0

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

!llvm.module.flags = !{!0}
!llvm.dbg.cu = !{!1}
!llvm.ident = !{!3}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = distinct !DICompileUnit(language: DW_LANG_C, file: !2, producer: "toycalculator", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug)
!2 = !DIFile(filename: "types.toy", directory: ".")
!3 = !{!"toycalculator V2"}
!4 = distinct !DISubprogram(name: "main", linkageName: "main", scope: !2, file: !2, line: 1, type: !5, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !1)
!5 = !DISubroutineType(types: !6)
!6 = !{!7}
!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!8 = !DILocation(line: 1, column: 1, scope: !4)
!9 = !DILocalVariable(name: "i1", scope: !4, file: !2, line: 1, type: !10, align: 8)
!10 = !DIBasicType(name: "bool", size: 8, encoding: DW_ATE_boolean)
!11 = !DILocation(line: 2, column: 6, scope: !4)
!12 = !DILocation(line: 3, column: 1, scope: !4)
!13 = !DILocation(line: 5, column: 1, scope: !4)
!14 = !DILocalVariable(name: "i8", scope: !4, file: !2, line: 5, type: !15, align: 8)
!15 = !DIBasicType(name: "int8_t", size: 8, encoding: DW_ATE_signed)
!16 = !DILocation(line: 6, column: 6, scope: !4)
!17 = !DILocation(line: 7, column: 1, scope: !4)
!18 = !DILocation(line: 9, column: 1, scope: !4)
!19 = !DILocalVariable(name: "i16", scope: !4, file: !2, line: 9, type: !20, align: 16)
!20 = !DIBasicType(name: "int16_t", size: 16, encoding: DW_ATE_signed)
!21 = !DILocation(line: 10, column: 7, scope: !4)
!22 = !DILocation(line: 11, column: 1, scope: !4)
!23 = !DILocation(line: 13, column: 1, scope: !4)
!24 = !DILocalVariable(name: "i32", scope: !4, file: !2, line: 13, type: !25, align: 32)
!25 = !DIBasicType(name: "int32_t", size: 32, encoding: DW_ATE_signed)
!26 = !DILocation(line: 14, column: 7, scope: !4)
!27 = !DILocation(line: 15, column: 1, scope: !4)
!28 = !DILocation(line: 17, column: 1, scope: !4)
!29 = !DILocalVariable(name: "i64", scope: !4, file: !2, line: 17, type: !30, align: 64)
!30 = !DIBasicType(name: "int64_t", size: 64, encoding: DW_ATE_signed)
!31 = !DILocation(line: 18, column: 7, scope: !4)
!32 = !DILocation(line: 19, column: 1, scope: !4)
!33 = !DILocation(line: 21, column: 1, scope: !4)
!34 = !DILocalVariable(name: "f32", scope: !4, file: !2, line: 21, type: !35, align: 32)
!35 = !DIBasicType(name: "float", size: 32, encoding: DW_ATE_float)
!36 = !DILocation(line: 22, column: 7, scope: !4)
!37 = !DILocation(line: 23, column: 1, scope: !4)
!38 = !DILocation(line: 25, column: 1, scope: !4)
!39 = !DILocalVariable(name: "f64", scope: !4, file: !2, line: 25, type: !40, align: 64)
!40 = !DIBasicType(name: "double", size: 64, encoding: DW_ATE_float)
!41 = !DILocation(line: 26, column: 7, scope: !4)
!42 = !DILocation(line: 27, column: 1, scope: !4)

Interesting bits include:

source_filename = "types.toy"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

!llvm.module.flags = !{!0}
!llvm.dbg.cu = !{!1}
!llvm.ident = !{!3}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = distinct !DICompileUnit(language: DW_LANG_C, file: !2, producer: "toycalculator", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug)
!2 = !DIFile(filename: "types.toy", directory: ".")
!3 = !{!"toycalculator V2"}
!4 = distinct !DISubprogram(name: "main", linkageName: "main", scope: !2, file: !2, line: 1, type: !5, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !1)
!5 = !DISubroutineType(types: !6)
!6 = !{!7}

Unlike flang’s AddDebugInfoPass DI instrumentation pass, I didn’t try to do anything fancy, instead just implemented a couple of helper functions.

One for the target triple:

void setModuleAttrs()
{
    std::string targetTriple = llvm::sys::getDefaultTargetTriple();
    llvm::Triple triple( targetTriple );
    assert( triple.isArch64Bit() &amp;amp;amp;amp;&amp;amp;amp;amp; triple.isOSLinux() );

    std::string error;
    const llvm::Target* target = llvm::TargetRegistry::lookupTarget( targetTriple, error );
    assert( target );
    llvm::TargetOptions options;
    auto targetMachine = std::unique_ptr&amp;amp;amp;lt;llvm::TargetMachine&amp;amp;amp;gt;( target-&amp;amp;amp;gt;createTargetMachine(
        targetTriple, "generic", "", options, std::optional&amp;amp;amp;lt;llvm::Reloc::Model&amp;amp;amp;gt;( llvm::Reloc::PIC_ ) ) );
    assert( targetMachine );
    std::string dataLayoutStr = targetMachine-&amp;amp;amp;gt;createDataLayout().getStringRepresentation();

    module-&amp;amp;amp;gt;setAttr( "llvm.ident", builder.getStringAttr( COMPILER_NAME COMPILER_VERSION ) );
    module-&amp;amp;amp;gt;setAttr( "llvm.data_layout", builder.getStringAttr( dataLayoutStr ) );
    module-&amp;amp;amp;gt;setAttr( "llvm.target_triple", builder.getStringAttr( targetTriple ) );
}

one for the DICompileUnitAttr, and DISubprogramAttr:

void createMain()
{
    auto ctx = builder.getContext();
    auto mainFuncType = LLVM::LLVMFunctionType::get( builder.getI32Type(), {}, false );
    mainFunc =
        builder.create&amp;amp;amp;lt;LLVM::LLVMFuncOp&amp;amp;amp;gt;( module.getLoc(), ENTRY_SYMBOL_NAME, mainFuncType, LLVM::Linkage::External );

    // Construct module level DI state:
    fileAttr = mlir::LLVM::DIFileAttr::get( ctx, driverState.filename, "." );
    auto distinctAttr = mlir::DistinctAttr::create( builder.getUnitAttr() );
    auto compileUnitAttr = mlir::LLVM::DICompileUnitAttr::get(
        ctx, distinctAttr, llvm::dwarf::DW_LANG_C, fileAttr, builder.getStringAttr( COMPILER_NAME ), false,
        mlir::LLVM::DIEmissionKind::Full, mlir::LLVM::DINameTableKind::Default );
    auto ta =
        mlir::LLVM::DIBasicTypeAttr::get( ctx, (unsigned)llvm::dwarf::DW_TAG_base_type, builder.getStringAttr( "int" ),
                                          32, (unsigned)llvm::dwarf::DW_ATE_signed );
    llvm::SmallVector&amp;amp;amp;lt;mlir::LLVM::DITypeAttr, 1&amp;amp;amp;gt; typeArray;
    typeArray.push_back( ta );
    auto subprogramType = mlir::LLVM::DISubroutineTypeAttr::get( ctx, 0, typeArray );
    subprogramAttr = mlir::LLVM::DISubprogramAttr::get(
        ctx, mlir::DistinctAttr::create( builder.getUnitAttr() ), compileUnitAttr, fileAttr,
        builder.getStringAttr( ENTRY_SYMBOL_NAME ), builder.getStringAttr( ENTRY_SYMBOL_NAME ), fileAttr, 1, 1,
        mlir::LLVM::DISubprogramFlags::Definition, subprogramType, llvm::ArrayRef&amp;amp;amp;lt;mlir::LLVM::DINodeAttr&amp;amp;amp;gt;{},
        llvm::ArrayRef&amp;amp;amp;lt;mlir::LLVM::DINodeAttr&amp;amp;amp;gt;{} );
    mainFunc-&amp;amp;amp;gt;setAttr( "llvm.debug.subprogram", subprogramAttr );

    // This is the key to ensure that translateModuleToLLVMIR does not strip the location info (instead converts
    // loc's into !dbg's)
    mainFunc-&amp;amp;amp;gt;setLoc( builder.getFusedLoc( { module.getLoc() }, subprogramAttr ) );
}

The ‘setLoc’ call above, right near the end is critical.  Without that, the call to mlir::translateModuleToLLVMIR strips out all the loc() references, instead of replacing them with !DILocation.

Finally, one for the variable DI creation:

void constructVariableDI( llvm::StringRef varName, mlir::Type&amp;amp;amp;amp; elemType, mlir::FileLineColLoc loc,
                          unsigned elemSizeInBits, mlir::LLVM::AllocaOp&amp;amp;amp;amp; allocaOp )
{
    auto ctx = builder.getContext();
    allocaOp-&amp;amp;amp;gt;setAttr( "bindc_name", builder.getStringAttr( varName ) );

    mlir::LLVM::DILocalVariableAttr diVar;

    if ( elemType.isa&amp;amp;amp;lt;mlir::IntegerType&amp;amp;amp;gt;() )
    {
        const char* typeName{};
        unsigned dwType = llvm::dwarf::DW_ATE_signed;
        unsigned sz = elemSizeInBits;

        switch ( elemSizeInBits )
        {
            case 1:
            {
                typeName = "bool";
                dwType = llvm::dwarf::DW_ATE_boolean;
                sz = 8;
                break;
            }
            case 8:
            {
                typeName = "int8_t";
                break;
            }
            case 16:
            {
                typeName = "int16_t";
                break;
            }
            case 32:
            {
                typeName = "int32_t";
                break;
            }
            case 64:
            {
                typeName = "int64_t";
                break;
            }
            default:
            {
                llvm_unreachable( "Unsupported float type size" );
            }
        }

        auto diType = mlir::LLVM::DIBasicTypeAttr::get( ctx, llvm::dwarf::DW_TAG_base_type,
                                                        builder.getStringAttr( typeName ), sz, dwType );

        diVar = mlir::LLVM::DILocalVariableAttr::get( ctx, subprogramAttr, builder.getStringAttr( varName ), fileAttr,
                                                      loc.getLine(), 0, sz, diType, mlir::LLVM::DIFlags::Zero );
    }
    else
    {
        const char* typeName{};

        switch ( elemSizeInBits )
        {
            case 32:
            {
                typeName = "float";
                break;
            }
            case 64:
            {
                typeName = "double";
                break;
            }
            default:
            {
                llvm_unreachable( "Unsupported float type size" );
            }
        }

        auto diType =
            mlir::LLVM::DIBasicTypeAttr::get( ctx, llvm::dwarf::DW_TAG_base_type, builder.getStringAttr( typeName ),
                                              elemSizeInBits, llvm::dwarf::DW_ATE_float );

        diVar =
            mlir::LLVM::DILocalVariableAttr::get( ctx, subprogramAttr, builder.getStringAttr( varName ), fileAttr,
                                                  loc.getLine(), 0, elemSizeInBits, diType, mlir::LLVM::DIFlags::Zero );
    }
            
    builder.setInsertionPointAfter( allocaOp );
    builder.create&amp;amp;amp;lt;mlir::LLVM::DbgDeclareOp&amp;amp;amp;gt;( loc, allocaOp, diVar );
        
    symbolToAlloca[varName] = allocaOp;
}

In this code, the call to builder.setInsertionPointAfter is critical.  When the lowering eraseOp takes out the DeclareOp, we need the replacement instructions to all end up in the same place.  Without that, the subsequent AssignOp lowering results in an error like this:

//===-------------------------------------------===//
Legalizing operation : 'toy.assign'(0x2745ab50) {
  "toy.assign"(%3) <{name = "x"}> : (i64) -> ()Fold {
  } -> FAILURE : unable to fold
Pattern : 'toy.assign -> ()' {
Trying to match "toy::AssignOpLowering"
Lowering AssignOp: toy.assign "x", %c5_i64 : i64
name: x
value: ImplicitTypeIDRegistry::lookupOrInsert(mlir::PromotableOpInterface::Trait<mlir::TypeID::get()::Empty>)
...
operand #0 does not dominate this use
mlir-asm-printer: 'builtin.module' failed to verify and will be printed in generic form
%3 = "arith.constant"() <{value = 5 : i64}> : () -> i64
valType: i64
elemType: f64
** Insert  : 'llvm.sitofp'(0x274a6ed0)
ImplicitTypeIDRegistry::lookupOrInsert(mlir::LLVM::detail::StoreOpGenericAdaptorBase::Properties)
** Insert  : 'llvm.store'(0x27437f30)
** Erase   : 'toy.assign'(0x2745ab50)
"toy::AssignOpLowering" result 1

My DI insertion isn’t fancy like flang’s, but I have only simple types to deal with, and don’t even support functions yet, so my simple way seemed like a reasonable choice. Regardless, getting working debugger support is nice milestone.