Peeter Joot's Blog » Added FUNCTION/CALL support to my toy compiler

I’ve tagged V4 for my toy language and MLIR based compiler.

See the Changelog for the gory details (or the commit history). There are three specific new features, relative to the V3 tag:

Adds support (grammar, builder, lowering) for function declarations, and function calls. Much of the work for this was done in branch use_mlir_funcop_with_scopeop, later squashed and merged as a big commit. Here’s an example

FUNCTION bar ( INT16 w, INT32 z )
{
    PRINT "In bar";
    PRINT w;
    PRINT z;
    RETURN;
};

FUNCTION foo ( )
{
    INT16 v;
    v = 3;
    PRINT "In foo";
    CALL bar( v, 42 );
    PRINT "Called bar";
    RETURN;
};

PRINT "In main";
CALL foo();
PRINT "Back in main";

Here is the MLIR for this program:

module {
  func.func private @foo() {
    "toy.scope"() ({
      "toy.declare"() <{type = i16}> {sym_name = "v"} : () -> ()
      %c3_i64 = arith.constant 3 : i64
      "toy.assign"(%c3_i64) <{var_name = @v}> : (i64) -> ()
      %0 = "toy.string_literal"() <{value = "In foo"}> : () -> !llvm.ptr
      toy.print %0 : !llvm.ptr
      %1 = "toy.load"() <{var_name = @v}> : () -> i16
      %c42_i64 = arith.constant 42 : i64
      %2 = arith.trunci %c42_i64 : i64 to i32
      "toy.call"(%1, %2) <{callee = @bar}> : (i16, i32) -> ()
      %3 = "toy.string_literal"() <{value = "Called bar"}> : () -> !llvm.ptr
      toy.print %3 : !llvm.ptr
      "toy.return"() : () -> ()
    }) : () -> ()
    "toy.yield"() : () -> ()
  }
  func.func private @bar(%arg0: i16, %arg1: i32) {
    "toy.scope"() ({
      "toy.declare"() <{param_number = 0 : i64, parameter, type = i16}> {sym_name = "w"} : () -> ()
      "toy.declare"() <{param_number = 1 : i64, parameter, type = i32}> {sym_name = "z"} : () -> ()
      %0 = "toy.string_literal"() <{value = "In bar"}> : () -> !llvm.ptr
      toy.print %0 : !llvm.ptr
      %1 = "toy.load"() <{var_name = @w}> : () -> i16
      toy.print %1 : i16
      %2 = "toy.load"() <{var_name = @z}> : () -> i32
      toy.print %2 : i32
      "toy.return"() : () -> ()
    }) : () -> ()
    "toy.yield"() : () -> ()
  }
  func.func @main() -> i32 {
    "toy.scope"() ({
      %c0_i32 = arith.constant 0 : i32
      %0 = "toy.string_literal"() <{value = "In main"}> : () -> !llvm.ptr
      toy.print %0 : !llvm.ptr
      "toy.call"() <{callee = @foo}> : () -> ()
      %1 = "toy.string_literal"() <{value = "Back in main"}> : () -> !llvm.ptr
      toy.print %1 : !llvm.ptr
      "toy.return"(%c0_i32) : (i32) -> ()
    }) : () -> ()
    "toy.yield"() : () -> ()
  }
}

Here’s a sample program with an assigned CALL value:

FUNCTION bar ( INT16 w )
{
    PRINT w;
    RETURN;
};

PRINT "In main";
CALL bar( 3 );
PRINT "Back in main";

The MLIR for this one looks like:

module {
  func.func private @bar(%arg0: i16) {
    "toy.scope"() ({
      "toy.declare"() <{param_number = 0 : i64, parameter, type = i16}> {sym_name = "w"} : () -> ()
      %0 = "toy.load"() <{var_name = @w}> : () -> i16
      toy.print %0 : i16
      "toy.return"() : () -> ()
    }) : () -> ()
    "toy.yield"() : () -> ()
  }
  func.func @main() -> i32 {
    "toy.scope"() ({
      %c0_i32 = arith.constant 0 : i32
      %0 = "toy.string_literal"() <{value = "In main"}> : () -> !llvm.ptr
      toy.print %0 : !llvm.ptr
      %c3_i64 = arith.constant 3 : i64
      %1 = arith.trunci %c3_i64 : i64 to i16
      "toy.call"(%1) <{callee = @bar}> : (i16) -> ()
      %2 = "toy.string_literal"() <{value = "Back in main"}> : () -> !llvm.ptr
      toy.print %2 : !llvm.ptr
      "toy.return"(%c0_i32) : (i32) -> ()
    }) : () -> ()
    "toy.yield"() : () -> ()
  }
}

I’ve implemented a two stage lowering, where the toy.scope, toy.yield, toy.call, and toy.returns are stripped out leaving just the func and llvm dialects. Code from that stage of the lowering is cleaner looking

llvm.mlir.global private constant @str_1(dense<[66, 97, 99, 107, 32, 105, 110, 32, 109, 97, 105, 110]> : tensor<12xi8>) {addr_space = 0 : i32} : !llvm.array<12 x i8>
func.func private @__toy_print_string(i64, !llvm.ptr)
llvm.mlir.global private constant @str_0(dense<[73, 110, 32, 109, 97, 105, 110]> : tensor<7xi8>) {addr_space = 0 : i32} : !llvm.array<7 x i8>
func.func private @__toy_print_i64(i64)
func.func private @bar(%arg0: i16) {
  %0 = llvm.mlir.constant(1 : i64) : i64
  %1 = llvm.alloca %0 x i16 {alignment = 2 : i64, bindc_name = "w.addr"} : (i64) -> !llvm.ptr
  llvm.store %arg0, %1 : i16, !llvm.ptr
  %2 = llvm.load %1 : !llvm.ptr -> i16
  %3 = llvm.sext %2 : i16 to i64
  call @__toy_print_i64(%3) : (i64) -> ()
  return
}
func.func @main() -> i32 {
  %0 = llvm.mlir.constant(0 : i32) : i32
  %1 = llvm.mlir.addressof @str_0 : !llvm.ptr
  %2 = llvm.mlir.constant(7 : i64) : i64
  call @__toy_print_string(%2, %1) : (i64, !llvm.ptr) -> ()
  %3 = llvm.mlir.constant(3 : i64) : i64
  %4 = llvm.mlir.constant(3 : i16) : i16
  call @bar(%4) : (i16) -> ()
  %5 = llvm.mlir.addressof @str_1 : !llvm.ptr
  %6 = llvm.mlir.constant(12 : i64) : i64
  call @__toy_print_string(%6, %5) : (i64, !llvm.ptr) -> ()
  return %0 : i32
}

There are some dead code constants left there (%3), seeming due to type conversion, but they get stripped out nicely by the time we get to LLVM-IR:

@str_1 = private constant [12 x i8] c"Back in main"
@str_0 = private constant [7 x i8] c"In main"

declare void @__toy_print_string(i64, ptr)

declare void @__toy_print_i64(i64)

define void @bar(i16 %0) {
  %2 = alloca i16, i64 1, align 2
  store i16 %0, ptr %2, align 2
  %3 = load i16, ptr %2, align 2
  %4 = sext i16 %3 to i64
  call void @__toy_print_i64(i64 %4)
  ret void
}

define i32 @main() {
  call void @__toy_print_string(i64 7, ptr @str_0)
  call void @bar(i16 3)
  call void @__toy_print_string(i64 12, ptr @str_1)
  ret i32 0
}

Generalize NegOp lowering to support all types, not just f64.

Allow PRINT of string literals, avoiding requirement for variables. Example:

    %0 = "toy.string_literal"() <{value = "A string literal!"}> : () -> !llvm.ptr loc(#loc)
    "toy.print"(%0) : (!llvm.ptr) -> () loc(#loc)

The next obvious thing to do for the language/compiler would be to implement conditionals (IF/ELIF/ELSE) and loops. I think that there are MLIR dialects to facilitate both (like the affine dialect for loops.)

However, having now finished this function support feature (which I’ve been working on for quite a while), I’m going to take a break from this project. Even though I’ve only been working on this toy compiler project in my spare time, it periodically invades my thoughts. With all that I have to learn for my new job, I’d rather have one less extra thing to think about, so that I don’t feel pulled in too many directions at once.

Markdown	Result
text	text
text	text
*text*	text
`code`	`code`
~~~ more code ~~~~	more code
[Link](http://www.example.com)	Link
* Listitem	Listitem
> Quote	Quote

Markdown

Result

*text*

text

**text**

text

***text***

text

`code`

code

~~~
more code
~~~~

more code

[Link](http://www.example.com)

Link

* Listitem

Listitem

> Quote

Quote

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Added FUNCTION/CALL support to my toy compiler

Like this:

Pages

Categories

Recent Posts

Archives

Meta

Categories

Recent Posts

Archives

Added FUNCTION/CALL support to my toy compiler

Share this:

Like this:

Pages

Categories

Recent Posts

Tags

Archives

Meta

Categories

Recent Posts

Tags

Archives