working storage

Multilanguage debugging in lldb: print call to function.

December 13, 2023 C/C++ development and debugging. , , , , , ,

There probably aren’t many people that care about debugging multiple languages, but I learned a new trick today that is worth making a note of, even if that note is for a future amnesiatic self.

Here’s a debug session where C code is calling COBOL, but in the COBOL frame, the language rules prohibit running print to show the results of a C function call (example: printf, strlen, strspn, …)

To make a function call in lldb, I used to go up the stack to a C language frame.  For example, if this was the COBOL code I was debugging:

(lldb) n
12/13/23 19:27:26 LTE14039I Opening LzMQZ connection. QMGR: MQZ1 MQZCONN: 0x7ff920625170 API: 0x7fed0008e0e0
Process 1673776 stopped
* thread #57, name = 'LZOCREG1', stop reason = step over
    frame #0: 0x00007ff9243b31f2 WINDC.NATIVE.LZPDS.A0116662(LTESVCXC).f3968a73`LTESVCXC at LTESVCXC.cbl:36:1
   34                  DISPLAY 'WSCHECK: "' WORK-VAR '"'
-> 36                 EXEC CICS LINK PROGRAM ('LTESVCXC')
   37                      COMMAREA(WORK-COMMAREA)
   38                      LENGTH   (LENGTH OF WORK-COMMAREA)
   39                 END-EXEC
(lldb) p &WORK-VAR
(*char [10]) $4 = 0x00007fadef810478
(lldb) p WORK-VAR
(char [10]) WORK-VAR = "STORISOK  "
(lldb) fr v -format x WORK-VAR
(char [10]) WORK-VAR = {
  [0] = 0xe2
  [1] = 0xe3
  [2] = 0xd6
  [3] = 0xd9
  [4] = 0xc9
  [5] = 0xe2
  [6] = 0xd6
  [7] = 0xd2
  [8] = 0x40
  [9] = 0x40

Aside: If you object to the use of a C address-of operator against a COBOL variable, that’s just because our debugger has C like & notational shorthand for the COBOL ‘ADDRESS OF …’, which is very useful.

If I want to run a C function against that COBOL WORKING-STORAGE variable, like strchr, to look for the address of the first EBCDIC space (0x40) in that string, I used to do it by going up the stack into a C frame, like so:

(lldb) up 2
frame #2: 0x00007ff9243b3f7e WINDC.NATIVE.LZPDS.A0116662(LTESVCXC).f3968a73`pgm_ltesvcxc + 382
->  0x7ff9243b3f7e <+382>: jmp    0x7ff9243b3f88            ; <+392>
    0x7ff9243b3f80 <+384>: addq   $0x128, %rsp              ; imm = 0x128 
    0x7ff9243b3f87 <+391>: retq   
    0x7ff9243b3f88 <+392>: leaq   0x201039(%rip), %rdi
(lldb) print (char *)strchr(0x00007fadef810478, 0x40)
(char *) $6 = 0x00007fadef810480 "@@"

Sure enough, that space is found 8 bytes into the string, as expected. This is a very short string, and I could have seen that by inspection, but it’s just to illustrate that we can make calls to functions within the debugger, and they can even be functions that aren’t in the program or language that we are debugging.

I noticed today that ‘print’ is an alias for ‘expression –‘, and that expression takes a language option. This means that I can do cross language calls like this in any frame, provided I specify the language I want. Example:

(lldb) down 2
frame #0: 0x00007ff9243b31f2 WINDC.NATIVE.LZPDS.A0116662(LTESVCXC).f3968a73`LTESVCXC at LTESVCXC.cbl:36:1
   34                  DISPLAY 'WSCHECK: "' WORK-VAR '"'
-> 36                 EXEC CICS LINK PROGRAM ('LTESVCXC')
   37                      COMMAREA(WORK-COMMAREA)
   38                      LENGTH   (LENGTH OF WORK-COMMAREA)
   39                 END-EXEC
(lldb) expression -l c -- (char *)strchr(0x00007fadef810478, 0x40)
(char *) $7 = 0x00007fadef810480 "@@"

Ten points to me for learning yet another obscure debugger trick.

The evil of COBOL: everything is in global variables

December 7, 2023 COBOL , , , , , , ,

COBOL does not have stack variables.  Everything is a global variable.  There is a loose equivalent of a function, called a paragraph, which can be called using a PERFORM statement, but a paragraph does not have any input or output variables, and no return code, so if you want it to behave like a function, you have to construct some sort of complicated naming convention using your global variables.

I’ve seen real customer COBOL programs with many thousands of global variables.  A production COBOL program is usually a giant sequence of MOVEs, MOVE A TO B, MOVE B TO C, MOVE C TO D, MOVE D TO E, … with various PERFORMs or GOTOs, or other things in between.  If you find that your variable has a bad value in it, that is probably because it has been copied from something that was copied from something, that was copied from something, that’s the output of something else, that was copied from something, 9 or 10 times.

I was toying around with the idea of coding up a COBOL implementation of 2D Euclidean geometric algebra, just as a joke, as it is surely the worst language in the world.  Yes, I work on a COBOL compiler project. The project is a lot of fun, and the people I work with are awesome, but I don’t have to like the language.

If I was to implement this simplest geometric algebra in COBOL, the logical starting place for that would be to implement complex numbers in COBOL first.  That is because we can use a pair of complex numbers to implement a 2D multivector, with one complex number for the vector part, and a complex number for the scalar and pseudoscalar parts.  That technique has been detailed on this blog previously, and also in a Mathematica module Cl20.m.

Trying to implement a couple of complex number operations in COBOL got absurd really fast.  Here’s an example.  First step was to create some complex number types.  I did that with a copybook (include file), like so:

This can be included multiple times, each time with a different name, like so:

The way that I structured all my helper functions, was with one set of global variables for input (at least one), and if appropriate, one output global variable.  Here’s an example:

So, if I want to compute and display a value, I have a whole pile of stupid MOVEs to do in and out of the appropriate global variables for each of the helper routines in question:

I wrote enough of this little complex number library that I could do conjugate, real, imaginary, multiply, inverse, and divide operations.  I can run that little program with the following JCL


and get this SYSOUT:

A                    =  .10000000000000000E 01 + ( .20000000000000000E 01) I
B                    =  .30000000000000000E 01 + ( .40000000000000000E 01) I
CONJ(A)              =  .10000000000000000E 01 + (-.20000000000000000E 01) I
RE(A)                =  .10000000000000000E 01
IM(A)                =  .20000000000000000E 01
A * B                = -.50000000000000000E 01 + ( .10000000000000000E 02) I
1/A                  =  .20000000000000000E 00 + (-.40000000000000000E 00) I
A/B                  =  .44000000000000000E 00 + ( .80000000000000000E-01) I

If you would like your eyes burned further, you can access the full program on github here. It takes almost 200 lines of code to do almost nothing.