Mainframe

An absurd COBOL library: 2D Euclidean GA

December 31, 2023 COBOL, math and physics play , , , ,

I’ve achieved a new pinnacle of obscurity, and have now written a rudimentary COBOL implementation of a geometric algebra library for \( \mathbb{R}^2 \) calculations.

Who will use this?  Absolutely nobody.  Effectively, nobody knows geometric algebra.  Nobody wants to know COBOL, but some do.  The union of those two groups is vanishingly small (probably one: argued below.)

I understand that some Opus Dei members have taught themselves COBOL, as looking at COBOL has been found to be equally painful as a course of self flagellation.

Figure 0. A flagellation representation of COBOL.

Assuming that no Opus Dei practitioners know geometric algebra, that means that there is exactly one person in the world that both knows COBOL and geometric algebra.  Me.

Why did I write this little library?  Well, I was tickled to write something so completely stupid, and I’ve been laughing at the absurdity of it. I also thought I might learn a few things about COBOL in the process of trying to use it for something slightly non-trivial.  I’m adept at writing simple test programs that exercise various obscure compiler features, but those are usually fairly small.  On the flip side of complexity, I have to debug through a number of horribly complicated customer programs as part of my compiler validation work.  A simple real life test scenario might run 100+ COBOL programs in a set of CICS transactions, executing thousands of EXEC DLI and EXEC CICS statements as well as all of the rest of the COBOL language statements!  Despite having gained familiarity with COBOL from that sort of observational use, walking through stuff in the debugger doesn’t provide the same level of comfort with the language as writing code from scratch.  Since I have no interest in simulating a boring business application, why not do something just for fun as a learning game.

The compiler I am using does not seem to support object-COBOL (which would have been nicely suited for this project), so I’ve written my little toy in conventional COBOL, using one external procedure for each type of mathematical operation.  In the huge set of customer COBOL code that I’ve examined and done test compilations of, none of it has used object-COBOL.  I am guessing that the object-COBOL community is as large as the user base for my little toy COBOL geometric algebra library will ever be.

I’ve implemented methods to construct multivectors with scalar, vector and pseudoscalar components, or a general multivector with all of the above.  I’ve also implemented multiply, add, subtract, scalar multiplication, grade selection, and a DISPLAY function to write a multivector to SYSOUT (stdout equivalent.)

The multivector “type”

Figure 1 shows the implementation of my multivector type, implemented in copybook (include file) named MVI.  I have an alternate MV copybook that doesn’t have the VALUE (initialization) clauses, as you don’t want initialization for LINKAGE-SECTION values (i.e.: program parameters.)

Figure 1. Copybook with multivector declaration and initialization.

If you are wondering what the hell a ‘PIC S9(9) USAGE IS COMP-5’ is, well, that’s the “easy to remember” way to declare a 32-bit signed integer in COBOL.  A COMP-2, on the other hand, is a floating point value.

Figure 2 shows an example of the use of this copybook:

Figure 2. Using the multivector copybook.

Figure 3 shows these two copybook declarations after preprocessor expansion

Figure 3. Multivector global variable examples after preprocessing.

The global variable declarations above are roughly equivalent to the following pseudo C++ code (pretending that we can have anonymous unions that match the COBOL declarations above):

#include <complex>

using complex = std::complex<double>;

struct ga20{
   int grade{};
   union {
      struct { double sc{}; double ps{}; };
      complex g02{};
   };
   union { 
      struct { double x{}; double y{}; };
      complex g1{};
   };
};

ga20 a;
ga20 b;

COBOL is inherently untyped, but requires matching types for CALL parameters, or else all hell ensues, so you have to rely on naming conventions and other mechanisms to enforce the required type equivalences.  In this toy GA library, I’ve used copybooks to enforce the types required for everything.  Global variable declarations like these A-MV and B-MV variables are declared only using a copybook that knows the representation required, and all the uses in sub-programs of the effective -MV “type” use a matching copybook for their declarations.  However, I’ve also made use of the lack of typing to treat A-G02, B-G02, A-G1, and B-G1 as if they were complex numbers, and pass those “variables” off to complex number sub-programs, knowing that I’ve constructed the parameters to those programs in a way that is bit compatible with the MV field values.  You can screw things up really nicely doing stuff like this, especially because all COBOL sub-program parameters are (generally) passed by reference.  If you don’t match up the types right “fun ensues.”

Also observe that the nested level specifiers are optional in COBOL.  For nested fields in C++, we might write a.g1.x.  With a nested variable like this in COBOL, we could write something equivalent to that, like:

A-X OF A-G1 OF A-MV

but we can leave out any of the intermediate “level” specifications if we want.  This gets really confusing in complicated real-life COBOL code.  If you are looking to see where something is modified, you have to not only look for the variable of interest, but also any of the higher level fields, since any of those could have been passed off to other code, which implicitly wrote the value you are interested in.

Here’s what one of these multivectors looks like in memory on my (Linux x86-64) system

(lldb) c
Process 3903259 resuming
Process 3903259 stopped
* thread #10, name = 'GA20', stop reason = breakpoint 7.1
    frame #0: 0x00007fffd9189a02 PJOOT.GA20V01.LOADLIB(MULT).ec73dc4b`MULT at MULT.cob:50:1
   47              CALL GA-MKVECTOR-MODIFY USING C-MV, A-X, A-Y
   48              CALL GA-MKPSEUDO-MODIFY USING D-MV, A-PS
   49  
-> 50              MOVE 'A' TO WS-DISPPARM-N
   51              CALL GA-DISPLAY USING
   52                WS-DISPPARM-N,
   53                A-MV
(lldb) p A-MV
(A-MV) A-MV = {
  A-GRADE = -1
  A-G02 = (A-SC = 1, A-PS = 4)
  A-G1 = (A-X = 2, A-Y = 3)
}

i.e.: this has the value \( 1 + 2 \mathbf{e}_{12} + 3 \mathbf{e}_1 + 4 \mathbf{e}_1 \).

Looking at the multivector in it’s hex representation:

(lldb) fr v -format x A-MV
(A-MV) A-MV = {
  A-GRADE = 0xffffffff
  A-G02 = {
    A-SC = 0x3ff0000000000000
    A-PS = 0x4010000000000000
  }
  A-G1 = {
    A-X = 0x4000000000000000
    A-Y = 0x4008000000000000
  }
}

we see that the debugger is showing an underlying IEEE floating point representation for the COMP-2 variables in the program as it was compiled.

I have a multivector print routine that prints multivectors to SYSOUT:

Figure 4. Calling the multivector DISPLAY function.

where WS-DISPPARM-N is a PIC X(20).  (i.e.: a fixed size character array.)  Output for the A-MV value showing in the debug session above looks like:

A                     ( .10000000000000000E 01)                                                                         
                    + ( .20000000000000000E 01) e_1 + ( .30000000000000000E 01) e_2                                     
                    + ( .40000000000000000E 01) e_{12}            

End of sentence required for nested IFs?

I encountered a curious language issue in my multivector multiply function.  Here’s an example of how I’ve been coding IF statements

Figure 5. An IF END-IF pair without a period to terminate the sentence.

Notice that I don’t do anything special between the END-IF and the statement that follows it.  However, if I have an IF statement that includes nested IF END-IFs, then it appears that I need a period after the final END-IF, like so:

Figure 6. An IF with nested conditions that seems to require a period to terminate the sentence.

If I don’t include that period after the final END-IF (ending the COBOL sentence), then in some circumstances, I was seeing the program exit after the last interior basic block within this nested IF was executed.  In COBOL parlance, it seems as if a GOBACK (i.e.: return) was implicitly executed once we fell out of the big nested IF.  Why is that period required for a nested IF, but not for a simple IF?

In my “Murach’s mainframe COBOL”, he ends ALL if statements with a period, even simple IFs.  I don’t see a rationale for that in the book anywhere, but it’s a ~700 page book, so perhaps he says why at some point.

I’ve asked our compiler guys if this is a bug or expected behaviour, but I am guessing the latter…. I just don’t know why.

The multiplication kernel for this library

The workhorse of this GA(2,0) implementation, is a multivector multiplication operation, which can be implemented in two lines in Mathematica (or C++)

multivector /: multivector[_, m1_, m2_] ** multivector[_, n1_, n2_] := 
   multivector[-1, m1 n1 + Conjugate[m2] n2, n1 m2 + Conjugate[m1] n2 ]

In COBOL, it takes a lot more, and as usual, COBOL verbosity obfuscates things considerably. Here’s the equivalent code in my library:

Figure 7. GA(2,0) multiplication kernel in COBOL.

The library and a little test program.

If you are curious, you can poke around in the code for this library and the test program on github.  The sample/test program is src/MULT.cob, and running the job gives the following SYSOUT:

Figure 8. Sample SYSOUT for MULT.cob

A less evil COBOL toy complex number library

December 29, 2023 COBOL , , , , , , , , , ,

In a previous post ‘The evil of COBOL: everything is in global variables’, I discussed the implementation of a toy complex number library in COBOL.

That example code was a single module, having one paragraph for each function. I used a naming convention to work around the fact that COBOL functions (paragraphs) are completely braindead and have no input nor output parameters, and all such functions in a given loadmodule have access to all the variables of the entire program.

Perhaps you try to call me on my claim that COBOL doesn’t have parameters, nor does it have return values.  That’s true if you consider COBOL paragraphs to be the equivalent to functions.  I’ve heard paragraphs described as not-really-functions, and there’s some truth to that, especially since you can do a PERFORM range that executes a set of paragraphs, and there can be non-intuitive control flow transfers between paragraphs of such a range of execution, that is entirely non-function like.

There is one circumstance where COBOL parameters can be found.  It is actually possible to have both input and output parameters in COBOL, but it can only be done at a program level (i.e.: int main( int argc, char ** )). So, you can write a less braindead COBOL library, with a set of meaningful input and output parameters for each function, by using CALL instead of PERFORM, and a whole set of external programs, one for each of the operations that is desired. With that in mind, I’ve reworked my COBOL complex number toy library to use this program-level organization.  This is still a toy library implementation, but serves to illustrate the ideas.  The previous paragraph implementation can be found in the same repository, in the ../paragraphs-as-library/ directory.

Here are some examples of the functions in this little library, and examples of calls to them.

Multiply code:

And here’s a call to it:

Notice that I’ve opted to use dynamic calls to the COBOL functions, using a copybook that lists all the possible function names:

This frees me from the constraint of having to use inscrutable 8-character function names, which will get confusing as the library grows.

Like everything in COBOL, the verbosity makes it fairly unreadable, but refactoring all paragraphs into external programs, does make the calling code, and even the library functions themselves, much more readable.  It still takes 49 lines of code, to initialize two complex numbers, multiply them and display them to stdout.

Compare to the same thing in C++, which is 18 lines for a grow-your-own complex implementation with multiply:

#include <iostream>

struct complex{
   double re_;
   double im_;
};

complex mult(const complex & a, const complex & b) {
   // (a + b i)(c + d i) = a c - b d + i( b c + a d) 
   return complex{ a.re_ * b.re_ - a.im_ * b.im_,
                   a.im_ * b.re_ + a.re_ * b.im_ };
}

int main()
{
   complex a{1,2};
   complex b{3,4};
   complex c = mult(a, b);
   std::cout << "c = " << c.re_ << " +(" << c.im_ << ") I\n";

   return 0;
}

and only 11 lines if we use the standard library complex implementation:

#include <iostream>
#include <complex>

using complex = std::complex<double>;

int main() 
{  
   complex a{1,2}; 
   complex b{3,4};
   complex c = a * b;
   std::cout << "c = " << c << "\n";

   return 0;
}

Basically, we have one line for each operation: init, init, multiply, display, and all the rest is one-time fluff (the includes, main decl, return, …)

It turns out that the so-called OBJECT oriented COBOL extension to the language (circa Y2K), is basically a packaging of external-style-programs into collections that are class prefixed, just as I’ve done above.  This provides the capability for information hiding, and allows functions to have parameters and return values.  However, this doesn’t try to rectify the fundamental failure of the COBOL language: everything has to be in a global variable.  This language extension appears to be a hack that may have been done primarily for Java integration, which is probably why nobody uses it.  You just can’t take the dinosaur out of COBOL.

Sadly, it didn’t take people long to figure out that it’s incredibly dumb to require all variables to be global.  Even PL/I, which is 59 years old at the time I write this (only five years younger than COBOL), got it right.  They added parameters and return values to functions, and allow functions to have variables that are limited to that scope.  PL/I probably went too far, and added lots of features that are also braindead (like the PL/I macro preprocessor), but the basic language is at least sane.  It’s interesting that COBOL never evolved.  A language like C++ may have evolved too much, and still is, but the most flagrant design flaw in the COBOL language has been there since inception, despite every other language in the world figuring out that sort of stupidity should not be propagated.

Note that I work on the development of a COBOL and PL/I compilation stack.  I really like my work, which is challenging and great fun, and I work with awesome people. That doesn’t stop me from acknowledging that COBOL is a language spawned in hell by Satan. I can love my work, which provides tools for customers allowing them to develop, maintain and debug COBOL code, but also have great pity and remorse for those customers, having inherited ancient code built with an ancient language, and having no easy migration path away from that language.

The evil of COBOL: everything is in global variables

December 7, 2023 COBOL , , , , , , ,

COBOL does not have stack variables.  Everything is a global variable.  There is a loose equivalent of a function, called a paragraph, which can be called using a PERFORM statement, but a paragraph does not have any input or output variables, and no return code, so if you want it to behave like a function, you have to construct some sort of complicated naming convention using your global variables.

I’ve seen real customer COBOL programs with many thousands of global variables.  A production COBOL program is usually a giant sequence of MOVEs, MOVE A TO B, MOVE B TO C, MOVE C TO D, MOVE D TO E, … with various PERFORMs or GOTOs, or other things in between.  If you find that your variable has a bad value in it, that is probably because it has been copied from something that was copied from something, that was copied from something, that’s the output of something else, that was copied from something, 9 or 10 times.

I was toying around with the idea of coding up a COBOL implementation of 2D Euclidean geometric algebra, just as a joke, as it is surely the worst language in the world.  Yes, I work on a COBOL compiler project. The project is a lot of fun, and the people I work with are awesome, but I don’t have to like the language.

If I was to implement this simplest geometric algebra in COBOL, the logical starting place for that would be to implement complex numbers in COBOL first.  That is because we can use a pair of complex numbers to implement a 2D multivector, with one complex number for the vector part, and a complex number for the scalar and pseudoscalar parts.  That technique has been detailed on this blog previously, and also in a Mathematica module Cl20.m.

Trying to implement a couple of complex number operations in COBOL got absurd really fast.  Here’s an example.  First step was to create some complex number types.  I did that with a copybook (include file), like so:

This can be included multiple times, each time with a different name, like so:

The way that I structured all my helper functions, was with one set of global variables for input (at least one), and if appropriate, one output global variable.  Here’s an example:

So, if I want to compute and display a value, I have a whole pile of stupid MOVEs to do in and out of the appropriate global variables for each of the helper routines in question:

I wrote enough of this little complex number library that I could do conjugate, real, imaginary, multiply, inverse, and divide operations.  I can run that little program with the following JCL

//COMPLEX JOB
//A EXEC PGM=COMPLEX
//SYSOUT   DD SYSOUT=*
//STEPLIB  DD DSN=PJOOT.SAMPLE.COMPLEX,
//  DISP=SHR

and get this SYSOUT:

STEP A SYSOUT:
A                    =  .10000000000000000E 01 + ( .20000000000000000E 01) I
B                    =  .30000000000000000E 01 + ( .40000000000000000E 01) I
CONJ(A)              =  .10000000000000000E 01 + (-.20000000000000000E 01) I
RE(A)                =  .10000000000000000E 01
IM(A)                =  .20000000000000000E 01
A * B                = -.50000000000000000E 01 + ( .10000000000000000E 02) I
1/A                  =  .20000000000000000E 00 + (-.40000000000000000E 00) I
A/B                  =  .44000000000000000E 00 + ( .80000000000000000E-01) I

If you would like your eyes burned further, you can access the full program on github here. It takes almost 200 lines of code to do almost nothing.

Debugging a C coding error from an XPLINK assembly listing.

March 12, 2021 C/C++ development and debugging., Mainframe , , , , , , ,

There are at least two\({}^1\) z/OS C calling conventions, the traditional “LE” OSLINK calling convention, and the newer\({}^2\) XPLINK convention.  In the LE calling convention, parameters aren’t passed in registers, but in an array pointed to by R1.  Here’s an example of an OSLINK call to strtof():

*  float strtof(const char *nptr, char **endptr);
LA       r0,ep(,r13,408)
LA       r2,buf(,r13,280)
LA       r4,#wtemp_1(,r13,416)
L        r15,=V(STRTOF)(,r3,4)
LA       r1,#MX_TEMP3(,r13,224)
ST       r4,#MX_TEMP3(,r13,224)
ST       r2,#MX_TEMP3(,r13,228)
ST       r0,#MX_TEMP3(,r13,232)
BASR     r14,r15
LD       f0,#wtemp_1(,r13,416)

R1 is pointed to r13 + 224 (a location on the stack). If the original call was:

float f = strtof( mystring, &err );

The compiler has internally translated it into something of the form:

STRTOF( &f, mystring, &err );

where all of {&f, mystring, &err} are stuffed into the memory starting at the 224(R13) location. Afterwards the value has to be loaded from memory into a floating point register (F0) so that it can be used.  Compare this to a Linux strtof() call:

* char * e = 0;
* float x = strtof( "1.0", &e );
  400b74:       mov    $0x400ef8,%edi       ; first param is address of "1.0"
  400b79:       movq   $0x0,0x8(%rsp)       ; e = 0;
  400b82:       lea    0x8(%rsp),%rsi       ; second param is &e
  400b87:       callq  400810 <strtof@plt>  ; call the function, returning a value in %xmm0

Here the input parameters are RDI, RSI, and the output is XMM0. Nice and simple. Since XPLINK was designed for C code, we expect it to be more sensible. Let’s see what an XPLINK call looks like. Here’s a call to fmodf:

*      float r = fmodf( 10.0f, 3.0f );
            LD       f0,+CONSTANT_AREA(,r9,184)
            LD       f2,+CONSTANT_AREA(,r9,192)
            L        r7,#Save_ADA_Ptr_9(,r4,2052)
            L        r6,=A(__fmodf)(,r7,76)
            L        r5,=A(__fmodf)(,r7,72)
            BASR     r7,r6
            NOP      9
            LDR      f2,f0
            STE      f2,r(,r4,2144)
*
*      printf( "fmodf: %g\n", (double)r );

There are some curious details that would have to be explored to understand the code above (why f0, f2, and not f0,f1?), however, the short story is that all the input and output values in (floating point) registers.

The mystery that led me to looking at this was a malfunctioning call to strtof:

*      float x = strtof( "1.0q", &e );
            LA       r2,e(,r4,2144)
            L        r7,#Save_ADA_Ptr_12(,r4,2052)
            L        r6,=A(strtof)(,r7,4)
            L        r5,=A(strtof)(,r7,0)
            LA       r1,+CONSTANT_AREA(,r9,20)
            BASR     r7,r6
            NOP      17
            LR       r0,r3
            CEFR     f2,r0
            STE      f2,x(,r4,2148)
*
*      printf( "strtof: v: %g\n", x );

The CEFR instruction converts an integer to a (hfp32) floating point representation, so we appear to have strtof returning it’s value in R3, which is an integer register. That then gets copied into R0, and finally into F2 (and after that into a stack spill location before the printf call.) I scratched my head about this code for quite a while, trying to figure out if the compiler had some mysterious way to make this work that I wasn’t figuring out. Eventually, I clued in. I’m so used to using a C++ compiler that I forgot about the old style implicit int return for an unprototyped function. But I had included <stdlib.h> in this code, so strtof should have been prototyped? However, the Language Runtime reference specifies that on z/OS you need an additional define to have strtof visible:

#define _ISOC99_SOURCE
#include <stdlib.h>

Without the additional define, the call to strtof() is as if it was prototyped as:

int strtof( const char *, char ** );

My expectation is that with such a correction, the call to strtof() should return it’s value in f0, just like fmodf() does. The result should also not be garbage!

 

Footnotes:

  1.  There is also a “metal” compiler and probably a different calling convention to go with that.  I don’t know how metal differs from XPLINK.
  2. Newer in the lifetime of the mainframe means circa 2001, which is bleeding edge given the speed that mainframe development moves.

Example of PL/I macro

January 27, 2021 Mainframe , , ,

Up until last week PL/I macros were a bit of a mystery.  Most of the ones that I’d seen in customer code were impressively inscrutable, and if I had to look at any of them, my reaction was to throw my hands in the air and plead with the compiler backend guys for help.  Implementing one such macro has been very helpful to understanding how these work.

Here is a C program that roughly models some PL/I code of interest

The documentation for the ‘foo’ function says of the final return code parameter that it is 12 bytes long, and that the ‘rcvalues.h’ header file has a set of RCNNN constants and a RCCHECK macro that can be used to test for any one of those constants.  A possible C implementation of that header might look something like:

/* rcvalues.h */
#define RC000 0x0000000000000000LL
#define RC001 0x0000000123456789LL
/* ... */

#define RCCHECK( urc, crc ) ( memcmp( &(urc), &(crc), 8 ) == 0 )

PL/I APIs do not typically use modern constructs like typedefs.  The closest that I have seen is for an API header file (copybook in the mainframe lingo) is to declare a variable (which becomes a local variable with a specific name in the including module), which the programmer can refer to using the LIKE keyword, as in the following example:

I believe there is also a DEFINE keyword available in newer PL/I compilers, which provides a typedef like mechanism, but most existing code probably doesn’t use such new-fangled nonsense, when cut and paste has far superior maintenance characteristics.  For that reason, the API would be unlikely to have a typedef equivalent for the return code structure.  Instead, the PL/I equivalent of the C code above, would probably look like:

(i.e. the C code is really modeled on the PL/I code of this form, and if this was a C API, the API would have a struct declaration or a typedef for the return code structure)

The RCNNN constants would actually be found as named variables (not immutable constants) in the copy book, perhaps declared something like:

I struggled a bit to figure out what the PL/I equivalent of my C RCCHECK macro would be.  The following inner function correctly did the required type casting and comparisons:

The implementation is very long, since the entire declaration of the input parameter type has to be duplicated.

If I was to put this RCCHECK implementation above into my RCVALUES.inc header file, it would only work if all the customer declaration of their return code structure objects were field by field compatible.  What I really want is for my RCCHECK function to take the address of the parameter, and pass that instead of the underlying type.  That was not at all obvious to figure out how to do, but with some help, I was eventually able to construct a PL/I macro (with helper inner-function) of the following form:

It’s clearly no longer a one liner.  Some notes on this PL/I macro:

  • The PL/I macro body looks like a regular PL/I function, but the begin-PROCEDURE and END statements start with % (% is not part of the PROC name.)
  • Macro parameters and return values are explicit strings, regardless of the types of the parameters that were actually passed.
  • In PL/I the || symbol is used for string concatenation, so this constructs output that inserts an ADDR() call around ARG1 token and then passes the ARG2 token as is.
  • I don’t know if there’s a way to implement this macro in a way that doesn’t require a helper function, and still have the output work in the context of an IF statement.
  • You have to explicitly enable the macro, using %ACTIVATE.  In my case, without %ACTIVATE, the RCCHECK symbol ends up as an undeclared external entry, and was no call to the @RCCHECK_HELPER function \({}^{[1]}\).
  • Observe that the PL/I macro provides a mechanism to jam whatever you want into the code, as the compiler’s macro preprocessor replaces the macro call tokens with the string that you have provided, leaving that string for the final compiler pass to interpret instead.

If I compile the code using this macro version of RCCHECK, the preprocessor output looks like:

I’m still pretty horrified at some of the macros that I’ve seen in customer code — they almost seem like the source equivalent of self modifying code.  You can’t figure out what is going on without also looking at all the output of the precompiler passes.  This is especially evil, since you can write PL/I preprocessor macros that generate preprocessor macros and require multiple preprocessor passes to produce the final desired output!

Footnotes

[1] note that @ is a valid PL/I character to use in a symbol name, as is # and $ — so if you want your functions to look like swear words, this is a language where that is possible.  Something like the following is probably valid PL/I :

V = #@$1A#@(1);

For added entertainment, your file names (i.e. PDS member names) can also be like ‘#@$1A#@’. Storing files with names like that on a Unix filesystem results in hours of fun, as you are then left with the task of figuring out how to properly quote file names with embedded $’s and #’s in scripts and makefiles.