PL/I

Example of PL/I macro

January 27, 2021 Mainframe , , ,

Up until last week PL/I macros were a bit of a mystery.  Most of the ones that I’d seen in customer code were impressively inscrutable, and if I had to look at any of them, my reaction was to throw my hands in the air and plead with the compiler backend guys for help.  Implementing one such macro has been very helpful to understanding how these work.

Here is a C program that roughly models some PL/I code of interest

The documentation for the ‘foo’ function says of the final return code parameter that it is 12 bytes long, and that the ‘rcvalues.h’ header file has a set of RCNNN constants and a RCCHECK macro that can be used to test for any one of those constants.  A possible C implementation of that header might look something like:

/* rcvalues.h */
#define RC000 0x0000000000000000LL
#define RC001 0x0000000123456789LL
/* ... */

#define RCCHECK( urc, crc ) ( memcmp( &(urc), &(crc), 8 ) == 0 )

PL/I APIs do not typically use modern constructs like typedefs.  The closest that I have seen is for an API header file (copybook in the mainframe lingo) is to declare a variable (which becomes a local variable with a specific name in the including module), which the programmer can refer to using the LIKE keyword, as in the following example:

I believe there is also a DEFINE keyword available in newer PL/I compilers, which provides a typedef like mechanism, but most existing code probably doesn’t use such new-fangled nonsense, when cut and paste has far superior maintenance characteristics.  For that reason, the API would be unlikely to have a typedef equivalent for the return code structure.  Instead, the PL/I equivalent of the C code above, would probably look like:

(i.e. the C code is really modeled on the PL/I code of this form, and if this was a C API, the API would have a struct declaration or a typedef for the return code structure)

The RCNNN constants would actually be found as named variables (not immutable constants) in the copy book, perhaps declared something like:

I struggled a bit to figure out what the PL/I equivalent of my C RCCHECK macro would be.  The following inner function correctly did the required type casting and comparisons:

The implementation is very long, since the entire declaration of the input parameter type has to be duplicated.

If I was to put this RCCHECK implementation above into my RCVALUES.inc header file, it would only work if all the customer declaration of their return code structure objects were field by field compatible.  What I really want is for my RCCHECK function to take the address of the parameter, and pass that instead of the underlying type.  That was not at all obvious to figure out how to do, but with some help, I was eventually able to construct a PL/I macro (with helper inner-function) of the following form:

It’s clearly no longer a one liner.  Some notes on this PL/I macro:

  • The PL/I macro body looks like a regular PL/I function, but the begin-PROCEDURE and END statements start with % (% is not part of the PROC name.)
  • Macro parameters and return values are explicit strings, regardless of the types of the parameters that were actually passed.
  • In PL/I the || symbol is used for string concatenation, so this constructs output that inserts an ADDR() call around ARG1 token and then passes the ARG2 token as is.
  • I don’t know if there’s a way to implement this macro in a way that doesn’t require a helper function, and still have the output work in the context of an IF statement.
  • You have to explicitly enable the macro, using %ACTIVATE.  In my case, without %ACTIVATE, the RCCHECK symbol ends up as an undeclared external entry, and was no call to the @RCCHECK_HELPER function \({}^{[1]}\).
  • Observe that the PL/I macro provides a mechanism to jam whatever you want into the code, as the compiler’s macro preprocessor replaces the macro call tokens with the string that you have provided, leaving that string for the final compiler pass to interpret instead.

If I compile the code using this macro version of RCCHECK, the preprocessor output looks like:

I’m still pretty horrified at some of the macros that I’ve seen in customer code — they almost seem like the source equivalent of self modifying code.  You can’t figure out what is going on without also looking at all the output of the precompiler passes.  This is especially evil, since you can write PL/I preprocessor macros that generate preprocessor macros and require multiple preprocessor passes to produce the final desired output!

Footnotes

[1] note that @ is a valid PL/I character to use in a symbol name, as is # and $ — so if you want your functions to look like swear words, this is a language where that is possible.  Something like the following is probably valid PL/I :

V = #@$1A#@(1);

For added entertainment, your file names (i.e. PDS member names) can also be like ‘#@$1A#@’. Storing files with names like that on a Unix filesystem results in hours of fun, as you are then left with the task of figuring out how to properly quote file names with embedded $’s and #’s in scripts and makefiles.

Mainframe development: a story, chapter 1.

April 19, 2018 Mainframe , , , , , , , , , , , , ,

Once upon a time, in a land far from any modern developers, were languages named COBOL and PL/I, which generated programs that were consumed by a beast known as Mainframe. Developers for those languages compiled and linked their applications huddled around strange luminous green screens and piles of hole filled papers while chanting vaguely latin sounding incantations like “Om-padre-JCL-beget-loadmodule-pee-dee-ess.”

In these ancient times, version control tools like git were not available. There was no notion of makefiles, so compilation and link was a batch process, with no dependency tracking, and no parallelism. Developers used printf-style debugging, logging trace information to files.  In order to keep the uninitiated from talking to the Mainframe, files were called datasets.  In order to use graphical editors, developers had to repeatedly feed their source to the Mainframe using a slave named ftp, while praying that the evil demon EBCDIC-conversion didn’t mangle their work. The next day, they could go back and see if Mainframe accepted their offering.

[TO BE CONTINUED.]

Incidentally, as of a couple days ago, I’ve now been working for LzLabs for 2 years.  My work is not yet released, nor announced, so I can’t discuss it here yet, but it can be summarized as really awesome.  I’m still having lots of fun with my development work, even if I have to talk in languages that the beast understands.

Unpacking a PL/I VSAM keyed write loop.

June 7, 2017 Mainframe , , , , ,

I found myself faced with the task of understanding the effects of a PL/I WRITE loop that does the initial sequential load of a VSAM DATASET.  The block of code I was looking at had the following declarations:

     dcl IXUPD FILE UNBUFFERED KEYED env(vsam);
     dcl
        01 recArea,
            03 recPrefix,
                05 recID        PIC'(4)9' init (0),
                05 recKeyC      CHAR (4)  init (' '),
            03 recordData       CHAR (70) init (' ');

     dcl recIndx FIXED BIN(31) INITIAL(0);

     dcl keyListSize fixed bin(31) initial(10);
     dcl keyList(10) char(8);

As a C++ programmer, there are a few of oddities here:

  • Options for the FILE are specified at the file declaration point (or can be), not at the OPEN point.  They can also be specified at the OPEN point.  The designers of PL/I seem to have been guided by the general principle of “why have one way of doing something, when it can be done in an infinite variety of possible ways”.
  • There is a hybrid “structure & variable” declaration above.  recArea is like an object of an unnamed structure, containing nested parts (with lots of ugly COBOL like nesting specifications to show the depth of the various “structure” members).  It’s something like the following struct declaration (with c++11 default initializer specifiers):

    #include <stdio.h>
    
    int main() {
        struct {
            struct {
                char recID[4]{'0', '0', '0', '0'};
                char recKeyC[4]{' ', ' ', ' ', ' '};
            } recPrefix;
            char recordData[70]{ ' ', ' ', /* ... 70 spaces total */ };
        } recArea;
    
        printf( "recID: %.4s\n", recArea.recPrefix.recID );
        printf( "recKeyC: '%.4s'\n", recArea.recPrefix.recKeyC );
    
        return 0;
    }
    

    To PL/I’s credit, only ~45 years after the creation of PL/1 did C++ add a simple way of encoding default structure member initializers.

    We’ll see below that PL/I lets you access the inner members without any qualification if desired (i.e. recID == recArea.recPrefix.recId). The PL/I compiler writer is basically faced with the unenviable task of continually trying to guess what the programmer could have possibly meant.

  • The int32_t types have the annoying “mainframe”ism of being referred to as 31-bit integers (FIXED BIN(31)). Even if the high bit in pointers is ignored by the hardware (allowing the programmer to set 0x80000000 as a flag, for example for end of list in a list of pointers), that doesn’t mean that the registers aren’t fully 32-bit, nor does it mean that a 32-bit integer isn’t representable. I can’t for the life of me understand why a 32-bit integer variable should be declared as FIXED BINARY(31)?
  • The recID variable is declared with a PICTURE specification, as we also saw in COBOL code. PIC ‘9999’ (or PIC'(4)9′, for “short”), means that the character array will have four (EBCDIC) digits in it. I don’t quite understand this specification in this case, since the code (to follow) seems to put ‘RNNN’, where N is a digit in this field.

Here’s how the declarations above are used:

    
     keyList(1) = 'R001';
     keyList(2) = 'R002';
...
     OPEN FILE(IXUPD) OUTPUT;

     put skip list ('====== Write record to file by key.');
     do while (recIndx &lt; keyListSize);
        recIndx = recIndx + 1;
        recID = recIndx;
        recKeyC = 'Abcd';
        recordData = 'Data for ' || keyList(recIndx);
        write FILE(IXUPD) FROM(recArea) KEYFROM(keyList(recIndx));
     end;
     put skip list (recIndx, ' records is written to file by key.');

     CLOSE FILE(IXUPD);

My guess about what this ‘WRITE FROM(recArea)’ would do is to create records of the form:

0001AbcdData for R001
0002AbcdData for R002
0003AbcdData for R003
...

However, the VSAM DATASET (which was created with key offset 0, and key size 8), actually ends up with:

R001    Data for R001
R002    Data for R002
R003    Data for R003
...

Despite the fact that we are writing from recArea, which includes the recID and recKeyC fields (numeric and character respectively), only the non-key portion of the WRITE “data payload” ends up hitting the disk.

If that is the case, where do the spaces in the key-portion of the records come from? Again, the C programmer in me is interfering with my understanding. I look at:

dcl keyList(10) char(8);
keyList(1) = 'R001';

and think that keyList(1) = “R001\x00\x00\x00\x00”, but it must actually be space filled in PL/I! This seems to be confirmed emperically, based on the expected results for the test, but I can also see it in the debugger after manually relocating the 32-bit mainframe address:

(gdb) p keyLen
$1 = 8
(gdb) p /x aKey + 0x7ffbc4000000
$2 = 0x7ffbc5005740
(gdb) set target-charset EBCDIC-US
(gdb) p (char *)$2
$3 = 0x7ffbc5005740 "R001    R002    R003    R004    R005    R006    R007    R008    R009    R010    "

The final form of the records in the VSAM DATASET (mainframe for a file), is now fully understood. Note that the data disagrees with the PICTURE specification for the recID field in the recData variable declaration, but that’s okay, at least for this part of the program, since there is never any store to that field that is non-numeric. Would anything even have to have been written to recID or recKeyC … I suspect not? Once we have R00N in that part of the record what happens if we read it into recData with the numeric only PICTURE specification? Does that raise a PL/1 condition?

ps. Notice how the payload for the keyList array entries is nicely packed into memory. This is done in a very non-C like fashion with no requirement for an array of pointers and the corresponding cache hit loss those pointers create when accessing a big multilevel C array.

Still amused reading my PL/1 book: external storage.

May 28, 2017 Mainframe , , , ,

The z/OS Enterprise PL/I, Language Reference is the primary reference I have been using for the PL/1 that I’ve had to learn, but it is too modern, and not nearly as fun as the 1970’s era “PL/I structured programming book” I’ve got:

 

I’m not sure what a disk pack is, but I presume it is a predecessor to the hard drive.

Edit: Art Kaufmann, who I worked with at IBM, knew what a disk pack was (picture above from wikipedia):

“Back in the day, disk drives used removable media called “disk packs.” These were stacks of disks (usually about 2′ across) on a spindle with a plastic cover. See the IBM 2311 and 2314 for examples of these drives. You’d open the drive, lower the pack into place, twist the handle and remove the cover, then close the drive. The big risk was getting dust in when the cover was off; that would cause a head crash. Then some nitwit operator would either put a different disk pack in that drive (ruining that pack) or move the bad pack to a new drive, crashing that one. Or both.”