Mainframe

I see mainframes: a real life PDS container!

September 22, 2017 Mainframe , , , , , ,

I found a PDS container walking about my neighbourhood this morning:

 

Just like the mainframe version, you can put all sorts of stuff in this one.

A mainframe PDS (partitioned data set) is technically a different sort of container, as you can only put DATASETs (mainframe’ze for a file) in them. An example would be if you have two programs (loadmodules in mainframe’ze) both named PEETERJO, then you can create a two PDS datasets, each having a PEETERJO member, say:

PEETER.JOOT.IS.THE.BEST(PEETERJO)
PEETER.JOOT.IS.STILL.AWESOME(PEETERJO)

From these you could then choose which one you want your JCL script to execute with a STEPLIB statement like:

//A EXEC PGM=PEETERJO
//STEPLIB  DD DSN=PEETER.JOOT.IS.THE.BEST,DISP=SHR
//SYSOUT   DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSTERM  DD SYSOUT=*
//SYSABEND DD SYSOUT=*

This works around the global name space issue that you’d have with storing two different datasets, both with the name PEETERJO.

You can also put any file into a PDS, provided you are willing to have the PDS member name for that file be a 1-8 character string. The PDS is sort of the mainframe equivalent of a directory (the long strings of A.B.C.D.E DATASET names can also be viewed as a directory of sorts).

I’m not sure if you can put a PDS in a PDS. If that is possible, I also don’t know if a PDS member can be accessed as a PDS without first copying it out.

Unpacking a PL/I VSAM keyed write loop.

June 7, 2017 Mainframe , , , , ,

I found myself faced with the task of understanding the effects of a PL/I WRITE loop that does the initial sequential load of a VSAM DATASET.  The block of code I was looking at had the following declarations:

     dcl IXUPD FILE UNBUFFERED KEYED env(vsam);
     dcl
        01 recArea,
            03 recPrefix,
                05 recID        PIC'(4)9' init (0),
                05 recKeyC      CHAR (4)  init (' '),
            03 recordData       CHAR (70) init (' ');

     dcl recIndx FIXED BIN(31) INITIAL(0);

     dcl keyListSize fixed bin(31) initial(10);
     dcl keyList(10) char(8);

As a C++ programmer, there are a few of oddities here:

  • Options for the FILE are specified at the file declaration point (or can be), not at the OPEN point.  They can also be specified at the OPEN point.  The designers of PL/I seem to have been guided by the general principle of “why have one way of doing something, when it can be done in an infinite variety of possible ways”.
  • There is a hybrid “structure & variable” declaration above.  recArea is like an object of an unnamed structure, containing nested parts (with lots of ugly COBOL like nesting specifications to show the depth of the various “structure” members).  It’s something like the following struct declaration (with c++11 default initializer specifiers):

    #include <stdio.h>
    
    int main() {
        struct {
            struct {
                char recID[4]{'0', '0', '0', '0'};
                char recKeyC[4]{' ', ' ', ' ', ' '};
            } recPrefix;
            char recordData[70]{ ' ', ' ', /* ... 70 spaces total */ };
        } recArea;
    
        printf( "recID: %.4s\n", recArea.recPrefix.recID );
        printf( "recKeyC: '%.4s'\n", recArea.recPrefix.recKeyC );
    
        return 0;
    }
    

    To PL/I’s credit, only ~45 years after the creation of PL/1 did C++ add a simple way of encoding default structure member initializers.

    We’ll see below that PL/I lets you access the inner members without any qualification if desired (i.e. recID == recArea.recPrefix.recId). The PL/I compiler writer is basically faced with the unenviable task of continually trying to guess what the programmer could have possibly meant.

  • The int32_t types have the annoying “mainframe”ism of being referred to as 31-bit integers (FIXED BIN(31)). Even if the high bit in pointers is ignored by the hardware (allowing the programmer to set 0x80000000 as a flag, for example for end of list in a list of pointers), that doesn’t mean that the registers aren’t fully 32-bit, nor does it mean that a 32-bit integer isn’t representable. I can’t for the life of me understand why a 32-bit integer variable should be declared as FIXED BINARY(31)?
  • The recID variable is declared with a PICTURE specification, as we also saw in COBOL code. PIC ‘9999’ (or PIC'(4)9′, for “short”), means that the character array will have four (EBCDIC) digits in it. I don’t quite understand this specification in this case, since the code (to follow) seems to put ‘RNNN’, where N is a digit in this field.

Here’s how the declarations above are used:

    
     keyList(1) = 'R001';
     keyList(2) = 'R002';
...
     OPEN FILE(IXUPD) OUTPUT;

     put skip list ('====== Write record to file by key.');
     do while (recIndx &lt; keyListSize);
        recIndx = recIndx + 1;
        recID = recIndx;
        recKeyC = 'Abcd';
        recordData = 'Data for ' || keyList(recIndx);
        write FILE(IXUPD) FROM(recArea) KEYFROM(keyList(recIndx));
     end;
     put skip list (recIndx, ' records is written to file by key.');

     CLOSE FILE(IXUPD);

My guess about what this ‘WRITE FROM(recArea)’ would do is to create records of the form:

0001AbcdData for R001
0002AbcdData for R002
0003AbcdData for R003
...

However, the VSAM DATASET (which was created with key offset 0, and key size 8), actually ends up with:

R001    Data for R001
R002    Data for R002
R003    Data for R003
...

Despite the fact that we are writing from recArea, which includes the recID and recKeyC fields (numeric and character respectively), only the non-key portion of the WRITE “data payload” ends up hitting the disk.

If that is the case, where do the spaces in the key-portion of the records come from? Again, the C programmer in me is interfering with my understanding. I look at:

dcl keyList(10) char(8);
keyList(1) = 'R001';

and think that keyList(1) = “R001\x00\x00\x00\x00”, but it must actually be space filled in PL/I! This seems to be confirmed emperically, based on the expected results for the test, but I can also see it in the debugger after manually relocating the 32-bit mainframe address:

(gdb) p keyLen
$1 = 8
(gdb) p /x aKey + 0x7ffbc4000000
$2 = 0x7ffbc5005740
(gdb) set target-charset EBCDIC-US
(gdb) p (char *)$2
$3 = 0x7ffbc5005740 "R001    R002    R003    R004    R005    R006    R007    R008    R009    R010    "

The final form of the records in the VSAM DATASET (mainframe for a file), is now fully understood. Note that the data disagrees with the PICTURE specification for the recID field in the recData variable declaration, but that’s okay, at least for this part of the program, since there is never any store to that field that is non-numeric. Would anything even have to have been written to recID or recKeyC … I suspect not? Once we have R00N in that part of the record what happens if we read it into recData with the numeric only PICTURE specification? Does that raise a PL/1 condition?

ps. Notice how the payload for the keyList array entries is nicely packed into memory. This is done in a very non-C like fashion with no requirement for an array of pointers and the corresponding cache hit loss those pointers create when accessing a big multilevel C array.

Still amused reading my PL/1 book: external storage.

May 28, 2017 Mainframe , , , ,

The z/OS Enterprise PL/I, Language Reference is the primary reference I have been using for the PL/1 that I’ve had to learn, but it is too modern, and not nearly as fun as the 1970’s era “PL/I structured programming book” I’ve got:

 

I’m not sure what a disk pack is, but I presume it is a predecessor to the hard drive.

Edit: Art Kaufmann, who I worked with at IBM, knew what a disk pack was (picture above from wikipedia):

“Back in the day, disk drives used removable media called “disk packs.” These were stacks of disks (usually about 2′ across) on a spindle with a plastic cover. See the IBM 2311 and 2314 for examples of these drives. You’d open the drive, lower the pack into place, twist the handle and remove the cover, then close the drive. The big risk was getting dust in when the cover was off; that would cause a head crash. Then some nitwit operator would either put a different disk pack in that drive (ruining that pack) or move the bad pack to a new drive, crashing that one. Or both.”

 

 

 

 

COBOL code! Where’s the eyewash station?

March 20, 2017 Mainframe , , , , , , ,

In code that I am writing for work, I’m calling into COBOL code from C, and in order to setup the parameters and interpret the results, I have to know a little bit about how variables are declared in COBOL. I got an explanation of a little bit of COBOL syntax today that takes some of the mystery away.

Here’s the equivalent of something like a declaration of compile time constant variables in COBOL, a hierarchical beast something akin to a structure:

004500 01  CONSTANT-VALUES.                                             ORIG_SRC
004600     02  AN-CONSTANT PIC X(5) VALUE "IC104".                      ORIG_SRC
004700     02  NUM-CONSTANT PIC 99V9999 VALUE 0.7654.                   ORIG_SRC

This is roughly the equivalent of the following pseudo-c++11:

struct
{
   char AN_CONSTANT[5]{'I','C','1','0','4'};
   struct {
      char digits1[2]{'0', '0'};
      char decimalpoint{ '.' };
      char digits2[4]{'7', '6', '5', '4'};
   } NUM_CONSTANT;
} ;

Some points:

  • The first 6 characters are source sequence numbers.  They aren’t line numbers like in BASIC (ie. you wouldn’t do a ‘goto 004500’), but were related to punch cards to make sure that out of sequence cards weren’t inserted into the card reader, or a card wasn’t fed into the reader by the operator by accident.
  • The ‘ORIG_SRC’ stuff in column 73+ are ignored.  These columns are also related to punch cards, as an additional card sequence number could be encoded in those locations.
  • The 01 indicates the first level of the ‘structure’.
  • The 02 means a second level.  I don’t know if the indenting of the 01, 02 is significant, but I suspect not.
  • PIC or PICTURE basically means the structure line is a variable and not the name of a new level.
  • A sequence of 9’s means that the variable takes numeric digits in those locations, whereas the V means that location is a period.
  • A sequence of X’s (or the X(5) here that means XXXXX), means that those characters can be alphanumeric.
  • There is no reference to ‘CONSTANT-VALUES’ when the variables are referenced.  That is like a namespace of some sort.
  • The level indicators 01, 02 are arbitrary, but have to be less than 77 (why that magic number? … who knows).  For example 05, 10 could have been used, so that another level could have been inserted in between without renumbering things.

The 01, 02 level indicators are also used for global variable declarations, also somewhat struct like:

004900 01  GRP-01.                                                      ORIG_SRC
005000     02  AN-FIELD PICTURE X(5).                                   ORIG_SRC
005100     02  NUM-DISPLAY PIC 99.                                      ORIG_SRC
005200     02  GRP-LEVEL.                                               ORIG_SRC
005300         03  A-FIELD PICTURE A(3).                                ORIG_SRC

This might be considered equivalent to:

struct
{
   char AN_FIELD[5];
   char NUM_DISPLAY[2];
   struct {
      char A_FIELD[3];
   } GRP_LEVEL;
} GRP_01;

Here:

  • A(3), equivalent to AAA, means the field can have ASCII values.
  • The name ‘GRP-LEVEL’ header for the 03 structure level is not referenced in the code.

It is also possible to declare a variable as binary, like so:

005400 77  ELEM-01 PIC  V9(4) COMPUTATIONAL.                            ORIG_SRC
  • Here 77 is a special magic level number, that really means what follows is a variable and not a “structure”.
  • The V here means an implied decimal place in the interpretation of the value.
  • The 9(4), equivalent to 9999, means the variable must be able to hold 4 numeric digits.
  • The COMPUTATIONAL means the underlying variable must be able to hold a value as big as 9999.  i.e. a short or unsigned short must be used, and not a char or unsigned char.

The final variable group in the code I was looking at was:

005500 01  GRP-02.                                                      ORIG_SRC
005600     02  GRP-03.                                                  ORIG_SRC
005700         03  NUM-ITEM PICTURE S99.                                ORIG_SRC
005800         03  EDITED-FIELD  PIC XXBX0X.                            ORIG_SRC

which is roughly equivalent to:

struct
{  
   struct {
      char NUM_ITEM[2];
      struct
      {
         char digits1[2];
         char blank1[1]{' '};
         char digits2[1];
         char zero1[1]{'0'};
         char digits3[1];
      } EDITED_FIELD;
   } GRP_03;
} GRP_02;         

Here

  • EDITED-FIELD includes fixed blank and zero markers (B, 0 respectively).  When a four character variable is copied into this field, only the characters in the non-blank and non-zero values are touched.
  • NUM-ITEM is a signed numeric value.  It’s representation is strange:

The signed representation is also char based, and uses what is referred to as an “over-punch” to encode the sign.  The normal (EBCDIC) encoding of a two digit variable 42 without a sign, would be:

‘4’, ‘2’ == 0xF4, 0xF2

when the S modifier is used in the PICTURE declaration, the F in the EDCDIC encoding range is changed to either C or D for unsigned and signed respectively.  That means the ‘4’, ‘2’ is encoded as:

0xF4, 0xC2

whereas the signed value “-42” is encoded as:

0xF4, 0xD2

The procedure prototype, specifically, what the parameters to the function are, are given in a ‘PROCEDURE DIVISION’ block, like so:

005900 PROCEDURE DIVISION USING GRP-01 ELEM-01 GRP-02. 

Here

  • The first 6 characters are still just punch card junk.
  • Three variables are passed to and from the function: GRP-01, ELEM-01, GRP-02.  These are, respectively, 10, 4, and 8 bytes respectively.
  • On the mainframe the COBOL function could be called with R1 something like:

struct parms {
    void * pointers[3];
    char ten[10];
    uint16_t h;
    char eight[8];
};

//...
struct parms p;

p.pointers[0] = &p.ten[0];
p.pointers[1] = &p.h;
p.pointers[2] = &p.eight | 0x80000000;

strncpy( p.ten, "XXXXX00ZZZ", 10 );
p.h = 0;
strncpy( p.eight, "99XXBX0X" );

setregister( R1, &p );

The 0x80000000 is the mainframe “31-bit” way of indicating the end of list. It relies on the fact that virtual memory addresses in 32-bit z/OS processes have only 31-bits of addressable space available, so you can hack one extra bit into a pointer to indicate end of list of pointers.

Suppose the program has statements like the following to populate its output fields

006400 MOVE AN-CONSTANT TO AN-FIELD. 
006500 ADD 25 TO NUM-DISPLAY. 
006600 MOVE "YES" TO A-FIELD. 
006700 MOVE NUM-CONSTANT TO ELEM-01. 
006800 MOVE NUM-DISPLAY TO NUM-ITEM. 
006900 MOVE "ABCD" TO EDITED-FIELD. 

The results of this are roughly:

strncpy( p.ten, "IC104", 5 ); // MOVE AN-CONSTANT TO AN-FIELD (GRP-01)
strcpy( p.ten + 5, "25", 2 ); // ADD 25 TO NUM-DISPLAY (GRP-01): since the initial value was "00"
strncpy( p.ten + 7, "YES", 3 ); // MOVE "YES" TO A-FIELD. 
p.h = 7654 // MOVE NUM-CONSTANT TO ELEM-01. 
strcpy( p.eight, "25", 2 ); // MOVE NUM-DISPLAY TO NUM-ITEM. 
strncpy( p.eight + 2, "AB C0D", 6 ); // MOVE "ABCD" TO EDITED-FIELD. 

It appears that the the assignment of NUM-CONSTANT, a number of the form 99.9999 to the numeric value ELEM-01 which is of the form .9999, just truncates any whole portion of the number.

VSAM creation and population with JCL and IDCAMS

March 7, 2017 Mainframe , , , , , , , ,

I learned a few JCL DATASET related things yesterday that seemed notable, at least for a JCL newbie.

Delete a DATASET, and ignore any error.

Each time I’ve wanted a DATASET cleanup step in JCL I’ve been using a separate script, and running that first.  A better way of doing this is to include a IDCAMS job step in the script, and have that do the deletion

//CLEANUP EXEC PGM=IDCAMS
//SYSIN DD *
  DELETE PJOOT.XXXXX005
  SET MAXCC = 0
/*
//SYSPRINT DD SYSOUT=*
//SYSOUT   DD SYSOUT=*
//*SYSTERM  DD SYSOUT=*

This deletes the file PJOOT.XXXXX005, which in this case was a VSAM file. In case that file (a DATASETs in mainframe-eze) did not exist, the error code for that DELETE is ignored by setting MAXCC=0. If you have multiple things that you want to do with IDCAMS, you can do things like DELETE and then ALLOCATE immediately, such as

//REALLOC EXEC PGM=IDCAMS
//SYSIN DD *
  DELETE PJOOT.XXXXX005
  SET MAXCC = 0
  DEFINE CLUSTER (NAME(PJOOT.XXXXX005) -
               CYLINDERS(1) VOLUMES(LZ0000) -
               INDEXED -
               KEYS(4 0) -
               RECORDSIZE(240 240) -
               ) -
         DATA (NAME(PJOOT.XXXXX005.DATA)) -
         INDEX (NAME(PJOOT.XXXXX005.INDEX))
/*
//SYSPRINT DD SYSOUT=*
//SYSOUT   DD SYSOUT=*
//*SYSTERM  DD SYSOUT=*

This does the DELETE, ignores any error, and then proceeds to do the new ALLOCATE for the VSAM file. I haven’t seen any way described of ALLOCATING a VSAM file other than using IDCAMS, except in 3270 screens. I think I’ve seen that LzLabs has 3270 capabilities for this sort of stuff, but I’m not inclined to try to figure out how to use it. I’d rather use our much more intuitive GUI or do it in script with JCL like this.

Copy a DATASET.

Here is some JCL to copy an (INLINE) dataset into the VSAM file created above

//COPY2VS EXEC PGM=IDCAMS
//TARGET DD DSN=PJOOT.XXXXX005,DISP=(OLD,KEEP,KEEP)
//INLINEDD DD DATA,DCB=(BLKSIZE=240,LRECL=240,RECFM=F)
a
brown
fox
quick
/*
//SYSIN DD *
REPRO -
  INFILE(INLINEDD) -
  OUTFILE(TARGET)
/*
//SYSPRINT DD SYSOUT=*
//SYSOUT   DD SYSOUT=*
//SYSTERM  DD SYSOUT=*

There are two quirks that are noteworthy here.

  1. The VSAM file requires the input be sorted, which is why the words from ‘a quick brown fox’ are in the explicitly sorted order above.
  2. The VSAM file was created with RECORDSIZE 240, so the input file had to be forced to LRECL=240 to match.

Omission of either sort or the LRECL matching causes the VSAM load to fail.

This was the first time that I’d seen this specific INLINE DD syntax, with explicit parameters.  The way I’d seen it before was how SYSIN was specified above with ‘NAME DD *’, ending with C “comment start” /* sequence.  It turns out the default end of file delimiter can also be specified, for example, this also works:

//INLINEDD DD DATA,DLM=@@,DCB=(BLKSIZE=240,LRECL=240,RECFM=F)
a
brown
fox
quick
@@

Cat a file to spool

Because IDCAMS can copy files, this can also be used to cat a file to SPOOL if desired.  Here’s an example:

//CATVS JOB
//CATVS EXEC PGM=IDCAMS
//TARGET DD DSN=PJOOT.XXXXX005,DISP=(OLD,KEEP,KEEP)
//SYSIN DD *
REPRO -
  INFILE(TARGET) -
  OUTFILE(SYSOUT)
/*
//SYSPRINT DD SYSOUT=*
//SYSOUT   DD SYSOUT=*
//SYSTERM  DD SYSOUT=*

If I include a step like this, I’m able to see the file contents in our nice GUI spool browser along with the JCL script and all the other output.