Mainframe

An invalid transformation of a COBOL data description entry

August 28, 2020 COBOL No comments , , , , ,

Here’s a subtle gotcha that we saw recently.  A miraculous tool transformed some putrid DELTA generated COBOL code from GOTO soup into human readable form.  Among the transformations that this tool did, were modifications to working storage data declarations (removing unused variables in the source, and simplifying some others).  One of those transformations was problematic.  In that problematic case the pre-transformed declarations were:

This declaration is basically a union of char[8] with a structure that has four char[2]’s, with the COBOL language imposed restriction that the character values can be only numeric (EBCDIC) digits (i.e. ‘\xF0’, …, ‘\xF9’).  In the code in question none of the U044-BIS* variables (neither the first, nor the aliases) were ever used explicitly, but they were passed into another COBOL program as LINKAGE SECTION variables and used in the called program.

Here’s how the tool initially transformed the declaration:

It turns out that dropping that first PIC and removing the corresponding REDEFINES clause, was an invalid transformation in this case, because the code used INITIALIZE on the level 01 object that contained these variables.

On page 177, of the “178 Enterprise COBOL for z/OS: Enterprise COBOL for z/OS, V6.3 Language Reference”, we have:

(copyright IBM)

FILLER
A data item that is not explicitly referred to in a program. The keyword FILLER is optional. If specified,
FILLER must be the first word following the level-number.

… snip …

In an INITIALIZE statement:
• When the FILLER phrase is not specified, elementary FILLER items are ignored.

The transformation of the code in question would have been correct provided the “INITIALIZE foo” was replaced with “INITIALIZE foo WITH FILLER”.  The bug in the tool was fixed, and the transformed code in question was, in this case, changed to drop all the aliasing:

As a side effect of encountering this issue, I learned a number of things:

  • FILLER is actually a COBOL language keyword, with specific semantics, and not just a variable naming convention.
  • Both ‘INITIALIZE’ and ‘INITIALIZE … WITH FILLER’ are allowed.
  • INITIALIZE (without FILLER) doesn’t do PIC appropriate initialization of FILLER variables (we had binary zeros instead of EBCDIC zeros as a result.)

Using the debugger to understand COBOL level 88 semantics.

August 10, 2020 COBOL No comments , ,

COBOL has an enumeration mechanism called a LEVEL-88 variable.  I found a few aspects of this counter-intuitive, as I’ve mentioned in a few previous posts.  With the help of the debugger, I now think I finally understand what’s going on.  Here’s an example program that uses a couple of LEVEL-88 variables:

We can use a debugger to discover the meaning of the COBOL code. Let’s start by single stepping past the first MOVE statement to just before the SET MY88-VAR-1:

Here, I’m running the program LZ000550 from a PDS named COBRC.NATIVE.LZ000550. We expect EBCDIC spaces (‘\x40’) in the FEEDBACK structure at this point and that’s what we see:

Line stepping past the SET statement, our structure memory layout now looks like:

By setting the LEVEL-88 variable to TRUE all the memory starting at the address of feedback is now overwritten by the numeric value 0x0000000141C33A3BL. If we continue the program, the SYSOUT ends up looking like:

This condition was expected.
This condition was expected.

The first ‘IF MY88-VAR-1 THEN’ fires, and after the subsequent ‘SET MY88-VAR-2 TO TRUE’, the second ‘IF MY88-VAR-2 THEN’ fires. The SET overwrites the structure memory at the point that the 88 was declared, and an IF check of that same variable name checks if the memory in that location has the value in the 88 variable. It does not matter what the specific layout of the structure is at that point. We see that an IF check of the level-88 variable just tests whether or not the value at that address has the pattern specified in the variable. In this case, we have only on level-88 variable with the given name in the program, so the ‘IF MY88-VAR-2 OF feedback’ that was used was redundant, and could have been coded as just ‘IF MY88-VAR-2’, or could have been coded as ‘IF MY88-VAR-2 OF CONDITION-TOKEN-VALUE of Feedback’

We can infer that the COBOL code’s WORKING-STORAGE has the following equivalent C++ layout:

struct CONDITION_TOKEN_VALUE
{
   short SEVERITY;
   short MSG_NO;
   char CASE_SEV_CTL;
   char FACILITY_ID[3];
};

enum my88_vars
{
   MY88_VAR_1 = 0x0000000141C33A3BL,
   MY88_VAR_2 = 0x0000000241C33A3BL
};

struct feedback
{
   union {
      CONDITION_TOKEN_VALUE c;
      my88_vars e;
   } u;
   int I_S_INFO;
};

and that the control flow of the program can be modeled as the following:

//...
feedback f;

int main()
{
   memset( &f, 0x40, sizeof(f) );
   f.u.e = MY88_VAR_1;
   if ( f.u.e == MY88_VAR_2 )
   {
      impossible();
   }
   if ( f.u.e == MY88_VAR_1 )
   {
      expected();
   }

   f.u.e = MY88_VAR_2;
   if ( f.u.e == MY88_VAR_1 )
   {
      impossible();
   }
   if ( f.u.e == MY88_VAR_2 )
   {
      expected();
   }

   return 0;
}

Things also get even more confusing if the LEVEL-88 variable specifies less bytes than the structure that it is embedded in.  In that case, SET of the variable pads out the structure with spaces and a check of the variable also looks for those additional trailing spaces:

The CONDITION-TOKEN-VALUE object uses a total of 8 bytes.  We can see the spaces in the display of the FEEDBACK structure if we look in the debugger:

See the four trailing 0x40 spaces here.

Incidentally, it can be hard to tell what the total storage requirements of a COBOL structure is by just looking at the code, because the mappings between digits and storage depends on the usage clauses.  If the structure also uses REDEFINES clauses (embedded unions), as was the case in the program that I was originally looking at, the debug output is also really nice to understand how big the various fields are, and where they are situated.

Here are a few of the lessons learned:

  • You might see a check like ‘IF MY88-VAR-1 THEN’, but nothing in the program explicitly sets MY88-VAR-1. It is effectively a global variable value that is set as a side effect of some other call (in the real program I was looking at, what modified this “variable” was actually a call to CEEFMDA, a LE system service.) We have pass by reference in the function calls so it can be a reverse engineering task to read any program and figure out how any given field may have been modified, and that doesn’t get any easier by introducing LEVEL-88 variables into the mix.
  • This effective enumeration mechanism is not typed in any sense. Correct use of a LEVEL-88 relies on the variables that follow it to have the appropriate types. In this case, the ‘IF MY88-VAR-1 THEN’ is essentially shorthand for:
  • There is a disconnect between the variables modified or checked by a given LEVEL-88 variable reference that must be inferred by the context.
  • An IF check of a LEVEL-88 variable may include an implicit check of the trailing part of the structure with EBCDIC spaces, if the fields that follow the 88 variable take more space than the value of the variable. Similarly, seting such a variable to TRUE may effectively memset a trailing subset of the structure to EBCDIC spaces.
  • Exactly what is modified by a given 88 variable depends on the context.  For example, if the level 88 variables were found in a copybook, and if I had a second structure that had the same layout as FEEDBACK, with both structures including that copybook, then I’d have two instances of this “enumeration”, and would need a set of “OF foo OF BAR” type clauses to disambiguate things.  Level 88 variables aren’t like a set of C defines.  Their meaning is context dependent, even if they masquerade as constants.

Unexpected COBOL implicit operator distribution!

May 27, 2020 COBOL No comments , , , , , , , , , , ,

Another day, another surprise from COBOL.  I was debugging a failure in a set of COBOL programs, and it seemed that the place things started going wrong was a specific IF check, which basically looked like:

The original code was triple incomprehensible, as it:

  • Was in German.
  • Was in COBOL.
  • Was generated by DELTA and was completely disgusting spaghetti code.  A map of the basic blocks would have looked like it was colored by a three year old vigorously scribbling with a crayon.

It turns out that there was a whole pile of error handling code that happened after the IF check, and I correctly guessed that there was something wrong with how our compiler handled the IF statement.

What I didn’t guess was what the actual operator precedence in this IF check was.  Initially, my C programmer trained mind looked at that IF condition, and said “what the hell is that!?”  I then guessed, incorrectly, that it meant:

if ( X != SPACES and X = ZERO)

where X is the array slice expression.  That interpretation did not explain the runtime failure, so I was hoping that I was both wrong about what it meant, but right that there was a compiler bug.

It turns out that in COBOL the implicit operator for the second part of the IF statement is  ‘NOT =’.  i.e. the NOT= distributes over the AND, so this IF actually meant:

if ( X != SPACES and X != ZERO)

In the original program, that IF condition actually makes sense.  After some reflection, I see there is some sense to this distribution, but it certainly wasn’t intuitive after programming C and C++ for 27 years. I’d argue that the root cause of the confusion is COBOL itself. A real programming language would use a temporary variable for the really long array slice expression, which would obliterate the need for counter-intuitive operator distribution requirements. Imagine something like:

  VAR X = PAYLOAD-DATA(PAYLOAD-START(TALLY): PAYLOAD-END(TALLY))

  IF (X NOT = SPACES) AND (X NOT = LOW-VALUE)
     NEXT SENTENCE ELSE GO TO CHECK-IT-DONE.

(Incidentally LOW-VALUE means binary-zero, not a ‘0’ character that has a 0xF0 value in EBCDIC).

COBOL is made especially incomprehensible since you can’t declare an in-line temporary in COBOL.  If you want one, you have to go thousands of lines up in the code to the WORKING-STORAGE section, and put a variable in there somewhere.  Such a variable is global to the whole program, and you have to search to determine it’s usage scope.  You probably also need a really long name for it because of namespace collision with all the other global variables.  Basically, you are better off not using any helper variables, unless you want to pay an explicit cost in overall code complexity.

In my test program that illustrated the compiler bug, I made other COBOL errors. I blame the fact that I was using gross GOTO ridden code like the original. Here was my program:

Because I misinterpreted the NOT= distribution, I thought this should produce:

000000001: !(not space and low-value.)
000000002: !(not space and low-value.)
000000003: !(not space and low-value.)
000000003: not space and low-value.

Once the subtle compiler bug was fixed, the actual SYSOUT from the program was:

000000001: not space and low-value.
000000001: !(not space and low-value.)
000000002: !(not space and low-value.)
000000003: !(not space and low-value.)

See how both the TRUE and FALSE basic blocks executed in my code. That didn’t occur in the original code, because it used an additional dummy EXIT paragraph to end the PERFORM range, and had a GOTO out of the first paragraph.

There is more modern COBOL syntax that can avoid this GOTO hell, but I hadn’t used it, as I kept the reproducer somewhat like the original code.

Does this COBOL level-88 IF check make any sense?

May 21, 2020 COBOL 2 comments ,

I find COBOL level-88 declarations a bit confusing, which isn’t made any easier by usage that is probably wrong. Here’s an example from code that I was trying to step through in the debugger (anonymized):

       WORKING-STORAGE SECTION.
       01  data.
           10  function-type         PIC  X(01).
               88  option-a          VALUE '1'.
               88  option-b          VALUE '2'.
               88  option-c          VALUE '3'.
               88  option-d          VALUE '4'.

With the use like so:

           IF option-a AND option-b AND option-c
           NEXT SENTENCE ELSE GO TO meaningless-label-2.

It’s my understanding that this is essentially equivalent to:

           IF function-type = '1' AND function-type = '2' AND
              function-type = '3'
           NEXT SENTENCE ELSE GO TO meaningless-label-2.

Do I misunderstand the level-88 variables should be used, or is this just a plain old impossible-to-be-true if check? Putting this into a little sample program, confirms that we hit the ELSE:

       IDENTIFICATION DIVISION.
       PROGRAM-ID.                 TESTPROG.
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  data.
           10  function-type         PIC  X(01).
               88  option-a          VALUE '1'.
               88  option-b          VALUE '2'.
               88  option-c          VALUE '3'.
               88  option-d          VALUE '4'.
       PROCEDURE DIVISION.
           move '1' to function-type

           perform meaningless-label-1 thru meaningless-label-6

           goback
           .

       meaningless-label-1.

      *    IF function-type = '1' AND function-type = '2' AND
      *       function-type = '3'
           IF option-a AND option-b AND option-c
           NEXT SENTENCE ELSE GO TO meaningless-label-2.

           display 'IF was true.'

           goto meaningless-label-6
           .

       meaningless-label-2.

           display 'IF was not true.'
           .

       meaningless-label-6.
           EXIT
           .

I get SYSOUT of:

IF was not true.

as I expected. If these were level-88 variables each “belonging” to a different variable, such as:

       IDENTIFICATION DIVISION.
       PROGRAM-ID.                 TESTPROG.
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  data.
           10  blah                 PIC  X(01).
               88  blah-option-a                 VALUE '1'.
               88  blah-option-b                 VALUE '2'.
           10  foo                  PIC  X(01).
               88  foo-option-a                  VALUE '1'.
               88  foo-option-b                  VALUE '2'.
               88  foo-option-c                  VALUE '3'.
           10  bar                  PIC  X(01).
               88  bar-option-c                  VALUE '3'.
               88  bar-option-d                  VALUE '4'.

       PROCEDURE DIVISION.
           move '1' to blah
           move '2' to foo
           move '3' to bar

           perform meaningless-label-1 thru meaningless-label-6

           goback
           .

       meaningless-label-1.

           IF blah-option-a AND foo-option-b AND bar-option-c
           NEXT SENTENCE ELSE GO TO meaningless-label-2.

           display 'IF was true.'

           goto meaningless-label-6
           .

       meaningless-label-2.

           display 'IF was not true.'
           .

       meaningless-label-6.
           EXIT
           .

This has the ‘IF was true’ SYSOUT. Perhaps the original coder meant to use OR instead of AND?

COBOL spaghetti code: EXIT does nothing!

May 20, 2020 COBOL 1 comment , , , , , , , , , ,

I was staring down COBOL code of the following form:

       LOOP-COUNTER-INCREMENT.
           ADD 1 TO J.
       LOOP-PREDICATE-CHECK.   
           IF J GREATER 10 GO TO MYSTERIOUS-LABEL-1.
           
           IF ARRAY-1 (J)      NOT = ZERO
           NEXT SENTENCE ELSE GO TO MYSTERIOUS-LABEL-1.
           
           IF ARRAY-2 (J) = MYSTERIOUS-MAGIC-NUMBER-CONSTANT
           NEXT SENTENCE ELSE GO TO COUNTER-INCREMENT-SPAGGETTIFI.
           
     *     ...MORE STUFF...                                        
     
           GO TO MYSTERIOUS-LABEL-3.
           
       COUNTER-INCREMENT-SPAGGETTIFI.
           GO TO LOOP-COUNTER-INCREMENT.
           
       MYSTERIOUS-LABEL-1.
                       EXIT.
       MYSTERIOUS-LABEL-2.
                       EXIT.
       MYSTERIOUS-LABEL-3.
                       EXIT.

I had to get some guru help understanding what this was about (thanks Roger!). I didn’t understand why somebody would code a GOTO LABEL, when the the code at that LABEL just did an EXIT. If my intuition could be trusted, I would have assumed that this code was equivalent to the much simpler:

       LOOP-COUNTER-INCREMENT.
           ADD 1 TO J.
       LOOP-PREDICATE-CHECK.   
           IF J GREATER 10 EXIT.
           
           IF ARRAY-1 (J)      NOT = ZERO
           NEXT SENTENCE ELSE EXIT.
           
           IF ARRAY-2 (J) = MYSTERIOUS-MAGIC-NUMBER-CONSTANT
           NEXT SENTENCE ELSE GO TO LOOP-COUNTER-INCREMENT.
           
     *     ...MORE STUFF...                                        
     
           EXIT.

It turns out that intuition is not much use when looking at COBOL code. In this case, that intuition failure is because EXIT doesn’t actually do anything. It is not like a return, which is what I assumed, but is just something that you can put in a paragraph at the end of the section so that the code can exit the section (or at the end of a sequence of paragraphs invoked by PERFORM THRU, so that the code can return to the caller.)  The EXIT in such a paragraph is just a comment, and you could use an empty paragraph to do the same thing.

In my transformation of the code the EXIT would do nothing, and execution would just fall through to the next sentence!

Some of the transformations I made are valid. In particular, the spaghettification-indirection used to increment the loop counter, by using a goto to goto the target location instead of straight there, has no reason to exist.

The code in question was an edited version of a program that was generated by a 4GL language (DELTA), so some of the apparent stupidity can be blamed on the code generator. I also assume DELTA can also be blamed for the multiple EXIT paragraphs, when it would seem more natural to just have one per section.

This code also uses EXIT after other paragraph labels too. The first paragraph in the following serving of horror has such an example:

            PERFORM TRANSFER-CHECK THRU TRANSFER-CHECK-EXIT.

            [snip]

       TRANSFER-CHECK.
                       EXIT.
       MEANINGLESS-LABEL-1.
           IF [A COMPOUND PREDICATE CHECK]
           NEXT SENTENCE ELSE GO TO MEANINGLESS-LABEL-2.
                 [SNIP]
           PERFORM [MORE STUFF]
           GO TO MEANINGLESS-LABEL-100.
       MEANINGLESS-LABEL-2.
           [STUFF]
           GO TO MEANINGLESS-LABEL-4.
       MEANINGLESS-LABEL-3.
           [increment loop counter, and fall through]
       MEANINGLESS-LABEL-4.
           [loop body]
...
       MEANINGLESS-LABEL-50.
           GO TO MEANINGLESS-LABEL-3.
           [SNIP]
...
       MEANINGLESS-LABEL-99.                            
                       EXIT.                               
       MEANINGLESS-LABEL-100.                                       
                       EXIT. 
       TRANSFER-CHECK-EXIT.
                       EXIT.

Nothing ever branches to MEANINGLESS-LABEL-1 directly, so why even have that there? Using my new found knowledge that EXIT doesn’t do anything, I’m pretty sure that you could just write:

            PERFORM TRANSFER-CHECK THRU TRANSFER-CHECK-EXIT.

            [snip]

       TRANSFER-CHECK.
       
           IF [A COMPOUND PREDICATE CHECK]

Is there some subtle reason that this first no-op paragraph was added? My guess is that the programmer was either being paid per line of code, or the code generator is to blame.

I’m not certain about the flow-control in the TRUE evaluation above. My intuition about the THRU use above is that if we have a GOTO that bypasses one of the paragraphs, then all the preceding paragraphs are counted as taken (i.e. if you get to the final paragraph in the THRU evaluation, no matter how you get there, then you are done.) I’ll have to do an experiment to determine if that’s actually the case.

Reverse engineering a horrible COBOL structure initialization

May 16, 2020 Mainframe 2 comments , , , ,

The COBOL code that I was looking at used a magic value 999, and I couldn’t see where it could be coming from.  After considerable head scratching, I managed to figure out that all the array structure instantiations in the code are initialized using strings.  That seems to be the origin of the magic (standalone) 999’s scattered through the code.

To share the horror, here is an (anonymized) example of the offending array structure initialization

where I added in the block comment that points out each of the interesting regions of the initialization strings.

Here’s what’s going on.  We have a global variable array (effectively unnamed) that has three fields:

  • two-characters (numeric only)
  • dummy-structure-name, containing a 3 character field and a pad.
  • nine-more-characters

If you add up all the characters in this data structure we have: 2 + 1 + 4 * (3 + 1) + 9 = 28, so this array initialization is effectively done by aliasing the array elements with the memory containing a char[7][28].

My eyes are burning!

As far as I can tell, COBOL has no notion of a structure type, you just have instances of structures everywhere (they are probably called something different — a level 01 declaration, or something like that).  A lot of the PL/I code I’ve seen is also like that, although in PL/I you can declare your structure types if you want to.

The display’s above make use of the fact that COBOL variables don’t have to use all the high level qualifiers (unless there is ambiguity).  My SYSOUT shows that, sure enough, the (5) element of the array (COBOL arrays are one’s counted) has the values I expected:

1 22
2 999
3 1/2
4
5
6 SF

Basically, the horrendous initialization above, is as if you if declared your structure as:

struct arrayname
{                   
   char numeric2[2];
   char filler1[1];
   struct               
   {                 
      char threemore[3];
      char filler2[1];
   } threepluspad[4];

   char ninemore[9];     
}; 

and then initialized it with:

char globalmemory[7][28] = {
   // n2       f    x    x    x    y    x    x    x    y    x    x    x    y    x    x    x    y    'K', 'l', 'a', 's', 's', 'e', ' ', ' ', ' '},
   { '0', '1', ' ', ' ', ' ', '0', ' ', ' ', ' ', '0', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'K', 'l', 'a', 's', 's', 'e', ' ', ' ', ' '},
   { '0', '2', ' ', ' ', ' ', '0', ' ', ' ', ' ', '0', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'K', 'l', 'a', 's', 's', 'e', ' ', ' ', ' '},
   { '1', '3', ' ', '9', '9', '9', ' ', '9', '9', '9', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'K', 'l', 'a', 's', 's', 'e', ' ', ' ', ' '},
   { '2', '1', ' ', '9', '9', '9', ' ', '1', '/', '2', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'S', 'F', ' ', ' ', ' ', ' ', ' ', ' ', ' '},
   { '2', '2', ' ', '9', '9', '9', ' ', '1', '/', '2', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'S', 'F', ' ', ' ', ' ', ' ', ' ', ' ', ' '},
   { '2', '3', ' ', '1', '/', '2', ' ', '1', '/', '2', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'S', 'F', ' ', ' ', ' ', ' ', ' ', ' ', ' '},
   { '3', '1', ' ', ' ', ' ', '1', ' ', ' ', ' ', '1', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'S', 'F', ' ', ' ', ' ', ' ', ' ', ' ', ' '},
};

struct arrayname * p = (struct arrayname*)globalmemory;

and then and then printed:

   printf( "1 %.2s\n", p[4].numeric2 );
   printf( "2 %.3s\n", p[4].threepluspad[0].threemore );
   printf( "3 %.3s\n", p[4].threepluspad[1].threemore );
   printf( "4 %.3s\n", p[4].threepluspad[2].threemore );
   printf( "5 %.3s\n", p[4].threepluspad[3].threemore );
   printf( "6 %.9s\n", p[4].ninemore );

Of course, the use of fixed length strings without a null terminator wouldn’t ever be done in C, so a more natural equivalent (assuming one doesn’t care about the specific memory equivalence of the two representations, and can tolerate null terminators instead of spaces) would just be something like:

struct arrayname
{
   char numeric2[3];
   struct 
   {
      char threemore[4];
   } threepluspad[4];

   char ninemore[9];
};

struct arrayname g[7] = {
   { "01", {"  0", "  0", "   ", "   "}, "Klasse  " },
   { "02", {"  0", "  0", "   ", "   "}, "Klasse  " },
   { "13", {"999", "999", "   ", "   "}, "Klasse  " },
   { "21", {"999", "1/2", "   ", "   "}, "SF      " },
   { "22", {"999", "1/2", "   ", "   "}, "SF      " },
   { "23", {"1/2", "1/2", "   ", "   "}, "SF      " },
   { "31", {"  1", "  1", "   ", "   "}, "SF      " }
};  

You could argue that the COBOL way isn’t so bad once you’ve seen the pattern, and is only cosmetically different from the natural C analogue. That is, if you ignore the fact that there is no separation of fields in the initializer strings, and that you have to name a whole bunch of dummy initializer objects and fill characters, and the fact that any semblance of typing is completely obliterated.

The code in question is also complete spaghetti, with GOTO all over the place.  Perhaps COBOL versions after COBOL77, which is what I assume I’m looking at, added loops and better initialization syntax?

Computing “offsetof” in COBOL

May 15, 2020 Mainframe 1 comment , , , , ,

I couldn’t find a way to compute something like C offsetof in COBOL code.  What I could manage to figure out how to do is compare addresses of a runtime instantiation of the structure, effectively doing this indirectly.  Here’s the ugly mess that I cooked up:

I couldn’t figure out the right syntax to do a single compute statement that was just the difference of addresses, as I got numeric/pointer compare errors from the compiler, no matter what I tried.  I think that ‘USAGE IS POINTER’ may be required on my variables, but that would still require a temporary.  I’m probably either doing this the hard way, or there is no easy way in COBOL.

This program was run with the following simple JCL

//TESTPROG JOB
//A EXEC PGM=TESTPROG
//SYSOUT DD SYSOUT=*
//STEPLIB DD DSN=COBRC.NATIVE.TESTPROG,
// DISP=SHR

and produced the following SYSOUT

address of TESTPROG-STRUCT = 0016800264
offsetof(ARRAY-NAME,RUECK-BKL) = 0000000002
offsetof(ARRAY-NAME,RUECK-BS) = 0000000004
offsetof(ARRAY-NAME,RUECK-SF) = 0000000007
sizeof(ARRAY-NAME(1)) = 0000000019

Looking at that output, we can conclude the following:

  • PIC S9(3) COMP-3 is effectively horrible eye-burning syntax for a “short”
  • There is no alignment padding between fields, nor end of array-member padding to force natural alignment of the next array element, should the structure start have been aligned.

I knew the latter, but wasn’t sure what size the first field was, and thought that trying to figure it out with COBOL code would be a good learning exercise.

File organization in really old COBOL code.

May 7, 2020 Mainframe No comments , , , , , , , , , , , , ,

I encountered customer COBOL code today with a file declaration of the following form:

000038   SELECT AUSGABE ASSIGN TO UR-S-AUSGABE            
000039    ACCESS IS SEQUENTIAL.                   
...
000056 FD  AUSGABE                                                     
000057     RECORDING F                                                  
000058     BLOCK 0 RECORDS                                              
000059     LABEL RECORDS OMITTED.                                       

where the program’s JCL used an AUSGABE (German “output”) DDNAME of the following form:

//AUSGABE   DD    DUMMY

The SELECT looked completely wrong to me, as I thought that SELECT is supposed to have the form:

SELECT cobol-file-variable-name ASSIGN TO ddname

That’s the syntax that my Murach’s Mainframe COBOL uses, and also what I’d seen in big-blue’s documentation.

However, in this customer’s code, the identifier UR-S-AUSGABE is longer than 8 characters, so it sure didn’t look like a DDNAME. I preprocessed the code looking to see if UR-S-AUSGABE was hiding in a copybook (mainframe lingo for an include file), but it wasn’t. How on Earth did this work when it was compiled and run on the original mainframe?

It turns out that [LABEL-]S- or [LABEL]-AS- are ways that really old COBOL code used to specify file organization (something like PL/I’s ENV(ORGANIZATION) clauses for FILEs). This works on the mainframe because a “modern” mainframe COBOL compiler strips off the LABEL- prefix if specified and the organization prefix S- as well, essentially treating those identifier fragments as “comments”.

For anybody reading this who has only programmed in a sane programming language, on sane operating systems, this all probably sounds like verbal diarrhea.  What on earth is a file organization and ddname?  Do I really have to care about those just to access a file?  Well, on the mainframe, yes, you do.

These mysterious dependencies highlight a number of reasons why COBOL code is hard to migrate. It isn’t just a programming language, but it is tied to the mainframe with lots of historic baggage in ways that are very difficult to extricate.  Even just to understand how to open a file in mainframe COBOL you have a whole pile of obstacles along the learning curve:

  • You don’t just run the program in a shell, passing in arguments, but you have to construct a JCL job step to do so.  This specifies parameters, environment variables, file handles, and other junk.
  • You have to know what a DDNAME is.  This is like a HANDLE in the JCL code that refers to a file.  The file has a filename (DSNAME), but you don’t typically use that.  Instead the JCL’s job step declares an arbitrary DDNAME to refer to that handle, and the program that is run in that job step has to always refer to the file using that abstract handle.
  • The file has all sorts of esoteric attributes that you have to know about to access it properly (fixed, variable, blocked, record length, block size, …).  The program that accesses the file typically has to make sure that these attributes are all encoded with the equivalent language specific syntax.
  • Files are not typically just byte streams on the mainframe but can have internal structure that can be as complicated as a simple database (keyed records, with special modes to access them to initialize vs access/modify.)
  • To make life extra “fun”, files are found in a variety of EBCDIC code pages.  In some cases these can’t be converted to single byte iso-8859-X code pages, so you have to use utf-8, and can get into trouble if you want to do round trip conversions.
  • Because of the internal structure of a mainframe file, you may not be able to transfer it to a sane operating system unless special steps are taken.  For example, a variable format file with binary data would typically have to be converted to a fixed format representation so that it’s possible to seek from record to record.
  • Within the (COBOL) code you have three sets of attributes that you have to specify to “declare” a file, before you can even attempt to open it: the DDNAME to COBOL-file-name mapping (SELECT), the FD clause (file properties), and finally record declarations (global variables that mirror the file data record structure that you have to use to read and write the file.)

You can’t just learn to program COBOL, like you would any sane programming language, but also have to learn all the mainframe concepts that the COBOL code is dependent on.  Make sure you are close enough to your eyewash station before you start!

Ugg, COBOL is so horrible.

September 23, 2019 Mainframe 5 comments , , ,

If you don’t know COBOL (and I clearly don’t know it well enough), you probably won’t spot that there’s a syntax error here:

       IDENTIFICATION DIVISION.
       PROGRAM-ID. BLAH.
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       LINKAGE SECTION.

       PROCEDURE DIVISION.

           DISPLAY 'BLAH.' UPON CONSOLE.

           STOP-RUN.

The C equivalent to this program is:

#include <stdio.h>
int main(){
  printf("BLAH.\n");
}

but it is meant to have an exit(0), so we want:

#include <stdio.h>
int main(){
  printf("BLAH.\n");
  return 0;
}

Translating that back into COBOL, we need the STOP-RUN statement to say STOP RUN (no dash).  With the -, this gets interpreted as a paragraph label (i.e. STOP-RUN delimits a new function) with no statements in it. What I did was something akin to:

#include <stdio.h>
void STOPRUN(){
}
int main(){
  printf("BLAH.\n");
  STOPRUN();
}

The final program doesn’t have the required ‘return 0’ or ‘exit(0)’ and blows up with a “No stop run nor go back” error condition.  Logically, what I really wanted was:

 

       IDENTIFICATION DIVISION.
       PROGRAM-ID. BLAH.
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       LINKAGE SECTION.

       PROCEDURE DIVISION.

           DISPLAY 'BLAH.' UPON CONSOLE

           STOP RUN

           .

 

This time I’ve left off the period that I had after CONSOLE (which was superfluous and should really only be used at the end of the paragraph (end of “function”)), and also fixes the STOP-RUN -> STOP RUN typo.

This is not the first time I’ve gotten mixed up on when to use DASHES or not.  For example, even above, I’d not expect a programmer to know or remember that you need to know to use dashes for WORKING-STORAGE and PROGRAM-ID but not for PROCEDURE DIVISION, nor some of the other space delimited fields.

My eyes are burning, and I left my COBOL programmer safety gear at home.

Getting closer to a JCL one liner equivalent to Unix head -1.

June 14, 2019 Mainframe No comments , , , , , , , ,

I was somewhat bemused by how much JCL it took to do the equivalent of a couple ‘head -1’ commands.  It was pointed out to me that INDATASET, OUTDATASET can be used to eliminate all the DD lines, and that all but the SYSPRINT DDs for IDCAMS were not actually required.  This allows the JCL for these pair of ‘head -1’ commands to be shortened to:

The REPRO lines still have to be split up because of the annoying punch-card derived 72 column restrictions of JCL. Note that to use OUTDATASET in this way, I had to sacrifice the JCL shell variable expansion that I had been using. To retain my shell variables (SET TID=UT; SET CID=UT128) I still need DDNAME statements to do the shell expansion in JCL proper, since that doesn’t occur in the SYSIN specification.  Translated to Unix, we must think of this sort of SYSIN “file” as being single and not double quoted (unlike a Unix <<EOF…EOF inline file where shell script are expanded).  The JCL is left reduced to:

Note that since I opted to retain the DDNAME statements, the REPRO lines are now short enough to each fit on a single line.

It turns out that there’s also a way to do variable expansion within the SYSIN, essentially treating something like a Unix double quoted script variable.  You need to explicitly export the symbols in the JCL prologue using EXPORT SYMLIST, and then import them in the SYSIN specification using SYMBOLS=CNVTSYS

I’ve switched to IDS and ODS to make the lines shorter, which makes it possible for one of the REPRO lines to be a one liner (with 6 lines of helper code).  The final JCL line count weighs in at 8:2 vs. Unix, but is not as bad as the original JCL I constructed (22 lines.)