I couldn’t find a way to compute something like C offsetof in COBOL code. What I could manage to figure out how to do is compare addresses of a runtime instantiation of the structure, effectively doing this indirectly. Here’s the ugly mess that I cooked up:
I couldn’t figure out the right syntax to do a single compute statement that was just the difference of addresses, as I got numeric/pointer compare errors from the compiler, no matter what I tried. I think that ‘USAGE IS POINTER’ may be required on my variables, but that would still require a temporary. I’m probably either doing this the hard way, or there is no easy way in COBOL.
This program was run with the following simple JCL
//A EXEC PGM=TESTPROG
//SYSOUT DD SYSOUT=*
//STEPLIB DD DSN=COBRC.NATIVE.TESTPROG,
and produced the following SYSOUT
address of TESTPROG-STRUCT = 0016800264
offsetof(ARRAY-NAME,RUECK-BKL) = 0000000002
offsetof(ARRAY-NAME,RUECK-BS) = 0000000004
offsetof(ARRAY-NAME,RUECK-SF) = 0000000007
sizeof(ARRAY-NAME(1)) = 0000000019
Looking at that output, we can conclude the following:
- PIC S9(3) COMP-3 is effectively horrible eye-burning syntax for a “short”
- There is no alignment padding between fields, nor end of array-member padding to force natural alignment of the next array element, should the structure start have been aligned.
I knew the latter, but wasn’t sure what size the first field was, and thought that trying to figure it out with COBOL code would be a good learning exercise.
May 7, 2020
blocked, COBOL, COBOL FD, DATASET, DDNAME, DSNAME, EBCDIC, eye wash station, fixed, JCL, job step, mainframe, organization, variable
I encountered customer COBOL code today with a file declaration of the following form:
000038 SELECT AUSGABE ASSIGN TO UR-S-AUSGABE
000039 ACCESS IS SEQUENTIAL.
000056 FD AUSGABE
000057 RECORDING F
000058 BLOCK 0 RECORDS
000059 LABEL RECORDS OMITTED.
where the program’s JCL used an AUSGABE (German “output”) DDNAME of the following form:
The SELECT looked completely wrong to me, as I thought that SELECT is supposed to have the form:
SELECT cobol-file-variable-name ASSIGN TO ddname
That’s the syntax that my Murach’s Mainframe COBOL uses, and also what I’d seen in big-blue’s documentation.
However, in this customer’s code, the identifier UR-S-AUSGABE is longer than 8 characters, so it sure didn’t look like a DDNAME. I preprocessed the code looking to see if UR-S-AUSGABE was hiding in a copybook (mainframe lingo for an include file), but it wasn’t. How on Earth did this work when it was compiled and run on the original mainframe?
It turns out that [LABEL-]S- or [LABEL]-AS- are ways that really old COBOL code used to specify file organization (something like PL/I’s ENV(ORGANIZATION) clauses for FILEs). This works on the mainframe because a “modern” mainframe COBOL compiler strips off the LABEL- prefix if specified and the organization prefix S- as well, essentially treating those identifier fragments as “comments”.
For anybody reading this who has only programmed in a sane programming language, on sane operating systems, this all probably sounds like verbal diarrhea. What on earth is a file organization and ddname? Do I really have to care about those just to access a file? Well, on the mainframe, yes, you do.
These mysterious dependencies highlight a number of reasons why COBOL code is hard to migrate. It isn’t just a programming language, but it is tied to the mainframe with lots of historic baggage in ways that are very difficult to extricate. Even just to understand how to open a file in mainframe COBOL you have a whole pile of obstacles along the learning curve:
- You don’t just run the program in a shell, passing in arguments, but you have to construct a JCL job step to do so. This specifies parameters, environment variables, file handles, and other junk.
- You have to know what a DDNAME is. This is like a HANDLE in the JCL code that refers to a file. The file has a filename (DSNAME), but you don’t typically use that. Instead the JCL’s job step declares an arbitrary DDNAME to refer to that handle, and the program that is run in that job step has to always refer to the file using that abstract handle.
- The file has all sorts of esoteric attributes that you have to know about to access it properly (fixed, variable, blocked, record length, block size, …). The program that accesses the file typically has to make sure that these attributes are all encoded with the equivalent language specific syntax.
- Files are not typically just byte streams on the mainframe but can have internal structure that can be as complicated as a simple database (keyed records, with special modes to access them to initialize vs access/modify.)
- To make life extra “fun”, files are found in a variety of EBCDIC code pages. In some cases these can’t be converted to single byte iso-8859-X code pages, so you have to use utf-8, and can get into trouble if you want to do round trip conversions.
- Because of the internal structure of a mainframe file, you may not be able to transfer it to a sane operating system unless special steps are taken. For example, a variable format file with binary data would typically have to be converted to a fixed format representation so that it’s possible to seek from record to record.
- Within the (COBOL) code you have three sets of attributes that you have to specify to “declare” a file, before you can even attempt to open it: the DDNAME to COBOL-file-name mapping (SELECT), the FD clause (file properties), and finally record declarations (global variables that mirror the file data record structure that you have to use to read and write the file.)
You can’t just learn to program COBOL, like you would any sane programming language, but also have to learn all the mainframe concepts that the COBOL code is dependent on. Make sure you are close enough to your eyewash station before you start!
I was somewhat bemused by how much JCL it took to do the equivalent of a couple ‘head -1’ commands. It was pointed out to me that INDATASET, OUTDATASET can be used to eliminate all the DD lines, and that all but the SYSPRINT DDs for IDCAMS were not actually required. This allows the JCL for these pair of ‘head -1’ commands to be shortened to:
The REPRO lines still have to be split up because of the annoying punch-card derived 72 column restrictions of JCL. Note that to use OUTDATASET in this way, I had to sacrifice the JCL shell variable expansion that I had been using. To retain my shell variables (SET TID=UT; SET CID=UT128) I still need DDNAME statements to do the shell expansion in JCL proper, since that doesn’t occur in the SYSIN specification. Translated to Unix, we must think of this sort of SYSIN “file” as being single and not double quoted (unlike a Unix <<EOF…EOF inline file where shell script are expanded). The JCL is left reduced to:
Note that since I opted to retain the DDNAME statements, the REPRO lines are now short enough to each fit on a single line.
It turns out that there’s also a way to do variable expansion within the SYSIN, essentially treating something like a Unix double quoted script variable. You need to explicitly export the symbols in the JCL prologue using EXPORT SYMLIST, and then import them in the SYSIN specification using SYMBOLS=CNVTSYS
I’ve switched to IDS and ODS to make the lines shorter, which makes it possible for one of the REPRO lines to be a one liner (with 6 lines of helper code). The final JCL line count weighs in at 8:2 vs. Unix, but is not as bad as the original JCL I constructed (22 lines.)
Suppose you wanted to do the equivalent of the following Unix shell code on the mainframe in JCL:
head -1 < UT128.SYSOUT.EXPECTED > $TID.$CID.SYSOUT.ACT
head -1 < UT128.COBPRINT.EXPECTED > $TID.$CID.COBPRINT.ACT
Here’s the JCL equivalent of this pair of one-liners:
There are probably shorter ways to do this, but the naive way weighs in at 22:2 lines for JCL:Unix — damn!
I can’t help but to add a punny comment that knowing JCL must have once been really good JOB security.
I found a PDS container walking about my neighbourhood this morning:
Just like the mainframe version, you can put all sorts of stuff in this one.
A mainframe PDS (partitioned data set) is technically a different sort of container, as you can only put DATASETs (mainframe’ze for a file) in them. An example would be if you have two programs (loadmodules in mainframe’ze) both named PEETERJO, then you can create a two PDS datasets, each having a PEETERJO member, say:
From these you could then choose which one you want your JCL script to execute with a STEPLIB statement like:
//A EXEC PGM=PEETERJO
//STEPLIB DD DSN=PEETER.JOOT.IS.THE.BEST,DISP=SHR
//SYSOUT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSTERM DD SYSOUT=*
//SYSABEND DD SYSOUT=*
This works around the global name space issue that you’d have with storing two different datasets, both with the name PEETERJO.
You can also put any file into a PDS, provided you are willing to have the PDS member name for that file be a 1-8 character string. The PDS is sort of the mainframe equivalent of a directory (the long strings of A.B.C.D.E DATASET names can also be viewed as a directory of sorts).
I’m not sure if you can put a PDS in a PDS. If that is possible, I also don’t know if a PDS member can be accessed as a PDS without first copying it out.