Unexpected COBOL implicit operator distribution!

May 27, 2020 COBOL No comments , , , , , , , , , , ,

Another day, another surprise from COBOL.  I was debugging a failure in a set of COBOL programs, and it seemed that the place things started going wrong was a specific IF check, which basically looked like:

The original code was triple incomprehensible, as it:

  • Was in German.
  • Was in COBOL.
  • Was generated by DELTA and was completely disgusting spaghetti code.  A map of the basic blocks would have looked like it was colored by a three year old vigorously scribbling with a crayon.

It turns out that there was a whole pile of error handling code that happened after the IF check, and I correctly guessed that there was something wrong with how our compiler handled the IF statement.

What I didn’t guess was what the actual operator precedence in this IF check was.  Initially, my C programmer trained mind looked at that IF condition, and said “what the hell is that!?”  I then guessed, incorrectly, that it meant:

if ( X != SPACES and X = ZERO)

where X is the array slice expression.  That interpretation did not explain the runtime failure, so I was hoping that I was both wrong about what it meant, but right that there was a compiler bug.

It turns out that in COBOL the implicit operator for the second part of the IF statement is  ‘NOT =’.  i.e. the NOT= distributes over the AND, so this IF actually meant:

if ( X != SPACES and X != ZERO)

In the original program, that IF condition actually makes sense.  After some reflection, I see there is some sense to this distribution, but it certainly wasn’t intuitive after programming C and C++ for 27 years. I’d argue that the root cause of the confusion is COBOL itself. A real programming language would use a temporary variable for the really long array slice expression, which would obliterate the need for counter-intuitive operator distribution requirements. Imagine something like:

  VAR X = PAYLOAD-DATA(PAYLOAD-START(TALLY): PAYLOAD-END(TALLY))

  IF (X NOT = SPACES) AND (X NOT = LOW-VALUE)
     NEXT SENTENCE ELSE GO TO CHECK-IT-DONE.

(Incidentally LOW-VALUE means binary-zero, not a ‘0’ character that has a 0xF0 value in EBCDIC).

COBOL is made especially incomprehensible since you can’t declare an in-line temporary in COBOL.  If you want one, you have to go thousands of lines up in the code to the WORKING-STORAGE section, and put a variable in there somewhere.  Such a variable is global to the whole program, and you have to search to determine it’s usage scope.  You probably also need a really long name for it because of namespace collision with all the other global variables.  Basically, you are better off not using any helper variables, unless you want to pay an explicit cost in overall code complexity.

In my test program that illustrated the compiler bug, I made other COBOL errors. I blame the fact that I was using gross GOTO ridden code like the original. Here was my program:

Because I misinterpreted the NOT= distribution, I thought this should produce:

000000001: !(not space and low-value.)
000000002: !(not space and low-value.)
000000003: !(not space and low-value.)
000000003: not space and low-value.

Once the subtle compiler bug was fixed, the actual SYSOUT from the program was:

000000001: not space and low-value.
000000001: !(not space and low-value.)
000000002: !(not space and low-value.)
000000003: !(not space and low-value.)

See how both the TRUE and FALSE basic blocks executed in my code. That didn’t occur in the original code, because it used an additional dummy EXIT paragraph to end the PERFORM range, and had a GOTO out of the first paragraph.

There is more modern COBOL syntax that can avoid this GOTO hell, but I hadn’t used it, as I kept the reproducer somewhat like the original code.

My collection of Peeter Joot physics paperbacks

May 22, 2020 math and physics play No comments , , , , , , , ,

I ordered a copy of my old PHY456 Quantum Mechanics II notes for myself, and it arrived today!  Here it is with it’s buddies (Grad QM and QFT):

With the shipping cost from the US to Canada (because I’m now paying for amazon prime anyways) it’s actually cheaper for me to get a regular copy than to order an author proof, so this time I have no “not for resale” banding.

This little stack of Quantum notes weighs in at about 1050 pages, and makes a rather impressive pile.  There’s a lot of info there, for the bargain price of either free or about $30 USD, depending on whether you want a PDF or print copy of this set.  Of course, most people want neither, and get all their quantum mechanics through osmosis from the engineering of the microchips and electronics in their phones and computers.

I have to admit that it’s a fun ego boost to see your name in print.  In order to maximize the ego boost, you can use my strategy and do large scale vanity press, making a multiple volume set for yourself.  Here’s my whole collection, which includes the bulk of my course notes, plus my little book:

Based on the height of the stack, I’d guess this is about 3000 pages total, the product of about 10 years of study and work.

Making these all available for free to anybody in PDF form surely cripples my potential physical copy sales volume, but that doesn’t matter too much since I’ve set the price so low that I only get a token payment for each copy anyways.  Based on linear extrapolation of my sales so far, I’ll recoup my tuition costs (not counting the opportunity cost of working part time while I took the courses) after another 65 years of royalties.

Does this COBOL level-88 IF check make any sense?

May 21, 2020 COBOL 2 comments ,

I find COBOL level-88 declarations a bit confusing, which isn’t made any easier by usage that is probably wrong. Here’s an example from code that I was trying to step through in the debugger (anonymized):

       WORKING-STORAGE SECTION.
       01  data.
           10  function-type         PIC  X(01).
               88  option-a          VALUE '1'.
               88  option-b          VALUE '2'.
               88  option-c          VALUE '3'.
               88  option-d          VALUE '4'.

With the use like so:

           IF option-a AND option-b AND option-c
           NEXT SENTENCE ELSE GO TO meaningless-label-2.

It’s my understanding that this is essentially equivalent to:

           IF function-type = '1' AND function-type = '2' AND
              function-type = '3'
           NEXT SENTENCE ELSE GO TO meaningless-label-2.

Do I misunderstand the level-88 variables should be used, or is this just a plain old impossible-to-be-true if check? Putting this into a little sample program, confirms that we hit the ELSE:

       IDENTIFICATION DIVISION.
       PROGRAM-ID.                 TESTPROG.
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  data.
           10  function-type         PIC  X(01).
               88  option-a          VALUE '1'.
               88  option-b          VALUE '2'.
               88  option-c          VALUE '3'.
               88  option-d          VALUE '4'.
       PROCEDURE DIVISION.
           move '1' to function-type

           perform meaningless-label-1 thru meaningless-label-6

           goback
           .

       meaningless-label-1.

      *    IF function-type = '1' AND function-type = '2' AND
      *       function-type = '3'
           IF option-a AND option-b AND option-c
           NEXT SENTENCE ELSE GO TO meaningless-label-2.

           display 'IF was true.'

           goto meaningless-label-6
           .

       meaningless-label-2.

           display 'IF was not true.'
           .

       meaningless-label-6.
           EXIT
           .

I get SYSOUT of:

IF was not true.

as I expected. If these were level-88 variables each “belonging” to a different variable, such as:

       IDENTIFICATION DIVISION.
       PROGRAM-ID.                 TESTPROG.
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       01  data.
           10  blah                 PIC  X(01).
               88  blah-option-a                 VALUE '1'.
               88  blah-option-b                 VALUE '2'.
           10  foo                  PIC  X(01).
               88  foo-option-a                  VALUE '1'.
               88  foo-option-b                  VALUE '2'.
               88  foo-option-c                  VALUE '3'.
           10  bar                  PIC  X(01).
               88  bar-option-c                  VALUE '3'.
               88  bar-option-d                  VALUE '4'.

       PROCEDURE DIVISION.
           move '1' to blah
           move '2' to foo
           move '3' to bar

           perform meaningless-label-1 thru meaningless-label-6

           goback
           .

       meaningless-label-1.

           IF blah-option-a AND foo-option-b AND bar-option-c
           NEXT SENTENCE ELSE GO TO meaningless-label-2.

           display 'IF was true.'

           goto meaningless-label-6
           .

       meaningless-label-2.

           display 'IF was not true.'
           .

       meaningless-label-6.
           EXIT
           .

This has the ‘IF was true’ SYSOUT. Perhaps the original coder meant to use OR instead of AND?

COBOL spaghetti code: EXIT does nothing!

May 20, 2020 COBOL 1 comment , , , , , , , , , ,

I was staring down COBOL code of the following form:

       LOOP-COUNTER-INCREMENT.
           ADD 1 TO J.
       LOOP-PREDICATE-CHECK.   
           IF J GREATER 10 GO TO MYSTERIOUS-LABEL-1.
           
           IF ARRAY-1 (J)      NOT = ZERO
           NEXT SENTENCE ELSE GO TO MYSTERIOUS-LABEL-1.
           
           IF ARRAY-2 (J) = MYSTERIOUS-MAGIC-NUMBER-CONSTANT
           NEXT SENTENCE ELSE GO TO COUNTER-INCREMENT-SPAGGETTIFI.
           
     *     ...MORE STUFF...                                        
     
           GO TO MYSTERIOUS-LABEL-3.
           
       COUNTER-INCREMENT-SPAGGETTIFI.
           GO TO LOOP-COUNTER-INCREMENT.
           
       MYSTERIOUS-LABEL-1.
                       EXIT.
       MYSTERIOUS-LABEL-2.
                       EXIT.
       MYSTERIOUS-LABEL-3.
                       EXIT.

I had to get some guru help understanding what this was about (thanks Roger!). I didn’t understand why somebody would code a GOTO LABEL, when the the code at that LABEL just did an EXIT. If my intuition could be trusted, I would have assumed that this code was equivalent to the much simpler:

       LOOP-COUNTER-INCREMENT.
           ADD 1 TO J.
       LOOP-PREDICATE-CHECK.   
           IF J GREATER 10 EXIT.
           
           IF ARRAY-1 (J)      NOT = ZERO
           NEXT SENTENCE ELSE EXIT.
           
           IF ARRAY-2 (J) = MYSTERIOUS-MAGIC-NUMBER-CONSTANT
           NEXT SENTENCE ELSE GO TO LOOP-COUNTER-INCREMENT.
           
     *     ...MORE STUFF...                                        
     
           EXIT.

It turns out that intuition is not much use when looking at COBOL code. In this case, that intuition failure is because EXIT doesn’t actually do anything. It is not like a return, which is what I assumed, but is just something that you can put in a paragraph at the end of the section so that the code can exit the section (or at the end of a sequence of paragraphs invoked by PERFORM THRU, so that the code can return to the caller.)  The EXIT in such a paragraph is just a comment, and you could use an empty paragraph to do the same thing.

In my transformation of the code the EXIT would do nothing, and execution would just fall through to the next sentence!

Some of the transformations I made are valid. In particular, the spaghettification-indirection used to increment the loop counter, by using a goto to goto the target location instead of straight there, has no reason to exist.

The code in question was an edited version of a program that was generated by a 4GL language (DELTA), so some of the apparent stupidity can be blamed on the code generator. I also assume DELTA can also be blamed for the multiple EXIT paragraphs, when it would seem more natural to just have one per section.

This code also uses EXIT after other paragraph labels too. The first paragraph in the following serving of horror has such an example:

            PERFORM TRANSFER-CHECK THRU TRANSFER-CHECK-EXIT.

            [snip]

       TRANSFER-CHECK.
                       EXIT.
       MEANINGLESS-LABEL-1.
           IF [A COMPOUND PREDICATE CHECK]
           NEXT SENTENCE ELSE GO TO MEANINGLESS-LABEL-2.
                 [SNIP]
           PERFORM [MORE STUFF]
           GO TO MEANINGLESS-LABEL-100.
       MEANINGLESS-LABEL-2.
           [STUFF]
           GO TO MEANINGLESS-LABEL-4.
       MEANINGLESS-LABEL-3.
           [increment loop counter, and fall through]
       MEANINGLESS-LABEL-4.
           [loop body]
...
       MEANINGLESS-LABEL-50.
           GO TO MEANINGLESS-LABEL-3.
           [SNIP]
...
       MEANINGLESS-LABEL-99.                            
                       EXIT.                               
       MEANINGLESS-LABEL-100.                                       
                       EXIT. 
       TRANSFER-CHECK-EXIT.
                       EXIT.

Nothing ever branches to MEANINGLESS-LABEL-1 directly, so why even have that there? Using my new found knowledge that EXIT doesn’t do anything, I’m pretty sure that you could just write:

            PERFORM TRANSFER-CHECK THRU TRANSFER-CHECK-EXIT.

            [snip]

       TRANSFER-CHECK.
       
           IF [A COMPOUND PREDICATE CHECK]

Is there some subtle reason that this first no-op paragraph was added? My guess is that the programmer was either being paid per line of code, or the code generator is to blame.

I’m not certain about the flow-control in the TRUE evaluation above. My intuition about the THRU use above is that if we have a GOTO that bypasses one of the paragraphs, then all the preceding paragraphs are counted as taken (i.e. if you get to the final paragraph in the THRU evaluation, no matter how you get there, then you are done.) I’ll have to do an experiment to determine if that’s actually the case.

My old Quantum II notes are now available on amazon

May 17, 2020 phy456 No comments , , , , , , , , ,

PHY456, Quantum Mechanics II was one of the first few courses that I did as part of my non-degree upper year physics program.  That was a self directed study part time program, where I took most of interesting seeming fourth year undergrad physics courses at UofT.

I was never really pleased with how my QMII notes came out, and unlike some of my other notes compilations, I never made a version available on amazon, instead just had the PDF available for free on my Quantum Mechanics page.  That page also outlines how to get a copy of the latex sources for the notes (for the curious, or for the zealous reader who wants to submit merge requests with corrections.)

Well, over the last month or so, I’ve gradually cleaned up these QMII notes enough that they are “print-ready” (no equations overflowing into the “gutter”, …) , and have gone ahead and made it available on amazon, for $10 USD.  Like my other class notes “books”, this is published using amazon’s print on demand service.  In the likely event that nobody will order a copy, there is no upfront requirement for me to order a minimal sized print run, and then be stuck with a whole bunch of copies that I can’t give away.

There are still lots of defects in this set of notes.  In particular, I seem to have never written up my problem set solutions in latex, and subsequently lost those solutions.  There’s also lots of redundant material, as I reworked a few of the derivations multiple times, and never went back and purged the crud.  That said, they are available as-is, now in paper form, as well as a free PDF.

I’ll share the preface, and the contents below.

Preface.

These are my personal lecture notes for the Fall 2011, University of Toronto Quantum mechanics II course (PHY456H1F), taught by Prof. John E Sipe.

The official description of this course was:

“Quantum dynamics in Heisenberg and Schrodinger Pictures; WKB approximation; Variational Method; Time-Independent Perturbation Theory; Spin; Addition of Angular Momentum; Time-Dependent Perturbation Theory; Scattering.”

This document contains a few things

  • My lecture notes.
  • Notes from reading of the text \citep{desai2009quantum}. This may include observations, notes on what seem like errors, and some solved problems.
  • Different ways of tackling some of the assigned problems than the solution sets.
  • Some personal notes exploring details that were not clear to me from the lectures.
  • Some worked problems.

There were three main themes in this course, my notes for which can be found in

  • Approximate methods and perturbation,
  • Spin, angular momentum, and two particle systems, and
  • Scattering theory.

Unlike some of my other course notes compilations, this one is short and contains few worked problems. It appears that I did most of my problem sets on paper and subsequently lost my solutions. There are also some major defects in these notes:

  • There are plenty of places where things weren’t clear, and there are still comments to followup on those issues to understand them.
  • There is redundant content, from back to back lectures on materials that included review of the previous lecture notes.
  • A lot of the stuff in the appendix (mostly personal notes and musings) should be merged into the appropriate lecture note chapters. Some work along those lines has been started, but that work was very preliminary.
  • I reworked some ideas from the original lecture notes to make sense of them (in particular, adiabatic approximation theory), but then didn’t go back and consolidate all the different notes for the topic into a single coherent unit.
  • There were Mathematica notebooks for some of the topics with issues that I never did figure out.
  • Lots of typos, bad spelling, and horrendous grammar.
  • The indexing is very spotty.

Hopefully, despite these and other defects, these notes may be of some value to other students of Quantum Mechanics.

I’d like to thank Professor Sipe for teaching this course. I learned a lot and it provided a great foundation for additional study.

Phy456 (QM II) Contents:

  • Copyright
  • Document Version
  • Dedication
  • Preface
  • Contents
  • List of Figures
  • 1 Approximate methods.
  • 1.1 Approximate methods for finding energy eigenvalues and eigenkets.
  • 1.2 Variational principle.
  • 2 Perturbation methods.
  • 2.1 States and wave functions.
  • 2.2 Excited states.
  • 2.3 Problems.
  • 3 Time independent perturbation.
  • 3.1 Time independent perturbation.
  • 3.2 Issues concerning degeneracy.
  • 3.3 Examples.
  • 4 Time dependent perturbation.
  • 4.1 Review of dynamics.
  • 4.2 Interaction picture.
  • 4.3 Justifying the Taylor expansion above (not class notes).
  • 4.4 Recap: Interaction picture.
  • 4.5 Time dependent perturbation theory.
  • 4.6 Perturbation expansion.
  • 4.7 Time dependent perturbation.
  • 4.8 Sudden perturbations.
  • 4.9 Adiabatic perturbations.
  • 4.10 Adiabatic perturbation theory (cont.)
  • 4.11 Examples.
  • 5 Fermi’s golden rule.
  • 5.1 Recap. Where we got to on Fermi’s golden rule.
  • 5.2 Fermi’s Golden rule.
  • 5.3 Problems.
  • 6 WKB Method.
  • 6.1 WKB (Wentzel-Kramers-Brillouin) Method.
  • 6.2 Turning points..
  • 6.3 Examples.
  • 7 Composite systems.
  • 7.1 Hilbert Spaces.
  • 7.2 Operators.
  • 7.3 Generalizations.
  • 7.4 Recalling the Stern-Gerlach system from PHY354.
  • 8 Spin and Spinors.
  • 8.1 Generators.
  • 8.2 Generalizations.
  • 8.3 Multiple wavefunction spaces.
  • 9 Two state kets and Pauli matrices.
  • 9.1 Representation of kets.
  • 9.2 Representation of two state kets.
  • 9.3 Pauli spin matrices.
  • 10 Rotation operator in spin space.
  • 10.1 Formal Taylor series expansion.
  • 10.2 Spin dynamics.
  • 10.3 The hydrogen atom with spin.
  • 11 Two spins, angular momentum, and Clebsch-Gordon.
  • 11.1 Two spins.
  • 11.2 More on two spin systems.
  • 11.3 Recap: table of two spin angular momenta.
  • 11.4 Tensor operators.
  • 12 Rotations of operators and spherical tensors.
  • 12.1 Setup.
  • 12.2 Infinitesimal rotations.
  • 12.3 A problem.
  • 12.4 How do we extract these buried simplicities?
  • 12.5 Motivating spherical tensors.
  • 12.6 Spherical tensors (cont.)
  • 13 Scattering theory.
  • 13.1 Setup.
  • 13.2 1D QM scattering. No potential wave packet time evolution.
  • 13.3 A Gaussian wave packet.
  • 13.4 With a potential.
  • 13.5 Considering the time independent case temporarily.
  • 13.6 Recap.
  • 14 3D Scattering.
  • 14.1 Setup.
  • 14.2 Seeking a post scattering solution away from the potential.
  • 14.3 The radial equation and its solution.
  • 14.4 Limits of spherical Bessel and Neumann functions.
  • 14.5 Back to our problem.
  • 14.6 Scattering geometry and nomenclature.
  • 14.7 Appendix.
  • 14.8 Verifying the solution to the spherical Bessel equation.
  • 14.9 Scattering cross sections.
  • 15 Born approximation.
  • A Harmonic oscillator Review.
  • A.1 Problems.
  • B Simple entanglement example.
  • C Problem set 4, problem 2 notes.
  • D Adiabatic perturbation revisited.
  • E 2nd order adiabatically Hamiltonian.
  • F Degeneracy and diagonalization.
  • F.1 Motivation.
  • F.2 A four state Hamiltonian.
  • F.3 Generalizing slightly.
  • G Review of approximation results.
  • G.1 Motivation.
  • G.2 Variational method.
  • G.3 Time independent perturbation.
  • G.4 Degeneracy.
  • G.5 Interaction picture.
  • G.6 Time dependent perturbation.
  • G.7 Sudden perturbations.
  • G.8 Adiabatic perturbations.
  • G.9 WKB.
  • H Clebsh-Gordan zero coefficients.
  • H.1 Motivation.
  • H.2 Recap on notation.
  • H.3 The \(J_z\) action.
  • I One more adiabatic perturbation derivation.
  • I.1 Motivation.
  • I.2 Build up.
  • I.3 Adiabatic case.
  • I.4 Summary.
  • J Time dependent perturbation revisited.
  • K Second form of adiabatic approximation.
  • L Verifying the Helmholtz Green’s function.
  • M Mathematica notebooks.
  • Index
  • Bibliography

Reverse engineering a horrible COBOL structure initialization

May 16, 2020 Mainframe 2 comments , , , ,

The COBOL code that I was looking at used a magic value 999, and I couldn’t see where it could be coming from.  After considerable head scratching, I managed to figure out that all the array structure instantiations in the code are initialized using strings.  That seems to be the origin of the magic (standalone) 999’s scattered through the code.

To share the horror, here is an (anonymized) example of the offending array structure initialization

where I added in the block comment that points out each of the interesting regions of the initialization strings.

Here’s what’s going on.  We have a global variable array (effectively unnamed) that has three fields:

  • two-characters (numeric only)
  • dummy-structure-name, containing a 3 character field and a pad.
  • nine-more-characters

If you add up all the characters in this data structure we have: 2 + 1 + 4 * (3 + 1) + 9 = 28, so this array initialization is effectively done by aliasing the array elements with the memory containing a char[7][28].

My eyes are burning!

As far as I can tell, COBOL has no notion of a structure type, you just have instances of structures everywhere (they are probably called something different — a level 01 declaration, or something like that).  A lot of the PL/I code I’ve seen is also like that, although in PL/I you can declare your structure types if you want to.

The display’s above make use of the fact that COBOL variables don’t have to use all the high level qualifiers (unless there is ambiguity).  My SYSOUT shows that, sure enough, the (5) element of the array (COBOL arrays are one’s counted) has the values I expected:

1 22
2 999
3 1/2
4
5
6 SF

Basically, the horrendous initialization above, is as if you if declared your structure as:

struct arrayname
{                   
   char numeric2[2];
   char filler1[1];
   struct               
   {                 
      char threemore[3];
      char filler2[1];
   } threepluspad[4];

   char ninemore[9];     
}; 

and then initialized it with:

char globalmemory[7][28] = {
   // n2       f    x    x    x    y    x    x    x    y    x    x    x    y    x    x    x    y    'K', 'l', 'a', 's', 's', 'e', ' ', ' ', ' '},
   { '0', '1', ' ', ' ', ' ', '0', ' ', ' ', ' ', '0', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'K', 'l', 'a', 's', 's', 'e', ' ', ' ', ' '},
   { '0', '2', ' ', ' ', ' ', '0', ' ', ' ', ' ', '0', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'K', 'l', 'a', 's', 's', 'e', ' ', ' ', ' '},
   { '1', '3', ' ', '9', '9', '9', ' ', '9', '9', '9', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'K', 'l', 'a', 's', 's', 'e', ' ', ' ', ' '},
   { '2', '1', ' ', '9', '9', '9', ' ', '1', '/', '2', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'S', 'F', ' ', ' ', ' ', ' ', ' ', ' ', ' '},
   { '2', '2', ' ', '9', '9', '9', ' ', '1', '/', '2', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'S', 'F', ' ', ' ', ' ', ' ', ' ', ' ', ' '},
   { '2', '3', ' ', '1', '/', '2', ' ', '1', '/', '2', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'S', 'F', ' ', ' ', ' ', ' ', ' ', ' ', ' '},
   { '3', '1', ' ', ' ', ' ', '1', ' ', ' ', ' ', '1', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'S', 'F', ' ', ' ', ' ', ' ', ' ', ' ', ' '},
};

struct arrayname * p = (struct arrayname*)globalmemory;

and then and then printed:

   printf( "1 %.2s\n", p[4].numeric2 );
   printf( "2 %.3s\n", p[4].threepluspad[0].threemore );
   printf( "3 %.3s\n", p[4].threepluspad[1].threemore );
   printf( "4 %.3s\n", p[4].threepluspad[2].threemore );
   printf( "5 %.3s\n", p[4].threepluspad[3].threemore );
   printf( "6 %.9s\n", p[4].ninemore );

Of course, the use of fixed length strings without a null terminator wouldn’t ever be done in C, so a more natural equivalent (assuming one doesn’t care about the specific memory equivalence of the two representations, and can tolerate null terminators instead of spaces) would just be something like:

struct arrayname
{
   char numeric2[3];
   struct 
   {
      char threemore[4];
   } threepluspad[4];

   char ninemore[9];
};

struct arrayname g[7] = {
   { "01", {"  0", "  0", "   ", "   "}, "Klasse  " },
   { "02", {"  0", "  0", "   ", "   "}, "Klasse  " },
   { "13", {"999", "999", "   ", "   "}, "Klasse  " },
   { "21", {"999", "1/2", "   ", "   "}, "SF      " },
   { "22", {"999", "1/2", "   ", "   "}, "SF      " },
   { "23", {"1/2", "1/2", "   ", "   "}, "SF      " },
   { "31", {"  1", "  1", "   ", "   "}, "SF      " }
};  

You could argue that the COBOL way isn’t so bad once you’ve seen the pattern, and is only cosmetically different from the natural C analogue. That is, if you ignore the fact that there is no separation of fields in the initializer strings, and that you have to name a whole bunch of dummy initializer objects and fill characters, and the fact that any semblance of typing is completely obliterated.

The code in question is also complete spaghetti, with GOTO all over the place.  Perhaps COBOL versions after COBOL77, which is what I assume I’m looking at, added loops and better initialization syntax?

Computing “offsetof” in COBOL

May 15, 2020 Mainframe 1 comment , , , , ,

I couldn’t find a way to compute something like C offsetof in COBOL code.  What I could manage to figure out how to do is compare addresses of a runtime instantiation of the structure, effectively doing this indirectly.  Here’s the ugly mess that I cooked up:

I couldn’t figure out the right syntax to do a single compute statement that was just the difference of addresses, as I got numeric/pointer compare errors from the compiler, no matter what I tried.  I think that ‘USAGE IS POINTER’ may be required on my variables, but that would still require a temporary.  I’m probably either doing this the hard way, or there is no easy way in COBOL.

This program was run with the following simple JCL

//TESTPROG JOB
//A EXEC PGM=TESTPROG
//SYSOUT DD SYSOUT=*
//STEPLIB DD DSN=COBRC.NATIVE.TESTPROG,
// DISP=SHR

and produced the following SYSOUT

address of TESTPROG-STRUCT = 0016800264
offsetof(ARRAY-NAME,RUECK-BKL) = 0000000002
offsetof(ARRAY-NAME,RUECK-BS) = 0000000004
offsetof(ARRAY-NAME,RUECK-SF) = 0000000007
sizeof(ARRAY-NAME(1)) = 0000000019

Looking at that output, we can conclude the following:

  • PIC S9(3) COMP-3 is effectively horrible eye-burning syntax for a “short”
  • There is no alignment padding between fields, nor end of array-member padding to force natural alignment of the next array element, should the structure start have been aligned.

I knew the latter, but wasn’t sure what size the first field was, and thought that trying to figure it out with COBOL code would be a good learning exercise.

Exploring 0^0, x^x, and z^z.

May 10, 2020 math and physics play No comments , , , , , , ,

My Youtube home page knows that I’m geeky enough to watch math videos.  Today it suggested Eddie Woo’s video about \(0^0\).

Mr Woo, who has great enthusiasm, and must be an awesome teacher to have in person.  He reminds his class about the exponent laws, which allow for an interpretation that \(0^0\) would be equal to 1.  He points out that \(0^n = 0\) for any positive integer, which admits a second contradictory value for \( 0^0 \), if this was true for \(n=0\) too.

When reviewing the exponent laws Woo points out that the exponent law for subtraction \( a^{n-n} \) requires \(a\) to be non-zero.  Given that restriction, we really ought to have no expectation that \(0^{n-n} = 1\).

To attempt to determine a reasonable value for this question, resolving the two contradictory possibilities, neither of which we actually have any reason to assume are valid possibilities, he asks the class to perform a proof by calculator, computing a limit table for \( x \rightarrow 0+ \). I stopped at that point and tried it by myself, constructing such a table in Mathematica. Here is what I used

griddisp[labelc1_, labelc2_, f_, values_] := Grid[({
({{labelc1}, values}) // Flatten,
({ {labelc2}, f[#] & /@ values} ) // Flatten
}) // Transpose,
Frame -> All]
decimalFractions[n_] := ((10^(-#)) & /@ Range[n])
With[{m = 10}, griddisp[x, x^x, #^# &, N[decimalFractions[m], 10]]]
With[{m = 10}, griddisp[x, x^x, #^# &, -N[decimalFractions[m], 10]]]

Observe that I calculated the limits from both above and below. The results are

and for the negative limit

Sure enough, from both below and above, we see numerically that \(\lim_{\epsilon\rightarrow 0} \epsilon^\epsilon = 1\), as if the exponent law argument for \( 0^0 = 1 \) was actually valid.  We see that this limit appears to be valid despite the fact that \( x^x \) can be complex valued — that is ignoring the fact that a rigorous limit argument should be valid for any path neighbourhood of \( x = 0 \) and not just along two specific (real valued) paths.

Let’s get a better idea where the imaginary component of \((-x)^{-x}\) comes from.  To do so, consider \( f(z) = z^z \) for complex values of \( z \) where \( z = r e^{i \theta} \). The logarithm of such a beast is

\begin{equation}\label{eqn:xtox:20}
\begin{aligned}
\ln z^z
&= z \ln \lr{ r e^{i\theta} } \\
&= z \ln r + i \theta z \\
&= e^{i\theta} \ln r^r + i \theta z \\
&= \lr{ \cos\theta + i \sin\theta } \ln r^r + i r \theta \lr{ \cos\theta + i \sin\theta } \\
&= \cos\theta \ln r^r – r \theta \sin\theta
+ i r \lr{ \sin\theta \ln r + \theta \cos\theta },
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:xtox:40}
z^z =
e^{ r \lr{ \cos\theta \ln r – \theta \sin\theta}} \times
e^{i r \lr{ \sin\theta \ln r + \theta \cos\theta }}.
\end{equation}
In particular, picking the \( \theta = \pi \) branch, we have, for any \( x > 0 \)
\begin{equation}\label{eqn:xtox:60}
(-x)^{-x} = e^{-x \ln x – i x \pi } = \frac{e^{ – i x \pi }}{x^x}.
\end{equation}

Let’s get some visual appreciation for this interesting \(z^z\) beastie, first plotting it for real values of \(z\)


Manipulate[
Plot[ {Re[x^x], Im[x^x]}, {x, -r, r}
, PlotRange -> {{-r, r}, {-r^r, r^r}}
, PlotLegends -> {Re[x^x], Im[x^x]}
], {{r, 2.25}, 0.0000001, 10}]

From this display, we see that the imaginary part of \( x^x \) is zero for integer values of \( x \).  That’s easy enough to verify explicitly: \( (-1)^{-1} = -1, (-2)^{-2} = 1/4, (-3)^{-3} = -1/27, \cdots \).

The newest version of Mathematica has a few nice new complex number visualization options.  Here’s two that I found illuminating, an absolute value plot that highlights the poles and zeros, also showing some of the phase action:

Manipulate[
ComplexPlot[ x^x, {x, s (-1 – I), s (1 + I)},
PlotLegends -> Automatic, ColorFunction -> "GlobalAbs"], {{s, 4},
0.00001, 10}]

We see the branch cut nicely, the tendency to zero in the left half plane, as well as some of the phase periodicity in the regions that are in the intermediate regions between the zeros and the poles.  We can also plot just the phase, which shows its interesting periodic nature


Manipulate[
ComplexPlot[ x^x, {x, s (-1 – I), s (1 + I)},
PlotLegends -> Automatic, ColorFunction -> "CyclicArg"], {{s, 6},
0.00001, 10}]

I’d like to take the time to play with some of the other ComplexPlot ColorFunction options, which appears to be a powerful and flexible visualization tool.

File organization in really old COBOL code.

May 7, 2020 Mainframe No comments , , , , , , , , , , , , ,

I encountered customer COBOL code today with a file declaration of the following form:

000038   SELECT AUSGABE ASSIGN TO UR-S-AUSGABE            
000039    ACCESS IS SEQUENTIAL.                   
...
000056 FD  AUSGABE                                                     
000057     RECORDING F                                                  
000058     BLOCK 0 RECORDS                                              
000059     LABEL RECORDS OMITTED.                                       

where the program’s JCL used an AUSGABE (German “output”) DDNAME of the following form:

//AUSGABE   DD    DUMMY

The SELECT looked completely wrong to me, as I thought that SELECT is supposed to have the form:

SELECT cobol-file-variable-name ASSIGN TO ddname

That’s the syntax that my Murach’s Mainframe COBOL uses, and also what I’d seen in big-blue’s documentation.

However, in this customer’s code, the identifier UR-S-AUSGABE is longer than 8 characters, so it sure didn’t look like a DDNAME. I preprocessed the code looking to see if UR-S-AUSGABE was hiding in a copybook (mainframe lingo for an include file), but it wasn’t. How on Earth did this work when it was compiled and run on the original mainframe?

It turns out that [LABEL-]S- or [LABEL]-AS- are ways that really old COBOL code used to specify file organization (something like PL/I’s ENV(ORGANIZATION) clauses for FILEs). This works on the mainframe because a “modern” mainframe COBOL compiler strips off the LABEL- prefix if specified and the organization prefix S- as well, essentially treating those identifier fragments as “comments”.

For anybody reading this who has only programmed in a sane programming language, on sane operating systems, this all probably sounds like verbal diarrhea.  What on earth is a file organization and ddname?  Do I really have to care about those just to access a file?  Well, on the mainframe, yes, you do.

These mysterious dependencies highlight a number of reasons why COBOL code is hard to migrate. It isn’t just a programming language, but it is tied to the mainframe with lots of historic baggage in ways that are very difficult to extricate.  Even just to understand how to open a file in mainframe COBOL you have a whole pile of obstacles along the learning curve:

  • You don’t just run the program in a shell, passing in arguments, but you have to construct a JCL job step to do so.  This specifies parameters, environment variables, file handles, and other junk.
  • You have to know what a DDNAME is.  This is like a HANDLE in the JCL code that refers to a file.  The file has a filename (DSNAME), but you don’t typically use that.  Instead the JCL’s job step declares an arbitrary DDNAME to refer to that handle, and the program that is run in that job step has to always refer to the file using that abstract handle.
  • The file has all sorts of esoteric attributes that you have to know about to access it properly (fixed, variable, blocked, record length, block size, …).  The program that accesses the file typically has to make sure that these attributes are all encoded with the equivalent language specific syntax.
  • Files are not typically just byte streams on the mainframe but can have internal structure that can be as complicated as a simple database (keyed records, with special modes to access them to initialize vs access/modify.)
  • To make life extra “fun”, files are found in a variety of EBCDIC code pages.  In some cases these can’t be converted to single byte iso-8859-X code pages, so you have to use utf-8, and can get into trouble if you want to do round trip conversions.
  • Because of the internal structure of a mainframe file, you may not be able to transfer it to a sane operating system unless special steps are taken.  For example, a variable format file with binary data would typically have to be converted to a fixed format representation so that it’s possible to seek from record to record.
  • Within the (COBOL) code you have three sets of attributes that you have to specify to “declare” a file, before you can even attempt to open it: the DDNAME to COBOL-file-name mapping (SELECT), the FD clause (file properties), and finally record declarations (global variables that mirror the file data record structure that you have to use to read and write the file.)

You can’t just learn to program COBOL, like you would any sane programming language, but also have to learn all the mainframe concepts that the COBOL code is dependent on.  Make sure you are close enough to your eyewash station before you start!

Finding the cheapest copy of my geometric algebra book on amazon

May 3, 2020 Geometric Algebra for Electrical Engineers 1 comment , , ,

My book, “Geometric Algebra for Electrical Engineers” is available as a free PDF here on my website, but also available in color ($40) and black-and-white ($12) formats on amazon.  Both versions are basically offered close to cost, should the reader be like me, preferring a print copy that can be marked up.  In fact, I made it available initially just so that I could get a cheap bound copy for my own use that I could mark up myself.

I noticed today that amazon now hides the cheapest version of my book, and seems shows the price of a reseller first.  For example, if you click the link to the $12 black-and-white version, it now appears that the book is selling for $13.01

but if you click on “Other Sellers”, the kindle-direct (print on demand) version that amazon offers itself hides further down in the list of sellers.  The version that I’m selling directly through amazon.com is third on the list, despite it being the cheapest:

I guess that I’ve priced the black-and-white version of the book so low, that there are resellers that are willing to try to make some profit selling their own copies.  Do they depend on amazon giving them preferential listing order to make those sales?  I wonder how many of the people who have bought my book have ended up accidentally paying a higher price, using one of these resellers?

It does not appear that any resellers have played this game with the color version of the book, which has a higher price point.  I’m curious now to look at the sales stats for the two variations of the book to see how many of each version are selling (hardly any in either case, as the subject matter is too esoteric, but it was actually enough over the whole year that I did include the revenue on my income taxes.)