Month: December 2023

An absurd COBOL library: 2D Euclidean GA

December 31, 2023 COBOL, math and physics play , , , ,

I’ve achieved a new pinnacle of obscurity, and have now written a rudimentary COBOL implementation of a geometric algebra library for \( \mathbb{R}^2 \) calculations.

Who will use this?  Absolutely nobody.  Effectively, nobody knows geometric algebra.  Nobody wants to know COBOL, but some do.  The union of those two groups is vanishingly small (probably one: argued below.)

I understand that some Opus Dei members have taught themselves COBOL, as looking at COBOL has been found to be equally painful as a course of self flagellation.

Figure 0. A flagellation representation of COBOL.

Assuming that no Opus Dei practitioners know geometric algebra, that means that there is exactly one person in the world that both knows COBOL and geometric algebra.  Me.

Why did I write this little library?  Well, I was tickled to write something so completely stupid, and I’ve been laughing at the absurdity of it. I also thought I might learn a few things about COBOL in the process of trying to use it for something slightly non-trivial.  I’m adept at writing simple test programs that exercise various obscure compiler features, but those are usually fairly small.  On the flip side of complexity, I have to debug through a number of horribly complicated customer programs as part of my compiler validation work.  A simple real life test scenario might run 100+ COBOL programs in a set of CICS transactions, executing thousands of EXEC DLI and EXEC CICS statements as well as all of the rest of the COBOL language statements!  Despite having gained familiarity with COBOL from that sort of observational use, walking through stuff in the debugger doesn’t provide the same level of comfort with the language as writing code from scratch.  Since I have no interest in simulating a boring business application, why not do something just for fun as a learning game.

The compiler I am using does not seem to support object-COBOL (which would have been nicely suited for this project), so I’ve written my little toy in conventional COBOL, using one external procedure for each type of mathematical operation.  In the huge set of customer COBOL code that I’ve examined and done test compilations of, none of it has used object-COBOL.  I am guessing that the object-COBOL community is as large as the user base for my little toy COBOL geometric algebra library will ever be.

I’ve implemented methods to construct multivectors with scalar, vector and pseudoscalar components, or a general multivector with all of the above.  I’ve also implemented multiply, add, subtract, scalar multiplication, grade selection, and a DISPLAY function to write a multivector to SYSOUT (stdout equivalent.)

The multivector “type”

Figure 1 shows the implementation of my multivector type, implemented in copybook (include file) named MVI.  I have an alternate MV copybook that doesn’t have the VALUE (initialization) clauses, as you don’t want initialization for LINKAGE-SECTION values (i.e.: program parameters.)

Figure 1. Copybook with multivector declaration and initialization.

If you are wondering what the hell a ‘PIC S9(9) USAGE IS COMP-5’ is, well, that’s the “easy to remember” way to declare a 32-bit signed integer in COBOL.  A COMP-2, on the other hand, is a floating point value.

Figure 2 shows an example of the use of this copybook:

Figure 2. Using the multivector copybook.

Figure 3 shows these two copybook declarations after preprocessor expansion

Figure 3. Multivector global variable examples after preprocessing.

The global variable declarations above are roughly equivalent to the following pseudo C++ code (pretending that we can have anonymous unions that match the COBOL declarations above):

#include <complex>

using complex = std::complex<double>;

struct ga20{
   int grade{};
   union {
      struct { double sc{}; double ps{}; };
      complex g02{};
   };
   union { 
      struct { double x{}; double y{}; };
      complex g1{};
   };
};

ga20 a;
ga20 b;

COBOL is inherently untyped, but requires matching types for CALL parameters, or else all hell ensues, so you have to rely on naming conventions and other mechanisms to enforce the required type equivalences.  In this toy GA library, I’ve used copybooks to enforce the types required for everything.  Global variable declarations like these A-MV and B-MV variables are declared only using a copybook that knows the representation required, and all the uses in sub-programs of the effective -MV “type” use a matching copybook for their declarations.  However, I’ve also made use of the lack of typing to treat A-G02, B-G02, A-G1, and B-G1 as if they were complex numbers, and pass those “variables” off to complex number sub-programs, knowing that I’ve constructed the parameters to those programs in a way that is bit compatible with the MV field values.  You can screw things up really nicely doing stuff like this, especially because all COBOL sub-program parameters are (generally) passed by reference.  If you don’t match up the types right “fun ensues.”

Also observe that the nested level specifiers are optional in COBOL.  For nested fields in C++, we might write a.g1.x.  With a nested variable like this in COBOL, we could write something equivalent to that, like:

A-X OF A-G1 OF A-MV

but we can leave out any of the intermediate “level” specifications if we want.  This gets really confusing in complicated real-life COBOL code.  If you are looking to see where something is modified, you have to not only look for the variable of interest, but also any of the higher level fields, since any of those could have been passed off to other code, which implicitly wrote the value you are interested in.

Here’s what one of these multivectors looks like in memory on my (Linux x86-64) system

(lldb) c
Process 3903259 resuming
Process 3903259 stopped
* thread #10, name = 'GA20', stop reason = breakpoint 7.1
    frame #0: 0x00007fffd9189a02 PJOOT.GA20V01.LOADLIB(MULT).ec73dc4b`MULT at MULT.cob:50:1
   47              CALL GA-MKVECTOR-MODIFY USING C-MV, A-X, A-Y
   48              CALL GA-MKPSEUDO-MODIFY USING D-MV, A-PS
   49  
-> 50              MOVE 'A' TO WS-DISPPARM-N
   51              CALL GA-DISPLAY USING
   52                WS-DISPPARM-N,
   53                A-MV
(lldb) p A-MV
(A-MV) A-MV = {
  A-GRADE = -1
  A-G02 = (A-SC = 1, A-PS = 4)
  A-G1 = (A-X = 2, A-Y = 3)
}

i.e.: this has the value \( 1 + 2 \mathbf{e}_{12} + 3 \mathbf{e}_1 + 4 \mathbf{e}_1 \).

Looking at the multivector in it’s hex representation:

(lldb) fr v -format x A-MV
(A-MV) A-MV = {
  A-GRADE = 0xffffffff
  A-G02 = {
    A-SC = 0x3ff0000000000000
    A-PS = 0x4010000000000000
  }
  A-G1 = {
    A-X = 0x4000000000000000
    A-Y = 0x4008000000000000
  }
}

we see that the debugger is showing an underlying IEEE floating point representation for the COMP-2 variables in the program as it was compiled.

I have a multivector print routine that prints multivectors to SYSOUT:

Figure 4. Calling the multivector DISPLAY function.

where WS-DISPPARM-N is a PIC X(20).  (i.e.: a fixed size character array.)  Output for the A-MV value showing in the debug session above looks like:

A                     ( .10000000000000000E 01)                                                                         
                    + ( .20000000000000000E 01) e_1 + ( .30000000000000000E 01) e_2                                     
                    + ( .40000000000000000E 01) e_{12}            

End of sentence required for nested IFs?

I encountered a curious language issue in my multivector multiply function.  Here’s an example of how I’ve been coding IF statements

Figure 5. An IF END-IF pair without a period to terminate the sentence.

Notice that I don’t do anything special between the END-IF and the statement that follows it.  However, if I have an IF statement that includes nested IF END-IFs, then it appears that I need a period after the final END-IF, like so:

Figure 6. An IF with nested conditions that seems to require a period to terminate the sentence.

If I don’t include that period after the final END-IF (ending the COBOL sentence), then in some circumstances, I was seeing the program exit after the last interior basic block within this nested IF was executed.  In COBOL parlance, it seems as if a GOBACK (i.e.: return) was implicitly executed once we fell out of the big nested IF.  Why is that period required for a nested IF, but not for a simple IF?

In my “Murach’s mainframe COBOL”, he ends ALL if statements with a period, even simple IFs.  I don’t see a rationale for that in the book anywhere, but it’s a ~700 page book, so perhaps he says why at some point.

I’ve asked our compiler guys if this is a bug or expected behaviour, but I am guessing the latter…. I just don’t know why.

The multiplication kernel for this library

The workhorse of this GA(2,0) implementation, is a multivector multiplication operation, which can be implemented in two lines in Mathematica (or C++)

multivector /: multivector[_, m1_, m2_] ** multivector[_, n1_, n2_] := 
   multivector[-1, m1 n1 + Conjugate[m2] n2, n1 m2 + Conjugate[m1] n2 ]

In COBOL, it takes a lot more, and as usual, COBOL verbosity obfuscates things considerably. Here’s the equivalent code in my library:

Figure 7. GA(2,0) multiplication kernel in COBOL.

The library and a little test program.

If you are curious, you can poke around in the code for this library and the test program on github.  The sample/test program is src/MULT.cob, and running the job gives the following SYSOUT:

Figure 8. Sample SYSOUT for MULT.cob

A less evil COBOL toy complex number library

December 29, 2023 COBOL , , , , , , , , , ,

In a previous post ‘The evil of COBOL: everything is in global variables’, I discussed the implementation of a toy complex number library in COBOL.

That example code was a single module, having one paragraph for each function. I used a naming convention to work around the fact that COBOL functions (paragraphs) are completely braindead and have no input nor output parameters, and all such functions in a given loadmodule have access to all the variables of the entire program.

Perhaps you try to call me on my claim that COBOL doesn’t have parameters, nor does it have return values.  That’s true if you consider COBOL paragraphs to be the equivalent to functions.  I’ve heard paragraphs described as not-really-functions, and there’s some truth to that, especially since you can do a PERFORM range that executes a set of paragraphs, and there can be non-intuitive control flow transfers between paragraphs of such a range of execution, that is entirely non-function like.

There is one circumstance where COBOL parameters can be found.  It is actually possible to have both input and output parameters in COBOL, but it can only be done at a program level (i.e.: int main( int argc, char ** )). So, you can write a less braindead COBOL library, with a set of meaningful input and output parameters for each function, by using CALL instead of PERFORM, and a whole set of external programs, one for each of the operations that is desired. With that in mind, I’ve reworked my COBOL complex number toy library to use this program-level organization.  This is still a toy library implementation, but serves to illustrate the ideas.  The previous paragraph implementation can be found in the same repository, in the ../paragraphs-as-library/ directory.

Here are some examples of the functions in this little library, and examples of calls to them.

Multiply code:

And here’s a call to it:

Notice that I’ve opted to use dynamic calls to the COBOL functions, using a copybook that lists all the possible function names:

This frees me from the constraint of having to use inscrutable 8-character function names, which will get confusing as the library grows.

Like everything in COBOL, the verbosity makes it fairly unreadable, but refactoring all paragraphs into external programs, does make the calling code, and even the library functions themselves, much more readable.  It still takes 49 lines of code, to initialize two complex numbers, multiply them and display them to stdout.

Compare to the same thing in C++, which is 18 lines for a grow-your-own complex implementation with multiply:

#include <iostream>

struct complex{
   double re_;
   double im_;
};

complex mult(const complex & a, const complex & b) {
   // (a + b i)(c + d i) = a c - b d + i( b c + a d) 
   return complex{ a.re_ * b.re_ - a.im_ * b.im_,
                   a.im_ * b.re_ + a.re_ * b.im_ };
}

int main()
{
   complex a{1,2};
   complex b{3,4};
   complex c = mult(a, b);
   std::cout << "c = " << c.re_ << " +(" << c.im_ << ") I\n";

   return 0;
}

and only 11 lines if we use the standard library complex implementation:

#include <iostream>
#include <complex>

using complex = std::complex<double>;

int main() 
{  
   complex a{1,2}; 
   complex b{3,4};
   complex c = a * b;
   std::cout << "c = " << c << "\n";

   return 0;
}

Basically, we have one line for each operation: init, init, multiply, display, and all the rest is one-time fluff (the includes, main decl, return, …)

It turns out that the so-called OBJECT oriented COBOL extension to the language (circa Y2K), is basically a packaging of external-style-programs into collections that are class prefixed, just as I’ve done above.  This provides the capability for information hiding, and allows functions to have parameters and return values.  However, this doesn’t try to rectify the fundamental failure of the COBOL language: everything has to be in a global variable.  This language extension appears to be a hack that may have been done primarily for Java integration, which is probably why nobody uses it.  You just can’t take the dinosaur out of COBOL.

Sadly, it didn’t take people long to figure out that it’s incredibly dumb to require all variables to be global.  Even PL/I, which is 59 years old at the time I write this (only five years younger than COBOL), got it right.  They added parameters and return values to functions, and allow functions to have variables that are limited to that scope.  PL/I probably went too far, and added lots of features that are also braindead (like the PL/I macro preprocessor), but the basic language is at least sane.  It’s interesting that COBOL never evolved.  A language like C++ may have evolved too much, and still is, but the most flagrant design flaw in the COBOL language has been there since inception, despite every other language in the world figuring out that sort of stupidity should not be propagated.

Note that I work on the development of a COBOL and PL/I compilation stack.  I really like my work, which is challenging and great fun, and I work with awesome people. That doesn’t stop me from acknowledging that COBOL is a language spawned in hell by Satan. I can love my work, which provides tools for customers allowing them to develop, maintain and debug COBOL code, but also have great pity and remorse for those customers, having inherited ancient code built with an ancient language, and having no easy migration path away from that language.

V0.3.5 of Geometric Algebra for Electrical Engineers (and temp hardcover price drop)

December 16, 2023 Geometric Algebra for Electrical Engineers , , , , ,

Yes, I just published an update last week, but here’s another one.

Temporary price drop on hardcover.

It’s been 4 years since I printed a copy of the book for myself to mark up and edit.  In particular, having added some vector calculus identities and their geometric algebra equivalents to chapter II, it messes up the flow a bit, and I’d like a paper copy to review to help figure out how to sequence it all better.  I may just start with div and curl (in their GA forms) before moving on to curvilinear coordinates, the vector derivative and all the integration theorems, …

While I intend to mark up my copy, I’m going to treat myself to a hardcover version this time, to see what it looks like.

However, to make things a bit cheaper on myself, I’ve reduced the price on the amazon.com marketplace for the hardcover version of the book to the absolute cheapest amazon will let me make it: $16.12 USD.  At that price, amazon will make some profit above printing costs, but I will not.  I believe this will also result in a price drop on the amazon.ca marketplace (unlike the paperback pricing, there is no explicit option to set an amazon.ca price for the hardcover version, so I think it’s just the USD price converted to CAD.)

So if you would like a hardcover print copy for yourself at bargain prices, now is your chance.  The paperback, in contrast, is $15.50 USD, so for only $0.62 USD more, you (and I) can get a hardcover version!  I’ll wait about a week before ordering my copy to make sure that it’s the newest version when I order, and will leave the hardcover price at $16.12 USD until I get my copy in the mail.  After that, the price will go back up, and I’ll make a couple bucks for any hardcover sales again after that.  Note that the PDF version is still available for free, as always.

If you ask “What about author proofs”.  Well, Kindle direct publishing (formerly Createspace) does have a mechanism for ordering author proofs, and if I lived in the USA, I’d use that.  However, for us poor second class Canadians, it costs just about as much to get an author proof with shipping, as just buying a copy.

What changed in this version

V0.3.5 (Dec 15, 2023)

  • Rewrite spherical polar section with the geometry first, not the coordinate representation, nor the CAS stuff.
  • Move the coordinates -> GA derivation to a problem.
  • New figure: sphericalPolarFig2.
  • update spherical polar figure1 with orientation of j.
  • Add to helpful formulas: Vector calculus identities.
  • Note about ambiguity of our curl notation.
  • Add some references to the d’Alembertian (wave equation) operator.
  • New section (chapter 2): Vector calculus identities.
  • Two (wedge) curl examples (vector field) to make things less abstract.
  • bivector field curl examples (problem.)
  • curl with polar form representation of gradient and field (problem.)
  • curl of 3D vector field example.

 

Multilanguage debugging in lldb: print call to function.

December 13, 2023 C/C++ development and debugging. , , , , , ,

There probably aren’t many people that care about debugging multiple languages, but I learned a new trick today that is worth making a note of, even if that note is for a future amnesiatic self.

Here’s a debug session where C code is calling COBOL, but in the COBOL frame, the language rules prohibit running print to show the results of a C function call (example: printf, strlen, strspn, …)

To make a function call in lldb, I used to go up the stack to a C language frame.  For example, if this was the COBOL code I was debugging:

(lldb) n
12/13/23 19:27:26 LTE14039I Opening LzMQZ connection. QMGR: MQZ1 MQZCONN: 0x7ff920625170 API: 0x7fed0008e0e0
Process 1673776 stopped
* thread #57, name = 'LZOCREG1', stop reason = step over
    frame #0: 0x00007ff9243b31f2 WINDC.NATIVE.LZPDS.A0116662(LTESVCXC).f3968a73`LTESVCXC at LTESVCXC.cbl:36:1
   33  
   34                  DISPLAY 'WSCHECK: "' WORK-VAR '"'
   35  
-> 36                 EXEC CICS LINK PROGRAM ('LTESVCXC')
   37                      COMMAREA(WORK-COMMAREA)
   38                      LENGTH   (LENGTH OF WORK-COMMAREA)
   39                 END-EXEC
(lldb) p &WORK-VAR
(*char [10]) $4 = 0x00007fadef810478
(lldb) p WORK-VAR
(char [10]) WORK-VAR = "STORISOK  "
(lldb) fr v -format x WORK-VAR
(char [10]) WORK-VAR = {
  [0] = 0xe2
  [1] = 0xe3
  [2] = 0xd6
  [3] = 0xd9
  [4] = 0xc9
  [5] = 0xe2
  [6] = 0xd6
  [7] = 0xd2
  [8] = 0x40
  [9] = 0x40
}

Aside: If you object to the use of a C address-of operator against a COBOL variable, that’s just because our debugger has C like & notational shorthand for the COBOL ‘ADDRESS OF …’, which is very useful.

If I want to run a C function against that COBOL WORKING-STORAGE variable, like strchr, to look for the address of the first EBCDIC space (0x40) in that string, I used to do it by going up the stack into a C frame, like so:

(lldb) up 2
frame #2: 0x00007ff9243b3f7e WINDC.NATIVE.LZPDS.A0116662(LTESVCXC).f3968a73`pgm_ltesvcxc + 382
WINDC.NATIVE.LZPDS.A0116662(LTESVCXC).f3968a73`pgm_ltesvcxc:
->  0x7ff9243b3f7e <+382>: jmp    0x7ff9243b3f88            ; <+392>
    0x7ff9243b3f80 <+384>: addq   $0x128, %rsp              ; imm = 0x128 
    0x7ff9243b3f87 <+391>: retq   
    0x7ff9243b3f88 <+392>: leaq   0x201039(%rip), %rdi
(lldb) print (char *)strchr(0x00007fadef810478, 0x40)
(char *) $6 = 0x00007fadef810480 "@@"

Sure enough, that space is found 8 bytes into the string, as expected. This is a very short string, and I could have seen that by inspection, but it’s just to illustrate that we can make calls to functions within the debugger, and they can even be functions that aren’t in the program or language that we are debugging.

I noticed today that ‘print’ is an alias for ‘expression –‘, and that expression takes a language option. This means that I can do cross language calls like this in any frame, provided I specify the language I want. Example:

(lldb) down 2
frame #0: 0x00007ff9243b31f2 WINDC.NATIVE.LZPDS.A0116662(LTESVCXC).f3968a73`LTESVCXC at LTESVCXC.cbl:36:1
   33  
   34                  DISPLAY 'WSCHECK: "' WORK-VAR '"'
   35  
-> 36                 EXEC CICS LINK PROGRAM ('LTESVCXC')
   37                      COMMAREA(WORK-COMMAREA)
   38                      LENGTH   (LENGTH OF WORK-COMMAREA)
   39                 END-EXEC
(lldb) expression -l c -- (char *)strchr(0x00007fadef810478, 0x40)
(char *) $7 = 0x00007fadef810480 "@@"

Ten points to me for learning yet another obscure debugger trick.

Some new tmux tricks (for me)

December 9, 2023 perl and general scripting hackery , ,

I’ve been using tmux since RHEL stopped shipping screen by default.  We now use Oracle Linux at work, not RHEL, and perhaps Oracle Linux ships screen, but I’m unlikely to switch back now.

I blundered upon a tmux book at the Toronto Public Library today.  It’s a little book, and most of what is interesting looking to me, is in the first couple chapters.  Here are some notes.

tmux sessions.

I normally want just one tmux session per machine, but tmux does have the capability of creating named sessions. I knew this, but hadn’t used it, and it’s spelled out nicely in the book.  Here are two bash aliases to exploit that feature:

alias tplay='tmux new -s play'
alias twork='tmux new -s work'

Here are two such sessions in action:

Notice the little [work] and [play] indicators in the bottom left corner of each screen session.  Nice!

Of course, I created a couple aliases to attach to my named sessions:

alias atplay='tmux at -t play'
alias atwork='tmux at -t work'

tmux window switching “menu”

The only thing screen feature that I miss is the menu of windows for the session, but I’ve gotten used to that limitation.

However, because tmux shows the active sessions at the bottom, you can use that as an built in session switching mechanism.

Do this with [Prefix] N, where N is the window number.  [Prefix] is the notation from the book that has the control sequence used for shortcuts.  For me, because I have this in my .tmux.conf

unbind C-b
set -g prefix ^N
bind n send-prefix

For me, [Prefix] is control-n (I don’t remember why I had ‘set -g’ and bind, which looks redundant to me.) So, I can do window switching with

control-n 1
control-n 2

Most of the time I use my F5, and F8 commmand key mappings to do window switching

bind-key -T root F5 select-window -p
bind-key -T root F6 select-window -l
bind-key -T root F8 select-window -n
bind-key -T root F9 new-window

but this ‘[Prefix] N’ gives me another way to do window switching.

However, what do you know, there is an equivalent to the old screen window selection mechanism, as I’ve just learned from the book:

[Prefix] w

This window switcher is actually really cool. It shows a preview of the window to be selected, and even shows the windows in both named sessions, so not only is it a window switcher, but also a session switcher!

Panes

I’d seen that tmux has support for window splitting, but have never used it.

Here are the commands, for new vertical split, new horizontal split, pane cycle, and pane navigate:

[Prefix] %

[Prefix] ”

[Prefix] o

[Prefix] Up-Arrow,Down-Arrow,Left-Arrow,Right-Arrow

Looks like the pane splitting is specific to the active window, so if you do a split pane, and have run a rename-window command, for example, using something like:

alias tre='tmux rename-window'
function tnd
{ 
    tmux rename-window `pwd | sed 's,.*/,,'`
}

then the split pane only shows up in that named window, like so:

Some random tricks

Detach: [Prefix] d

(I use command line ‘tmux det’)

Switch between horizontal and vertical panes: [Prefix] space

Create new window: [Prefix] c

(I use my F9 bind-key mapping)

 

Pane mapping bind-keys for adjusting window size:

tmux bind -r H resize-pane -L 5
tmux bind -r J resize-pane -D 5
tmux bind -r K resize-pane -U 5
tmux bind -r L resize-pane -R 5

(can also be put into the .tmux.conf without the leading tmux)

This allows you to resize panes. For example, perhaps you want to tail the output of something long running, or run top, or something else, but have the rest of the upper pane in action, something like: