C/C++ development and debugging.

Visualizing the 3D Mandelbrot set.

February 1, 2021 C/C++ development and debugging.

In “Geometric Algebra for Computer Science” is a fractal problem based on a vectorization of the Mandelbrot equation, which allows for generalization to \( N \) dimensions.

I finally got around to trying the 3D variation of this problem.  Recall that the Mandlebrot set is a visualization of iteration of the following complex number equation:
\begin{equation}
z \rightarrow z^2 + c,
\end{equation}
where the idea is that \( z \) starts as the constant \( c \), and if this sequence converges to zero, then the point \( c \) is in the set.

The idea in the problem is that this equation can be cast as a vector equation, instead of a complex number equation. All we have to do is set \( z = \Be_1 \Bx \), where \( \Be_1 \) is the x-axis unit vector, and \( \Bx \) is an \(\mathbb{R}^2\) vector. Expanding in coordinates, with \( \Bx = \Be_1 x + \Be_2 y \), we have
\begin{equation}
z
= \Be_1 \lr{ \Be_1 x + \Be_2 y }
= x + \Be_1 \Be_2 y,
\end{equation}
but since the bivector \( \Be_1 \Be_2 \) squares to \( -1 \), we can represent complex numbers as even grade multivectors. Making the same substitution in the Mandlebrot equation, we have
\begin{equation}
\Be_1 \Bx \rightarrow \Be_1 \Bx \Be_1 \Bx + \Be_1 \Bc,
\end{equation}
or
\begin{equation}
\Bx \rightarrow \Bx \Be_1 \Bx + \Bc.
\end{equation}
Viola! This is a vector version of the Mandlebrot equation, and we can use it in 2 or 3 or N dimensions, as desired.  Observe that despite all the vector products above, the result is still a vector since \( \Bx \Be_1 \Bx \) is the geometric algebra form of a reflection of \( \Bx \) about the x-axis.

The problem with generalizing this from 2D is really one of visualization. How can we visualize a 3D Mandelbrot set? One idea I had was to use ray tracing, so that only the points on the surface from the desired viewpoint need be evaluated. I don’t think I’ve ever written a ray tracer, but I thought that there has got to be a quick and dirty way to do this.  Also, figuring out how to make a ray tracer interact with an irregular surface like this is probably non trivial!

What I did instead, was a brute force evaluation of all the points in the upper half plane in around the origin, one slice of the set at a time. Here’s the result

Code for the visualization can be found in github. I’ve used Pauli matrices to do the evaluation, which is actually pretty quick (but slower than plain std::complex< double> evaluation), and the C++ ImageMagick API to save individual png files for the slices. There are better coloring schemes for the Mandelbrot set, and if I scrounge up some more time, I may try one of those instead.

As your T.A., I have to punish you …

December 19, 2020 C/C++ development and debugging. , , ,

Back in university, I had to implement a reverse polish notation calculator in a software engineering class.  Overall the assignment was pretty stupid, and I entertained myself by generating writing a very compact implementation.  It worked perfectly, but I got a 25/40 (62.5%) grade on it.  That mark was well deserved, although I did not think so at the time.

The grading remarks were actually some of best feedback that I ever received, and also really funny to boot.  I don’t know the name of this old now-nameless TA anymore, but I took his advice to heart, and kept his grading remarks on my wall in my IBM office for years.  That served as an excellent reminder not to write over complicated code.

Today, I found those remarks again, and am posting them for posterity.  Enjoy!

Transcription for easy reading

  • It is obvious that are a very clever person, but this program is is like a big puzzle, and in understanding it, I appreciated it and enjoyed it, because of your cleverness. However much I enjoyed, it is none the less a very poorly designed program.
  • A program should be constructed in the easiest and simplest to understand manner because when you construct very large programs the “complexity” of them will increase greatly.
  • A program should not be an intricate puzzle, where you show off how clever you are.
  • Your string class is an elephant gun trying to kill a mouse.
  • macros Build_binary_op and Binary_op are the worst examples of programming style I have ever seen in my entire life!  Veru c;ever. bit a cardinal sin of programming style.
  • Your binary_expr constructor does all the computation.  Not good style.
  • Your “expr” class is a baroque mess.
  • Although I enjoyed your program, Never write a program like this in your life again.  As your T.A., I have to pushish you so that you do not develop bad habits in the future.  I hate to do it, but I can only give you 25/40 for this “clever puzzle”.

Reflection.

The only part of this feedback that I would refute was the comment about the string class.  That was a actually a pretty good string implementation.  I didn’t write it because I was a viscous mouse hunter, but because I hit a porting issue with pre-std:: C.  In particular, we had two sets of Solaris machines available to us, and I was using one that had a compiler that included a nice C++ string class.  So, naturally I used it.  For submission, our code had to compile an run on a different Solaris machine, and lo and behold, the string class that all my code was based on was not available.

What I should have done (20/20 hindsight), was throw out my horrendous code, and start over from scratch.  However, I took the more fun approach, and wrote my own string class so that my machine would compile on either machine.

Amusingly, when I worked on IBM LUW, there was a part of the query optimizer code seemed to have learned all it’s tricks from the ugly macros and token pasting that I did in this assignment.  It was truly gross, but there was 10000x more of it than my assignment.  Having been thoroughly punished for my atrocities, I easily recognized this code for the evil it was.  The only way that you could debug that optimizer code, was by running it through the preprocessor, cut and pasting the results, and filtering that cut and paste through something like cindent (these days you would probably use clang-format.)  That code was brutal, and I always wished that it’s authors had had the good luck of having a TA like mine.  That code is probably still part of LUW terrorizing developers.  Apparently the justification for it was that it was originally written by an IBM researcher using templates, but templates couldn’t be used in DB2 code because we didn’t have compiler on all platforms that supported them at the time.

I have used token pasting macros very judiciously and sparingly in the 26 years since I originally used them in this assignment, and I do think that there are a few good uses for that sort of generative code.  However, if you do have to write that sort of code, I think it’s better to write perl (or some other language) code that generates understandable code that can be debugged, instead of relying on token pasting.

Listing the code pages for gdb ‘set target-charset’

August 21, 2020 C/C++ development and debugging. , , , , , ,

I wanted to display some internal state as an IBM-1141 codepage, but didn’t know the name to use.  I knew that EBCDIC-US could be used for IBM-1047, but gdb didn’t like ibm-1147:

(gdb) set target-charset EBCDIC-US
(gdb) p (char *)0x7ffbb7b58088
$2 = 0x7ffbb7b58088 "{Jim       ;012}", ' ' <repeats 104 times>
(gdb) set target-charset ibm-1141
Undefined item: "ibm-1141".

I’d either didn’t know or had forgotten that we can get a list of the supported codepages. The help shows this:

(gdb) help set target-charset
Set the target character set.
The `target character set' is the one used by the program being debugged.
GDB translates characters and strings between the host and target
character sets as needed.
To see a list of the character sets GDB supports, type `set target-charset'<TAB>

I had to hit tab twice, but after doing so, I see:

(gdb) set target-charset 
Display all 200 possibilities? (y or n)
1026               866                ARABIC7            CP-HU              CP1129             CP1158             CP1371             CP4517             CP856              CP903
1046               866NAV             ARMSCII-8          CP037              CP1130             CP1160             CP1388             CP4899             CP857              CP904
1047               869                ASCII              CP038              CP1132             CP1161             CP1390             CP4909             CP860              CP905
10646-1:1993       874                ASMO-708           CP1004             CP1133             CP1162             CP1399             CP4971             CP861              CP912
10646-1:1993/UCS4  8859_1             ASMO_449           CP1008             CP1137             CP1163             CP273              CP500              CP862              CP915
437                8859_2             BALTIC             CP1025             CP1140             CP1164             CP274              CP5347             CP863              CP916
500                8859_3             BIG-5              CP1026             CP1141             CP1166             CP275              CP737              CP864              CP918
500V1              8859_4             BIG-FIVE           CP1046             CP1142             CP1167             CP278              CP770              CP865              CP920
850                8859_5             BIG5               CP1047             CP1143             CP1250             CP280              CP771              CP866              CP921
851                8859_6             BIG5-HKSCS         CP1070             CP1144             CP1251             CP281              CP772              CP866NAV           CP922
852                8859_7             BIG5HKSCS          CP1079             CP1145             CP1252             CP282              CP773              CP868              CP930
855                8859_8             BIGFIVE            CP1081             CP1146             CP1253             CP284              CP774              CP869              CP932
856                8859_9             BRF                CP1084             CP1147             CP1254             CP285              CP775              CP870              CP933
857                904                BS_4730            CP1089             CP1148             CP1255             CP290              CP803              CP871              CP935
860                ANSI_X3.110        CA                 CP1097             CP1149             CP1256             CP297              CP813              CP874              CP936
861                ANSI_X3.110-1983   CN                 CP1112             CP1153             CP1257             CP367              CP819              CP875              CP937
862                ANSI_X3.4          CN-BIG5            CP1122             CP1154             CP1258             CP420              CP850              CP880              CP939
863                ANSI_X3.4-1968     CN-GB              CP1123             CP1155             CP1282             CP423              CP851              CP891              CP949
864                ANSI_X3.4-1986     CP-AR              CP1124             CP1156             CP1361             CP424              CP852              CP901              CP950
865                ARABIC             CP-GR              CP1125             CP1157             CP1364             CP437              CP855              CP902              auto
*** List may be truncated, max-completions reached. ***

There’s my ibm-1141 in there, but masquerading as CP1141, so I’m able to view my data in that codepage, and lookup the value of characters of interest in 1141:

(gdb) set target-charset CP1141
(gdb) p (char *)0x7ffbb7b58088
$3 = 0x7ffbb7b58088 "äJim       ;012ü", ' ' <repeats 104 times>
(gdb) p /x '{'
$4 = 0x43
(gdb) p /x '}
Unmatched single quote.
(gdb) p /x '}'
$5 = 0xdc
(gdb) p /x *(char *)0x7ffbb7b58088
$6 = 0xc0

I’m able to conclude that the buffer in question appears to be in CP1047, not CP1141 (the first character, which is supposed to be ‘{‘ doesn’t have the CP1141 value of ‘{‘).

splitting the last git commit into two

April 23, 2020 C/C++ development and debugging. , ,

In the blog post, Split a commit in two with Git, Emmanuel provides a super clear explanation of how to split an old commit into multiple commits, separating that commit into different commits, each with a subset of the files initially committed.

It took me a while before I could figure out how to apply this to the very last commit.  Here’s the required git magic:

git log -n 1 > m
git reset HEAD^
git add ...
git commit -m "First part"
git add ...
git commit -m "Second part"

The differences are really to just skip the first and last rebase steps (don’t do an interactive rebase, and don’t continue that rebase when done.) This was probably obvious to the author of the more general instructions.

Note that before resetting HEAD to the previous commit, I collect the current commit message, under the assumption that portions of it will be used in either of the two (or more) new commit messages.  If you don’t do that, you can fish it out of your history by looking at ‘git reflog’ to see what the message was before mucking around with HEAD.

Interesting z/OS (clang based) compiler release notes.

December 13, 2019 C/C++ development and debugging. , , , ,

The release notes for the latest z/OS C/C++ compiler are interesting.  When I was at IBM they were working on “clangtana”, a clang frontend melded with the legacy TOBY backend.  This really surprised me, but was consistent with the fact that the IBM compiler guys kept saying that they were continually losing their internal funding — that project was a clever way to do more with less resources.  I think they’d made the clangtana switch for zLinux by the time I left, with AIX to follow once they had resolved some ABI incompatibility issues.  At the time, I didn’t know (nor care) about the status of that project on z/OS.

Well, years later, it looks like they’ve now switched to a clang based compiler frontend on z/OS too.  This major change appears to have a number of side effects that I can imagine will be undesirable to existing mainframe customers:

  • Compiler now requires POSIX(ON) and Unix System Services.  No more compilation using JCL.
  • Compiler support for 31-bit applications appears to be dropped (64-bit only!)
  • Support for C, FASTLINK, and OS linkage conventions has been dropped (XPLINK only.)
  • Only ibm-1047 is supported for both source and runtime character set encoding.
  • C89 support appears to have been dropped.
  • Hex floating support has been dropped.
  • No decimal floating point support.
  • SIMD support isn’t implemented.
  • Metal C support has been dropped.

i.e. if you want C++14, you have to be willing to give up a lot to get it.  They must be using an older clang, because this “new” compiler doesn’t include C++17 support.  I’m surprised that they didn’t even manage multiple character set support for this first compiler release.

It is interesting that they’ve also dropped IPA and PDF support, and that the optimization options have changed.  Does that mean that they’ve actually not only dropped the old Montana frontend, but also gutted the whole backend, switching to clang exclusively?