I wanted to display some internal state as an IBM-1141 codepage, but didn’t know the name to use. I knew that EBCDIC-US could be used for IBM-1047, but gdb didn’t like ibm-1147:
(gdb) set target-charset EBCDIC-US (gdb) p (char *)0x7ffbb7b58088 $2 = 0x7ffbb7b58088 "{Jim ;012}", ' ' <repeats 104 times> (gdb) set target-charset ibm-1141 Undefined item: "ibm-1141".
I’d either didn’t know or had forgotten that we can get a list of the supported codepages. The help shows this:
(gdb) help set target-charset Set the target character set. The `target character set' is the one used by the program being debugged. GDB translates characters and strings between the host and target character sets as needed. To see a list of the character sets GDB supports, type `set target-charset'<TAB>
I had to hit tab twice, but after doing so, I see:
(gdb) set target-charset Display all 200 possibilities? (y or n) 1026 866 ARABIC7 CP-HU CP1129 CP1158 CP1371 CP4517 CP856 CP903 1046 866NAV ARMSCII-8 CP037 CP1130 CP1160 CP1388 CP4899 CP857 CP904 1047 869 ASCII CP038 CP1132 CP1161 CP1390 CP4909 CP860 CP905 10646-1:1993 874 ASMO-708 CP1004 CP1133 CP1162 CP1399 CP4971 CP861 CP912 10646-1:1993/UCS4 8859_1 ASMO_449 CP1008 CP1137 CP1163 CP273 CP500 CP862 CP915 437 8859_2 BALTIC CP1025 CP1140 CP1164 CP274 CP5347 CP863 CP916 500 8859_3 BIG-5 CP1026 CP1141 CP1166 CP275 CP737 CP864 CP918 500V1 8859_4 BIG-FIVE CP1046 CP1142 CP1167 CP278 CP770 CP865 CP920 850 8859_5 BIG5 CP1047 CP1143 CP1250 CP280 CP771 CP866 CP921 851 8859_6 BIG5-HKSCS CP1070 CP1144 CP1251 CP281 CP772 CP866NAV CP922 852 8859_7 BIG5HKSCS CP1079 CP1145 CP1252 CP282 CP773 CP868 CP930 855 8859_8 BIGFIVE CP1081 CP1146 CP1253 CP284 CP774 CP869 CP932 856 8859_9 BRF CP1084 CP1147 CP1254 CP285 CP775 CP870 CP933 857 904 BS_4730 CP1089 CP1148 CP1255 CP290 CP803 CP871 CP935 860 ANSI_X3.110 CA CP1097 CP1149 CP1256 CP297 CP813 CP874 CP936 861 ANSI_X3.110-1983 CN CP1112 CP1153 CP1257 CP367 CP819 CP875 CP937 862 ANSI_X3.4 CN-BIG5 CP1122 CP1154 CP1258 CP420 CP850 CP880 CP939 863 ANSI_X3.4-1968 CN-GB CP1123 CP1155 CP1282 CP423 CP851 CP891 CP949 864 ANSI_X3.4-1986 CP-AR CP1124 CP1156 CP1361 CP424 CP852 CP901 CP950 865 ARABIC CP-GR CP1125 CP1157 CP1364 CP437 CP855 CP902 auto *** List may be truncated, max-completions reached. ***
There’s my ibm-1141 in there, but masquerading as CP1141, so I’m able to view my data in that codepage, and lookup the value of characters of interest in 1141:
(gdb) set target-charset CP1141 (gdb) p (char *)0x7ffbb7b58088 $3 = 0x7ffbb7b58088 "äJim ;012ü", ' ' <repeats 104 times> (gdb) p /x '{' $4 = 0x43 (gdb) p /x '} Unmatched single quote. (gdb) p /x '}' $5 = 0xdc (gdb) p /x *(char *)0x7ffbb7b58088 $6 = 0xc0
I’m able to conclude that the buffer in question appears to be in CP1047, not CP1141 (the first character, which is supposed to be ‘{‘ doesn’t have the CP1141 value of ‘{‘).
As your T.A., I have to punish you …
December 19, 2020 C/C++ development and debugging. grading comments, horrible code, macros, token pasting
Back in university, I had to implement a reverse polish notation calculator in a software engineering class. Overall the assignment was pretty stupid, and I entertained myself by generating writing a very compact implementation. It worked perfectly, but I got a 25/40 (62.5%) grade on it. That mark was well deserved, although I did not think so at the time.
The grading remarks were actually some of best feedback that I ever received, and also really funny to boot. I don’t know the name of this old now-nameless TA anymore, but I took his advice to heart, and kept his grading remarks on my wall in my IBM office for years. That served as an excellent reminder not to write over complicated code.
Today, I found those remarks again, and am posting them for posterity. Enjoy!
Transcription for easy reading
Reflection.
The only part of this feedback that I would refute was the comment about the string class. That was a actually a pretty good string implementation. I didn’t write it because I was a viscous mouse hunter, but because I hit a porting issue with pre-std:: C. In particular, we had two sets of Solaris machines available to us, and I was using one that had a compiler that included a nice C++ string class. So, naturally I used it. For submission, our code had to compile an run on a different Solaris machine, and lo and behold, the string class that all my code was based on was not available.
What I should have done (20/20 hindsight), was throw out my horrendous code, and start over from scratch. However, I took the more fun approach, and wrote my own string class so that my machine would compile on either machine.
Amusingly, when I worked on IBM LUW, there was a part of the query optimizer code seemed to have learned all it’s tricks from the ugly macros and token pasting that I did in this assignment. It was truly gross, but there was 10000x more of it than my assignment. Having been thoroughly punished for my atrocities, I easily recognized this code for the evil it was. The only way that you could debug that optimizer code, was by running it through the preprocessor, cut and pasting the results, and filtering that cut and paste through something like cindent (these days you would probably use clang-format.) That code was brutal, and I always wished that it’s authors had had the good luck of having a TA like mine. That code is probably still part of LUW terrorizing developers. Apparently the justification for it was that it was originally written by an IBM researcher using templates, but templates couldn’t be used in DB2 code because we didn’t have compiler on all platforms that supported them at the time.
I have used token pasting macros very judiciously and sparingly in the 26 years since I originally used them in this assignment, and I do think that there are a few good uses for that sort of generative code. However, if you do have to write that sort of code, I think it’s better to write perl (or some other language) code that generates understandable code that can be debugged, instead of relying on token pasting.
Share this:
Click to share on X (Opens in new window)
X
Click to share on Facebook (Opens in new window)
Facebook
Click to share on Telegram (Opens in new window)
Telegram
Click to share on Reddit (Opens in new window)
Reddit
Click to share on LinkedIn (Opens in new window)
LinkedIn
Click to email a link to a friend (Opens in new window)
Email
Click to share on WhatsApp (Opens in new window)
WhatsApp
Like this: