clang

The C compiler is too forgiving! sizeof(variable_name+1) allowed?

April 28, 2022 C/C++ development and debugging. , , , ,

I carelessly passed:

sizeof(s.st_size+1)

to an allocator call, instead of:

s.st_size+1

and corrupted memory nicely.

What the hell would sizeof(variable+1) even mean, and why on earth would the compiler think that is anything close to valid? Both gcc and clang, each with -Wall, are completely quiet about this error!

Interesting z/OS (clang based) compiler release notes.

December 13, 2019 C/C++ development and debugging. , , , ,

The release notes for the latest z/OS C/C++ compiler are interesting.  When I was at IBM they were working on “clangtana”, a clang frontend melded with the legacy TOBY backend.  This really surprised me, but was consistent with the fact that the IBM compiler guys kept saying that they were continually losing their internal funding — that project was a clever way to do more with less resources.  I think they’d made the clangtana switch for zLinux by the time I left, with AIX to follow once they had resolved some ABI incompatibility issues.  At the time, I didn’t know (nor care) about the status of that project on z/OS.

Well, years later, it looks like they’ve now switched to a clang based compiler frontend on z/OS too.  This major change appears to have a number of side effects that I can imagine will be undesirable to existing mainframe customers:

  • Compiler now requires POSIX(ON) and Unix System Services.  No more compilation using JCL.
  • Compiler support for 31-bit applications appears to be dropped (64-bit only!)
  • Support for C, FASTLINK, and OS linkage conventions has been dropped (XPLINK only.)
  • Only ibm-1047 is supported for both source and runtime character set encoding.
  • C89 support appears to have been dropped.
  • Hex floating support has been dropped.
  • No decimal floating point support.
  • SIMD support isn’t implemented.
  • Metal C support has been dropped.

i.e. if you want C++14, you have to be willing to give up a lot to get it.  They must be using an older clang, because this “new” compiler doesn’t include C++17 support.  I’m surprised that they didn’t even manage multiple character set support for this first compiler release.

It is interesting that they’ve also dropped IPA and PDF support, and that the optimization options have changed.  Does that mean that they’ve actually not only dropped the old Montana frontend, but also gutted the whole backend, switching to clang exclusively?

using ltrace to dig into shared libraries

October 19, 2016 C/C++ development and debugging., clang/llvm , ,

I was trying to find where the clang compiler is writing out constant global data values, and didn’t manage to find it by code inspection. If I run ltrace (also tracing system calls), I see the point where the ELF object is written out:

std::string::compare(std::string const&) const(0x7ffc8983a190, 0x1e32e60, 7, 254) = 5
std::string::compare(std::string const&) const(0x1e32e60, 0x7ffc8983a190, 7, 254) = 0xfffffffb
std::string::compare(std::string const&) const(0x7ffc8983a190, 0x1e32e60, 7, 254) = 5
write@SYS(4, "\177ELF\002\001\001", 848)         = 848
lseek@SYS(4, 40, 0)                              = 40
write@SYS(4, "\220\001", 8)                      = 8
lseek@SYS(4, 848, 0)                             = 848
lseek@SYS(4, 60, 0)                              = 60
write@SYS(4, "\a", 2)                            = 2
lseek@SYS(4, 848, 0)                             = 848
std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()(0x1e2a2e0, 0x1e2a2e8, 0x1e27978, 0x1e27978) = 0
rt_sigprocmask@SYS(2, 0x7ffc8983bb58, 0x7ffc8983bad8, 8) = 0
close@SYS(4)                                     = 0
rt_sigprocmask@SYS(2, 0x7ffc8983bad8, 0, 8)      = 0

This is from running:

ltrace -S --demangle \
   ...

The -S is to display syscalls as well as library calls. To my suprise, this seems to show calls to libstdc++ library calls, but I’m not seeing much from clang itself, just:

clang::DiagnosticsEngine::DiagnosticsEngine
clang::driver::ToolChain::getTargetAndModeFromProgramName
llvm::cl::ExpandResponseFiles
llvm::EnablePrettyStackTrace
llvm::errs
llvm::install_fatal_error_handler
llvm::llvm_shutdown
llvm::PrettyStackTraceEntry::PrettyStackTraceEntry
llvm::PrettyStackTraceEntry::~PrettyStackTraceEntry
llvm::raw_ostream::preferred_buffer_size
llvm::raw_svector_ostream::write_impl
llvm::remove_fatal_error_handler
llvm::StringMapImpl::LookupBucketFor
llvm::StringMapImpl::RehashTable
llvm::sys::PrintStackTraceOnErrorSignal
llvm::sys::Process::FixupStandardFileDescriptors
llvm::sys::Process::GetArgumentVector
llvm::TimerGroup::printAll

There’s got to be a heck of a lot more that the compiler is doing!? It turns out that ltrace doesn’t seem to trace out all the library function calls that lie in shared libraries (I’m using a shared library + split dwarf build of clang). The default output was a bit deceptive since I saw some shared lib calls, in particular the there were std::… calls (from libstc++.so) in the ltrace output. My conclusion seems to be that the tool is lying by default.

This can be confirmed by explicitly asking to see the functions from a specific shared lib. For example, if I call ltrace as:

$ ltrace -S --demangle -e @libLLVMX86CodeGen.so \
/clang/be.b226a0a/bin/clang-3.9 \
-cc1 \
-triple \
x86_64-unknown-linux-gnu \
...

Now I get ~68K calls to libLLVMX86CodeGen.so functions that didn’t show up in the default ltrace output! The ltrace tool won’t show me these by default (although the man page seems to suggest that it should), but if I narrow down what I’m looking through to a single shared lib, at least I can now examine the function calls in that shared lib.

On the SONAME

Note that the @lib….so name has to match the SONAME.  For example if the shared libraries on disk were:

libLLVMX86CodeGen.so -> libLLVMX86CodeGen.so.3
libLLVMX86CodeGen.so.3 -> libLLVMX86CodeGen.so.3.9
libLLVMX86CodeGen.so.3.9 -> libLLVMX86CodeGen.so.3.9.0

$ objdump -x libLLVMX86CodeGen.so | grep SONAME

would give you the name to use.  This becomes relevant in clang 4.0 where the SONAME ends up with .so.4 instead of just .so (when building clang with shared libs instead of archive libs).

How to invoke the 2nd pass of the clang compiler manually

October 3, 2016 clang/llvm , , , , ,

Because the clang front end reexecs itself, breakpoints on the interesting parts of the clang front end don’t get hit by default. Here’s an example

$ cat g2
b llvm::Module::setDataLayout
b BackendConsumer::BackendConsumer
b llvm::TargetMachine::TargetMachine
b llvm::TargetMachine::createDataLayout
run -mbig-endian -m64 -c bytes.c -emit-llvm -o big.bc

$ gdb `which clang`
GNU gdb (GDB) Red Hat Enterprise Linux 7.9.1-19.lz.el7
...
(gdb) source g2
Breakpoint 1 at 0x2c04c3d: llvm::Module::setDataLayout. (2 locations)
Breakpoint 2 at 0x3d08870: file /source/llvm/lib/Target/TargetMachine.cpp, line 47.
Breakpoint 3 at 0x33108ca: file /source/llvm/include/llvm/Target/TargetMachine.h, line 133.
...
Detaching after vfork from child process 15795.
[Inferior 1 (process 15789) exited normally]

(The debugger finishes and exits, hitting none of the breakpoints)

One way to deal with this is to set the fork mode to child:

(gdb) set follow-fork-mode child

An alternate way of dealing with this is to use strace to collect the command line that clang invokes itself with. For example:

$ strace -f -s 1024 -v clang -mbig-endian -m64 big.bc -c 2>&1 | grep exec | tail -2 | head -1

This provides the command line options for the self invocation of clang

[pid  4650] execve("/usr/local/bin/clang-3.9", ["/usr/local/bin/clang-3.9", "-cc1", "-triple", "aarch64_be-unknown-linux-gnu", "-emit-obj", "-mrelax-all", "-disable-free", "-main-file-name", "big.bc", "-mrelocation-model", "static", "-mthread-model", "posix", "-mdisable-fp-elim", "-fmath-errno", "-masm-verbose", "-mconstructor-aliases", "-fuse-init-array", "-target-cpu", "generic", "-target-feature", "+neon", "-target-abi", "aapcs", "-dwarf-column-info", "-debugger-tuning=gdb", "-coverage-file", "/workspace/pass/run/big.bc", "-resource-dir", "/usr/local/bin/../lib/clang/3.9.0", "-fdebug-compilation-dir", "/workspace/pass/run", "-ferror-limit", "19", "-fmessage-length", "0", "-fallow-half-arguments-and-returns", "-fno-signed-char", "-fobjc-runtime=gcc", "-fdiagnostics-show-option", "-o", "big.o", "-x", "ir", "big.bc"],

With a bit of vim tweaking you can turn this into a command line that can be executed (or debugged) directly

/usr/local/bin/clang-3.9 -cc1 -triple aarch64_be-unknown-linux-gnu -emit-obj -mrelax-all -disable-free -main-file-name big.bc -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -fuse-init-array -target-cpu generic -target-feature +neon -target-abi aapcs -dwarf-column-info -debugger-tuning=gdb -coverage-file /workspace/pass/run/big.bc -resource-dir /usr/local/bin/../lib/clang/3.9.0 -fdebug-compilation-dir /workspace/pass/run -ferror-limit 19 -fmessage-length 0 -fallow-half-arguments-and-returns -fno-signed-char -fobjc-runtime=gcc -fdiagnostics-show-option -o big.o -x ir big.bc

Note that doing this also provides a mechanism to change the compiler triple manually, which is something that I wondered how to do (since clang documents -triple as an option, but seems to ignore it). For example, I’m able to able to change -triple aarch64_be to aarch64 and get little endian object code from bytecode prepared with -mbig-endian.

speeding up clang debug and builds

October 2, 2016 clang/llvm , , , , , , ,

I found the default static library configuration of clang slow to rebuild, so I started building it with in shared mode. That loaded pretty slow in gdb, so I went looking for how to enable split dwarf, and found a nice little presentation on how to speed up clang builds.

There’s a followup blog post with some speed up conclusions.

A failure of that blog post is actually listing the cmake commands required to build with all these tricks. Using all these tricks listed there, I’m now trying the following:

mkdir -p ~/freeware
cd ~/freeware

git clone git://sourceware.org/git/binutils-gdb.git
cd binutils-gdb
./configure --prefix=$HOME/local/binutils.gold --enable-gold=default
make 
make install

cd ..
git clone git://github.com/ninja-build/ninja.git 
cd ninja
./configure.py --bootstrap
mkdir -p ~/local/ninja/bin/
cp ninja ~/local/ninja/bin/

With ninja in my PATH, I can now build clang with:

CC=clang CXX=clang++ \
cmake -G Ninja \
../llvm \
-DLLVM_USE_SPLIT_DWARF=TRUE \
-DLLVM_ENABLE_ASSERTIONS=TRUE \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_INSTALL_PREFIX=$HOME/clang39.be \
-DCMAKE_SHARED_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \
-DCMAKE_EXE_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \
-DBUILD_SHARED_LIBS=true \
-DLLVM_TARGETS_TO_BUILD=X86 \
2>&1 | tee o

ninja

ninja install

This does build way faster, both for full builds and incremental builds.

Build tree size

Dynamic libraries: 4.4 Gb. Static libraries: 19.8Gb.

Installed size

Dynamic libraries: 0.7 Gb. Static libraries: 14.7Gb.

Results: full build time.

Static libraries, non-ninja, all backends:

real    51m6.494s
user    160m47.027s
sys     8m49.429s

Dynamic libraries, ninja, split dwarf, x86 backend only:

real    26m19.360s
user    86m11.477s
sys     3m14.478s

Results: incremental build. touch lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp.

Static libraries, non-ninja, all backends:

real    2m17.709s
user    6m8.648s
sys     0m28.594s

Dynamic libraries, ninja, split dwarf, x86 backend only:

real    0m3.245s
user    0m6.104s
sys     0m0.802s

make install times

make:

real    2m6.353s
user    0m7.827s
sys     0m15.316s

ninja:

real    0m2.138s
user    0m0.420s
sys     0m0.831s

The time for rerunning a sharedlib-config ‘ninja install’ is even faster!

Results: time for gdb, b main, run, quit

Static libraries:

real    0m45.904s
user    0m32.376s
sys     0m1.787s

Dynamic libraries, with split dwarf:

real    0m44.440s
user    0m37.096s
sys     0m1.067s

This one isn’t what I would have expected. The initial gdb load time for the split-dwarf exe is almost instantaneous, however it still takes a long time to break in main and continue to that point. I guess that we are taking the hit for a lot of symbol lookup at that point, so it comes out as a wash.

Thinking about this, I noticed that the clang make system doesn’t seem to add ‘-Wl,-gdb-index’ to the link step along with the addition of -gsplit-dwarf to the compilation command line. I thought that was required to get all the deferred symbol table lookup?

Attempting to do so, I found that the insertion of an alternate linker in my PATH wasn’t enough to get clang to use it. Adding –Wl,–gdb-index into the link flags caused complaints from /usr/bin/ld! The cmake magic required was:

-DCMAKE_SHARED_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \
-DCMAKE_EXE_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \

This is in the first cmake invocation flags above, but wasn’t used for my initial 45s gdb+clang time measurements. With –gdb-index, the time for the gdb b-main, run, quit sequence is now reduced to:

real    0m10.268s
user    0m3.623s
sys     0m0.429s

A 4x reduction, which is quite nice!