## using ltrace to dig into shared libraries

I was trying to find where the clang compiler is writing out constant global data values, and didn’t manage to find it by code inspection. If I run ltrace (also tracing system calls), I see the point where the ELF object is written out:

std::string::compare(std::string const&) const(0x7ffc8983a190, 0x1e32e60, 7, 254) = 5
std::string::compare(std::string const&) const(0x1e32e60, 0x7ffc8983a190, 7, 254) = 0xfffffffb
std::string::compare(std::string const&) const(0x7ffc8983a190, 0x1e32e60, 7, 254) = 5
write@SYS(4, "\177ELF\002\001\001", 848)         = 848
lseek@SYS(4, 40, 0)                              = 40
write@SYS(4, "\220\001", 8)                      = 8
lseek@SYS(4, 848, 0)                             = 848
lseek@SYS(4, 60, 0)                              = 60
write@SYS(4, "\a", 2)                            = 2
lseek@SYS(4, 848, 0)                             = 848
std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()(0x1e2a2e0, 0x1e2a2e8, 0x1e27978, 0x1e27978) = 0
close@SYS(4)                                     = 0


This is from running:

ltrace -S --demangle \
...


The -S is to display syscalls as well as library calls. To my suprise, this seems to show calls to libstdc++ library calls, but I’m not seeing much from clang itself, just:

clang::DiagnosticsEngine::DiagnosticsEngine
clang::driver::ToolChain::getTargetAndModeFromProgramName
llvm::cl::ExpandResponseFiles
llvm::EnablePrettyStackTrace
llvm::errs
llvm::install_fatal_error_handler
llvm::llvm_shutdown
llvm::PrettyStackTraceEntry::PrettyStackTraceEntry
llvm::PrettyStackTraceEntry::~PrettyStackTraceEntry
llvm::raw_ostream::preferred_buffer_size
llvm::raw_svector_ostream::write_impl
llvm::remove_fatal_error_handler
llvm::StringMapImpl::LookupBucketFor
llvm::StringMapImpl::RehashTable
llvm::sys::PrintStackTraceOnErrorSignal
llvm::sys::Process::FixupStandardFileDescriptors
llvm::sys::Process::GetArgumentVector
llvm::TimerGroup::printAll


There’s got to be a heck of a lot more that the compiler is doing!? It turns out that ltrace doesn’t seem to trace out all the library function calls that lie in shared libraries (I’m using a shared library + split dwarf build of clang). The default output was a bit deceptive since I saw some shared lib calls, in particular the there were std::… calls (from libstc++.so) in the ltrace output. My conclusion seems to be that the tool is lying by default.

This can be confirmed by explicitly asking to see the functions from a specific shared lib. For example, if I call ltrace as:

$ltrace -S --demangle -e @libLLVMX86CodeGen.so \ /clang/be.b226a0a/bin/clang-3.9 \ -cc1 \ -triple \ x86_64-unknown-linux-gnu \ ...  Now I get ~68K calls to libLLVMX86CodeGen.so functions that didn’t show up in the default ltrace output! The ltrace tool won’t show me these by default (although the man page seems to suggest that it should), but if I narrow down what I’m looking through to a single shared lib, at least I can now examine the function calls in that shared lib. ## On the SONAME Note that the @lib….so name has to match the SONAME. For example if the shared libraries on disk were: libLLVMX86CodeGen.so -> libLLVMX86CodeGen.so.3 libLLVMX86CodeGen.so.3 -> libLLVMX86CodeGen.so.3.9 libLLVMX86CodeGen.so.3.9 -> libLLVMX86CodeGen.so.3.9.0 $ objdump -x libLLVMX86CodeGen.so | grep SONAME


would give you the name to use.  This becomes relevant in clang 4.0 where the SONAME ends up with .so.4 instead of just .so (when building clang with shared libs instead of archive libs).

## How to invoke the 2nd pass of the clang compiler manually

October 3, 2016 clang/llvm No comments , , , , ,

Because the clang front end reexecs itself, breakpoints on the interesting parts of the clang front end don’t get hit by default. Here’s an example

$cat g2 b llvm::Module::setDataLayout b BackendConsumer::BackendConsumer b llvm::TargetMachine::TargetMachine b llvm::TargetMachine::createDataLayout run -mbig-endian -m64 -c bytes.c -emit-llvm -o big.bc$ gdb which clang
GNU gdb (GDB) Red Hat Enterprise Linux 7.9.1-19.lz.el7
...
(gdb) source g2
Breakpoint 1 at 0x2c04c3d: llvm::Module::setDataLayout. (2 locations)
Breakpoint 2 at 0x3d08870: file /source/llvm/lib/Target/TargetMachine.cpp, line 47.
Breakpoint 3 at 0x33108ca: file /source/llvm/include/llvm/Target/TargetMachine.h, line 133.
...
Detaching after vfork from child process 15795.
[Inferior 1 (process 15789) exited normally]


(The debugger finishes and exits, hitting none of the breakpoints)

One way to deal with this is to set the fork mode to child:

(gdb) set follow-fork-mode child


An alternate way of dealing with this is to use strace to collect the command line that clang invokes itself with. For example:

$strace -f -s 1024 -v clang -mbig-endian -m64 big.bc -c 2>&1 | grep exec | tail -2 | head -1  This provides the command line options for the self invocation of clang [pid 4650] execve("/usr/local/bin/clang-3.9", ["/usr/local/bin/clang-3.9", "-cc1", "-triple", "aarch64_be-unknown-linux-gnu", "-emit-obj", "-mrelax-all", "-disable-free", "-main-file-name", "big.bc", "-mrelocation-model", "static", "-mthread-model", "posix", "-mdisable-fp-elim", "-fmath-errno", "-masm-verbose", "-mconstructor-aliases", "-fuse-init-array", "-target-cpu", "generic", "-target-feature", "+neon", "-target-abi", "aapcs", "-dwarf-column-info", "-debugger-tuning=gdb", "-coverage-file", "/workspace/pass/run/big.bc", "-resource-dir", "/usr/local/bin/../lib/clang/3.9.0", "-fdebug-compilation-dir", "/workspace/pass/run", "-ferror-limit", "19", "-fmessage-length", "0", "-fallow-half-arguments-and-returns", "-fno-signed-char", "-fobjc-runtime=gcc", "-fdiagnostics-show-option", "-o", "big.o", "-x", "ir", "big.bc"],  With a bit of vim tweaking you can turn this into a command line that can be executed (or debugged) directly /usr/local/bin/clang-3.9 -cc1 -triple aarch64_be-unknown-linux-gnu -emit-obj -mrelax-all -disable-free -main-file-name big.bc -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -fuse-init-array -target-cpu generic -target-feature +neon -target-abi aapcs -dwarf-column-info -debugger-tuning=gdb -coverage-file /workspace/pass/run/big.bc -resource-dir /usr/local/bin/../lib/clang/3.9.0 -fdebug-compilation-dir /workspace/pass/run -ferror-limit 19 -fmessage-length 0 -fallow-half-arguments-and-returns -fno-signed-char -fobjc-runtime=gcc -fdiagnostics-show-option -o big.o -x ir big.bc  Note that doing this also provides a mechanism to change the compiler triple manually, which is something that I wondered how to do (since clang documents -triple as an option, but seems to ignore it). For example, I’m able to able to change -triple aarch64_be to aarch64 and get little endian object code from bytecode prepared with -mbig-endian. ## speeding up clang debug and builds I found the default static library configuration of clang slow to rebuild, so I started building it with in shared mode. That loaded pretty slow in gdb, so I went looking for how to enable split dwarf, and found a nice little presentation on how to speed up clang builds. There’s a followup blog post with some speed up conclusions. A failure of that blog post is actually listing the cmake commands required to build with all these tricks. Using all these tricks listed there, I’m now trying the following: mkdir -p ~/freeware cd ~/freeware git clone git://sourceware.org/git/binutils-gdb.git cd binutils-gdb ./configure --prefix=$HOME/local/binutils.gold --enable-gold=default
make
make install

cd ..
git clone git://github.com/ninja-build/ninja.git
cd ninja
./configure.py --bootstrap
mkdir -p ~/local/ninja/bin/
cp ninja ~/local/ninja/bin/


With ninja in my PATH, I can now build clang with:

CC=clang CXX=clang++ \
cmake -G Ninja \
../llvm \
-DLLVM_USE_SPLIT_DWARF=TRUE \
-DLLVM_ENABLE_ASSERTIONS=TRUE \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_INSTALL_PREFIX=$HOME/clang39.be \ -DCMAKE_SHARED_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \
-DCMAKE_EXE_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \  This is in the first cmake invocation flags above, but wasn’t used for my initial 45s gdb+clang time measurements. With –gdb-index, the time for the gdb b-main, run, quit sequence is now reduced to: real 0m10.268s user 0m3.623s sys 0m0.429s  A 4x reduction, which is quite nice! ## stack corruption detection with clang: safe-stack notes At LZ our development and nightly builds are done with clang, so it is of interest to check out what stack protection checking compiler options are available. DB2 LUW used the intel compiler, which had very nice stack clobbering code. How does clang’s fair against the intel compiler in this respect? Clang does support a safe-stack option. Here’s an example of some stack corrupting code: #include <string.h> void corrupt( char * b ) ; int main() { char b[12] ; corrupt( b ) ; return 0 ; } void corrupt( char * b ) { memset( b - 4, 'a', 20 ) ; }  Running this without safe stack results in a SEGV on return from from corrupt(): with safe-stack we have a “nicer” trap: (gdb) run Starting program: /home/pjoot/lznotes/proto/stackcorrupt2 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. __memset_sse2 () at ../sysdeps/x86_64/memset.S:415 415 L(P4Q0): mov %edx,-0x4(%rdi) Missing separate debuginfos, use: debuginfo-install libgcc-4.8.3-9.el7.x86_64 libstdc++-4.8.3-9.el7.x86_64 (gdb) where #0 __memset_sse2 () at ../sysdeps/x86_64/memset.S:415 #1 0x0000000000411835 in corrupt ( b=0x7ffff6bd2ff4 'a' <repeats 12 times>, "\177ELF\002\001\001\003") at stackcorrupt2.cc:16 #2 0x00000000004117f1 in main () at stackcorrupt2.cc:9  The compiler is able to catch the corruption in the act, right in the offending memset. ## Limitations What I do notice with this compiler option, is that the implementation has opted not to catch corruptions within the valid stack frame, nor is there any attempt to catch a corruption that does not walk over the return pointer. Here’s an example: #include <string.h> void corrupt( char * b ) ; #define SZ 12 #if 0 // safe-stack catches this: #define PRESZ 0 #define POSTSZ 4 #else // safe-stack catches this buffer overwrite #define PRESZ 4 #define POSTSZ 0 #endif int main() { char b[SZ] ; corrupt( b ) ; return 0 ; } void corrupt( char * b ) { memset( b - PRESZ, 'a', SZ + PRESZ + POSTSZ ) ; }  and another non-trapping stack corruption: #include <stdio.h> #include <string.h> void corrupt2( int & a, char * b, int & c ) ; int main() { int a = 0 ; char b[12] ; int c = 0 ; corrupt2( a, b, c ) ; printf( "0x%08X 0x%08X\n", a, c ) ; return 0 ; } void corrupt2( int & a, char * b, int & c ) { memset( b - 4, 'a', 20 ) ; }  The intel compiler appeared to use guard bytes between stack variables, and was able to tell you exactly which stack variable was overwritten. Clang appears to be opting for a write-forbidden guard region on the stack frame, so it able to catch the corruption in the act, but only if the corruption is “big enough”. There are benefits of both approaches. Unfortunately, there are a number of restrictions in the safe-stack documentation. I’m not sure that I’ll be able to use this at all in LZ code. ## First build break at the new job: C++ uniform initialization Development builds at LZ are done with clang-3.8, but there is an alternate nightly build done with the older RHEL7 GCC-4.8.3 compiler (gcc is up to 6.1 now, so the RHEL7 default is truly _ancient_). This bit of code didn’t compile with gcc:  template <typename mutex_type> class shared_lock { mutex_type & m_mutex ; public: /** construct and acquire the mutex in shared mode */ explicit shared_lock( mutex_type & mutex ) : m_mutex{ mutex } {  The error is: error: invalid initialization of non-const reference of type ‘lz::shared_mutex&’ from an rvalue of type ‘<brace-enclosed initializer list>’  This seems like a compiler bug to me, one that I’d seen when doing my scinet scientific computing course, which mandated the use of at least -std=c++11. In the scinet assignments, I fixed all such issues by using -std=c++14, which worked fine, but I was using gcc-5.3 for those assignments. It appears that this is a compiler bug, and not just an issue with the c++11 language specification, as I initially thought while doing my scinet assignments. If I rebuild this code with g++-6.1, explicitly specifying -std=c++11 (GCC 6.1 defaults to c++14), then the issue goes away, so specification of -std=c++14 is not required to allow uniform initialization to work in this situation. Because of being forced to use the older compiler, it looks like I have to fix this by using pre-c++11 syntax:  explicit shared_lock( mutex_type & mutex ) : m_mutex( mutex )  My conclusion is that gcc-4.8.3 is not truly up to the job of building c++11 compliant code. I’ll have to be more careful with the language features that I use in the future. ## extern vs const in C++ and C code. We now build DB2 on linux ppcle with the IBM xlC 13.1.2 compiler. This version of the compiler is a hybrid compared to any previous compilers, retaining the IBM xlC backend for power, but using the clang front end. Because of this we are exposed to a large number of warnings that we don’t see with many other compilers (well we probably do for our MacOSX port, but we do not really have active development on that platform at the moment), and I’ve been trying to take down those counts to manageable levels. Header files that produce warnings have been my first target since they introduce the most repeated noise. One message that I was seeing hundreds of was warning: 'extern' variable has an initializer [-Wextern-initializer]  This seemed to be coming from headers that did something like: #if defined FOO_INITIALIZE_IT_IN_SOME_SOURCE_FILE extern const TYPE foo[] = { ... } ; #else extern const TYPE foo[] ; #endif  where FOO_INITIALIZE_IT_IN_SOME_SOURCE_FILE is defined at the top of a source file that explicitly includes this header. My attempt to handle the messages was to remove the ‘extern’ from the initialization case, but I was suprised to see link errors as a result of some of those changes. It turns out that there are some subtle differences between different variations of const and extern with an array declaration of this sort. Here’s a bit of sample code: // t.h extern const int x[] ; extern int y[] ; extern int z[] ; // t.cc #if defined WANT_LINK_ERROR const int x[] = { 42 } ; #else extern const int x[] = { 42 } ; #endif extern int y[] = { 42 } ; int z[] = { 42 } ;  When WANT_LINK_ERROR isn’t defined, this produces just one clang warning message t.cc:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer] extern int y[] = { 42 } ; ^  Note that the ‘extern const’ has no such warning, nor does the non-const symbol that’s been declared ‘extern’ in the header. However, removing the extern from the const case (via -DWANT_LINK_ERROR) results in no symbol ‘x’ available to other consumers. The extern is required for const symbols, but generates a warning for non-const symbols. It appears that this is also C++ specific. A const symbol in C compiled code is available for external use, regardless of whether extern is used: $ clang -c t.c
t.c:5:18: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern const int x[] = { 42 } ;
^
t.c:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
^
2 warnings generated.

$nm t.o 0000000000000000 R x 0000000000000000 D y 0000000000000004 D z$ clang -c -DWANT_LINK_ERROR t.c
t.c:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
^
1 warning generated.
$nm t.o 0000000000000000 R x 0000000000000000 D y 0000000000000004 D z  whereas that same symbol requires extern if it is const in C++: $ clang++ -c t.cc
t.cc:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
^
1 warning generated.
$nm t.o 0000000000000000 R x 0000000000000000 D y 0000000000000004 D z$ clang++ -c -DWANT_LINK_ERROR t.cc
t.cc:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
^
1 warning generated.
$nm t.o 0000000000000000 D y 0000000000000004 D z  I hadn’t expected the const to interact this way with extern. I am guessing that C++ allows for the compiler to not generate symbols for global scope const variables, unless you ask for that by using extern, whereas with C you get the symbol like-it-or-not. This particular message from the clang front end is only for non-const extern initializations, making across the board fixing of messages for extern initialization of the sort above trickier. This makes it so that you can’t do an across the board replacement of extern in initializers for a given file without first ensuring that the symbol isn’t const. It looks like dealing with this will have to be done much more carefully than I first tried. ## resolving merge conflicts due to automated C to C++ comment changes November 17, 2014 C/C++ development and debugging. No comments , , , , I was faced with hundreds of merge conflicts that had the following diff3 -m conflict structure:  <<<<<<< file.C.mine /* Allocate memory for ProcNamePattern and memset to blank */ /* ProcNamePattern is used as an intermediate upper case string to capture the procedure name*/ /* Allocate space for 128 byte schema, 128 byte procedure name */ rc = BAR(0, len+SCHEMA_IDENT+1, MEM_DEFAULT, &ProcNamePattern); ||||||| file.C.orig /* Allocate memory for ProcNamePattern and memset to blank */ /* ProcNamePattern is used as an intermediate upper case string to capture the procedure name*/ /* Allocate space for 128 byte schema, 128 byte procedure name */ rc = FOO(0, len+SCHEMA_IDENT+1, (void **) &ProcNamePattern); ======= // Allocate memory for ProcNamePattern and memset to blank // ProcNamePattern is used as an intermediate upper case string to capture the procedure name // Allocate space for 128 byte schema, 128 byte procedure name rc = FOO(0, len+SCHEMA_IDENT+1, (void **) &ProcNamePattern); >>>>>>> file.C.new if (rc ) {  I’d run a clang based source editing tool that changed FOO to BAR, added a parameter, and removed a cast. Other maintainers of the code had run a tool, or perhaps an editor macro that changed most (but not all) of the C style /* … */ comments into C++ single line comments // … Those pairs of changes were unfortunately close enough to generate a diff3 -m conflict. I can run my clang editing tool again (and will), but need to get the source compile-able first, so was faced with either tossing and regenerating my changes, or resolving the conflicts. Basically I needed to filter these comments in the same fashion, and then accept all of my changes, provided there were no other changes in the .orig -> .new stream. Here’s a little perl filter I wrote for this task: #!/usr/bin/perl -n # a script to replace single line /* */ comments with C++ // comments in a restricted fashion. # # - doesn't touch comments of the form: # /* ... */ ... /* ... # ^^ ^^ # - doesn't touch comments with leading non-whitespace or trailing non-whitespace # # This is used to filter new/old/mine triplets in merges to deal with automated replacements of this sort. chomp ; if ( ! m,/\*.*/\*, ) { s, ^(\s*) # restrict replacement to comments that start only after beginning of line and whitespace /\* # start of comment \s* # opt spaces (.*) # payload \s* # opt spaces \*/ # end of comment \s*$     # opt spaces and end of line
,$1//$2,x ;
}

#s,/\* *(.*)(?=\*/ *$)\*/ *$,// $1, ; print "$_\n" ;


This consumes stdin, and spits out stdout, making the automated change that had been applied to the code. I didn’t want it to do anything with comments of any of the forms:

• [non-whitespace] /* … */
• /* … */ … non-whitespace
• /* … */ … /* … */

Since the comment filtering that’s now in the current version of the files didn’t do this, and I didn’t want to introduce more conflicts due to spacing changes.

With this filter run on all the .mine, .orig, .new versions I was able to rerun

diff3 -m file.mine file.orig file.new

and have only a few actual conflicts to deal with. Most of those were also due to space change and in some cases comment removal.

A lot of this trouble stems from the fact that our product has no coding standards for layout (or very little, or ones that are component specific). I maintain our coding standards for correctness, but when I was given these standards as fairly green developer I didn’t have the guts to take on the role of coding standards dictator, and impose style guidelines on developers very much my senior.

Without style guidelines, a lot of these sorts of merge conflicts could be avoided or minimized significantly if we would only make automated changes with tools that everybody could (or must) run. That would allow conflict filtering of those sort to be done automatically, without having to write ad-hoc tools to “re-play” the automated change in a subset of the merge contributors.

My use of a clang rewriter flies in the face of this ideal conflict avoidance strategy since our build environment is not currently tooled up for our development to do so. However, in this case, being able to do robust automated maintenance ill hopefully be worth the conflicts that this itself will inject.