clang

using ltrace to dig into shared libraries

October 19, 2016 C/C++ development and debugging., clang/llvm 1 comment , ,

I was trying to find where the clang compiler is writing out constant global data values, and didn’t manage to find it by code inspection. If I run ltrace (also tracing system calls), I see the point where the ELF object is written out:

std::string::compare(std::string const&) const(0x7ffc8983a190, 0x1e32e60, 7, 254) = 5
std::string::compare(std::string const&) const(0x1e32e60, 0x7ffc8983a190, 7, 254) = 0xfffffffb
std::string::compare(std::string const&) const(0x7ffc8983a190, 0x1e32e60, 7, 254) = 5
write@SYS(4, "\177ELF\002\001\001", 848)         = 848
lseek@SYS(4, 40, 0)                              = 40
write@SYS(4, "\220\001", 8)                      = 8
lseek@SYS(4, 848, 0)                             = 848
lseek@SYS(4, 60, 0)                              = 60
write@SYS(4, "\a", 2)                            = 2
lseek@SYS(4, 848, 0)                             = 848
std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()(0x1e2a2e0, 0x1e2a2e8, 0x1e27978, 0x1e27978) = 0
rt_sigprocmask@SYS(2, 0x7ffc8983bb58, 0x7ffc8983bad8, 8) = 0
close@SYS(4)                                     = 0
rt_sigprocmask@SYS(2, 0x7ffc8983bad8, 0, 8)      = 0

This is from running:

ltrace -S --demangle \
   ...

The -S is to display syscalls as well as library calls. To my suprise, this seems to show calls to libstc++ library calls, but I’m not seeing much from clang itself, just:

clang::DiagnosticsEngine::DiagnosticsEngine
clang::driver::ToolChain::getTargetAndModeFromProgramName
llvm::cl::ExpandResponseFiles
llvm::EnablePrettyStackTrace
llvm::errs
llvm::install_fatal_error_handler
llvm::llvm_shutdown
llvm::PrettyStackTraceEntry::PrettyStackTraceEntry
llvm::PrettyStackTraceEntry::~PrettyStackTraceEntry
llvm::raw_ostream::preferred_buffer_size
llvm::raw_svector_ostream::write_impl
llvm::remove_fatal_error_handler
llvm::StringMapImpl::LookupBucketFor
llvm::StringMapImpl::RehashTable
llvm::sys::PrintStackTraceOnErrorSignal
llvm::sys::Process::FixupStandardFileDescriptors
llvm::sys::Process::GetArgumentVector
llvm::TimerGroup::printAll

There’s got to be a heck of a lot more that the compiler is doing!? It turns out that ltrace doesn’t seem to trace out all the library function calls that lie in shared libraries (I’m using a shared library + split dwarf build of clang). The default output was a bit deceptive since I saw some shared lib calls, in particular the there were std::… calls (from libstc++.so) in the ltrace output. My conclusion seems to be that the tool is lying by default.

This can be confirmed by explicitly asking to see the functions from a specific shared lib. For example, if I call ltrace as:

ltrace -S --demangle -e @libLLVMX86CodeGen.so \
/clang/be.b226a0a/bin/clang-3.9 \
-cc1 \
-triple \
x86_64-unknown-linux-gnu \
...

Now I get ~68K calls to libLLVMX86CodeGen.so functions that didn’t show up in the default ltrace output! The ltrace tool won’t show me these by default (although the man page seems to suggest that it should), but if I narrow down what I’m looking through to a single shared lib, at least I can now examine the function calls in that shared lib.

How to invoke the 2nd pass of the clang compiler manually

October 3, 2016 clang/llvm No comments , , , , ,

Because the clang front end reexecs itself, breakpoints on the interesting parts of the clang front end don’t get hit by default. Here’s an example

$ cat g2
b llvm::Module::setDataLayout
b BackendConsumer::BackendConsumer
b llvm::TargetMachine::TargetMachine
b llvm::TargetMachine::createDataLayout
run -mbig-endian -m64 -c bytes.c -emit-llvm -o big.bc

$ gdb `which clang`
GNU gdb (GDB) Red Hat Enterprise Linux 7.9.1-19.lz.el7
...
(gdb) source g2
Breakpoint 1 at 0x2c04c3d: llvm::Module::setDataLayout. (2 locations)
Breakpoint 2 at 0x3d08870: file /source/llvm/lib/Target/TargetMachine.cpp, line 47.
Breakpoint 3 at 0x33108ca: file /source/llvm/include/llvm/Target/TargetMachine.h, line 133.
...
Detaching after vfork from child process 15795.
[Inferior 1 (process 15789) exited normally]

(The debugger finishes and exits, hitting none of the breakpoints)

One way to deal with this is to set the fork mode to child:

(gdb) set follow-fork-mode child

An alternate way of dealing with this is to use strace to collect the command line that clang invokes itself with. For example:

$ strace -f -s 1024 -v clang -mbig-endian -m64 big.bc -c 2>&1 | grep exec | tail -2 | head -1

This provides the command line options for the self invocation of clang

[pid  4650] execve("/usr/local/bin/clang-3.9", ["/usr/local/bin/clang-3.9", "-cc1", "-triple", "aarch64_be-unknown-linux-gnu", "-emit-obj", "-mrelax-all", "-disable-free", "-main-file-name", "big.bc", "-mrelocation-model", "static", "-mthread-model", "posix", "-mdisable-fp-elim", "-fmath-errno", "-masm-verbose", "-mconstructor-aliases", "-fuse-init-array", "-target-cpu", "generic", "-target-feature", "+neon", "-target-abi", "aapcs", "-dwarf-column-info", "-debugger-tuning=gdb", "-coverage-file", "/workspace/pass/run/big.bc", "-resource-dir", "/usr/local/bin/../lib/clang/3.9.0", "-fdebug-compilation-dir", "/workspace/pass/run", "-ferror-limit", "19", "-fmessage-length", "0", "-fallow-half-arguments-and-returns", "-fno-signed-char", "-fobjc-runtime=gcc", "-fdiagnostics-show-option", "-o", "big.o", "-x", "ir", "big.bc"],

With a bit of vim tweaking you can turn this into a command line that can be executed (or debugged) directly

/usr/local/bin/clang-3.9 -cc1 -triple aarch64_be-unknown-linux-gnu -emit-obj -mrelax-all -disable-free -main-file-name big.bc -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -fuse-init-array -target-cpu generic -target-feature +neon -target-abi aapcs -dwarf-column-info -debugger-tuning=gdb -coverage-file /workspace/pass/run/big.bc -resource-dir /usr/local/bin/../lib/clang/3.9.0 -fdebug-compilation-dir /workspace/pass/run -ferror-limit 19 -fmessage-length 0 -fallow-half-arguments-and-returns -fno-signed-char -fobjc-runtime=gcc -fdiagnostics-show-option -o big.o -x ir big.bc

Note that doing this also provides a mechanism to change the compiler triple manually, which is something that I wondered how to do (since clang documents -triple as an option, but seems to ignore it). For example, I’m able to able to change -triple aarch64_be to aarch64 and get little endian object code from bytecode prepared with -mbig-endian.

speeding up clang debug and builds

October 2, 2016 clang/llvm No comments , , , , , , ,

I found the default static library configuration of clang slow to rebuild, so I started building it with in shared mode. That loaded pretty slow in gdb, so I went looking for how to enable split dwarf, and found a nice little presentation on how to speed up clang builds.

There’s a followup blog post with some speed up conclusions.

A failure of that blog post is actually listing the cmake commands required to build with all these tricks. Using all these tricks listed there, I’m now trying the following:

mkdir -p ~/freeware
cd ~/freeware

git clone git://sourceware.org/git/binutils-gdb.git
cd binutils-gdb
./configure --prefix=$HOME/local/binutils.gold --enable-gold=default
make 
make install

cd ..
git clone git://github.com/ninja-build/ninja.git 
cd ninja
./configure.py --bootstrap
mkdir -p ~/local/ninja/bin/
cp ninja ~/local/ninja/bin/

With ninja in my PATH, I can now build clang with:

CC=clang CXX=clang++ \
cmake -G Ninja \
../llvm \
-DLLVM_USE_SPLIT_DWARF=TRUE \
-DLLVM_ENABLE_ASSERTIONS=TRUE \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_INSTALL_PREFIX=$HOME/clang39.be \
-DCMAKE_SHARED_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \
-DCMAKE_EXE_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \
-DBUILD_SHARED_LIBS=true \
-DLLVM_TARGETS_TO_BUILD=X86 \
2>&1 | tee o

ninja

ninja install

This does build way faster, both for full builds and incremental builds.

Build tree size

Dynamic libraries: 4.4 Gb. Static libraries: 19.8Gb.

Installed size

Dynamic libraries: 0.7 Gb. Static libraries: 14.7Gb.

Results: full build time.

Static libraries, non-ninja, all backends:

real    51m6.494s
user    160m47.027s
sys     8m49.429s

Dynamic libraries, ninja, split dwarf, x86 backend only:

real    26m19.360s
user    86m11.477s
sys     3m14.478s

Results: incremental build. touch lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp.

Static libraries, non-ninja, all backends:

real    2m17.709s
user    6m8.648s
sys     0m28.594s

Dynamic libraries, ninja, split dwarf, x86 backend only:

real    0m3.245s
user    0m6.104s
sys     0m0.802s

make install times

make:

real    2m6.353s
user    0m7.827s
sys     0m15.316s

ninja:

real    0m2.138s
user    0m0.420s
sys     0m0.831s

The time for rerunning a sharedlib-config ‘ninja install’ is even faster!

Results: time for gdb, b main, run, quit

Static libraries:

real    0m45.904s
user    0m32.376s
sys     0m1.787s

Dynamic libraries, with split dwarf:

real    0m44.440s
user    0m37.096s
sys     0m1.067s

This one isn’t what I would have expected. The initial gdb load time for the split-dwarf exe is almost instantaneous, however it still takes a long time to break in main and continue to that point. I guess that we are taking the hit for a lot of symbol lookup at that point, so it comes out as a wash.

Thinking about this, I noticed that the clang make system doesn’t seem to add ‘-Wl,-gdb-index’ to the link step along with the addition of -gsplit-dwarf to the compilation command line. I thought that was required to get all the deferred symbol table lookup?

Attempting to do so, I found that the insertion of an alternate linker in my PATH wasn’t enough to get clang to use it. Adding –Wl,–gdb-index into the link flags caused complaints from /usr/bin/ld! The cmake magic required was:

-DCMAKE_SHARED_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \
-DCMAKE_EXE_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \

This is in the first cmake invocation flags above, but wasn’t used for my initial 45s gdb+clang time measurements. With –gdb-index, the time for the gdb b-main, run, quit sequence is now reduced to:

real    0m10.268s
user    0m3.623s
sys     0m0.429s

A 4x reduction, which is quite nice!

stack corruption detection with clang: safe-stack notes

June 10, 2016 C/C++ development and debugging. No comments , , ,

At LZ our development and nightly builds are done with clang, so it is of interest to check out what stack protection checking compiler options are available.  DB2 LUW used the intel compiler, which had very nice stack clobbering code.  How does clang’s fair against the intel compiler in this respect?

Clang does support a safe-stack option.  Here’s an example of some stack corrupting code:

#include <string.h>

void corrupt( char * b ) ;

int main()
{
   char b[12] ;

   corrupt( b ) ;

   return 0 ;
}

void corrupt( char * b )
{
   memset( b - 4, 'a', 20 ) ;
}

Running this without safe stack results in a SEGV on return from from corrupt():

Screen Shot 2016-06-10 at 11.08.54 AM

with safe-stack we have a “nicer” trap:

(gdb) run
Starting program: /home/pjoot/lznotes/proto/stackcorrupt2
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
__memset_sse2 () at ../sysdeps/x86_64/memset.S:415
415     L(P4Q0): mov    %edx,-0x4(%rdi)
Missing separate debuginfos, use: debuginfo-install libgcc-4.8.3-9.el7.x86_64 libstdc++-4.8.3-9.el7.x86_64
(gdb) where
#0  __memset_sse2 () at ../sysdeps/x86_64/memset.S:415
#1  0x0000000000411835 in corrupt (
    b=0x7ffff6bd2ff4 'a' <repeats 12 times>, "\177ELF\002\001\001\003")
    at stackcorrupt2.cc:16
#2  0x00000000004117f1 in main () at stackcorrupt2.cc:9

The compiler is able to catch the corruption in the act, right in the offending memset.

Limitations

What I do notice with this compiler option, is that the implementation has opted not to catch corruptions within the valid stack frame, nor is there any attempt to catch a corruption that does not walk over the return pointer. Here’s an example:

#include <string.h>

void corrupt( char * b ) ;

#define SZ 12
#if 0
   // safe-stack catches this:
   #define PRESZ 0
   #define POSTSZ 4
#else
   // safe-stack catches this buffer overwrite
   #define PRESZ 4
   #define POSTSZ 0
#endif
int main()
{
   char b[SZ] ;

   corrupt( b ) ;

   return 0 ;
}

void corrupt( char * b )
{
   memset( b - PRESZ, 'a', SZ + PRESZ + POSTSZ ) ;
}

and another non-trapping stack corruption:

#include <stdio.h>
#include <string.h>

void corrupt2( int & a, char * b, int & c ) ;

int main()
{
   int a = 0 ;
   char b[12] ;
   int c = 0 ;

   corrupt2( a, b, c ) ;

   printf( "0x%08X 0x%08X\n", a, c ) ;

   return 0 ;
}

void corrupt2( int & a, char * b, int & c )
{
   memset( b - 4, 'a', 20 ) ;
}

The intel compiler appeared to use guard bytes between stack variables, and was able to tell you exactly which stack variable was overwritten. Clang appears to be opting for a write-forbidden guard region on the stack frame, so it able to catch the corruption in the act, but only if the corruption is “big enough”. There are benefits of both approaches. Unfortunately, there are a number of restrictions in the safe-stack documentation. I’m not sure that I’ll be able to use this at all in LZ code.

First build break at the new job: C++ uniform initialization

May 12, 2016 C/C++ development and debugging. No comments , , , , ,

Development builds at LZ are done with clang-3.8, but there is an alternate nightly build done with the older RHEL7 GCC-4.8.3 compiler (gcc is up to 6.1 now, so the RHEL7 default is truly _ancient_). This bit of code didn’t compile with gcc:

   template <typename mutex_type>
   class shared_lock
   {  
      mutex_type &      m_mutex ;

   public:

      /** construct and acquire the mutex in shared mode */
      explicit shared_lock( mutex_type & mutex )
         : m_mutex{ mutex }
      {  

The error is:

error: invalid initialization of non-const reference of type ‘lz::shared_mutex&’ from an rvalue of type ‘<brace-enclosed initializer list>’

This seems like a compiler bug to me, one that I’d seen when doing my scinet scientific computing course, which mandated the use of at least -std=c++11. In the scinet assignments, I fixed all such issues by using -std=c++14, which worked fine, but I was using gcc-5.3 for those assignments.

It appears that this is a compiler bug, and not just an issue with the c++11 language specification, as I initially thought while doing my scinet assignments. If I rebuild this code with g++-6.1, explicitly specifying -std=c++11 (GCC 6.1 defaults to c++14), then the issue goes away, so specification of -std=c++14 is not required to allow uniform initialization to work in this situation.

Because of being forced to use the older compiler, it looks like I have to fix this by using pre-c++11 syntax:

      explicit shared_lock( mutex_type & mutex )
         : m_mutex( mutex )

My conclusion is that gcc-4.8.3 is not truly up to the job of building c++11 compliant code. I’ll have to be more careful with the language features that I use in the future.

extern vs const in C++ and C code.

October 5, 2015 C/C++ development and debugging. No comments , , , , , ,

We now build DB2 on linux ppcle with the IBM xlC 13.1.2 compiler. This version of the compiler is a hybrid compared to any previous compilers, retaining the IBM xlC backend for power, but using the clang front end. Because of this we are exposed to a large number of warnings that we don’t see with many other compilers (well we probably do for our MacOSX port, but we do not really have active development on that platform at the moment), and I’ve been trying to take down those counts to manageable levels. Header files that produce warnings have been my first target since they introduce the most repeated noise.

One message that I was seeing hundreds of was

warning: 'extern' variable has an initializer [-Wextern-initializer]

This seemed to be coming from headers that did something like:

#if defined FOO_INITIALIZE_IT_IN_SOME_SOURCE_FILE
extern const TYPE foo[] = { ... } ;
#else
extern const TYPE foo[] ;
#endif


where FOO_INITIALIZE_IT_IN_SOME_SOURCE_FILE is defined at the top of a source file that explicitly includes this header. My attempt to handle the messages was to remove the ‘extern’ from the initialization case, but I was suprised to see link errors as a result of some of those changes. It turns out that there are some subtle differences between different variations of const and extern with an array declaration of this sort. Here’s a bit of sample code:

// t.h
extern const int x[] ;
extern int y[] ;
extern int z[] ;


// t.cc
#if defined WANT_LINK_ERROR
const int x[] = { 42 } ;
#else
extern const int x[] = { 42 } ;
#endif

extern int y[] = { 42 } ;
int z[] = { 42 } ;


When WANT_LINK_ERROR isn’t defined, this produces just one clang warning message

t.cc:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
           ^

Note that the ‘extern const’ has no such warning, nor does the non-const symbol that’s been declared ‘extern’ in the header. However, removing the extern from the const case (via -DWANT_LINK_ERROR) results in no symbol ‘x’ available to other consumers. The extern is required for const symbols, but generates a warning for non-const symbols.

It appears that this is also C++ specific. A const symbol in C compiled code is available for external use, regardless of whether extern is used:



$ clang -c t.c
t.c:5:18: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern const int x[] = { 42 } ;
                 ^
t.c:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
           ^
2 warnings generated.

$ nm t.o
0000000000000000 R x
0000000000000000 D y
0000000000000004 D z

$ clang -c -DWANT_LINK_ERROR t.c
t.c:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
           ^
1 warning generated.
$  nm t.o
0000000000000000 R x
0000000000000000 D y
0000000000000004 D z


whereas that same symbol requires extern if it is const in C++:


$ clang++ -c t.cc
t.cc:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
           ^
1 warning generated.
$ nm t.o
0000000000000000 R x
0000000000000000 D y
0000000000000004 D z



$ clang++ -c -DWANT_LINK_ERROR t.cc
t.cc:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
           ^
1 warning generated.
$ nm t.o
0000000000000000 D y
0000000000000004 D z


I hadn’t expected the const to interact this way with extern. I am guessing that C++ allows for the compiler to not generate symbols for global scope const variables, unless you ask for that by using extern, whereas with C you get the symbol like-it-or-not. This particular message from the clang front end is only for non-const extern initializations, making across the board fixing of messages for extern initialization of the sort above trickier. This makes it so that you can’t do an across the board replacement of extern in initializers for a given file without first ensuring that the symbol isn’t const. It looks like dealing with this will have to be done much more carefully than I first tried.

resolving merge conflicts due to automated C to C++ comment changes

November 17, 2014 C/C++ development and debugging. No comments , , , ,

I was faced with hundreds of merge conflicts that had the following diff3 -m conflict structure:


<<<<<<< file.C.mine
   /* Allocate memory for ProcNamePattern and memset to blank */
   /* ProcNamePattern is used as an intermediate upper case string to capture the procedure name*/
   /* Allocate space for 128 byte schema, 128 byte procedure name */
   rc = BAR(0,
            len+SCHEMA_IDENT+1,
            MEM_DEFAULT,
            &ProcNamePattern);
||||||| file.C.orig
   /* Allocate memory for ProcNamePattern and memset to blank */
   /* ProcNamePattern is used as an intermediate upper case string to capture the procedure name*/
   /* Allocate space for 128 byte schema, 128 byte procedure name */
   rc = FOO(0,
            len+SCHEMA_IDENT+1,
            (void **) &ProcNamePattern);
=======
   // Allocate memory for ProcNamePattern and memset to blank
   // ProcNamePattern is used as an intermediate upper case string to capture the procedure name
   // Allocate space for 128 byte schema, 128 byte procedure name
   rc = FOO(0,
            len+SCHEMA_IDENT+1,
            (void **) &ProcNamePattern);
>>>>>>> file.C.new
   if (rc  )
   {


I’d run a clang based source editing tool that changed FOO to BAR, added a parameter, and removed a cast. Other maintainers of the code had run a tool, or perhaps an editor macro that changed most (but not all) of the C style /* … */ comments into C++ single line comments // …

Those pairs of changes were unfortunately close enough to generate a diff3 -m conflict.

I can run my clang editing tool again (and will), but need to get the source compile-able first, so was faced with either tossing and regenerating my changes, or resolving the conflicts. Basically I needed to filter these comments in the same fashion, and then accept all of my changes, provided there were no other changes in the .orig -> .new stream.

Here’s a little perl filter I wrote for this task:

#!/usr/bin/perl -n

# a script to replace single line /* */ comments with C++ // comments in a restricted fashion.
#
# - doesn't touch comments of the form:
#                                         /* ... */ ... /* ...
#                                         ^^            ^^
# - doesn't touch comments with leading non-whitespace or trailing non-whitespace
#
# This is used to filter new/old/mine triplets in merges to deal with automated replacements of this sort.

chomp ;

if ( ! m,/\*.*/\*, )
{
   s,
^(\s*)   # restrict replacement to comments that start only after beginning of line and whitespace
/\*      # start of comment
\s*      # opt spaces
(.*)     # payload
\s*      # opt spaces
\*/      # end of comment
\s*$     # opt spaces and end of line
,$1// $2,x ;
}

#s,/\* *(.*)(?=\*/ *$)\*/ *$,// $1, ;

print "$_\n" ;

This consumes stdin, and spits out stdout, making the automated change that had been applied to the code. I didn’t want it to do anything with comments of any of the forms:

  • [non-whitespace] /* … */
  • /* … */ … non-whitespace
  • /* … */ … /* … */

Since the comment filtering that’s now in the current version of the files didn’t do this, and I didn’t want to introduce more conflicts due to spacing changes.

With this filter run on all the .mine, .orig, .new versions I was able to rerun

diff3 -m file.mine file.orig file.new

and have only a few actual conflicts to deal with. Most of those were also due to space change and in some cases comment removal.

A lot of this trouble stems from the fact that our product has no coding standards for layout (or very little, or ones that are component specific). I maintain our coding standards for correctness, but when I was given these standards as fairly green developer I didn’t have the guts to take on the role of coding standards dictator, and impose style guidelines on developers very much my senior.

Without style guidelines, a lot of these sorts of merge conflicts could be avoided or minimized significantly if we would only make automated changes with tools that everybody could (or must) run. That would allow conflict filtering of those sort to be done automatically, without having to write ad-hoc tools to “re-play” the automated change in a subset of the merge contributors.

My use of a clang rewriter flies in the face of this ideal conflict avoidance strategy since our build environment is not currently tooled up for our development to do so. However, in this case, being able to do robust automated maintenance ill hopefully be worth the conflicts that this itself will inject.