C/C++ development and debugging.

gdb set target-charset

January 9, 2017 C/C++ development and debugging. , , ,

I was looking for a way to convert ASCII and EBCDIC strings in gdb debugging sessions and was experimenting with gdb python script extensions. I managed to figure out how to add my own command that read a gdb variable, and print it out, but it failed when I tried to run a character conversion function. In the process of debugging that char encoding error, I found that there’s a built in way to do exactly what I wanted to do:

(gdb) p argv[0]
$16 = 0x7fd8fbda0108 "\323\326\303\301\323\305\303\326"
(gdb) set target-charset EBCDIC-US
(gdb) p argv[0]
$17 = 0x7fd8fbda0108 "LOCALECO"
(gdb) set target-charset ASCII
(gdb) p argv[0]
$18 = 0x7fd8fbda0108 "\323\326\303\301\323\305\303\326"

A wierd way to invoke the compiler

November 18, 2016 C/C++ development and debugging. ,

Did you know that you can run the compiler from sources specified in stdin?  Here’s an example:

// m.c
#include <stdio.h>

int main()
{   
    int x = 3;
    printf("%d\n", x);
    return 0;
}

You have to specify the language for the code explicitly, since that can’t be inferred from the filename when that file data is coming from stdin:

$ cat m.c | clang -g -x c - -o f
$ ./f
3

This fact came up in conversation the other day. The result is something that is completely undebuggable, but you can do it! I’m curious if there’s actually a use case for this?

Another Linux shared library trace facility

October 27, 2016 C/C++ development and debugging. , ,

I previously blogged about a way to force ltrace to show some shared memory trace records that didn’t show up by default.

Where that fails to be useful, is when you don’t have a guess about what shared library the code in question lives in. I just blundered on the latrace command that uses a Linux dynamic loader audit facility to give a complete trace of all the function-name/library-name pairs that are executed!

Here’s an example invocation:

latrace \
clang xx.c -c 2>&1 | c++filt

without output like:

...
 9022     std::operator&(std::memory_order, std::__memory_order_modifier) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMAnalysis.so]
 9022     std::operator&(std::memory_order, std::__memory_order_modifier) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMAnalysis.so]
 9022     strlen [/lib64/libc.so.6]
 9022     strlen [/lib64/libc.so.6]
 9022     strlen [/lib64/libc.so.6]
 9022     llvm::cl::basic_parser::basic_parser(llvm::cl::Option&) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMSupport.so]
 9022     strlen [/lib64/libc.so.6]
 9022     llvm::cl::Option::setArgStr(llvm::StringRef) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMSupport.so]
 9022     strlen [/lib64/libc.so.6]
 9022     std::pair::__type, std::__decay_and_strip::__type> std::make_pair(void const**&&, bool&&) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMX86CodeGen.so]
 9022       void const**&& std::forward(std::remove_reference::type&) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMX86CodeGen.so]
 9022       bool&& std::forward(std::remove_reference::type&) []
 9022       void const**&& std::forward(std::remove_reference::type&) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMX86CodeGen.so]
 9022       bool&& std::forward(std::remove_reference::type&) []
 9022     void const**&& std::forward(std::remove_reference::type&) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMX86CodeGen.so]
...

With this latrace command, we get all the shared library function call names and their corresponding shared library names. Using that info we can dig into a specific shared library with ltrace or the debugger, once a point of interest is determined.

using ltrace to dig into shared libraries

October 19, 2016 C/C++ development and debugging., clang/llvm , ,

I was trying to find where the clang compiler is writing out constant global data values, and didn’t manage to find it by code inspection. If I run ltrace (also tracing system calls), I see the point where the ELF object is written out:

std::string::compare(std::string const&) const(0x7ffc8983a190, 0x1e32e60, 7, 254) = 5
std::string::compare(std::string const&) const(0x1e32e60, 0x7ffc8983a190, 7, 254) = 0xfffffffb
std::string::compare(std::string const&) const(0x7ffc8983a190, 0x1e32e60, 7, 254) = 5
write@SYS(4, "\177ELF\002\001\001", 848)         = 848
lseek@SYS(4, 40, 0)                              = 40
write@SYS(4, "\220\001", 8)                      = 8
lseek@SYS(4, 848, 0)                             = 848
lseek@SYS(4, 60, 0)                              = 60
write@SYS(4, "\a", 2)                            = 2
lseek@SYS(4, 848, 0)                             = 848
std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()(0x1e2a2e0, 0x1e2a2e8, 0x1e27978, 0x1e27978) = 0
rt_sigprocmask@SYS(2, 0x7ffc8983bb58, 0x7ffc8983bad8, 8) = 0
close@SYS(4)                                     = 0
rt_sigprocmask@SYS(2, 0x7ffc8983bad8, 0, 8)      = 0

This is from running:

ltrace -S --demangle \
   ...

The -S is to display syscalls as well as library calls. To my suprise, this seems to show calls to libstdc++ library calls, but I’m not seeing much from clang itself, just:

clang::DiagnosticsEngine::DiagnosticsEngine
clang::driver::ToolChain::getTargetAndModeFromProgramName
llvm::cl::ExpandResponseFiles
llvm::EnablePrettyStackTrace
llvm::errs
llvm::install_fatal_error_handler
llvm::llvm_shutdown
llvm::PrettyStackTraceEntry::PrettyStackTraceEntry
llvm::PrettyStackTraceEntry::~PrettyStackTraceEntry
llvm::raw_ostream::preferred_buffer_size
llvm::raw_svector_ostream::write_impl
llvm::remove_fatal_error_handler
llvm::StringMapImpl::LookupBucketFor
llvm::StringMapImpl::RehashTable
llvm::sys::PrintStackTraceOnErrorSignal
llvm::sys::Process::FixupStandardFileDescriptors
llvm::sys::Process::GetArgumentVector
llvm::TimerGroup::printAll

There’s got to be a heck of a lot more that the compiler is doing!? It turns out that ltrace doesn’t seem to trace out all the library function calls that lie in shared libraries (I’m using a shared library + split dwarf build of clang). The default output was a bit deceptive since I saw some shared lib calls, in particular the there were std::… calls (from libstc++.so) in the ltrace output. My conclusion seems to be that the tool is lying by default.

This can be confirmed by explicitly asking to see the functions from a specific shared lib. For example, if I call ltrace as:

$ ltrace -S --demangle -e @libLLVMX86CodeGen.so \
/clang/be.b226a0a/bin/clang-3.9 \
-cc1 \
-triple \
x86_64-unknown-linux-gnu \
...

Now I get ~68K calls to libLLVMX86CodeGen.so functions that didn’t show up in the default ltrace output! The ltrace tool won’t show me these by default (although the man page seems to suggest that it should), but if I narrow down what I’m looking through to a single shared lib, at least I can now examine the function calls in that shared lib.

On the SONAME

Note that the @lib….so name has to match the SONAME.  For example if the shared libraries on disk were:

libLLVMX86CodeGen.so -> libLLVMX86CodeGen.so.3
libLLVMX86CodeGen.so.3 -> libLLVMX86CodeGen.so.3.9
libLLVMX86CodeGen.so.3.9 -> libLLVMX86CodeGen.so.3.9.0

$ objdump -x libLLVMX86CodeGen.so | grep SONAME

would give you the name to use.  This becomes relevant in clang 4.0 where the SONAME ends up with .so.4 instead of just .so (when building clang with shared libs instead of archive libs).

brace matching in vim, regardless of how it is formatted?

August 31, 2016 C/C++ development and debugging. ,

DB2 functions were usually formatted with the brace on the leading line like so:

size_t table_count( T * table )
{ 
   size_t count = 0 ;
   ....
} 

For such code, typing [[ in vim anywhere from somewhere in the function text would take you to the beginning of the function. It has always annoyed me that this key sequence didn’t work for functions formatted without the leading { in the first column, such as

size_t table_count( T * table ) { 
    size_t count = 0 ;
    ....
} 

Having my handy [[ command sequence take me to the first line of the file is pretty annoying, enough that I looked up the way to do what I want. A key sequence that does part of this job is:

[{

This takes you to the outermost ending position of the current scope, and you can use % to get to the beginning of that scope. You can repeat this as many times as necessary, until you get the outermost scope.

Is there a better way to go directly to the outermost scope directly, regardless of how the function happens to be formatted?