C/C++ development and debugging.

gdb pretty print of structures

March 9, 2017 C/C++ development and debugging. No comments , , , ,

Here’s a nice little gdb trick for displaying structure contents in a less compact format

(gdb) set print pretty on
(gdb) p dd[0]
$4 = {
  jfcb = {
    datasetName = "PJOOT.NVS1", ' ' <repeats 34 times>,
    vols = {"<AAAiW", "\000\000\000\000\000", "\000\000\000\000\000", "\000\000\000\000\000", "\000\000\000\000\000"},
  block_size = 800,
  device_class = 32 '\040',
  device_type = 15 '\017',
  disp_normal = 8 '\010',
  disp_cond = 8 '\010',
  volsers = 0x7fb71801ecd6 "<AAAiW",

compare this to the dense default

(gdb) set print pretty on
(gdb) p dd[0]
$5 = {jfcb = {datasetName = "PJOOT.NVS1", ... vols = {"<AAAiW", "\000\000\000\000\000", "\000\000\000\000\000", "\000\000\000\000\000", "\000\000\000\000\000"}, block_size = 800, block_size_limit = 0, device_class = 32 '\040', device_type = 15 '\017', disp_normal = 8 '\010', disp_cond = 8 '\010', volsers = 0x7fb71801ecd6 "<AAAiW"}

For really big structures (this one actually is, but I’ve pruned a bunch of stuff), this makes the structure print display a whole lot more readable. Additionally, if you combine this with ‘(gdb) set logging on’, then with pretty print enabled you can prune the output by line easily to see just what you want.

peeking into relocation of function static in shared library

February 27, 2017 C/C++ development and debugging. 2 comments , , , , , , , ,

Here’s GUI (TUI) output of a function with a static variable access:

B+ |0x7ffff7616800 <st32>           test   %edi,%edi                                                                                                                     |
   |0x7ffff7616802 <st32+2>         je     0x7ffff7616811 <st32+17>                                                                                                      |
   |0x7ffff7616804 <st32+4>         mov    %edi,%eax                                                                                                                     |
   |0x7ffff7616806 <st32+6>         bswap  %eax                                                                                                                          |
   |0x7ffff7616808 <st32+8>         mov    %eax,0x200852(%rip)        # 0x7ffff7817060 <st32.yst32>                                                                        |
   |0x7ffff761680e <st32+14>        mov    %edi,%eax                                                                                                                     |
   |0x7ffff7616810 <st32+16>        retq                                                                                                                                 |
  >|0x7ffff7616811 <st32+17>        mov    0x200849(%rip),%edi        # 0x7ffff7817060 <st32.yst32>                                                                        |
   |0x7ffff7616817 <st32+23>        bswap  %edi                                                                                                                          |
   |0x7ffff7616819 <st32+25>        mov    %edi,%eax                                                                                                                     |
   |0x7ffff761681b <st32+27>        retq                                                                                                                                 |
   |0x7ffff761681c <_fini>          sub    $0x8,%rsp                                                                                                                     |
   |0x7ffff7616820 <_fini+4>        add    $0x8,%rsp                                                                                                                     |
   |0x7ffff7616824 <_fini+8>        retq                                                                                                                                 |
   |0x7ffff7616825                  add    %al,(%rcx)                                                                                                                    |
   |0x7ffff7616827 <x16+1>          add    (%rcx),%al                                                                                                                    |

The associated code is:

int st32( int v ) {
    static int yst32 = 0x1a2b3c4d;

    if ( v ) {
        yst32 = v;

    return yst32;

The object code dump (prior to relocation) just has zeros in the offset for the variable:

$ objdump -d g.bs.o | grep -A12 '<st32>'
0000000000000050 <st32>:
  50:   85 ff                   test   %edi,%edi
  52:   74 0d                   je     61 <st32+0x11>
  54:   89 f8                   mov    %edi,%eax
  56:   0f c8                   bswap  %eax
  58:   89 05 00 00 00 00       mov    %eax,0x0(%rip)        # 5e <st32+0xe>
  5e:   89 f8                   mov    %edi,%eax
  60:   c3                      retq   
  61:   8b 3d 00 00 00 00       mov    0x0(%rip),%edi        # 67 <st32+0x17>
  67:   0f cf                   bswap  %edi
  69:   89 f8                   mov    %edi,%eax
  6b:   c3                      retq   

The linker has filled in the real offsets in question, and the dynamic loader has collaborated to put the data segment in the desired location.

The observant reader may notice bwsap instructions in the listings above that don’t make sense for x86_64 code. That is because this code is compiled with an LLVM pass that performs byte swapping at load and store points, making it big endian in a limited fashion.

The book Linkers and Loaders has some nice explanation of how relocation works, but I wanted to see the end result first hand in the debugger. It turned out that my naive expectation that the sum of $rip and the constant relocation factor is the address of the global variable (actually static in this case) is incorrect. Check that out in the debugger:

(gdb) p /x 0x200849+$rip
$1 = 0x7ffff781705a

(gdb) x/10 $1
0x7ffff781705a <gy+26>: 0x22110000      0x2b1a4433      0x00004d3c      0x00000000
0x7ffff781706a: 0x00000000      0x00000000      0x00000000      0x30350000
0x7ffff781707a: 0x20333236      0x64655228

My magic value 0x1a2b3c4d looks like it is 6 bytes into the $rip + 0x200849 location that the disassembly appears to point to, and that is in fact the case:

(gdb) x/10 $1+6
0x7ffff7817060 <st32.yst32>:      0x4d3c2b1a      0x00000000      0x00000000      0x00000000
0x7ffff7817070 <y32>:   0x00000000      0x00000000      0x32363035      0x52282033
0x7ffff7817080: 0x48206465      0x34207461

My guess was the mysterious offset of 6 required to actually find this global address was the number of bytes in the MOV instruction, and sure enough that MOV is 6 bytes long:

(gdb) disassemble /r
Dump of assembler code for function st32:
   0x00007ffff7616800 <+0>:     85 ff   test   %edi,%edi
   0x00007ffff7616802 <+2>:     74 0d   je     0x7ffff7616811 <st32+17>
   0x00007ffff7616804 <+4>:     89 f8   mov    %edi,%eax
   0x00007ffff7616806 <+6>:     0f c8   bswap  %eax
   0x00007ffff7616808 <+8>:     89 05 52 08 20 00       mov    %eax,0x200852(%rip)        # 0x7ffff7817060 <st32.yst32>
   0x00007ffff761680e <+14>:    89 f8   mov    %edi,%eax
   0x00007ffff7616810 <+16>:    c3      retq
=> 0x00007ffff7616811 <+17>:    8b 3d 49 08 20 00       mov    0x200849(%rip),%edi        # 0x7ffff7817060 <st32.yst32>
   0x00007ffff7616817 <+23>:    0f cf   bswap  %edi
   0x00007ffff7616819 <+25>:    89 f8   mov    %edi,%eax
   0x00007ffff761681b <+27>:    c3      retq
End of assembler dump.

So, it appears that the %rip reference in the disassembly is really the value of the instruction pointer after the instruction executes, which is curious.

Note that this 4 byte relocation requires the shared library code segment and the shared library data segment be separated by no more than 4G. The linux dynamic loader has put all the segments back to back so that this is the case. This can be seen from /proc/PID/maps for the process:

$ ps -ef | grep maindl
pjoot    17622 17582  0 10:50 pts/3    00:00:00 /home/pjoot/workspace/pass/global/maindl libglobtestbs.so

$ grep libglob /proc/17622/maps
7ffff7616000-7ffff7617000 r-xp 00000000 fc:00 2492653                    /home/pjoot/workspace/pass/global/libglobtestbs.so
7ffff7617000-7ffff7816000 ---p 00001000 fc:00 2492653                    /home/pjoot/workspace/pass/global/libglobtestbs.so
7ffff7816000-7ffff7817000 r--p 00000000 fc:00 2492653                    /home/pjoot/workspace/pass/global/libglobtestbs.so
7ffff7817000-7ffff7818000 rw-p 00001000 fc:00 2492653                    /home/pjoot/workspace/pass/global/libglobtestbs.so

We’ve got a read-execute mmap region, where the code lies, and a read-write mmap region for the data. There’s a read-only segment which I presume is for read only global variables (my shared lib has one such variable and we have one page worth of space allocated for read only memory).

I wonder what the segment that has none of the read, write, nor execute permissions set is?

New book for work: Linkers and Loaders

February 24, 2017 C/C++ development and debugging. No comments , ,

Fresh off the press:

I got this book to get some background on relocation of ELF globals, and was surprised to find a bit on z/OS (punch card compatible!) object format layout:

… an interesting bonus that’s topical.

gdb set target-charset

January 9, 2017 C/C++ development and debugging. No comments , , ,

I was looking for a way to convert ASCII and EBCDIC strings in gdb debugging sessions and was experimenting with gdb python script extensions. I managed to figure out how to add my own command that read a gdb variable, and print it out, but it failed when I tried to run a character conversion function. In the process of debugging that char encoding error, I found that there’s a built in way to do exactly what I wanted to do:

(gdb) p argv[0]
$16 = 0x7fd8fbda0108 "\323\326\303\301\323\305\303\326"
(gdb) set target-charset EBCDIC-US
(gdb) p argv[0]
$17 = 0x7fd8fbda0108 "LOCALECO"
(gdb) set target-charset ASCII
(gdb) p argv[0]
$18 = 0x7fd8fbda0108 "\323\326\303\301\323\305\303\326"

A wierd way to invoke the compiler

November 18, 2016 C/C++ development and debugging. 2 comments ,

Did you know that you can run the compiler from sources specified in stdin?  Here’s an example:

// m.c
#include <stdio.h>

int main()
    int x = 3;
    printf("%d\n", x);
    return 0;

You have to specify the language for the code explicitly, since that can’t be inferred from the filename when that file data is coming from stdin:

$ cat m.c | clang -g -x c - -o f
$ ./f

This fact came up in conversation the other day. The result is something that is completely undebuggable, but you can do it! I’m curious if there’s actually a use case for this?

Another Linux shared library trace facility

October 27, 2016 C/C++ development and debugging. No comments , ,

I previously blogged about a way to force ltrace to show some shared memory trace records that didn’t show up by default.

Where that fails to be useful, is when you don’t have a guess about what shared library the code in question lives in. I just blundered on the latrace command that uses a Linux dynamic loader audit facility to give a complete trace of all the function-name/library-name pairs that are executed!

Here’s an example invocation:

latrace \
clang xx.c -c 2>&1 | c++filt

without output like:

 9022     std::operator&(std::memory_order, std::__memory_order_modifier) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMAnalysis.so]
 9022     std::operator&(std::memory_order, std::__memory_order_modifier) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMAnalysis.so]
 9022     strlen [/lib64/libc.so.6]
 9022     strlen [/lib64/libc.so.6]
 9022     strlen [/lib64/libc.so.6]
 9022     llvm::cl::basic_parser::basic_parser(llvm::cl::Option&) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMSupport.so]
 9022     strlen [/lib64/libc.so.6]
 9022     llvm::cl::Option::setArgStr(llvm::StringRef) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMSupport.so]
 9022     strlen [/lib64/libc.so.6]
 9022     std::pair::__type, std::__decay_and_strip::__type> std::make_pair(void const**&&, bool&&) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMX86CodeGen.so]
 9022       void const**&& std::forward(std::remove_reference::type&) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMX86CodeGen.so]
 9022       bool&& std::forward(std::remove_reference::type&) []
 9022       void const**&& std::forward(std::remove_reference::type&) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMX86CodeGen.so]
 9022       bool&& std::forward(std::remove_reference::type&) []
 9022     void const**&& std::forward(std::remove_reference::type&) [/home/pjoot/clang/be.5e0ac1f.lz31/bin/../lib/libLLVMX86CodeGen.so]

With this latrace command, we get all the shared library function call names and their corresponding shared library names. Using that info we can dig into a specific shared library with ltrace or the debugger, once a point of interest is determined.

using ltrace to dig into shared libraries

October 19, 2016 C/C++ development and debugging., clang/llvm 1 comment , ,

I was trying to find where the clang compiler is writing out constant global data values, and didn’t manage to find it by code inspection. If I run ltrace (also tracing system calls), I see the point where the ELF object is written out:

std::string::compare(std::string const&) const(0x7ffc8983a190, 0x1e32e60, 7, 254) = 5
std::string::compare(std::string const&) const(0x1e32e60, 0x7ffc8983a190, 7, 254) = 0xfffffffb
std::string::compare(std::string const&) const(0x7ffc8983a190, 0x1e32e60, 7, 254) = 5
write@SYS(4, "\177ELF\002\001\001", 848)         = 848
lseek@SYS(4, 40, 0)                              = 40
write@SYS(4, "\220\001", 8)                      = 8
lseek@SYS(4, 848, 0)                             = 848
lseek@SYS(4, 60, 0)                              = 60
write@SYS(4, "\a", 2)                            = 2
lseek@SYS(4, 848, 0)                             = 848
std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()(0x1e2a2e0, 0x1e2a2e8, 0x1e27978, 0x1e27978) = 0
rt_sigprocmask@SYS(2, 0x7ffc8983bb58, 0x7ffc8983bad8, 8) = 0
close@SYS(4)                                     = 0
rt_sigprocmask@SYS(2, 0x7ffc8983bad8, 0, 8)      = 0

This is from running:

ltrace -S --demangle \

The -S is to display syscalls as well as library calls. To my suprise, this seems to show calls to libstdc++ library calls, but I’m not seeing much from clang itself, just:


There’s got to be a heck of a lot more that the compiler is doing!? It turns out that ltrace doesn’t seem to trace out all the library function calls that lie in shared libraries (I’m using a shared library + split dwarf build of clang). The default output was a bit deceptive since I saw some shared lib calls, in particular the there were std::… calls (from libstc++.so) in the ltrace output. My conclusion seems to be that the tool is lying by default.

This can be confirmed by explicitly asking to see the functions from a specific shared lib. For example, if I call ltrace as:

$ ltrace -S --demangle -e @libLLVMX86CodeGen.so \
/clang/be.b226a0a/bin/clang-3.9 \
-cc1 \
-triple \
x86_64-unknown-linux-gnu \

Now I get ~68K calls to libLLVMX86CodeGen.so functions that didn’t show up in the default ltrace output! The ltrace tool won’t show me these by default (although the man page seems to suggest that it should), but if I narrow down what I’m looking through to a single shared lib, at least I can now examine the function calls in that shared lib.


Note that the @lib….so name has to match the SONAME.  For example if the shared libraries on disk were:

libLLVMX86CodeGen.so -> libLLVMX86CodeGen.so.3
libLLVMX86CodeGen.so.3 -> libLLVMX86CodeGen.so.3.9
libLLVMX86CodeGen.so.3.9 -> libLLVMX86CodeGen.so.3.9.0

$ objdump -x libLLVMX86CodeGen.so | grep SONAME

would give you the name to use.  This becomes relevant in clang 4.0 where the SONAME ends up with .so.4 instead of just .so (when building clang with shared libs instead of archive libs).

brace matching in vim, regardless of how it is formatted?

August 31, 2016 C/C++ development and debugging. No comments ,

DB2 functions were usually formatted with the brace on the leading line like so:

size_t table_count( T * table )
   size_t count = 0 ;

For such code, typing [[ in vim anywhere from somewhere in the function text would take you to the beginning of the function. It has always annoyed me that this key sequence didn’t work for functions formatted without the leading { in the first column, such as

size_t table_count( T * table ) { 
    size_t count = 0 ;

Having my handy [[ command sequence take me to the first line of the file is pretty annoying, enough that I looked up the way to do what I want. A key sequence that does part of this job is:


This takes you to the outermost ending position of the current scope, and you can use % to get to the beginning of that scope. You can repeat this as many times as necessary, until you get the outermost scope.

Is there a better way to go directly to the outermost scope directly, regardless of how the function happens to be formatted?

Playing with c++11 and posix regular expression libraries

July 24, 2016 C/C++ development and debugging. No comments , , , , , , , , ,

I was curious how the c++11 std::regex interface compared to the C posix regular expression library. The c++11 interfaces are almost as easy to use as perl. Suppose we have some space separated fields that we wish to manipulate, showing an order switch and the original:

my @strings = ( "hi bye", "hello world", "why now", "one two" ) ;

foreach ( @strings )
   s/(\S+)\s+(\S+)/'$&' -> '$2 $1'/ ;

   print "$_\n" ;

The C++ equivalent is

   const char * strings[] { "hi bye", "hello world", "why now", "one two" } ;

   std::regex re( R"((\S+)\s+(\S+))" ) ;

   for ( auto s : strings )
      std::cout << regex_replace( s, re, "'$&' -> '$2 $1'\n" )  ;

We have one additional step with the C++ code, compiling the regular expression. Precompilation of perl regular expressions is also possible, but that is usually just as performance optimization.

The posix equivalent requires precompilation too

void posixre_error( regex_t * pRe, int rc )
   char buf[ 128 ] ;

   regerror( rc, pRe, buf, sizeof(buf) ) ;

   fprintf( stderr, "regerror: %s\n", buf ) ;
   exit( 1 ) ;

void posixre_compile( regex_t * pRe, const char * expression )
   int rc = regcomp( pRe, expression, REG_EXTENDED ) ;
   if ( rc )
      posixre_error( pRe, rc ) ;

but the transform requires more work:

void posixre_transform( regex_t * pRe, const char * input )
   constexpr size_t N{3} ;
   regmatch_t m[N] {} ;

   int rc = regexec( pRe, input, N, m, 0 ) ;

   if ( rc && (rc != REG_NOMATCH) )
      posixre_error( pRe, rc ) ;

   if ( !rc )
      printf( "'%s' -> ", input ) ;
      int len ;
      len = m[2].rm_eo - m[2].rm_so ; printf( "'%.*s ", len, &input[ m[2].rm_so ] ) ;
      len = m[1].rm_eo - m[1].rm_so ; printf( "%.*s'\n", len, &input[ m[1].rm_so ] ) ;

To get at the capture expressions we have to pass an array of regmatch_t’s. The first element of that array is the entire match expression, and then we get the captures after that. The awkward thing to deal with is that the regmatch_t is a structure containing the start end end offset within the string.

If we want more granular info from the c++ matcher, it can also provide an array of capture info. We can also get info about whether or not the match worked, something we can do in perl easily

my @strings = ( "hi bye", "helloworld", "why now", "onetwo" ) ;

foreach ( @strings )
   if ( s/(\S+)\s+(\S+)/$2 $1/ )
      print "$_\n" ;

This only prints the transformed line if there was a match success. To do this in C++ we can use regex_match

const char * pattern = R"((\S+)\s+(\S+))" ;

std::regex re( pattern ) ;

for ( auto s : strings )
   std::cmatch m ;

   if ( regex_match( s, m, re ) )
      std::cout << m[2] << ' ' << m[1] << '\n' ;

Note that we don’t have to mess around with offsets as was required with the Posix C interface, and also don’t have to worry about the size of the capture match array, since that is handled under the covers. It’s not too hard to do wrap the posix C APIs in a C++ wrapper that makes it about as easy to use as the C++ regex code, but unless you are constrained to using pre-C++11 code and can also live with a Unix only restriction. There are also portability issues with the posix APIs. For example, the perl-style regular expressions like:

   R"((\S+)(\s+)(\S+))" ) ;

work fine with the Linux regex API, but that appears to be an exception. To make code using that regex work on Mac, I had to use strict posix syntax


Actually using the Posix C interface, with a portability constraint that avoids the Linux regex extensions, would be horrendous.

Notes on “memory and resources” of Stroustrup’s “The C++ Programming Language”.

July 21, 2016 C/C++ development and debugging. No comments , , , , , , ,

Some chapter 34 notes.


There’s a fixed size array type designed to replace raw C style arrays. It doesn’t appear that it is bounds checked by default, and the Xcode7 (clang) compiler doesn’t do bounds checking for it right now. Here’s an example

#include <array>

using a10 = std::array<int, 10> ;

void foo( a10 & a )
   a[3] = 7 ;
   a[13] = 7 ;

void bar( int * a )
   a[3] = 7 ;
   a[13] = 7 ;

The generated asm for both of these is identical

$ gobjdump -d --reloc -C --no-show-raw-insn d.o

d.o:     file format mach-o-x86-64

Disassembly of section .text:

0000000000000000 <foo(std::__1::array<int, 10ul>&)>:
   0:   push   %rbp
   1:   mov    %rsp,%rbp
   4:   movl   $0x7,0xc(%rdi)
   b:   movl   $0x7,0x34(%rdi)
  12:   pop    %rbp
  13:   retq   
  14:   data16 data16 nopw %cs:0x0(%rax,%rax,1)

0000000000000020 <bar(int*)>:
  20:   push   %rbp
  21:   mov    %rsp,%rbp
  24:   movl   $0x7,0xc(%rdi)
  2b:   movl   $0x7,0x34(%rdi)
  32:   pop    %rbp
  33:   retq   
  34:   data16 data16 nopw %cs:0x0(%rax,%rax,1)

The foo() function here is also not compile-time bounds checked if the out of bounds access is changed to

   a.at(13) = 7 ;

however, this does at least generate an out of bounds error

$ ./d
libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: array::at
Abort trap: 6

Even though we don’t get compile-time bounds checking (at least with the current clang compiler), array has the nice advantage of knowing its own size, so you can’t screw it up:

void blah( a10 & a )
   a[0] = 1 ;

   for ( int i{1} ; i < a.size() ; i++ )
      a[i] = 2 * a[i-1] ;

bitset and vector bool

The bitset class provides a fixed size bit array that appears to be formed from an array of register sized words. On a 64-bit platform (mac+xcode 7) I’m seeing that sizeof() == 8 for <= 64 bits, and doubles after that for <= 128 bits.

The code for something like the following (set two bits), is pretty decent, basically a single or immediate instruction:

using b70 = std::bitset<70> ;

void foo( b70 & v )
   v[3] = 1 ;
   v[13] = 1 ;

Array access operators are provided to access each bit position:

   for ( int i{} ; i < v.size() ; i++ )
      char sep{ ' ' } ;
      if ( ((i+1) % 8) == 0 )
         sep = '\n' ;

      std::cout << v[i] << sep ;
   std::cout << '\n' ;

There is no range-for support built in for this class. I was able to implement a wrapper that allowed that using a wrapper class

template <int N>
struct iter ;

template <int N>
struct mybits : public std::bitset<N>
   using T = std::bitset<N> ;

   using T::T ;
   using T::size ;

   inline iter<N> begin( ) ;

   inline iter<N> end( ) ;
} ;

and a helper iterator

template <int N>
struct iter
   unsigned pos{} ;
   const mybits<N> & b ;

   iter( const mybits<N> & bits, unsigned p = {} ) : pos{p}, b{bits} {}

   const iter & operator++()
      pos++ ;

      return *this ;

   bool operator != ( const iter & i ) const
      return pos != i.pos ;

   int operator*() const
      return b[ pos ] ;
} ;

plus the begin and end function bodies required for the loop

template <int N>
inline iter<N> mybits<N>::begin( )
   return iter<N>( *this ) ;

template <int N>
inline iter<N> mybits<N>::end( )
   return iter<N>( *this, size() ) ;

I’m not sure what the rationale for not including such range for support is, when std::vector has exactly that? vector is a vector specialization that is also supposed to be compact, but unlike bitset, allows for a variable sized bit array.

bitset also has a number of handy type conversion operators that vector does not (to string, and string to integer)


The std::tuple type generalizes std::pair, allowing for easy structures of N different types.

I saw that tuple has a tie method that allows it to behave very much like a perl array assignment. Such an assignment looks like


my ($a, $b, $c) = foo() ;

printf( "%0.1f $b $c\n", $a ) ;

exit 0 ;

sub foo
   return (1.0, "blah", 3) ;

A similar C++ equivalent is more verbose

#include <tuple>
#include <stdio.h>

using T = std::tuple<float, const char *, int> ;

T foo()
   return std::make_tuple( 1.0, "blah", 3 ) ;

int main()
   float f ;
   const char * k ;
   int i ;

   std::tie( f, k, i ) = foo() ;

   printf("%f %s %d\n", f, k, i ) ;

   return 0 ;

I was curious how the code that accepts a tuple return using tie, using different variables (as above), and using a structure return differed

struct S
   float f ;
   const char * s ;
   int i ;
} ;

S bar()
   return { 1.0, "blah", 3 } ;

In each case, using -O2 and the Xcode 7 compiler (clang), a printf function similar to the above ends up looking pretty much uniformly like:

$ gobjdump -d --reloc -C --no-show-raw-insn u.o 

0000000000000110 <h()>:
 110:   push   %rbp
 111:   mov    %rsp,%rbp
 114:   sub    $0x20,%rsp
 118:   lea    -0x18(%rbp),%rdi
 11c:   callq  121 <h()+0x11>
                        11d: BRANCH32   foo()
 121:   mov    -0x10(%rbp),%rsi
 125:   mov    -0x8(%rbp),%edx
 128:   movss  -0x18(%rbp),%xmm0
 12d:   cvtss2sd %xmm0,%xmm0
 131:   lea    0xd(%rip),%rdi        # 145 <h()+0x35>
                        134: DISP32     .cstring-0x145
 138:   mov    $0x1,%al
 13a:   callq  13f <h()+0x2f>
                        13b: BRANCH32   printf
 13f:   add    $0x20,%rsp
 143:   pop    %rbp
 144:   retq   

The generated code is pretty much dominated by the stack pushing required for the printf call. I used printf here instead of std::cout because the generated code for std::cout is so crappy looking (and verbose).


Reading the section on shared_ptr, it wasn’t obvious that it was a thread safe interface. I wondered if some sort of specialization was required to make the reference counting thread safe. It appears that thread safety is built in

This can also be seen in the debugger (assuming the gcc libstdc++ is representitive)

Breakpoint 1, main () at sharedptr.cc:33
33    std::shared_ptr<T> p = std::make_shared<T>() ;
Missing separate debuginfos, use: debuginfo-install libgcc-4.8.5-4.el7.x86_64 libstdc++-4.8.5-4.el7.x86_64
(gdb) n
35    foo( p ) ;
(gdb) s
std::shared_ptr<T>::shared_ptr (this=0x7fffffffe060) at /usr/include/c++/4.8.2/bits/shared_ptr.h:103
103         shared_ptr(const shared_ptr&) noexcept = default;
(gdb) s
std::__shared_ptr<T, (__gnu_cxx::_Lock_policy)2>::__shared_ptr (this=0x7fffffffe060) at /usr/include/c++/4.8.2/bits/shared_ptr_base.h:779
779         __shared_ptr(const __shared_ptr&) noexcept = default;
(gdb) s
std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count (this=0x7fffffffe068, __r=...)
    at /usr/include/c++/4.8.2/bits/shared_ptr_base.h:550
550         : _M_pi(__r._M_pi)
(gdb) s
552      if (_M_pi != 0)
(gdb) s
553        _M_pi->_M_add_ref_copy();
(gdb) s
std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_add_ref_copy (this=0x607010) at /usr/include/c++/4.8.2/bits/shared_ptr_base.h:131
131         { __gnu_cxx::__atomic_add_dispatch(&_M_use_count, 1); }

This was looking at a call of the following form

using Tp = std::shared_ptr<T> ;

void foo( Tp p ) ;

int main()
   std::shared_ptr<T> p = std::make_shared<T>() ;

   foo( p ) ;

   return 0 ;