clang

stack corruption detection with clang: safe-stack notes

June 10, 2016 C/C++ development and debugging. , , ,

At LZ our development and nightly builds are done with clang, so it is of interest to check out what stack protection checking compiler options are available.  DB2 LUW used the intel compiler, which had very nice stack clobbering code.  How does clang’s fair against the intel compiler in this respect?

Clang does support a safe-stack option.  Here’s an example of some stack corrupting code:

#include <string.h>

void corrupt( char * b ) ;

int main()
{
   char b[12] ;

   corrupt( b ) ;

   return 0 ;
}

void corrupt( char * b )
{
   memset( b - 4, 'a', 20 ) ;
}

Running this without safe stack results in a SEGV on return from from corrupt():

Screen Shot 2016-06-10 at 11.08.54 AM

with safe-stack we have a “nicer” trap:

(gdb) run
Starting program: /home/pjoot/lznotes/proto/stackcorrupt2
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
__memset_sse2 () at ../sysdeps/x86_64/memset.S:415
415     L(P4Q0): mov    %edx,-0x4(%rdi)
Missing separate debuginfos, use: debuginfo-install libgcc-4.8.3-9.el7.x86_64 libstdc++-4.8.3-9.el7.x86_64
(gdb) where
#0  __memset_sse2 () at ../sysdeps/x86_64/memset.S:415
#1  0x0000000000411835 in corrupt (
    b=0x7ffff6bd2ff4 'a' <repeats 12 times>, "\177ELF\002\001\001\003")
    at stackcorrupt2.cc:16
#2  0x00000000004117f1 in main () at stackcorrupt2.cc:9

The compiler is able to catch the corruption in the act, right in the offending memset.

Limitations

What I do notice with this compiler option, is that the implementation has opted not to catch corruptions within the valid stack frame, nor is there any attempt to catch a corruption that does not walk over the return pointer. Here’s an example:

#include <string.h>

void corrupt( char * b ) ;

#define SZ 12
#if 0
   // safe-stack catches this:
   #define PRESZ 0
   #define POSTSZ 4
#else
   // safe-stack catches this buffer overwrite
   #define PRESZ 4
   #define POSTSZ 0
#endif
int main()
{
   char b[SZ] ;

   corrupt( b ) ;

   return 0 ;
}

void corrupt( char * b )
{
   memset( b - PRESZ, 'a', SZ + PRESZ + POSTSZ ) ;
}

and another non-trapping stack corruption:

#include <stdio.h>
#include <string.h>

void corrupt2( int & a, char * b, int & c ) ;

int main()
{
   int a = 0 ;
   char b[12] ;
   int c = 0 ;

   corrupt2( a, b, c ) ;

   printf( "0x%08X 0x%08X\n", a, c ) ;

   return 0 ;
}

void corrupt2( int & a, char * b, int & c )
{
   memset( b - 4, 'a', 20 ) ;
}

The intel compiler appeared to use guard bytes between stack variables, and was able to tell you exactly which stack variable was overwritten. Clang appears to be opting for a write-forbidden guard region on the stack frame, so it able to catch the corruption in the act, but only if the corruption is “big enough”. There are benefits of both approaches. Unfortunately, there are a number of restrictions in the safe-stack documentation. I’m not sure that I’ll be able to use this at all in LZ code.

First build break at the new job: C++ uniform initialization

May 12, 2016 C/C++ development and debugging. , , , , ,

Development builds at LZ are done with clang-3.8, but there is an alternate nightly build done with the older RHEL7 GCC-4.8.3 compiler (gcc is up to 6.1 now, so the RHEL7 default is truly _ancient_). This bit of code didn’t compile with gcc:

   template <typename mutex_type>
   class shared_lock
   {  
      mutex_type &      m_mutex ;

   public:

      /** construct and acquire the mutex in shared mode */
      explicit shared_lock( mutex_type & mutex )
         : m_mutex{ mutex }
      {  

The error is:

error: invalid initialization of non-const reference of type ‘lz::shared_mutex&’ from an rvalue of type ‘<brace-enclosed initializer list>’

This seems like a compiler bug to me, one that I’d seen when doing my scinet scientific computing course, which mandated the use of at least -std=c++11. In the scinet assignments, I fixed all such issues by using -std=c++14, which worked fine, but I was using gcc-5.3 for those assignments.

It appears that this is a compiler bug, and not just an issue with the c++11 language specification, as I initially thought while doing my scinet assignments. If I rebuild this code with g++-6.1, explicitly specifying -std=c++11 (GCC 6.1 defaults to c++14), then the issue goes away, so specification of -std=c++14 is not required to allow uniform initialization to work in this situation.

Because of being forced to use the older compiler, it looks like I have to fix this by using pre-c++11 syntax:

      explicit shared_lock( mutex_type & mutex )
         : m_mutex( mutex )

My conclusion is that gcc-4.8.3 is not truly up to the job of building c++11 compliant code. I’ll have to be more careful with the language features that I use in the future.

extern vs const in C++ and C code.

October 5, 2015 C/C++ development and debugging. , , , , , ,

We now build DB2 on linux ppcle with the IBM xlC 13.1.2 compiler. This version of the compiler is a hybrid compared to any previous compilers, retaining the IBM xlC backend for power, but using the clang front end. Because of this we are exposed to a large number of warnings that we don’t see with many other compilers (well we probably do for our MacOSX port, but we do not really have active development on that platform at the moment), and I’ve been trying to take down those counts to manageable levels. Header files that produce warnings have been my first target since they introduce the most repeated noise.

One message that I was seeing hundreds of was

warning: 'extern' variable has an initializer [-Wextern-initializer]

This seemed to be coming from headers that did something like:

#if defined FOO_INITIALIZE_IT_IN_SOME_SOURCE_FILE
extern const TYPE foo[] = { ... } ;
#else
extern const TYPE foo[] ;
#endif


where FOO_INITIALIZE_IT_IN_SOME_SOURCE_FILE is defined at the top of a source file that explicitly includes this header. My attempt to handle the messages was to remove the ‘extern’ from the initialization case, but I was suprised to see link errors as a result of some of those changes. It turns out that there are some subtle differences between different variations of const and extern with an array declaration of this sort. Here’s a bit of sample code:

// t.h
extern const int x[] ;
extern int y[] ;
extern int z[] ;


// t.cc
#if defined WANT_LINK_ERROR
const int x[] = { 42 } ;
#else
extern const int x[] = { 42 } ;
#endif

extern int y[] = { 42 } ;
int z[] = { 42 } ;


When WANT_LINK_ERROR isn’t defined, this produces just one clang warning message

t.cc:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
           ^

Note that the ‘extern const’ has no such warning, nor does the non-const symbol that’s been declared ‘extern’ in the header. However, removing the extern from the const case (via -DWANT_LINK_ERROR) results in no symbol ‘x’ available to other consumers. The extern is required for const symbols, but generates a warning for non-const symbols.

It appears that this is also C++ specific. A const symbol in C compiled code is available for external use, regardless of whether extern is used:



$ clang -c t.c
t.c:5:18: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern const int x[] = { 42 } ;
                 ^
t.c:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
           ^
2 warnings generated.

$ nm t.o
0000000000000000 R x
0000000000000000 D y
0000000000000004 D z

$ clang -c -DWANT_LINK_ERROR t.c
t.c:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
           ^
1 warning generated.
$  nm t.o
0000000000000000 R x
0000000000000000 D y
0000000000000004 D z


whereas that same symbol requires extern if it is const in C++:


$ clang++ -c t.cc
t.cc:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
           ^
1 warning generated.
$ nm t.o
0000000000000000 R x
0000000000000000 D y
0000000000000004 D z



$ clang++ -c -DWANT_LINK_ERROR t.cc
t.cc:8:12: warning: 'extern' variable has an initializer [-Wextern-initializer]
extern int y[] = { 42 } ;
           ^
1 warning generated.
$ nm t.o
0000000000000000 D y
0000000000000004 D z


I hadn’t expected the const to interact this way with extern. I am guessing that C++ allows for the compiler to not generate symbols for global scope const variables, unless you ask for that by using extern, whereas with C you get the symbol like-it-or-not. This particular message from the clang front end is only for non-const extern initializations, making across the board fixing of messages for extern initialization of the sort above trickier. This makes it so that you can’t do an across the board replacement of extern in initializers for a given file without first ensuring that the symbol isn’t const. It looks like dealing with this will have to be done much more carefully than I first tried.

resolving merge conflicts due to automated C to C++ comment changes

November 17, 2014 C/C++ development and debugging. , , , ,

I was faced with hundreds of merge conflicts that had the following diff3 -m conflict structure:


<<<<<<< file.C.mine
   /* Allocate memory for ProcNamePattern and memset to blank */
   /* ProcNamePattern is used as an intermediate upper case string to capture the procedure name*/
   /* Allocate space for 128 byte schema, 128 byte procedure name */
   rc = BAR(0,
            len+SCHEMA_IDENT+1,
            MEM_DEFAULT,
            &ProcNamePattern);
||||||| file.C.orig
   /* Allocate memory for ProcNamePattern and memset to blank */
   /* ProcNamePattern is used as an intermediate upper case string to capture the procedure name*/
   /* Allocate space for 128 byte schema, 128 byte procedure name */
   rc = FOO(0,
            len+SCHEMA_IDENT+1,
            (void **) &ProcNamePattern);
=======
   // Allocate memory for ProcNamePattern and memset to blank
   // ProcNamePattern is used as an intermediate upper case string to capture the procedure name
   // Allocate space for 128 byte schema, 128 byte procedure name
   rc = FOO(0,
            len+SCHEMA_IDENT+1,
            (void **) &ProcNamePattern);
>>>>>>> file.C.new
   if (rc  )
   {


I’d run a clang based source editing tool that changed FOO to BAR, added a parameter, and removed a cast. Other maintainers of the code had run a tool, or perhaps an editor macro that changed most (but not all) of the C style /* … */ comments into C++ single line comments // …

Those pairs of changes were unfortunately close enough to generate a diff3 -m conflict.

I can run my clang editing tool again (and will), but need to get the source compile-able first, so was faced with either tossing and regenerating my changes, or resolving the conflicts. Basically I needed to filter these comments in the same fashion, and then accept all of my changes, provided there were no other changes in the .orig -> .new stream.

Here’s a little perl filter I wrote for this task:

#!/usr/bin/perl -n

# a script to replace single line /* */ comments with C++ // comments in a restricted fashion.
#
# - doesn't touch comments of the form:
#                                         /* ... */ ... /* ...
#                                         ^^            ^^
# - doesn't touch comments with leading non-whitespace or trailing non-whitespace
#
# This is used to filter new/old/mine triplets in merges to deal with automated replacements of this sort.

chomp ;

if ( ! m,/\*.*/\*, )
{
   s,
^(\s*)   # restrict replacement to comments that start only after beginning of line and whitespace
/\*      # start of comment
\s*      # opt spaces
(.*)     # payload
\s*      # opt spaces
\*/      # end of comment
\s*$     # opt spaces and end of line
,$1// $2,x ;
}

#s,/\* *(.*)(?=\*/ *$)\*/ *$,// $1, ;

print "$_\n" ;

This consumes stdin, and spits out stdout, making the automated change that had been applied to the code. I didn’t want it to do anything with comments of any of the forms:

  • [non-whitespace] /* … */
  • /* … */ … non-whitespace
  • /* … */ … /* … */

Since the comment filtering that’s now in the current version of the files didn’t do this, and I didn’t want to introduce more conflicts due to spacing changes.

With this filter run on all the .mine, .orig, .new versions I was able to rerun

diff3 -m file.mine file.orig file.new

and have only a few actual conflicts to deal with. Most of those were also due to space change and in some cases comment removal.

A lot of this trouble stems from the fact that our product has no coding standards for layout (or very little, or ones that are component specific). I maintain our coding standards for correctness, but when I was given these standards as fairly green developer I didn’t have the guts to take on the role of coding standards dictator, and impose style guidelines on developers very much my senior.

Without style guidelines, a lot of these sorts of merge conflicts could be avoided or minimized significantly if we would only make automated changes with tools that everybody could (or must) run. That would allow conflict filtering of those sort to be done automatically, without having to write ad-hoc tools to “re-play” the automated change in a subset of the merge contributors.

My use of a clang rewriter flies in the face of this ideal conflict avoidance strategy since our build environment is not currently tooled up for our development to do so. However, in this case, being able to do robust automated maintenance ill hopefully be worth the conflicts that this itself will inject.