I was faced with hundreds of merge conflicts that had the following diff3 -m conflict structure:


<<<<<<< file.C.mine
   /* Allocate memory for ProcNamePattern and memset to blank */
   /* ProcNamePattern is used as an intermediate upper case string to capture the procedure name*/
   /* Allocate space for 128 byte schema, 128 byte procedure name */
   rc = BAR(0,
            len+SCHEMA_IDENT+1,
            MEM_DEFAULT,
            &ProcNamePattern);
||||||| file.C.orig
   /* Allocate memory for ProcNamePattern and memset to blank */
   /* ProcNamePattern is used as an intermediate upper case string to capture the procedure name*/
   /* Allocate space for 128 byte schema, 128 byte procedure name */
   rc = FOO(0,
            len+SCHEMA_IDENT+1,
            (void **) &ProcNamePattern);
=======
   // Allocate memory for ProcNamePattern and memset to blank
   // ProcNamePattern is used as an intermediate upper case string to capture the procedure name
   // Allocate space for 128 byte schema, 128 byte procedure name
   rc = FOO(0,
            len+SCHEMA_IDENT+1,
            (void **) &ProcNamePattern);
>>>>>>> file.C.new
   if (rc  )
   {


I’d run a clang based source editing tool that changed FOO to BAR, added a parameter, and removed a cast. Other maintainers of the code had run a tool, or perhaps an editor macro that changed most (but not all) of the C style /* … */ comments into C++ single line comments // …

Those pairs of changes were unfortunately close enough to generate a diff3 -m conflict.

I can run my clang editing tool again (and will), but need to get the source compile-able first, so was faced with either tossing and regenerating my changes, or resolving the conflicts. Basically I needed to filter these comments in the same fashion, and then accept all of my changes, provided there were no other changes in the .orig -> .new stream.

Here’s a little perl filter I wrote for this task:

#!/usr/bin/perl -n

# a script to replace single line /* */ comments with C++ // comments in a restricted fashion.
#
# - doesn't touch comments of the form:
#                                         /* ... */ ... /* ...
#                                         ^^            ^^
# - doesn't touch comments with leading non-whitespace or trailing non-whitespace
#
# This is used to filter new/old/mine triplets in merges to deal with automated replacements of this sort.

chomp ;

if ( ! m,/\*.*/\*, )
{
   s,
^(\s*)   # restrict replacement to comments that start only after beginning of line and whitespace
/\*      # start of comment
\s*      # opt spaces
(.*)     # payload
\s*      # opt spaces
\*/      # end of comment
\s*$     # opt spaces and end of line
,$1// $2,x ;
}

#s,/\* *(.*)(?=\*/ *$)\*/ *$,// $1, ;

print "$_\n" ;

This consumes stdin, and spits out stdout, making the automated change that had been applied to the code. I didn’t want it to do anything with comments of any of the forms:

  • [non-whitespace] /* … */
  • /* … */ … non-whitespace
  • /* … */ … /* … */

Since the comment filtering that’s now in the current version of the files didn’t do this, and I didn’t want to introduce more conflicts due to spacing changes.

With this filter run on all the .mine, .orig, .new versions I was able to rerun

diff3 -m file.mine file.orig file.new

and have only a few actual conflicts to deal with. Most of those were also due to space change and in some cases comment removal.

A lot of this trouble stems from the fact that our product has no coding standards for layout (or very little, or ones that are component specific). I maintain our coding standards for correctness, but when I was given these standards as fairly green developer I didn’t have the guts to take on the role of coding standards dictator, and impose style guidelines on developers very much my senior.

Without style guidelines, a lot of these sorts of merge conflicts could be avoided or minimized significantly if we would only make automated changes with tools that everybody could (or must) run. That would allow conflict filtering of those sort to be done automatically, without having to write ad-hoc tools to “re-play” the automated change in a subset of the merge contributors.

My use of a clang rewriter flies in the face of this ideal conflict avoidance strategy since our build environment is not currently tooled up for our development to do so. However, in this case, being able to do robust automated maintenance ill hopefully be worth the conflicts that this itself will inject.