Month: October 2016

How to invoke the 2nd pass of the clang compiler manually

October 3, 2016 clang/llvm , , , , ,

Because the clang front end reexecs itself, breakpoints on the interesting parts of the clang front end don’t get hit by default. Here’s an example

$ cat g2
b llvm::Module::setDataLayout
b BackendConsumer::BackendConsumer
b llvm::TargetMachine::TargetMachine
b llvm::TargetMachine::createDataLayout
run -mbig-endian -m64 -c bytes.c -emit-llvm -o big.bc

$ gdb `which clang`
GNU gdb (GDB) Red Hat Enterprise Linux 7.9.1-19.lz.el7
...
(gdb) source g2
Breakpoint 1 at 0x2c04c3d: llvm::Module::setDataLayout. (2 locations)
Breakpoint 2 at 0x3d08870: file /source/llvm/lib/Target/TargetMachine.cpp, line 47.
Breakpoint 3 at 0x33108ca: file /source/llvm/include/llvm/Target/TargetMachine.h, line 133.
...
Detaching after vfork from child process 15795.
[Inferior 1 (process 15789) exited normally]

(The debugger finishes and exits, hitting none of the breakpoints)

One way to deal with this is to set the fork mode to child:

(gdb) set follow-fork-mode child

An alternate way of dealing with this is to use strace to collect the command line that clang invokes itself with. For example:

$ strace -f -s 1024 -v clang -mbig-endian -m64 big.bc -c 2>&1 | grep exec | tail -2 | head -1

This provides the command line options for the self invocation of clang

[pid  4650] execve("/usr/local/bin/clang-3.9", ["/usr/local/bin/clang-3.9", "-cc1", "-triple", "aarch64_be-unknown-linux-gnu", "-emit-obj", "-mrelax-all", "-disable-free", "-main-file-name", "big.bc", "-mrelocation-model", "static", "-mthread-model", "posix", "-mdisable-fp-elim", "-fmath-errno", "-masm-verbose", "-mconstructor-aliases", "-fuse-init-array", "-target-cpu", "generic", "-target-feature", "+neon", "-target-abi", "aapcs", "-dwarf-column-info", "-debugger-tuning=gdb", "-coverage-file", "/workspace/pass/run/big.bc", "-resource-dir", "/usr/local/bin/../lib/clang/3.9.0", "-fdebug-compilation-dir", "/workspace/pass/run", "-ferror-limit", "19", "-fmessage-length", "0", "-fallow-half-arguments-and-returns", "-fno-signed-char", "-fobjc-runtime=gcc", "-fdiagnostics-show-option", "-o", "big.o", "-x", "ir", "big.bc"],

With a bit of vim tweaking you can turn this into a command line that can be executed (or debugged) directly

/usr/local/bin/clang-3.9 -cc1 -triple aarch64_be-unknown-linux-gnu -emit-obj -mrelax-all -disable-free -main-file-name big.bc -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -fuse-init-array -target-cpu generic -target-feature +neon -target-abi aapcs -dwarf-column-info -debugger-tuning=gdb -coverage-file /workspace/pass/run/big.bc -resource-dir /usr/local/bin/../lib/clang/3.9.0 -fdebug-compilation-dir /workspace/pass/run -ferror-limit 19 -fmessage-length 0 -fallow-half-arguments-and-returns -fno-signed-char -fobjc-runtime=gcc -fdiagnostics-show-option -o big.o -x ir big.bc

Note that doing this also provides a mechanism to change the compiler triple manually, which is something that I wondered how to do (since clang documents -triple as an option, but seems to ignore it). For example, I’m able to able to change -triple aarch64_be to aarch64 and get little endian object code from bytecode prepared with -mbig-endian.

speeding up clang debug and builds

October 2, 2016 clang/llvm , , , , , , ,

I found the default static library configuration of clang slow to rebuild, so I started building it with in shared mode. That loaded pretty slow in gdb, so I went looking for how to enable split dwarf, and found a nice little presentation on how to speed up clang builds.

There’s a followup blog post with some speed up conclusions.

A failure of that blog post is actually listing the cmake commands required to build with all these tricks. Using all these tricks listed there, I’m now trying the following:

mkdir -p ~/freeware
cd ~/freeware

git clone git://sourceware.org/git/binutils-gdb.git
cd binutils-gdb
./configure --prefix=$HOME/local/binutils.gold --enable-gold=default
make 
make install

cd ..
git clone git://github.com/ninja-build/ninja.git 
cd ninja
./configure.py --bootstrap
mkdir -p ~/local/ninja/bin/
cp ninja ~/local/ninja/bin/

With ninja in my PATH, I can now build clang with:

CC=clang CXX=clang++ \
cmake -G Ninja \
../llvm \
-DLLVM_USE_SPLIT_DWARF=TRUE \
-DLLVM_ENABLE_ASSERTIONS=TRUE \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_INSTALL_PREFIX=$HOME/clang39.be \
-DCMAKE_SHARED_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \
-DCMAKE_EXE_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \
-DBUILD_SHARED_LIBS=true \
-DLLVM_TARGETS_TO_BUILD=X86 \
2>&1 | tee o

ninja

ninja install

This does build way faster, both for full builds and incremental builds.

Build tree size

Dynamic libraries: 4.4 Gb. Static libraries: 19.8Gb.

Installed size

Dynamic libraries: 0.7 Gb. Static libraries: 14.7Gb.

Results: full build time.

Static libraries, non-ninja, all backends:

real    51m6.494s
user    160m47.027s
sys     8m49.429s

Dynamic libraries, ninja, split dwarf, x86 backend only:

real    26m19.360s
user    86m11.477s
sys     3m14.478s

Results: incremental build. touch lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp.

Static libraries, non-ninja, all backends:

real    2m17.709s
user    6m8.648s
sys     0m28.594s

Dynamic libraries, ninja, split dwarf, x86 backend only:

real    0m3.245s
user    0m6.104s
sys     0m0.802s

make install times

make:

real    2m6.353s
user    0m7.827s
sys     0m15.316s

ninja:

real    0m2.138s
user    0m0.420s
sys     0m0.831s

The time for rerunning a sharedlib-config ‘ninja install’ is even faster!

Results: time for gdb, b main, run, quit

Static libraries:

real    0m45.904s
user    0m32.376s
sys     0m1.787s

Dynamic libraries, with split dwarf:

real    0m44.440s
user    0m37.096s
sys     0m1.067s

This one isn’t what I would have expected. The initial gdb load time for the split-dwarf exe is almost instantaneous, however it still takes a long time to break in main and continue to that point. I guess that we are taking the hit for a lot of symbol lookup at that point, so it comes out as a wash.

Thinking about this, I noticed that the clang make system doesn’t seem to add ‘-Wl,-gdb-index’ to the link step along with the addition of -gsplit-dwarf to the compilation command line. I thought that was required to get all the deferred symbol table lookup?

Attempting to do so, I found that the insertion of an alternate linker in my PATH wasn’t enough to get clang to use it. Adding –Wl,–gdb-index into the link flags caused complaints from /usr/bin/ld! The cmake magic required was:

-DCMAKE_SHARED_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \
-DCMAKE_EXE_LINKER_FLAGS="-B$HOME/local/binutils.gold/bin -Wl,--gdb-index' \

This is in the first cmake invocation flags above, but wasn’t used for my initial 45s gdb+clang time measurements. With –gdb-index, the time for the gdb b-main, run, quit sequence is now reduced to:

real    0m10.268s
user    0m3.623s
sys     0m0.429s

A 4x reduction, which is quite nice!

Helmholtz theorem

October 1, 2016 math and physics play , , , , , , , , , , , , , ,

[Click here for a PDF of this post with nicer formatting]

This is a problem from ece1228. I attempted solutions in a number of ways. One using Geometric Algebra, one devoid of that algebra, and then this method, which combined aspects of both. Of the three methods I tried to obtain this result, this is the most compact and elegant. It does however, require a fair bit of Geometric Algebra knowledge, including the Fundamental Theorem of Geometric Calculus, as detailed in [1], [3] and [2].

Question: Helmholtz theorem

Prove the first Helmholtz’s theorem, i.e. if vector \(\BM\) is defined by its divergence

\begin{equation}\label{eqn:helmholtzDerviationMultivector:20}
\spacegrad \cdot \BM = s
\end{equation}

and its curl
\begin{equation}\label{eqn:helmholtzDerviationMultivector:40}
\spacegrad \cross \BM = \BC
\end{equation}

within a region and its normal component \( \BM_{\textrm{n}} \) over the boundary, then \( \BM \) is
uniquely specified.

Answer

The gradient of the vector \( \BM \) can be written as a single even grade multivector

\begin{equation}\label{eqn:helmholtzDerviationMultivector:60}
\spacegrad \BM
= \spacegrad \cdot \BM + I \spacegrad \cross \BM
= s + I \BC.
\end{equation}

We will use this to attempt to discover the relation between the vector \( \BM \) and its divergence and curl. We can express \( \BM \) at the point of interest as a convolution with the delta function at all other points in space

\begin{equation}\label{eqn:helmholtzDerviationMultivector:80}
\BM(\Bx) = \int_V dV’ \delta(\Bx – \Bx’) \BM(\Bx’).
\end{equation}

The Laplacian representation of the delta function in \R{3} is

\begin{equation}\label{eqn:helmholtzDerviationMultivector:100}
\delta(\Bx – \Bx’) = -\inv{4\pi} \spacegrad^2 \inv{\Abs{\Bx – \Bx’}},
\end{equation}

so \( \BM \) can be represented as the following convolution

\begin{equation}\label{eqn:helmholtzDerviationMultivector:120}
\BM(\Bx) = -\inv{4\pi} \int_V dV’ \spacegrad^2 \inv{\Abs{\Bx – \Bx’}} \BM(\Bx’).
\end{equation}

Using this relation and proceeding with a few applications of the chain rule, plus the fact that \( \spacegrad 1/\Abs{\Bx – \Bx’} = -\spacegrad’ 1/\Abs{\Bx – \Bx’} \), we find

\begin{equation}\label{eqn:helmholtzDerviationMultivector:720}
\begin{aligned}
-4 \pi \BM(\Bx)
&= \int_V dV’ \spacegrad^2 \inv{\Abs{\Bx – \Bx’}} \BM(\Bx’) \\
&= \gpgradeone{\int_V dV’ \spacegrad^2 \inv{\Abs{\Bx – \Bx’}} \BM(\Bx’)} \\
&= -\gpgradeone{\int_V dV’ \spacegrad \lr{ \spacegrad’ \inv{\Abs{\Bx – \Bx’}}} \BM(\Bx’)} \\
&= -\gpgradeone{\spacegrad \int_V dV’ \lr{
\spacegrad’ \frac{\BM(\Bx’)}{\Abs{\Bx – \Bx’}}
-\frac{\spacegrad’ \BM(\Bx’)}{\Abs{\Bx – \Bx’}}
} } \\
&=
-\gpgradeone{\spacegrad \int_{\partial V} dA’
\ncap \frac{\BM(\Bx’)}{\Abs{\Bx – \Bx’}}
}
+\gpgradeone{\spacegrad \int_V dV’
\frac{s(\Bx’) + I\BC(\Bx’)}{\Abs{\Bx – \Bx’}}
} \\
&=
-\gpgradeone{\spacegrad \int_{\partial V} dA’
\ncap \frac{\BM(\Bx’)}{\Abs{\Bx – \Bx’}}
}
+\spacegrad \int_V dV’
\frac{s(\Bx’)}{\Abs{\Bx – \Bx’}}
+\spacegrad \cdot \int_V dV’
\frac{I\BC(\Bx’)}{\Abs{\Bx – \Bx’}}.
\end{aligned}
\end{equation}

By inserting a no-op grade selection operation in the second step, the trivector terms that would show up in subsequent steps are automatically filtered out. This leaves us with a boundary term dependent on the surface and the normal and tangential components of \( \BM \). Added to that is a pair of volume integrals that provide the unique dependence of \( \BM \) on its divergence and curl. When the surface is taken to infinity, which requires \( \Abs{\BM}/\Abs{\Bx – \Bx’} \rightarrow 0 \), then the dependence of \( \BM \) on its divergence and curl is unique.

In order to express final result in traditional vector algebra form, a couple transformations are required. The first is that

\begin{equation}\label{eqn:helmholtzDerviationMultivector:800}
\gpgradeone{ \Ba I \Bb } = I^2 \Ba \cross \Bb = -\Ba \cross \Bb.
\end{equation}

For the grade selection in the boundary integral, note that

\begin{equation}\label{eqn:helmholtzDerviationMultivector:740}
\begin{aligned}
\gpgradeone{ \spacegrad \ncap \BX }
&=
\gpgradeone{ \spacegrad (\ncap \cdot \BX) }
+
\gpgradeone{ \spacegrad (\ncap \wedge \BX) } \\
&=
\spacegrad (\ncap \cdot \BX)
+
\gpgradeone{ \spacegrad I (\ncap \cross \BX) } \\
&=
\spacegrad (\ncap \cdot \BX)

\spacegrad \cross (\ncap \cross \BX).
\end{aligned}
\end{equation}

These give

\begin{equation}\label{eqn:helmholtzDerviationMultivector:721}
\boxed{
\begin{aligned}
\BM(\Bx)
&=
\spacegrad \inv{4\pi} \int_{\partial V} dA’ \ncap \cdot \frac{\BM(\Bx’)}{\Abs{\Bx – \Bx’}}

\spacegrad \cross \inv{4\pi} \int_{\partial V} dA’ \ncap \cross \frac{\BM(\Bx’)}{\Abs{\Bx – \Bx’}} \\
&-\spacegrad \inv{4\pi} \int_V dV’
\frac{s(\Bx’)}{\Abs{\Bx – \Bx’}}
+\spacegrad \cross \inv{4\pi} \int_V dV’
\frac{\BC(\Bx’)}{\Abs{\Bx – \Bx’}}.
\end{aligned}
}
\end{equation}

References

[1] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.

[2] A. Macdonald. Vector and Geometric Calculus. CreateSpace Independent Publishing Platform, 2012.

[3] Garret Sobczyk and Omar Le’on S’anchez. Fundamental theorem of calculus. Advances in Applied Clifford Algebras, 21:221–231, 2011. URL https://arxiv.org/abs/0809.4526.