I just watched an intel-at-google talk on memory model from right around the days that they were putting in the much clarified memory ordering text into the ISA document.

These slides and the talk do a nice job of explaining the significance of a somewhat obscure point in the ISA document:

  • Locked instructions have a total order.

(See ISA Vol 3a-part1, chapter 8.2 Multiple-Processor Management)

The subtlety of this point previously escaped me, and the implication of this is the effect of lock instructions to different memory addresses will always be observed in a specific order.  I’ve always thought of intel LOCKed instructions as a way of implementing something like an powerpc LWARX/STWCX. pair, but this powerpc instruction pair has no implied ordering with respect to any other addresses other than the one that it is operating on.  On powerpc such ordering is really only possible in a pairwise fashion by inserting fencing instructions (LWFENCE and ISYNC typically).  I’m not sure of a way, on powerpc, of obtaining such a total ordering as the intel LOCKed instructions provide.