Consider using xchg instead of mfence for CS stores
|Reported by:||tibbe||Owned by:|
|Type of failure:||Runtime performance bug||Test Case:|
|Related Tickets:||Differential Revisions:|
To get sequential consistency for atomicWriteIntArray# we use an mfence instruction. An alternative is to use an xchg instruction (which has an implicit lock prefix), which might have lower latency. We should check what other compilers do.