i've compiled , analyzed assembly output for:
struct s{ public: int a,b,c,d,e,f,g,h,i,j,k; }; int main() { s s; std::atomic<s> as; as.store(s); return 0; }
i want see how implemented atomic store
in fact. easy when comes aligned "small" operands. but, have wider operand more complicated situation.
in other question ( atomicity on x86) @peter cordes said:
for wider operands, atomically writing new data multiple entries of struct, need protect lock accesses respect. (you may able use x86 lock cmpxchg16b retry loop atomic 16b store. note there's no way emulate without lock.)
ok, mean exactly? mean lock? especially, know lock
prefix ensures atomicity of "prefixed" instruction. especially, @peter cordes said:
you may able use x86 lock cmpxchg16b retry loop atomic 16b store
i cannot understand how possible keep atomic? ok, can imagine 16b chunk of memory can stored in atomic way? next iterations?
i hope doubts understandable because had problem express it.
i debugging above program and, on eye, magic behind atomic_store
. suppose function executes @peter cordes said. if wants, can paste here disassembled __atomic_store
you may able use x86 lock cmpxchg16b retry loop atomic 16b store
did 16b instead of 16b? oops. i'll fix part of larger edit.
that lets one 16b atomic store, read-modify-rewrite keeps retrying until compare part succeeds. can't use store more 16b atomically.
what mean lock? especially, know lock prefix ensures atomicity of "prefixed" instruction.
lock in spinlock / mutex, not lock
prefix. lock
prefix ever works on read-modify-write instructions; there no lock mov [mem], eax
atomic unaligned store or something. lock
ed bus cycles read-modify-write, documented intel in docs cmpxchg
. lock mov
store generate load, has different semantics if use on memory-mapped i/o. (a read can trigger side effects).
i've compiled , analyzed assembly output ...
why put code in main()
, , store uninitialized garbage s
as
? besides that, main
special in few ways. better write function takes arg (or affects global). , atomic<s>
needs global, not local can potentially partly optimized away, if want sure you're seeing gcc "really" does.
#include <atomic> struct s{ int a,b,c,d,e,f,g,h,i,j,k; }; // or int a[11] std::atomic<s> as; void atomic_struct_store_zero() { s s = { 0 }; // initializes members 0 as.store(s); }
this compiles to function call __atomic_store
, passing src , dest pointers , size. presumably uses lock somewhere, lock isn't part of as
. (sizeof(as)
== sizeof(s)
== 44
).
Comments
Post a Comment