メモリモデルとThin-air read

C++11メモリモデルと、atomic変数relaxedメモリ操作で起こりえる "Thin-air read" への対策について。

注意：本記事は内容を理解して書いたわけでなく、関連情報の単なるメモ。

"Thin-air read"（または "out-of-thin-air value"）は定式化されたメモリモデルで起こりえる、存在しない値*1のreadを許容する動作。（後述N3132のサンプルコードが分かり易い）
C++11メモリモデルやJavaメモリモデルでは共に "Thin-air read" を禁止している。Javaでは複雑なメモリモデル定義によって対処し、C++11では（Javaに比べて）単純なメモリモデル定義＋禁止条項の追加にて対処している。

(PDF)N3132 Mathematizing C++ Concurrency: The Post-Rapperswil Modelより言及箇所を引用。

3. Exploring the model by example
Thin-air reads The model defined here allows thin-air reads: the program below reads the value 1, yet there is no occurrence of 1 in the program source.*2
int main() {
  int r1, r2;
  atomic_int x = 0;
  atomic_int y = 0;
  {{{ { r1 = x.load(mo_relaxed));
        y.store(r1,mo_relaxed); }
  ||| { r2 = y.load(mo_relaxed));
        x.store(r2,mo_relaxed); }
  }}}
  return 0; }
（図は省略）
An execution like this would be surprising, and in fact would not happen with typical hardware and compilers. In the Java Memory Model [MPA05], much of the complexity of the model arises from the desire to outlaw thin-air reads, which there is essential to prevent forging of pointers. The C++0x final committee draft attempts to forbid such executions as well, but the restrictions it imposes to that end seem to have unfortunate consequences, as we discuss in Section 5; they are not incorporated into the formal model presented here.

5. From standard to formalisation and back
Overlapping executions and thin-air reads The N3092 draft standard attempts to forbid thin air reads, with: An atomic store shall only store a value that has been computed from constants and program input values by a finite sequence of program evaluations, such that each evaluation observes the values of variables as computed by the last prior assignment in the sequence. This seems to be overly constraining. For example, two subexpression evaluations (in separate threads) can overlap (e.g. if they are the arguments of a function call) and can contain multiple actions. With relaxed atomics there can be consistent executions in which it is impossible to disentangle the two into any sequence, for example as below, where the SC-write*3 of x must be between the two reads of x. In our formalisation we currently do not impose any thin-air condition.
int main() {
  atomic_int x = 0;
  int y;
  {{{ x.store(1);
  ||| { y = (x.load()==x.load()); }
  }}};
  return 0; }
（図は省略）
The Final Committee Draft defines the model in terms of actions and evaluations of expressions. Our model does not keep track of which evaluations give rise to which actions, and instead considers only actions.

上記で言及しているN3092と同様、N3337にも同一条項が存在する。N3337 29.3/p9-10より引用。

9 An atomic store shall only store a value that has been computed from constants and program input values by a finite sequence of program evaluations, such that each evaluation observes the values of variables as computed by the last prior assignment in the sequence. The ordering of evaluations in this sequence shall be such that:

if an evaluation B observes a value computed by A in a different thread, then B does not happen before A, and

if an evaluation A is included in the sequence, then every evaluation that assigns to the same variable and happens before A is included.

10 [ Note: The second requirement disallows "out-of-thin-air" or "speculative" stores of atomics when relaxed atomics are used. Since unordered operations are involved, evaluations may appear in this sequence out of thread order. For example, with x and y initially zero,
// Thread 1:
r1 = y.load(memory_order_relaxed);
x.store(r1, memory_order_relaxed);

// Thread 2:
r2 = x.load(memory_order_relaxed);
y.store(42, memory_order_relaxed);
is allowed to produce r1 = r2 = 42. The sequence of evaluations justifying this consists of:
y.store(42, memory_order_relaxed);
r1 = y.load(memory_order_relaxed);
x.store(r1, memory_order_relaxed);
r2 = x.load(memory_order_relaxed);
On the other hand,
// Thread 1:
r1 = y.load(memory_order_relaxed);
x.store(r1, memory_order_relaxed);

// Thread 2:
r2 = x.load(memory_order_relaxed);
y.store(r2, memory_order_relaxed);
may not produce r1 = r2 = 42, since there is no sequence of evaluations that results in the computation of 42. (snip) --end note ]

一方で明確に "Thin-air read" を禁止した後でも、C++11メモリモデル上は下記例示のような直観に反する動作結果を許容する。ただし現実の処理系ではこのような実行結果は起こりえない。N3337 29.3/p11より引用。

[Note: The requirements do allow r1 == r2 == 42 in the following example, with x and y initially zero:
// Thread 1:
r1 = x.load(memory_order_relaxed);
if (r1 == 42) y.store(r1, memory_order_relaxed);

// Thread 2:
r2 = y.load(memory_order_relaxed);
if (r2 == 42) x.store(42, memory_order_relaxed);
However, implementations should not allow such behavior. --end note]

T1:y=r1(42)書込みのためにはT1:r1=x(42)読込みが行われなければならないが、同atomic変数xに対してT2:x=42書込みを行うのはT2:r2=y(42)読込んだ場合に限られる。初期値はx=y=0であり、直観的には（実動作においても）このような実行結果は生じない。

yohhoyの日記

技術的メモをしていきたい日記

メモリモデルとThin-air read