yohhoyの日記

技術的メモをしていきたい日記

OpenMP 2.0とメモリモデルの闇

OpenMPが提供するロック獲得/解放omp_set_lock/omp_unset_lock関数と、OpenMPメモリモデルおよびflush指示文に関するメモ。

一見問題がなさそうな下記コードは、OpenMP 2.0以前の仕様に従って厳密に解釈するとプログラマの期待通り動作する保証がない。この仕様上の欠陥はOpenMP 2.5にて修正されている。

#include <assert.h>
#include <omp.h>

int main()
{
  int count = 0;
  omp_lock_t lockvar;
  omp_init_lock(&lockvar);

  #pragma omp parallel thread_num(10)
  {
    //...
    omp_set_lock(&lockvar);
    count++;
    omp_unset_lock(&lockvar);
    //...
  }
  assert(count == 10);  // ??
}

OpenMP 2.0以前の仕様

OpenMP 2.0以前は、ロック獲得/解放omp_set_lock/omp_unset_lock関数がflush効果を持つとは定義されていない。つまりOpenMP実行モデル定義に従うと、共有変数countに対する変更が別スレッドから可視となるために明示的なflush指示文が必要になる。また、OpenMP処理系による変数countアクセス操作とロック操作関数呼び出しの入れ替えを抑止するためにもflash指示文が必要*1

// OpenMP 2.0仕様上 "正しい"コード
omp_set_lock(&lockvar);
#pragma omp flush
count++;
#pragma omp flush
omp_unset_lock(&lockvar);

OpenMP C/C++ API Version 2.0仕様*2 1.3 Execution Model, 2.6.5 flush Directiveより一部引用(リスト表記を一部変更)。

If a thread modifies a shared object, it affects not only its own execution environment, but also those of the other threads in the program. The modification is guaranteed to be complete, from the point of view of one of the other threads, at the next sequence point (as defined in the base language) only if the object is declared to be volatile. Otherwise, the modification is guaranteed to be complete after first the modifying thread, and then (or concurrently) the other threads, encounter a flush directive that specifies the object (either implicitly or explicitly). Note that when the flush directives that are implied by other OpenMP directives are not sufficient to ensure the desired ordering of side effects, it is the programmer's responsibility to supply additional, explicit flush directives.

A flush directive without a variable-list is implied for the following directives:

  • barrier
  • At entry to and exit from critical, ordered, parallel
  • At exit from for, sections, single
  • At entry to and exit from parallel for, parallel sections

OpenMP 2.5以降の仕様

前掲の問題を解消するため、OpenMP 2.5では「OpenMPロックの獲得/解放関数は暗黙的なflush効果をもつ」と仕様変更された。

// OpenMP 2.5以降の正しいコード
omp_set_lock(&lockvar);    // 暗黙のflush
count++;
omp_unset_lock(&lockvar);  // 暗黙のflush

OpenMP API Version 2.5仕様*3 2.7.5 flush Construct, 3.3 Lock Routinesより該当箇所を引用(下線部は強調)。

A flush region without a list is implied at the following locations:

  • During a barrier region.
  • At entry to and exit from parallel, critical and ordered regions.
  • At exit from work-sharing regions, unless a nowait is present.
  • At entry to and exit from combined parallel work-sharing regions.
  • During omp_set_lock and omp_unset_lock regions.
  • During omp_test_lock, omp_set_nest_lock, omp_unset_nest_lock and omp_test_nest_lock regions, if the region causes the lock to be set or unset.

The OpenMP lock routines access a lock variable in such a way that they always read and update the most current value of the lock variable. Therefore, it is not necessary for an OpenMP program to include explicit flush directives to ensure that the lock variable's value is consistent among different threads.

関連URL

*1:仮に “変数リスト指定つきflush指示文” を用いる場合は、#pragma omp flush(count, lockvar) のように変数 count とロック lockvar の両方を指定する必要がある。

*2:http://www.openmp.org/mp-documents/cspec20.pdf

*3:http://www.openmp.org/mp-documents/spec25.pdf