yohhoyの日記

技術的メモをしていきたい日記

macOSはPOSIX無名セマフォをサポートしない

macOS(旧Mac OS X)では POSIX無名(unnamed)セマフォ を意図的にサポートしない。
無名セマフォ生成sem_initや破壊sem_destroy関数呼び出しはerrno=ENOSYS(function not supported)/戻り値-1で常に失敗する。(おまけ:sem_getvalue関数も非サポート)

名前無しセマフォDispatchSemaphore で代替可能。一般的にはObjective-CやSwiftから呼び出す GCD(Grand Central Dispatch)ランタイム の一部だが、C言語APIとして提供されるためネイティブC/C++コードからも利用できる。
実装例:https://github.com/yohhoy/yamc/blob/master/include/gcd_semaphore.hpp

2009年4月頃のTerry Lambert氏*1メールより一部引用。

They aren't required for UNIX conformance because they are not mandatory interfaces per Appendix 9 of the Single UNIX Specification, and there are lots of alternatives, most of them with better overall semantics, which this tread has amply demonstrated. Most portable and historical software uses System V semaphores, since POSIX semaphores are relatively new, so there's little software portability incentive. The software which does like to use them is typically based on GNU autoconf/automake which typically uses linkage tests to find interfaces and blows it by doing that and ignoring <unistd.h> contents, since historical conformance with older versions of the specification permitted stub functions which errored out, and the linkage tests only check linkability rather than functionality. So basically the software that wants them usually fails to conform to the standards which would have allowed them to be used safely and reliably in the first place.

Then there is the little problem of binary backward compatibility for POSIX named semaphores, if the error or success returns don't happen to match the standard once testing of a full implementation became possible.

As a Mac OS X implementation detail, sem_t is the required data type for both POSIX named semaphores and POSIX unnamed semaphores according to the standard. The sem_t definition has historically been a 32 bit value containing an fd cast to a pointer, which is problematic for maintaining binary and source backward compatibility (hint: think symbol decoration) for named semaphores while at the same time permitting typical usage of unnamed semaphores. Specifically, typical usage of unnamed semaphores is to use them as an IPC synchronization mechanism between otherwise unrelated programs by allocating a shared memory region shared between them of sizeof(sem_t) *number_of_semaphores_desired, and the casting the base address of that memory region to a (sem_t *) and indexing off that to get a semaphore index in the range 0..(number_of_semaphores_desired - 1).

The implementation problems should now be obvious. They are not insurmountable; I sometimes pose how to resolve the conflicting goals involved as an interview question. 8-). But there isn't really a very obvious fix I'd call elegant, either.

https://lists.apple.com/archives/darwin-kernel/2009/Apr/msg00010.html

関連URL

*1:同氏はMac OS Xカーネルの6%を書いたとのこと。https://www.quora.com/profile/Terry-Lambert

Objective-C的 null(nil)安全

Objective-Cにおいてメッセージ送信先のレシーバがnilの場合、Objective-Cランタイムはなにもしない(エラーは発生せずメソッドも呼び出されない)。メソッドが戻り値型を持つ場合、値 0 相当が返却されたかのように振る舞う。

Sending Messages to nil
In Objective-C, it is valid to send a message to nil -- it simply has no effect at runtime. There are several patterns in Cocoa that take advantage of this fact. The value returned from a message to nil may also be valid:

  • If the method returns an object, then a message sent to nil returns 0 (nil). For example:
Person *motherInLaw = [[aPerson spouse] mother];

If the spouse object here is nil, then mother is sent to nil and the method returns nil.

  • If the method returns any pointer type, any integer scalar of size less than or equal to sizeof(void*), a float, a double, a long double, or a long long, then a message sent to nil returns 0.
  • If the method returns a struct, as defined by the OS X ABI Function Call Guide to be returned in registers, then a message sent to nil returns 0.0 for every field in the struct. Other struct data types will not be filled with zeros.
  • If the method returns anything other than the aforementioned value types, the return value of a message sent to nil is undefined.
The Objective-C Programming Language, Objects, Classes, and Messaging

ノート:利便性という観点では他プログラミング言語でいうOptional型に似ており、メソッドチェインを簡潔に記述できるようになっている。一方でコンパイラやランタイムによる検査機構がないことで、本質的なバグを潜在化させて問題を先送りするという危険性もある。

関連URL

単一メンバunionの使い道

プログラミング言語C++において、単一メンバしか含まない共用体(union)を用いるとオブジェクトの明示的な生成/破棄操作が可能となる。貧者(poor man's)のOptional。

#include <iostream>

template <typename T>
union Wrapper {
  // 共用体のコンストラクタ/デストラクタ定義は必須
  Wrapper() {}
  ~Wrapper() {}
  // 明示的なオブジェクト初期化
  void init() { new (&obj_) Holder; }
  // 明示的なオブジェクト廃棄
  void destroy() { obj_.~Holder(); }
  // 制御対象クラス
  struct Holder {
    T m_;
  } obj_;
};

struct S {
  S() { std::cout << "S::ctor\n"; }
  ~S() { std::cout << "S::dtor\n"; }
};

int main()
{
  Wrapper<S> opt;
  // このタイミングではS型オブジェクトは生成されない

  std::cout << "call init\n";
  opt.init();  // S::S()を呼び出し

  std::cout << "call destroy\n";
  opt.destroy();  // S::~S()を呼び出し
}

C++17 12.3/p1, 6より一部引用。

1 In a union, a non-static data member is active if its name refers to an object whose lifetime has begun and has not ended (6.8). At most one of the non-static data members of an object of union type can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. (snip)

6 [Note: In general, one must use explicit destructor calls and placement new-expression to change the active member of a union. -- end note] [Example: Consider an object u of a union type U having non-static data members m of type M and n of type N. If M has a non-trivial destructor and N has a non-trivial constructor (for instance, if they declare or inherit virtual functions), the active member of u can be safely switched from m to n using the destructor and placement new-expression as follows:

u.m.~M();
new (&u.n) N;

-- end example]

関連URL

名前によるUnicodeリテラル表現

PythonではUnicodeコードポイントによるリテラル表現*1の他に、Unicode文字データベース(UCD; Unicode Character Database)*2に準じた名前表現もサポートする。

print("\U0001F4DB")      # 📛
print("\N{NAME BADGE}")  # 📛

print("\N{TOFU ON FIRE}")
# SyntaxError:
#   (unicode error) 'unicodeescape' codec can't decode bytes in position 0-15:
#   unknown Unicode character name

関連URL

*1:¥uXXXXUCS-2) または ¥UXXXXXXXX(UCS-4)

*2:https://www.unicode.org/reports/tr44/

Go言語の++/--は文(Statement)

Go言語におけるインクリメント++/デクリメント--演算子は、後置(postfix)記法のみが許容され、式(expression)ではなく 文(statement) を構成する。

i++;  // i += 1; と等価
i--;  // i -= 1; と等価

Why are ++ and -- statements and not expressions? And why postfix, not prefix?
Without pointer arithmetic, the convenience value of pre- and postfix increment operators drops. By removing them from the expression hierarchy altogether, expression syntax is simplified and the messy issues around order of evaluation of ++ and -- (consider f(i++) and p[i] = q[++i]) are eliminated as well. The simplification is significant. As for postfix vs. prefix, either would work fine but the postfix version is more traditional; insistence on prefix arose with the STL, a library for a language whose name contains, ironically, a postfix increment.

https://golang.org/doc/faq

関連URL

独自診断メッセージ diagnose_if属性

Clangコンパイラはユーザ定義のコンパイル警告/エラーメッセージ出力を行うdiagnose_if属性を提供する。

The diagnose_if attribute can be placed on function declarations to emit warnings or errors at compile-time if calls to the attributed function meet certain user-defined criteria. For example:

void abs(int a)
  __attribute__((diagnose_if(a >= 0, "Redundant abs call", "warning")));
void must_abs(int a)
  __attribute__((diagnose_if(a >= 0, "Redundant abs call", "error")));

int val = abs(1); // warning: Redundant abs call
int val2 = must_abs(1); // error: Redundant abs call
int val3 = abs(val);
int val4 = must_abs(val); // Because run-time checks are not emitted for
                          // diagnose_if attributes, this executes without
                          // issue.
https://releases.llvm.org/5.0.0/tools/clang/docs/AttributeReference.html#diagnose-if

C++標準ライブラリlibcxxでは、このdiagnose_if属性を利用してC++標準ライブラリ要件(requirements)診断を部分的に行っている。