プログラミング言語Javaにおけるソースコード上のUnicodeエスケープでは、バックスラッシュ(\
)に続くu
をいくつでも記述してよい。
String s0 = "\u65e5\u672c\u8a9e"; // "日本語" String s1 = "\uu65e5\uuu672c\uuuu8a9e"; // s0.equals(s1) == true
The Java Language Specification(3rd Ed.), 3.3 Unicode Escapesより構文定義(一部)と説明文を引用。
UnicodeEscape: \ UnicodeMarker HexDigit HexDigit HexDigit HexDigit UnicodeMarker: u UnicodeMarker u HexDigit: one of 0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E FThe \, u, and hexadecimal digits here are all ASCII characters.
(snip)
The Java programming language specifies a standard way of transforming a program written in Unicode into ASCII that changes a program into a form that can be processed by ASCII-based tools. The transformation involves converting any Unicode escapes in the source text of the program to ASCII by adding an extra
u
-for example,u\
xxxx becomes\uu
xxxx- while simultaneously converting non-ASCII characters in the source text to Unicode escapes containing a singleu
each.This transformed version is equally acceptable to a compiler for the Java programming language ("Java compiler") and represents the exact same program. The exact Unicode source can later be restored from this ASCII form by converting each escape sequence where multiple
CHAPTER 3 Lexical Structure, 3.3 Unicode Escapesu
's are present to a sequence of Unicode characters with one feweru
, while simultaneously converting each escape sequence with a singleu
to the corresponding single Unicode character.
関連URL: