Check for a Chinese character

This function checks whether a character is one of the 20950 CJK Unified Ideographs. These include all the Chinese hanzi, Japanese kanji, and Korean hanja in common usage. Some rarer characters and variants are stored elsewhere, but it's still a reliable test real world applications.

In [1]:
def is_hanzi(char):
    """Check for CJK Unified Ideograph."""
    return ord(char) >= 0x4e00 and ord(char) <= 0x9fff

ord is built-in function returning the Unicode code point of a single Unicode character.

In [2]:
for char in 'a. δΈ€θ§ι’Ÿζƒ…':
    print(is_hanzi(char))
False
False
False
True
True
True
True