21 April 2009

This NYT article was, alas, written as a human-interest story, rather than as a geek-interest story. The author explains that the new identity cards issued by China’s Public Security Bureau are rejecting many citizens’ names because of their “obscure” characters because “[t]he bureau’s computers… are programmed to read only 32,252 of the roughly 55,000 Chinese characters”.

Since 32,252 is suspiciously close to 32,767, I am wondering if the Bureau is stuck using an obsolete 3115-bit-wide character set. Perhaps someone from the technical press can follow up and ask the appropriate government officials: “what character set are you using?” and “why don’t you just use Unicode?”

I should point out, in fairness, that my casual search didn’t locate the second character of Ma Cheng’s name in the Unihan database, so perhaps she would be out of luck even with a Unicode database. Or maybe I just suck at recognizing Chinese characters.