Hacker News new | past | comments | ask | show | jobs | submit login

You have to render strings in (Chinese|Japanese) font if you believe the string to be meant to be (Chinese|Japanese). Literally that. That's the official Consortium sanctioned way to handle Chinese/Japanese/Korean characters.

There are no particularly good ways or ready made frameworks for that, as it wasn't a huge issue pre-Internet because most people are monolingual in these languages: you pick a language in OS(or buy computers with a ROM) and everything user would see was in the user's language.

It's a giant pain today - there's no "Arial in Chinese", no easy way to mesh multiple fonts together in UI, or good ways to determine intended language of a string, and the fallback default is least common denominator of Simplified Chinese(PRC) for some reason - but not much is being done on those fronts.






I realized I can add a bit more here in hope it'll be useful to someone at some point, all [citation needed]:

It seems that there are Chinese regulation(?) that require Chinese text to be _always_ displayed in _appropriate_ font, which causes "wrong font" problem to Japanese users, while there are no such equivalent requirements elsewhere that computer software face no hard dilemma to follow two regulations in same code point; one is law and one is just unanimous customer complaints.

The rationale for that Chinese regulation(?), I think, is that Chinese computer users long had an exaggerated version of the same problem due to how Kanji was adopted to Unicode; the Kanji map was created by first enumerating common-use Japanese Kanji from reasonable existing tables, then merging additional Chinese common use characters not found on Japanese texts which are plenty. Duplicates were merged on loose and pragmatic-at-time judgements, some by shapes, some by meanings(!), some left as duplicates.

It seems to me that this had lead to a situation that lasted at least some period of time, that Chinese UTF-8 strings on a computer displayed in a mishmash of Japanese and Chinese fonts, Chinese one filling the gaps of Japanese rather than the other way around, which was frustrating(imagine top 10 most used characters of alphabets in Comic Sans and rest Arial), and solved by that regulation(the regulatory setup is a Chinese national GB or GB/T standard and enforcement of "applicable industrial standards", I believe).

There are such attempted solutions to this problem of same-system Chinese-Japanese text coexistence, such as registering all the Japanese Kanji as first choices on Unicode IVS map along legitimate Japanese variants, such that each Japanese characters with the IVS suffix sequences per each characters would be in Japanese form. Obviously there's no such font that this is going to work well with, and it's also unnecessary bloat and just a committee backstage influencing war.

The moral of the story is, the "it's all Kanji after all" approach never worked, simultaneous Chinese-and-Japanese support in Unicode and Unicode-based apps is a mess, and it has to be fixed higher up at some point in the future.


I don't disagree. The only practical strategy now is that if you know the text/user is Japanese, use a specifically Japanese font (Windows and MacOS have a few built in options), not just one that has the CJK codepoints.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: