The term internationalization is usually abbreviated i18n, short for ???i,
18 letters, and then n.??? Similarly, ???localization??? is abbreviated L10n.
To avoid ambiguity, i18n is always written with a lowercase i, while
L10n always uses an uppercase L. I will use this convention throughout
this chapter.
Locale
Although language translation gets the lion??™s share of attention in this field, it is but
one part of i18n. A human language may have significant regional differences or variants
between countries where the language is spoken. Dialects aside, there can be
large differences in currency, collation (sort order), number and date format, and
even writing system across regional or political divisions within a country.
Character Encodings | 237
These differences are encapsulated in the concept of locale. A locale is usually
defined as a language plus a country or region. It includes not only language but also
regional and local preferences and possibly a character encoding. A POSIX-style
locale identifier looks like en_US.UTF-8 (English, United States, UTF-8 character
encoding).
Character Encodings
One of the most fundamental topics in i18n is the concept of a character encoding or
character set.* Computers work with numbers; people work with characters. A character
encoding maps one to the other. This is simple enough. The difficulty comes,
as it usually does, because of history.
Pages:
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370