Prev | Current Page 362 | Next

Brad Ediger

"Advanced Rails"

In addition, the Internet began to develop in the
1990s, connecting people and allowing them to exchange digital information with a
far greater reach than before.
So, in 1991, the Unicode Consortium published the first Unicode standard. Unicode
sought to be the ???one true character set??? in which all text would eventually be represented.
In a large part, that goal is well on the way to being accomplished. Unicode is
a widely known, well-supported standard that is used extensively on the Internet and
in other forms of data exchange today.
Unicode supports all of the world??™s writing systems currently in use and many
archaic ones, with very few exceptions. There is no ???code page??? switching as there
was under the old character-set systems. All of the scripts can be used interchangeably
within a document, and the encodings are universal; they can be exchanged
over the Internet without worrying too much about differing encodings.
Unicode deals with the world in Platonic ideals. Rather than representing glyphs (the
rendering of a character), each Unicode code point represents a grapheme (the character
abstracted from its representation).* This is consistent with the purpose of a
character encoding: to encode text without specifying presentation. For example, the
following two characters are the same grapheme and would be represented by the
same Unicode code point (U+0061, LATIN SMALL LETTER A), even though they
are different glyphs (see Figure 8-1).


Pages:
350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374