Be very careful converting a table that has existing data. If you have been using Rails 1.2
or later (which support UTF-8 by default) and have not converted your tables to UTF-8,
Rails and Unicode | 247
you may have UTF-8 data stored in the database as Latin1. If you then convert the
table to UTF-8, the conversion will be performed twice, which will corrupt your
data. The standard procedure in this case is to dump the data as Latin1, piping the
dump through sed to change the output character set to UTF-8:
mysqldump -uusername -p --default-character-set=latin1 mydb \
| sed -e 's/SET NAMES latin1/SET NAMES utf8/g' \
| sed -e 's/CHARSET=latin1/CHARSET=utf8/g' >mydb.sql
Then, load the dump back into MySQL as UTF-8:
mysql -uusername -p ??“default-character-set=utf8
The last step in this process is to set up the client connection to support UTF-8. Even if
all of the data is properly configured and using UTF-8, if MySQL thinks the client wants
Latin1 data, that is what it will send. The SQL command to set the client encoding in
MySQL is the following:
SET NAMES utf8;
The Rails MySQL connection adapter has an encoding option that sets the client
encoding as well; in lieu of sending the preceding command, just add the following
to your database.yml:
production:
adapter: mysql
(...)
encoding: utf8
At this time, MySQL does not support 4-byte UTF-8 characters.
Pages:
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387