Prev | Current Page 375 | Next

Brad Ediger

"Advanced Rails"


Be very careful converting a table that has existing data. If you have been using Rails 1.2
or later (which support UTF-8 by default) and have not converted your tables to UTF-8,
Rails and Unicode | 247
you may have UTF-8 data stored in the database as Latin1. If you then convert the
table to UTF-8, the conversion will be performed twice, which will corrupt your
data. The standard procedure in this case is to dump the data as Latin1, piping the
dump through sed to change the output character set to UTF-8:
mysqldump -uusername -p --default-character-set=latin1 mydb \
| sed -e 's/SET NAMES latin1/SET NAMES utf8/g' \
| sed -e 's/CHARSET=latin1/CHARSET=utf8/g' >mydb.sql
Then, load the dump back into MySQL as UTF-8:
mysql -uusername -p ??“default-character-set=utf8 The last step in this process is to set up the client connection to support UTF-8. Even if
all of the data is properly configured and using UTF-8, if MySQL thinks the client wants
Latin1 data, that is what it will send. The SQL command to set the client encoding in
MySQL is the following:
SET NAMES utf8;
The Rails MySQL connection adapter has an encoding option that sets the client
encoding as well; in lieu of sending the preceding command, just add the following
to your database.yml:
production:
adapter: mysql
(...)
encoding: utf8
At this time, MySQL does not support 4-byte UTF-8 characters.


Pages:
363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387