Say, you had a MySQL database that stores texts in languages such as Chinese, Japanese, Korean, Hebrew, Arabic, and so on, and recently there was a server move or upgrade, and all of sudden text in the above mentioned languages didn’t display properly. What do we do to fix it then?
That was what happened to me recently. My hosting company migrated this site to a new LAMP server, and managed to screw up the character set so my Chinese entries looked like gibberish. I spent some time researching, and was able to convert my WordPress database character set to utf8 successfully. Below is the notes for my own consumption. If it helps you, dear reader, all the better!
Please follow the procedures below step by step in the order presented here. If you need help running sql script using command line tools, I have a little discussion here. I am assuming you have access to command line tools or things like phpMyAdmin. Please remember to replace MyDb with your database name.
1. Switch to information_schema database by:
USE information_schema;
2. Run this statement below. It will generate sql statements to be executed later. Save the result in a sql script file.
SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', REPLACE(column_type, 'char', 'binary'), ';') FROM columns WHERE table_schema = 'MyDb' and data_type LIKE '%char%';
3. Run this statement below. It will generate sql statements to be executed later. Save the result in the same sql script file, below the sql statements generated earlier;
SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', REPLACE(column_type, 'text', 'blob'), ';') FROM columns WHERE table_schema = 'MyDb' and data_type LIKE '%text%';
4. Run this statement below. It will generate sql statements to be executed later. Save the result in the same sql script file, below the sql statements generated earlier;
SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', column_type, ' CHARACTER SET utf8;') FROM columns WHERE table_schema = 'MyDb' and data_type LIKE '%char%';
5. Run this statement below. It will generate sql statements to be executed later. Save the result in the same sql script file, below the sql statements generated earlier;
SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', column_type, ' CHARACTER SET utf8;') FROM columns WHERE table_schema = 'MyDb' and data_type LIKE '%text%';
6. Add the line below to the sql script file;
ALTER DATABASE MyDb CHARACTER SET utf8;
7. Switch to MyDb by:
USE MyDb;
8. Run the script file generated from the above steps, and you should be good.
Enjoy! My procedure above is especially useful for upgrading databases from MySQL 4 and earlier to MySQL 5 and later, as discussed here. It appears that for MySQL 5 and later, it is possible to convert character set of all columns in one command, like:
ALTER TABLE wp_posts CONVERT TO CHARACTER SET utf8;
However, I didn’t test it so I am not sure.
By the way, if you are interested in collation in MySQL, especially for Chinese, I wrote a little note here.