Comment 2 for bug 1171653

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Ivan,

Brian's suggestion works: --alter "CHARACTER SET utf8, MODIFY email VARCHAR(128) CHARACTER SET utf8 COLLATE utf8_unicode_ci". This will make the charset of the new table utf8. I've tested and confirmed this.

From your original report, that the temp (i.e. the new) table is created with latin1 and that you want to do --alter "MODIFY email VARCHAR(128) CHARACTER SET utf8 COLLATE utf8_unicode_ci" leads me to believe that you have a latin1 table (i.e. SHOW CREATE TABLE lists DEFAULT ENGINE=latin1) but you want to make a single column utf8. There are two issues here:

1) pt-osc creates the new table with the original table's charset--latin1 in this case. I've also tested and confirmed this. This is how it should work.

2) Having mixed charsets is complicated, if even possible. If you can make the whole table utf8 as per Brian's suggestion, or --alter "CONVERT TO CHARACTER SET charset_name" should work too (see http://dev.mysql.com/doc/refman/5.1/en/alter-table.html), then --charset=utf8 will also be appropriate. Otherwise, if the table is latin1 but a single column is utf8 (e.g. the email column in this case), then I'm not sure what --charset should be: latin1 or utf8? Any other program or tool using this database will probably have the same problem. I don't know if there's a solution to this; maybe http://mysql.rjweb.org/doc.php/charcoll has some insights.

In any case, I don't see a bug here, just a careful use of mixed or conversion of charsets that you'll need to research. If I'm wrong and you can demonstrate that the tool isn't doing something it should, let me know and we'll re-open this issue.