View Full Version : Changing encoding of text in MySQL database
stukov
06-12-2008, 01:31 AM
Hola,
I have a database full of text. I had to change the encoding of the tables and the database themselves. Now all my French accents are screwed up. Does anyone know/have a script that could de-corrupt my database and it's poor accents?
Thanks!
ohauer
06-19-2008, 09:02 PM
Hopefully you have a good dump before changes are applied.
You can try to convert the dump with iconv.
Dump the tables without data
$> mysqldump -d --database SourceDB -u YourMySQLAdmin -p > SourceDB_tables.dump
Dump the data only
$> mysqldump -n -t --database SourceDB -u YourMySQLAdmin -p > SourceDB_data.dump
Replace the encoding in the SourceDB_tables.dump file with new encoding and create a new database to test the next steps with this file.
To see all encodings that can be used with iconv
$> iconv -l
Now convert the data to the new encoding.
$> cat SourceDB_data.dump | iconv -f US-ASCII -t UTF-8 > SourceDB_data.utf8
Now import the converted data to the new database, if all went OK you are fine.
I cannot describe the exact steps for the ex/import since i don't use MySQL but i hope the way is clear now.
stukov
06-25-2008, 04:53 PM
Thanks for the reply ohauer. Sorry if it took some time for me to answer.
I followed your steps and understood the explanations. Thanks, it was very clear. However, I get the following error message when running "cat db.sql | iconv -f US-ASCII -t UTF-8 > db-utf8.sql":
iconv: (stdin):40:531: cannot convert
What are the possible causes of this error?
Thank you very much.
ohauer
06-25-2008, 06:18 PM
The error is about a sign that is not in US-ASCII table and iconv cannot translate it.
error is at line: 40, row 531
Take a look at this sign with your favor editor, maybe it is ISO-... or someting else.
Since you will transfer from a database witch has french chars, try the following if it works.
iconf -f ISO-8859-1 -t UTF-8 < source_dump.sql > target_dump.utf8
Maybe change ISO-8859-1 to something else
stukov
07-15-2008, 09:52 PM
Thanks for the solution ohauer, however, it looks like some part of the database is encoded two to three times or with another encoding. I had to rewrite the accents one by one in the DB.
lvlamb
07-15-2008, 10:48 PM
This is an application level encoding problem. In no way related to the underlaying OS.
Tables can be encoded in any code-page you may wish, either locally, or per users, or per table, or system wide.
Answers on http:://mysql.org
vBulletin® v3.7.2, Copyright ©2000-2009, Jelsoft Enterprises Ltd.