Upgrade from 2.8.3 to 2.10.0-latin1 to utf8

WebsiteBaker Support (2.10.x) > General Help & Support

<< < (2/6) > >>

contactjw:
OK. Thanks. Will do.

contactjw:
OK. Changed all DB entries from latin1 to utf8. However, that doesn't seem to have resolved the problem. Seems like the problem is only in the news module pages and one page I use with a sortable list. The other pages are OK. When I access one of the affected pages from the editor and check the source code I find â€™ for all of the ' (single quote) throughout a page. (See first example of code below)
For the second problem in the sortable table list I get ï»¿ only in some places on one side or both sides of an entry but not in others. (See bottom portion of the second example below.)
Seems this is the editor changing or adding these characters. Is there a way to resolve this without manually fixing each entry?

1.
<div class="alignCenter">When things go wrong, as they sometimes will, 
When the road youâ€™re trudging seems all uphill, 
When the funds are low and the debts are high, 
And you want to smile, but you have to sigh, 
When care is pressing you down a bit, 
Rest, if you mustâ€”but donâ€™t you quit. 
 

2.
<tr>
 <td width="135">
 <a href="http://en.wikipedia.org/wiki/Marathon_(2005_film)" target="_blank">Marathon</a></td>
 <td width="175">
 <a href="http://en.wikipedia.org/wiki/Jo_Seung-woo" target="_blank" title="Jo Seung-woo">Jo Seung-woo</a> 
 <a href="http://en.wikipedia.org/wiki/Kim_Mi-sook" target="_blank" title="Kim Mi-sook">Kim Mi-sook</a> 
 <a href="http://en.wikipedia.org/wiki/Lee_Ki-young" target="_blank" title="Lee Ki-young">Lee Ki-young</a></td>
 <td width="137"> based on (Korean w/ Engish subtitles)</td>
 <td width="111">courage</td>
 <td width="52">2005</td>
 <td width="85">ï»¿Amazon</td>
 <td width="72">ï»¿Netflix</td>
 </tr>

contactjw:
OK. I looked it over and there weren't that many entries that needed to be corrected so I went ahead and did them manually. Funny thing is there wasn't any consistency to the problem entries. Some places had them and others didn't. :?

hgs:
That makes courage! I also have a web page that I urgently need the current version. :-D

Gast:
question: the examples in #6 ist from the SQL-Backup-file???

looks for me like a wrong encoding on the way browser-> used charset
â€™ is the UTF8-Char for the apostroph, it has to show in your browser as ', if the browser use charset utf8 and (if you read this with a tool like phpmyadmin from the database) for a wrong table collation.
But....
its also possible, that you use the wrong method in the import of the repaired backup

1. at first (see post from Darkviper): Never use the simple Notepad / (Windows-)Editor or also Wordpad for a job like this. better is a editor like Notepad++ or a good IDE like Netbeans etc with a included converter
2. look at first into the database with a tool like phpmyadmin or mysqldumper etc and search there for the table collation and there for the fields "collation" in the overview (see my example)

the whole database has a collation (picture 1) and also every table (see pic #2). use here utf8_general_ci or utf8_unicode_ci.
you can change this in your backup-file. at the end of every table structure for found the settings like

--- Quote ---ENGINE=MyISAM AUTO_INCREMENT=28 DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;
--- End quote ---
DEFAULT CHARSET has to be change to utf8
COLLATE has to be change to utf8_general_ci or utf8_unicode_ci

3. look into every table for the collations of the fields. see the next picture from my backup. in the install-process, every field get the collation from the settings in the wb-installer. you say, your old wb runs with latin1, so i'm sure, every single field has now the wrong collation like my screenshot

you have to change every field collation in the backup file here also from (here in the example) latin1_german1_ci to utf8_general_ci or utf8_unicode_ci :roll: if you dont do this, you write the wrong collation for this tables, when you import the repaired backup and you see the problem in the output later. best method for a job like this is a php-based-converter, works with the method search&&replace. a lot of users has a own converter, but its very complexe, too many possibilitys. for example: if i add in my converter-code the job: search for latin1_german1_ci and i use in my database latin1_german2_ci, the script found nothing and replace nothing. in the next job, i use latin1_swedish_ci instead of latin1_german1_ci. in simple words: if you use the search&&replace method, its not possible to build a converter for everybody, that works in every case :| my converter use a long list with ~ 20 different collations and writing forms and everytime, when i found a new combination in a backup-file, i add a new line in my converter and start the job again.
its also possible, to replace the wrong collations with notepad++, but you have to check every table structure (same problem like the php-converter, if you use different collations)

back to your code and to notepad++
open the backup-sql-file with notepad++, go in the top-menu to "ENCODING". set notepad++ here to UTF8 without BOM (important!!). your code show's ï»¿ and thats a sign for the simple UTF8 (with BOM)
if you open the file, the position of the black point show's the actual status, here UTF8 ohne BOM

if the actual status show not UTF8 without BOM, go in the same menu to the point "Convert to UTF8 without BOM"
now make a check for some special chars
- replace all ï»¿ with nothing
- search for other combinations, the most UTF8-chars start with Â or Ã or Å . these are chars from spain or swedish language, not used in english. Search for this letters and check every result. best method is a single edit for every result or a single search for every utf8-char. See also a simple list from here Search for the blue marked letters in your file, try it for lower case letters and upper case letters and replace it with the real chars ( â€™ == ', â€“ == - etc)

at last step, add this line somewhere in the top of the sql-file

--- Code: ---SET NAMES 'utf8' COLLATE 'utf8_general_ci';
--- End code ---

save the repaired file with a new file name, maybe with date + time at the end of the filename. now you can import the file as UTF8-formated File

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version