WebsiteBaker Community Forum

WebsiteBaker Support (2.8.x) => WebsiteBaker Language Files => Topic started by: scolombo on July 02, 2009, 10:58:18 PM

Title: Chinese Characters not working on my site
Post by: scolombo on July 02, 2009, 10:58:18 PM
I have a client who has asked if we can include some chinese on our english site.  Coincidentally, I have another new client who wants a multilanguage site (English, Chinese and Spanish), so I really need to determine if WB can handle this or if i need to find another CMS for my new client!

So back to my problem.  When I type in the Chinese Characters (actually copy them from some text from my client), they appear just fine in the editor (FCKEditor), and same goes for the source view.  When I save it and view the page, they are jumbled english symbols and characters -- "伯佩知" -- (almost as if they were encoded twice or something?).  I took a look in the database in the mod_wysiwyg table and in the content field, the characters look like the jumbled text, but in the text field, they are correctly encoded -- "伯佩&#30693".  Unfortunately, the content field is what shows up on my web page!  Both fields are encoded in the db the same: utf8_general_ci -- so is this a problem with the FCKEditor?

I have been looking all over the place for a solution to this, and nothing is telling me exactly what to do.  One post on a forum for the FCKEditor told me to remove a call to utf8_encode in the PHP for the editor, but I don't want to break my clients site or mess up her content by changing that.

Can anyone help?  I would love to use WB for my other project!

I hope this editor doesn't encode the characters I included above!!!  I never did enter the chinese characters...just the encoded and possibly doubly encoded ones.

Thanks much,
Sharon
Title: Re: Chinese Characters not working on my site
Post by: Ruud on July 03, 2009, 12:07:19 AM
For Chinese characters you must have your site set to UTF-8 (Unicode) encoding.
This is the default setting in WB. Could it be you changed this? (settings -> advanced options)

I played around a bit, and for me it is working fine (copy/paste/view chinese)

Ruud
Title: Re: Chinese Characters not working on my site
Post by: scolombo on July 03, 2009, 01:29:11 AM
Unfortunately, I do have it set to Unicode (utf-8).  Any other thoughts?

Thx,
Sharon
Title: Re: Chinese Characters not working on my site
Post by: Ruud on July 03, 2009, 12:03:17 PM
And the data you pasted was also UTF-8?
Attached is what I see when pasting Chinese.

Do you have something online to look at?

R

[gelöscht durch Administrator]
Title: Re: Chinese Characters not working on my site
Post by: scolombo on July 03, 2009, 09:29:08 PM
Thanks for your help.  I solved the problem.  Because I created my own templates, I wasn't using the right charset on my pages.  Your question about my page encoding prompted me to dig a little deeper.  Here's what I had:

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />

Here's what you guys have in your templates:

<meta http-equiv="Content-Type" content="text/html; charset=<?php if(defined('DEFAULT_CHARSET')) { echo DEFAULT_CHARSET; } else { echo 'utf-8'; }?>" />

Changing mine to this fixed it. 

What I find curious is that the database has two fields in the MOD_WYSIWYG table, one called content and one called text.  When I look at that using phpMyAdmin, I see two values, one has the characters encoded the way I was seeing them above (incorrectly as "伯佩知") and one has them as what appears to be hex values "&#20271;&#20329;&#30693" -- I am curious about how this works?  Is one field encoded twice, then if the page is set to the correct content type it is "un-encoded" correctly for display?   Even with this switch the data is still stored this way.  I know, this is out of the scope of my original question, but I'm still curious after all this digging around. Actually, why are there two fields in the first place??  Seems like you'd only need the text field.

Thanks Ruud for the time and effort!

Sharon
Title: Re: Chinese Characters not working on my site
Post by: Ruud on July 03, 2009, 11:04:07 PM
The text field in the database is used for (the old) search, not for display.
It is stripped from all html and special characters (including chinese) are replaced by "readable" entities.

The new search, introduced in WB2.7, will actually search the contect field.

The characters you cannot read in the content field is the used data but without the correct decoding. This is done by the browser following instructions in the <meta> tag that declares the correct encoding.

Anyway, good to hear it is working now.

Ruud