[isf-wifidog] Char encoding inconsistency

Francois Proulx fproulx at lecameleon.net
Ven 1 Avr 09:07:16 EST 2005


Hello everyone !

As I was changing the mail headers in the validation, lost password, lost username e-mails I realized there was an inconsistency in our translation scheme. The shell scripts that extract strings via xgettext are configured to extract from ISO-8859-1, which is fine because our codebase is encoded that way. Now, xgettext automatically transcodes that in UTF-8 when writing the .po file. I checked the documentation and we can't really do anything against it, it's just the way gettext was created. Anyway, it's the way to go since you can easily create .po files with japanese, russian, french, arab ... translations. This is as long as you make sure your text editor reads and writes in UTF-8. The charset declaration in the smarty templates (ie header_small, header...) declare ISO-8859-1, although our gettext binding in language.php is aid to be UTF-8 bind_textdomain_codeset('messages', 'UTF-8');  .

That means we've had inconsistency since the beginning, but we've never realized it because all our specials chars in the french .po file are html entities... which is kinda bad if we want to use them outside of a web page ( of course we could do entities conversion though PHP, but that's not the way to go ). I suggest that I change the charset declaration in the header files to UTF-8 to be consistent with our bind_textdomain declaration and that as of now we make sure we edit our .po with UTF-8. We have no choice because I had to make the e-mail translations in text/plain UTF-8. 

I think there is no debate on choosing UTF-8 vs. ISO-8859-1 since we can support many more languages and that it doesn't make a big difference to us, because gettext does all the conversion for us.

See ya


Plus d'informations sur la liste de diffusion WiFiDog