[Catalyst] Double encoding of UTF8 strings

Francisco Obispo fobispo at isc.org
Fri Oct 7 17:16:44 GMT 2011


JSON will try to encode in UTF-8 format and if the data is already in UTF-8, most likely it will be double encoded.

This could be fixed in two ways:

1) when loading your UTF-8 data, convert it to perl's internal encoding with:

use Encoding qw(decode_utf8);

my $perl_encoded=decode_utf8($utf_encoded_string);

and then use JSON->enconde to encode the data


or:

2) instruct JSON to avoid converting a string if it's already
in UTF-8 by using is_utf8() from the Encode module.


You might want to look at the documentation of DBD::Mysql (I haven't used it in a while), and consult at the section regarding encodings., it seems like there might be flags that need to be raised to do the right encoding/decoding on serialization.

Cheers,

Francisco





On Oct 7, 2011, at 2:21 AM, jul.gil at gmail.com wrote:

> Hi,
> 
> I have installed and ran successfully the AutoCRUD plugin, I set up a
> mysql database tables to use UT8 charset, the charset in the ajax
> requests is utf-8, everything seems correct, except the data in the
> grids are double encoded, that means é instead of é.
> 
> I am pretty sure that the data in the database are correct, and the
> json data are also correctly displayed (the raw data received contains
> the wrong characters). The double encoding seems then to appear in the
> View of AutoCRUD, that is just a Catalyst::JSON::View simply used
> without any customization.
> 
> I have hacked around this AutoCRUD JSON view, and end up with a solution :
> - overload the encode_json method in the view to call directly
> JSON::XS without the utf8 call, ie a simple
> JSON::XS->new->encode($data);
> - and overload the process method in order to remove the  $json =
> Encode::encode($encoding, $json);
> 
> In short : don't touch my data, they are already in utf8, just send
> them to the browser. It works, but I don't know why...
> 
> It is probably a bad solution, as I can't imagine that
> Catalyst::JSON::View is wrong, but I wonder what is the correct way to
> do it.
> Couldn't it be a problem in DBIx::Class that does not correctly handle
> ut8 columns ?
> 
> --
> Julien Gilles.
> 
> _______________________________________________
> List: Catalyst at lists.scsys.co.uk
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
> Dev site: http://dev.catalyst.perl.org/

Francisco Obispo 
email: fobispo at isc.org
Phone: +1 650 423 1374 || INOC-DBA *3557* NOC
Key fingerprint = 532F 84EB 06B4 3806 D5FA  09C6 463E 614E B38D B1BE








More information about the Catalyst mailing list