[Dbix-class] Wrong UTF-8 handling in DBIx::Class/DBD::mysql
despite mysql_enable_utf8
Matias E. Fernandez
pisco at gmx.ch
Wed May 12 14:38:17 GMT 2010
Hello Marc
On 2010-05-12, at 15:40, Marc Mims wrote:
>> my $title = "\x{e4}\x{f6}\x{fc}"; # "äöü"
>
> This isn't a UTF-8 string.
>
> utf8::is_utf8($title); # false
>
> utf8::upgrade($title); # now it is
It is a string consisting of the three characters \x{e4}, \x{f6} and \x{fc}. That's about all I have to know as a Perl user, reread [1] if in doubt. The important thing to know is that you cannot rely on Perl internally holding strings in UTF-8! Of course I could force Perl to internally hold this string in UTF-8 by using utf8::upgrade(), but the question is: where should I do that so as to cover all cases? As pointed out in [2], overwriting get_columns and store_columns won't work reliably. That's why I suggested using the inflate/deflate subroutines, but will this work in all cases? Even then it would be a bad idea to use utf8::upgrade() because that's not was it's meant for. As pointed out in [3] the flow should be as follows:
> 1. Receive and decode
> 2. Process
> 3. Encode and output
and as a matter of fact, neither DBIx::Class nor DBD::mysql do the 3rd step (encoding to UTF-8), because then the problem would not arise. Look at this:
my $title = "\x{e4}\x{f6}\x{fc}";
return Encode::encode('UTF-8', $title);
and
my $other_title = "\x{e4}\x{f6}\x{fc}";
utf8::upgrade($other_title);
return Encode::encode('UTF-8', $other_title);
Both yield the same result. Using utf8::upgrade() here is useless, and again: as pointed out in [1] you shouldn't care about the internal format.
My question remains: is deflate/inflate a safe place to do encoding, or will it suffer the same flaws as DBIx::Class::UTF8Columns?
Regards
Matias E. Fernandez
[1] http://perldoc.perl.org/5.12.0/perlunifaq.html#I-lost-track%3b-what-encoding-is-the-internal-format-really%3f
[2] http://search.cpan.org/~frew/DBIx-Class-0.08121/lib/DBIx/Class/UTF8Columns.pm#Warning_-_Module_does_not_function_properly_on_create/insert
[3] http://perldoc.perl.org/5.12.0/perlunitut.html#I%2fO-flow-(the-actual-5-minute-tutorial)
More information about the DBIx-Class
mailing list