[Dbix-class] Wrong UTF-8 handling in DBIx::Class/DBD::mysql despite mysql_enable_utf8

Matias E. Fernandez pisco at gmx.ch
Wed May 12 14:38:17 GMT 2010


Hello Marc

On 2010-05-12, at 15:40, Marc Mims wrote:
>> my $title = "\x{e4}\x{f6}\x{fc}"; # "äöü"
> 
> This isn't a UTF-8 string.
> 
>    utf8::is_utf8($title); # false
> 
>    utf8::upgrade($title); # now it is

It is a string consisting of the three characters \x{e4}, \x{f6} and \x{fc}. That's about all I have to know as a Perl user, reread [1] if in doubt. The important thing to know is that you cannot rely on Perl internally holding strings in UTF-8! Of course I could force Perl to internally hold this string in UTF-8 by using utf8::upgrade(), but the question is: where should I do that so as to cover all cases? As pointed out in [2], overwriting get_columns and store_columns won't work reliably. That's why I suggested using the inflate/deflate subroutines, but will this work in all cases? Even then it would be a bad idea to use utf8::upgrade() because that's not was it's meant for. As pointed out in [3] the flow should be as follows:

> 1. Receive and decode
> 2. Process
> 3. Encode and output


and as a matter of fact, neither DBIx::Class nor DBD::mysql do the 3rd step (encoding to UTF-8), because then the problem would not arise. Look at this:

my $title = "\x{e4}\x{f6}\x{fc}";
return Encode::encode('UTF-8', $title);

and

my $other_title = "\x{e4}\x{f6}\x{fc}";
utf8::upgrade($other_title); 
return Encode::encode('UTF-8', $other_title);

Both yield the same result. Using utf8::upgrade() here is useless, and again: as pointed out in [1] you shouldn't care about the internal format.

My question remains: is deflate/inflate a safe place to do encoding, or will it suffer the same flaws as DBIx::Class::UTF8Columns?

Regards
Matias E. Fernandez

[1] http://perldoc.perl.org/5.12.0/perlunifaq.html#I-lost-track%3b-what-encoding-is-the-internal-format-really%3f
[2] http://search.cpan.org/~frew/DBIx-Class-0.08121/lib/DBIx/Class/UTF8Columns.pm#Warning_-_Module_does_not_function_properly_on_create/insert
[3] http://perldoc.perl.org/5.12.0/perlunitut.html#I%2fO-flow-(the-actual-5-minute-tutorial)


More information about the DBIx-Class mailing list