[Dbix-class] DBIC, MySQL and UTF8

Dave Cross dave at dave.org.uk
Sun Jul 3 11:02:12 GMT 2011


I'm sure I had this working before but it's all gone very strange now.

I have a MySQL db where all the tables contain UTF8 data. Actually, as 
far as I can see, the only problematic character is the £ sign.

 From my understanding of MySQL, I think this select proves that the 
data is UTF8:

mysql> select ord(substring(description, 48, 1)) from invoice_line where 
invoice = 347;
+------------------------------------+
| ord(substring(description, 48, 1)) |
+------------------------------------+
|                              49827 |
|                              49827 |
|                              49827 |
|                              49827 |
|                              49827 |
+------------------------------------+
5 rows in set (0.03 sec)

49827 is (194 * 256) + 163 - hence 0xC2 0xA3[1].

My DBIC connect looks like this:

my $sch = MyData::DB->connect(
   'dbi:mysql:database=mydata;',
   'user', 'xxxx', { mysql_enable_utf8 => 1 }
);

But when I look at the data in the description attribute within the 
Perl, the £ sign is just 0xA3 - which is the Latin1 encoding. The 0xC2 
has been dropped.

   DB<2> x map { ord $_ } (split //, $_->description)[45 .. 50]
0  64
1  32
2  163
3  52
4  48
5  48

(Those characters in the string are '@ £400')

If I run:

$_->description(encode('utf8', decode('latin1', $_->description)));

Then I get:

   DB<4> x map { ord $_ } (split //, $_->description)[45 .. 51]
0  64
1  32
2  194
3  163
4  52
5  48
6  48

Which is what I was expecting in the first place.

I thought that { mysql_enable_utf8 => 1 } was supposed to be the 
solution to all these problems. But I'm obviously missing something.

This is one of those occasions where I'm hoping that I'm being really 
stupid and there's an obvious problem staring me in the face.

Can you see it? :-/

Cheers,

Dave...

[1] See http://www.fileformat.info/info/unicode/char/a3/index.htm



More information about the DBIx-Class mailing list