[Catalyst] utf8 / pg double encoding problem

Ash Berlin ash_cpan at firemirror.com
Sat Jan 5 23:41:16 GMT 2008


On Jan 5, 2008, at 11:28 PM, Andrew Rodland wrote:

> On Saturday 05 January 2008 04:54:59 pm Daniel McBrearty wrote:
>> well I'm damned, I thought I had this stuff working squeaky clean.  
>> But
>> I was wrong. I actually had two bugs cancelling each other out -
>> usually.
>> [snip]
>> --' [debug] abçöeü
>> [debug] $VAR1 = "ab\x{c3}\x{a7}\x{c3}\x{b6}e\x{c3}\x{bc}";
>> [debug] it's UTF8!
>>
> Looks like the problem is here... the utf8 flag is on, indicating  
> that $edit
> is a string of characters, rather than bytes -- but the dumper  
> output seems
> to show that these "characters" correspond to UTF-8 encoded bytes,  
> instead of
> the actual characters of the data -- meaning that the bytes actually  
> stored
> in the string are along the lines of "ab\x{c3}\x{83}\x{c2}\x{a7}"...  
> not
> good. Somewhere, your data got the utf8 flag set "by assumption"  
> instead of
> by decoding. $edit = decode("UTF-8", $edit) should clear it up,  
> although
> finding the original problem is probably a better idea. :)
>
> Andrew

ISTR that last time I looked at C::P::Unicode, it did things in a  
manner that I didn't like. I can't remember if this is because i  
thought it was wrong or if it just didn't work right for me, but maybe  
some more eyes on C::P::Unicode might be a good idea.

-ash


More information about the Catalyst mailing list