[Catalyst] TT and UNICODE: Garbled special characters
Matt Lawrence
matt.lawrence at ymogen.net
Fri Sep 7 16:33:21 GMT 2007
Matt Lawrence wrote:
> Stefan Kühn wrote:
>
>> GERMAN UMLAUT HERE: ___\xFC\xFC\xFC___
>>
>>
> AFAIK, single-byte-width \xxx escapes are always treated as bytes, not
> as characters. Even if they are outside the 7-bit range, and even in the
> presence of the utf8 pragma.
>
> Try inserting real Unicode characters into the string, explicitly
> upgrading the string using utf8::upgrade or utf8 or use encoding 'latin1'.
>
Oops, that last paragraph wasn't very clear, and utf8::upgrade was not a
good suggestion. I'll try again:
#Option 1
use utf8; # recognise unicode characters in program text
my $name = "Stefan Kühn"; # use a real UTF-8 character here!
# Option 2
use Encode qw( decode );
my $name = decode("latin-1", "Stefan K\xfchn");
# Option 3
use encoding 'latin1';
my $name = "Stefan K\xfchn";
Once you have a unicode string that's internally marked as such,
C::P::Unicode should do the right thing with it.
Matt
More information about the Catalyst
mailing list