[Catalyst] Catalyst Unicode woes ...

Jonathan T. Rockway jon at jrock.us
Thu Aug 9 10:34:31 GMT 2007


On Thu, Aug 09, 2007 at 10:27:27AM +0200, Tobias Kremer wrote:

> month names which are displayed fine then. It looks like everything gets
> encoded twice when utilizing these plugins.

OK, so I changed my mind and I'll be a bit nicer.  :)

This is exactly the problem.  Currently, your "unicode" data is
sitting in memory as a bunch of octets.  If you read in octets and
then spit those out, things will appear to work.

The problem is that the Locale data you have is properly encoded as
Perl characters.  When you concatenate those characters with your
octets, the octet data is treated as latin-1 and then converted to
utf8.  Since the data is utf8 and not latin-1, you get your
double-encoded junk.  This is why you need to decode() your data
before you use it inside Perl.

Try this:

  $ recode latin-1..utf8
  <type in some utf8>

You'll notice the familiar double-encoded junk.  It turns out that
this is exactly what Perl is doing, because that's what you're telling
it to do.

> So I must admit I'm stuck with this. What is the best-practice for dealing with

The best practice is to tell your program what you want it to do,
rather than just type stuff and hope it works :)

Regards,
Jonathan Rockway



More information about the Catalyst mailing list