[Catalyst] Catalyst Unicode woes ...

Tatsuhiko Miyagawa miyagawa at gmail.com
Thu Aug 9 10:40:40 GMT 2007


On 8/9/07, Tobias Kremer <list at funkreich.de> wrote:
> I have the problem that up until now everything worked absolutely fine without
> C::P::Unicode, Template::Stash::ForceUTF8, Template::Provider::Encoding or any
> other unicode plugin because I believed that if everything is utf8 you don't
> really have to worry about it that much.

No, you need to. Because DateTime::Locale month_names are utf8-flagged
(meaning Perl knows that it's a string properly decoded to Unicode)
but Catalyst request parameters are from HTTP request which Catalyst
doesn't know which encoding it's encoded in, until you use modules
like C::P::Unicode.

Similarly even if your templates are encoded in utf-8,
Template-Toolkit doesn't know which encoding they are in, until you
set BOM to your templates or use Template::Provider::Encoding to
explicitly specify the encoding to decode the template.

Concatinating utf-8 flagged variables with utf-8 encoded byte string
causes automatic SV upgrade, which causes double utf-8 encoded string.

You might want to look at the manpages of encoding::warnings and perlunitut.

I have a couple of hacks to workaround that, like
Template::Stash::ForceUTF8 that you mentioned, and
Encode::DoubleEncodedUTF8 is probably the most evil one, that "fixes"
the double-encoded utf-8 strings back to what you mean. Too evil to
use on production but would be still useful to catch bugs like that in
testing.

HTH,

-- 
Tatsuhiko Miyagawa



More information about the Catalyst mailing list