[Catalyst] Re: Fix for content-length issue introduced with Catalyst 5.8.x

Aristotle Pagaltzis pagaltzis at gmx.de
Thu Feb 23 06:11:24 GMT 2012

* Dominic Germain <mailinglists at sogetel.com> [2012-02-16 04:00]:
> All our apps code and our DBs are in ISO-8859-1 encoding.  View::TT is
> configured to output stuff as UTF-8 and everything is working fine
> until the update.  It means that there is some re-encoding occurring
> somewhere in Catalyst View processing.
> The problem is quite simple:  Catalyst is unable to figure out the
> right content-length as soon we have characters that requires two
> bytes in UTF-8.  French accent characters like "é", "ê", 'è", "à",
> etc. are good examples.  Previously, "bytes::length" was used and it
> works fine but the code was changed to just "length".
> Because of that, if I have 100 accentuated characters in the body, the
> last 100 characters will be chopped by all browsers that are taking
> care of content-length (Chrome and Safari for example).  It seems that
> FF doesn't care about content-length, it displays everything.  Don't
> know about IE.
> reverting back to the old way does the trick...

Your code is broken and Catalyst used to be broken in such a way that it
hid your own breakage. Any occurrence of the bytes pragma or one of its
functions is *always* a bug: the bytes pragma violates encapsulation and
is broken as designed. (Except when it’s not, but understanding when it
isn’t comes when you understand why it is. It’s certainly never the best
way to do what you want to do.)

Your mistake is passing a character string to Catalyst as the response
body. Old versions of Catalyst would then break the encapsulation of the
character representation in Perl which conveniently happens to be UTF-8
in some cases (but can be Latin-1 in others), effectively giving you an
implicit encoding of characters to UTF-8; if your output was UTF-8 also,
then the whole misery would add up to something that happened to work.

Instead of this implicit encoding from characters to UTF-8 that falls
out of the implementation details of an abstraction you cracked open,
what you want to do is simply do the encoding explicitly, manually.

The proper fix is Catalyst::Plugin::Unicode::Encoding, as already
mentioned, but that is probably a big job for your codebase because you
will have to fix all other instances of character/byte confusion that
you have (and you have them, per your description above) in order to
make it work.

The surgical change to make your code work with Cat 5.9 with no deeper
changes and without hacking Catalyst itself is to encode the body in
your root controller’s `end` action (*after* forwarding to View::TT).
This is not a workaround – it is one of the things that the plugin will
do for you, and is the minimum you need to get your code going again.
But it is only one of the things the plugin does, and you should fix all
of your issues.

Aristotle Pagaltzis // <http://plasmasturm.org/>

More information about the Catalyst mailing list