[Catalyst] Re: Avoiding UTF8 in Catalyst

Aristotle Pagaltzis pagaltzis at gmx.de
Sun Nov 22 21:21:20 GMT 2009


Hi Marc,

* Marc SCHAEFER <schaefer at alphanet.ch> [2009-11-22 15:05]:
> On Sun, Nov 22, 2009 at 02:10:29PM +0100, Aristotle Pagaltzis wrote:
> > As a quick fix, you want to utf8::downgrade the $c->res->body
> > at the last moment before emitting the data to the wire.
>
> Interestingly, the data arrives on the other side as a stream
> of bytes, which are iso-8859-1. So it means that Perl knows or
> is taught to represent this internal representation correctly
> when print'ing. But not while counting (data is correct
> Content-Length: is too high).
>
> Very funny. I will wait a bit if there are any other comments
> on this issue, and maybe try to see what is really happening
> just before going on the wire, because as I can see, there must
> be something wrong there.

ah, d’oh, I see. You wrote that the count is off, but didn’t say
whether it’s too big or too small, and I assumed the wrong way
around, even though it’s now obvious to me as well that this
doesn’t make any sense given your problem description.

So I went thrawling the Catalyst sources and found what appears
to be the offending line. From finalize_headers in Catalyst.pm:

    # everything should be bytes at this point, but just in case
    $response->content_length( bytes::length( $response->body ) );

I was shocked to discover this! Any code that uses bytes::length
is automatically broken.

It looks like your response body is either upgraded at some point
during the request or starts life as a multibyte string, and then
this code is of course going to count the wrong length.

To work around it for now, in your case, it should suffice to put
a `before` modifier on `finalize_headers` to utf8::downgrade the
response body.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>



More information about the Catalyst mailing list