[Catalyst] Patch for Catalyst::Plugin::Unicode::Encoding

Bill Moseley moseley at hank.org
Wed Mar 19 16:56:00 GMT 2008


On Wed, Mar 19, 2008 at 07:32:27AM -0700, Bill Moseley wrote:
> 
>     unless ( Encode::is_utf8( $c->response->body ) ) {
>         return $c->NEXT::finalize;
>     }

BTW Jonathan, I realize we are talking past each other on this a bit.

This isn't the issue I originally posted about (decoding), but you are
correct that the above utf8 breaks for any string that has latin-1
characters.

My database is utf8 as well as my templates.  Any other string is
ascii.  So I have no latin-1.  So, it doesn't effect me.  So,
my code simply always encodes without testing is_utf8.

In general, if you have a situation where you might have high-bit
latin-1 strings then testing if_utf8 is false won't tell you if it's
already been encoded or if it's a latin1 string (and thus needs
encoding).

So, you could argue that we need a different flag for the plugin to
work in the general case to allow users to pre-encode the body
if they have a need to use a different encoding.

    unless ( $c->stash->{already_decoded_body} ) {
        $c->response->body(
            $c->encoding->encode( $c->response->body, $CHECK ),
        );
    }

That's trying to solve the problem that the body might already be
encoded. That the encoding can happen in more than one place is the
real problem.

A better option would be to allow a pre-request forced encoding, or
fallback to what the request asked for.

    # Force encoding if specified, otherwise negotiate
    my $encoding = $c->stash->{encoding_charset}
        || $c->get_accept_charset;

    $c->response->body(
        encode( $encoding, $c->response->body, $CHECK )
    );

And then decide if need to eval or not depending on what you decide
to do for $CHECK.

-- 
Bill Moseley
moseley at hank.org




More information about the Catalyst mailing list