[Catalyst] CSV / UTF-8 / Unicode

Bill Moseley moseley at hank.org
Tue Jul 2 15:58:57 GMT 2013


On Tue, Jul 2, 2013 at 2:59 AM, Craig Chant
<craig at homeloanpartnership.com>wrote:

>
>         # output header
>         $c->response->content_type('application/vnd.ms-excel');
>         $c->response->content_length(length($xls));
>         $c->response->header(Content_Disposition =3D>
> 'attachment;filename=3DNBCS_Export.csv');
>
>         # create an IO::File for Catalyst
>         use IO::File;
>         my $iof =3D IO::File->new;
>
>         $iof->open(\$xls, "r");
>         $iof->binmode(":encoding(UTF-8)");
>
>         # output XLS data
>         $c->response->body($iof);
>

All the above seems overkill.   I suspect what you want is closer to this:
(but see notes below).

        $c->response->content_type('text/csv');
        $c->response->body($xls);
        $c->response->header(Content_Disposition =3D>
'attachment;filename=3DNBCS_Export.csv');

Then with that content type the plugin would encode $xls as utf8 and add
;charset=3Dutf8 (or whatever it is configured to encode as).

Notes:

First, you are not returning Excel, so the content type is not what you
first listed above, right?

Second, be aware that $c->response->content_length(length($xls)); could be
very wrong.  If $xls is really CSV text AND it's decoded then length($xls)
is the length in characters, not octets.   Don't set the content length.


Third, Catalyst::Plugin::Unicode::Encoding, IMO, has some issues.

The plugin limits to just these content types.

    return $c->next::method(@_)
      unless $c->response->content_type =3D~ /^text|xml$|javascript$/;

Then it does this:

    $c->response->body( $c->encoding->encode( $body, $CHECK ) )
        if ref(\$body) eq 'SCALAR';

Personally, I think the correct approach is to only encode *character* data
-- that is check to see if the utf8 flag is set before calling encode.

Maybe limit to the content types listed above, but throw an exception for
other content types where the body is a scalar AND has the utf8 flag on.
 After all, we can only write out octets or else we get the Wide Character
error.






-- =

Bill Moseley
moseley at hank.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.scsys.co.uk/pipermail/catalyst/attachments/20130702/cb1e5=
f5f/attachment.htm


More information about the Catalyst mailing list