[Catalyst] CSV / UTF-8 / Unicode

Craig Chant craig at homeloanpartnership.com
Tue Jul 2 16:29:03 GMT 2013


>> All the above seems overkill.   I suspect what you want is closer to this: (but see notes below).

Tried that, didn't work, ended up in a long Catalyst discussion where it was worked out that I needed to wrap any XLS output to an IO:FILE handle otherwise Catalyst dies with an "out of memory" error something to do with streaming data support issues in Catalyst so the work round is to wrap the output into an IO:File object.

>>Second, be aware that $c->response->content_length(length($xls));

Yes, I was doing the encode then using Length (I did read on perldocs about requesting the length against the octet) , either way, the length was the least of my worries, keeping Catalyst from falling over with 'Wide Character' errors, or not getting garbage was my main concern.

And yes, the output is CSV not strictly XLS but I have been told and looked it up on the net that 'application/vnd.ms-excel'  Is the correct MIME header to pass for CSV that you want MS Excel to open.

Of course, if I have the wrong MIME header for CSV -> MS Excel please can you provide the correct one, as it took me a long time to find that one, as the box standard 'text/csv' does not work properly when opened in MS Excel.

Though as it appears DBI is corrupting my Unicode data, it might be related to that rather than CSV->MS Excel per sae!


From: Bill Moseley [mailto:moseley at hank.org]
Sent: 02 July 2013 16:59
To: The elegant MVC web framework
Subject: Re: [Catalyst] CSV / UTF-8 / Unicode


On Tue, Jul 2, 2013 at 2:59 AM, Craig Chant <craig at homeloanpartnership.com<mailto:craig at homeloanpartnership.com>> wrote:

        # output header
        $c->response->content_type('application/vnd.ms-excel');
        $c->response->content_length(length($xls));
        $c->response->header(Content_Disposition => 'attachment;filename=NBCS_Export.csv');

        # create an IO::File for Catalyst
        use IO::File;
        my $iof = IO::File->new;

        $iof->open(\$xls, "r");
        $iof->binmode(":encoding(UTF-8)");

        # output XLS data
        $c->response->body($iof);

All the above seems overkill.   I suspect what you want is closer to this: (but see notes below).

        $c->response->content_type('text/csv');
        $c->response->body($xls);
        $c->response->header(Content_Disposition => 'attachment;filename=NBCS_Export.csv');

Then with that content type the plugin would encode $xls as utf8 and add ;charset=utf8 (or whatever it is configured to encode as).

Notes:

First, you are not returning Excel, so the content type is not what you first listed above, right?

Second, be aware that $c->response->content_length(length($xls)); could be very wrong.  If $xls is really CSV text AND it's decoded then length($xls) is the length in characters, not octets.   Don't set the content length.


Third, Catalyst::Plugin::Unicode::Encoding, IMO, has some issues.

The plugin limits to just these content types.

    return $c->next::method(@_)
      unless $c->response->content_type =~ /^text|xml$|javascript$/;

Then it does this:

    $c->response->body( $c->encoding->encode( $body, $CHECK ) )
        if ref(\$body) eq 'SCALAR';

Personally, I think the correct approach is to only encode character data -- that is check to see if the utf8 flag is set before calling encode.

Maybe limit to the content types listed above, but throw an exception for other content types where the body is a scalar AND has the utf8 flag on.  After all, we can only write out octets or else we get the Wide Character error.






--
Bill Moseley
moseley at hank.org<mailto:moseley at hank.org>
This Email and any attachments contain confidential information and is intended solely for the individual to whom it is addressed. If this Email has been misdirected, please notify the author as soon as possible. If you are not the intended recipient you must not disclose, distribute, copy, print or rely on any of the information contained, and all copies must be deleted immediately. Whilst we take reasonable steps to try to identify any software viruses, any attachments to this e-mail may nevertheless contain viruses, which our anti-virus software has failed to identify. You should therefore carry out your own anti-virus checks before opening any documents. HomeLoan Partnership will not accept any liability for damage caused by computer viruses emanating from any attachment or other document supplied with this e-mail. HomeLoan Partnership reserves the right to monitor and archive all e-mail communications through its network. No representative or employee of HomeLoan Partnership has the authority to enter into any contract on behalf of HomeLoan Partnership by email. HomeLoan Partnership is a trading name of H L Partnership Limited, registered in England and Wales with Registration Number 5011722. Registered office: 26-34 Old Street, London, EC1V 9QQ. H L Partnership Limited is authorised and regulated by the Financial Conduct Authority.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.scsys.co.uk/pipermail/catalyst/attachments/20130702/2a13baf8/attachment.htm


More information about the Catalyst mailing list