[Catalyst] Catalyst::Request::Upload->filename is not decoded.

John Napiorkowski jjn1056 at yahoo.com
Fri Dec 19 17:40:37 GMT 2014


actually you might need to checkout and test the holland branch HEAD, there's fixes around that are not on CPAN 
and it looks like filename is right but baseman is using a regexp that is not unicode friendly.  I'll take a look
jnap 

     On Friday, December 19, 2014 11:15 AM, John Napiorkowski <jjn1056 at yahoo.com> wrote:
   

 Any chance you can test this on the current dev release on CPAN?  There's a ton of utf8 fixes there.  
Catalyst-Runtime-5.90079_003 - The Catalyst Framework Runtime - metacpan.org

|   |
|   |   |   |   |   |
| Catalyst-Runtime-5.90079_003 - The Catalyst Framework Runtime - metacpan.orgThe Catalyst Framework Runtime |
|  |
| View on metacpan.org | Preview by Yahoo |
|  |
|   |


If trouble remains, I'd love an issue or ideally a test case.  There's a big UTF8 test case over here

perl-catalyst/catalyst-runtime

|   |
|   |  |   |   |   |   |   |
| perl-catalyst/catalyst-runtimecatalyst-runtime - The Elegant MVC Web Application Framework |
|  |
| View on github.com | Preview by Yahoo |
|  |
|   |

  Take a look and let me know if we need  more here.  The file upload stuff is something that is a bit confusing to me that I got it all correct 

     On Wednesday, December 17, 2014 7:22 PM, Bill Moseley <moseley at hank.org> wrote:
   

 All my upload forms have accept-charset="utf-8".    We expect that uploaded filenames could have wide-characters.
The problem I hit was ->basename does this:
$ perl -le 'use Catalyst::Request::Upload; my $upload = Catalyst::Request::Upload->new( { filename => q[документ обучения.pdf] } ); print $upload->basename;'_.pdf
That's pretty mangled.

The problem is that $upload->filename is not decoded so the substitution is working on octets not characters. 

sub _build_basename {    my $self = shift;    my $basename = $self->filename;    $basename =~ s|\\|/|g;    $basename = ( File::Spec::Unix->splitpath($basename) )[2];    $basename =~ s|[^\w\.-]+|_|g;    return $basename;}

Obviously, we want \w to work on characters, not encoded octets.   Decoding the filename should be done -- it's character data.
Does it make sense to do it in Engine's prepare_uploads?
For example:
            my $u = Catalyst::Request::Upload->new(               size => $upload->{size},               type => scalar $headers->content_type,               headers => $headers,               tempname => $upload->{tempname},               filename => $c->_handle_unicode_decoding($upload->{filename}),            );

-- 
Bill Moseley
moseley at hank.org
_______________________________________________
List: Catalyst at lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/


    

   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.scsys.co.uk/pipermail/catalyst/attachments/20141219/9c62cfb6/attachment.htm>


More information about the Catalyst mailing list