[Catalyst] Re: decoding in core

Zbigniew Lukasiak zzbbyy at gmail.com
Mon Feb 23 14:31:07 GMT 2009


On Mon, Feb 23, 2009 at 2:58 PM, Neo [GC] <neo at gothic-chat.de> wrote:
> Zbigniew Lukasiak schrieb:
>>
>> Some more things to consider.
>>
>> - 'use utf8' in the code generated by the helpers?
>>
>
> Reasonable, but only if documentet. It took weeks for us until we learned,
> that this changes _nothing_ but the behaviour of several perl-functions like
> regexp, sort aso.

Hmm - in my understanding it only changes literals in the code ( $var
= 'ą' ).  So I looked into the pod and it says:

    Bytes in the source text that have their high-bit set will be
treated as being part of a literal
    UTF-8 character.  This includes most literals such as identifier
names, string constants, and con-
    stant regular expression patterns.

>>
>> - ENCODING: UTF-8 for the TT view helper?
>>
>> Maybe a global config option to choose the byte or character semantics?
>>
>> But with the DB it becomes a bit more complex - because BLOB columns
>> probably need to use byte sematic.
>>
>
> Uhm, of course, as BLOB is Binary and CLOB is Character. ;) This is even
> more complex, as the databases have different treating for this datatypes
> and some of Perls DBI-drivers are somewhat broken when it goes to unicode
> (according to our perl-saves-our-souls-guru).
> UTF-8 is ok in Perl itself (not easy, not coherent, but ok); but in
> combination of many modules (and as far as I learned, Perl is all about
> reusing modules) it is _hell_. Try to read UTF-8 from HTTP-request, store in
> database, select with correct order, write to XLS, convert to CSV, reimport
> it into the DB and output it to the browser, all with different subs in the
> same controller... and you know, what I mean.
> Even our most euphoric Perl-gurus don't have any clue how to handle UTF-8
> from the beginning to the end without hour-long trial&error in their
> programs (and remember - we Germans do only have those bloody Umlauts - try
> to imagine this in China >_<).
>
> Maybe the best thing for all average-and-below users would be a _really_
> good tutorial about Catalyst+UTF-8. What to do, what not to do. How to read
> UTF-8 from HTTP-request / uploaded file / local file / database, how to
> write it to client / downloadable file / local file / database. What
> catalystish variable is UTF-8-encoded when and why. How to determine what
> encoding a given scalar has and how to encode/decode/whatevercode it to a
> bloody nice scalar with shiny UTF-8 chars in it.
> Short: -- Umlauts with Catalyst for dummies --
>

Hmm - maybe I'll add UTF-8 handling in InstantCRUD.  I am waiting for
good sentences showing off the national characters.


-- 
Zbigniew Lukasiak
http://brudnopis.blogspot.com/
http://perlalchemy.blogspot.com/



More information about the Catalyst mailing list