[Catalyst] Re: decoding in core
Neo [GC]
neo at gothic-chat.de
Mon Feb 23 13:58:13 GMT 2009
Zbigniew Lukasiak schrieb:
> Some more things to consider.
>
> - 'use utf8' in the code generated by the helpers?
>
Reasonable, but only if documentet. It took weeks for us until we
learned, that this changes _nothing_ but the behaviour of several
perl-functions like regexp, sort aso.
> - ENCODING: UTF-8 for the TT view helper?
>
> Maybe a global config option to choose the byte or character semantics?
>
> But with the DB it becomes a bit more complex - because BLOB columns
> probably need to use byte sematic.
>
Uhm, of course, as BLOB is Binary and CLOB is Character. ;) This is even
more complex, as the databases have different treating for this
datatypes and some of Perls DBI-drivers are somewhat broken when it goes
to unicode (according to our perl-saves-our-souls-guru).
UTF-8 is ok in Perl itself (not easy, not coherent, but ok); but in
combination of many modules (and as far as I learned, Perl is all about
reusing modules) it is _hell_. Try to read UTF-8 from HTTP-request,
store in database, select with correct order, write to XLS, convert to
CSV, reimport it into the DB and output it to the browser, all with
different subs in the same controller... and you know, what I mean.
Even our most euphoric Perl-gurus don't have any clue how to handle
UTF-8 from the beginning to the end without hour-long trial&error in
their programs (and remember - we Germans do only have those bloody
Umlauts - try to imagine this in China >_<).
Maybe the best thing for all average-and-below users would be a _really_
good tutorial about Catalyst+UTF-8. What to do, what not to do. How to
read UTF-8 from HTTP-request / uploaded file / local file / database,
how to write it to client / downloadable file / local file / database.
What catalystish variable is UTF-8-encoded when and why. How to
determine what encoding a given scalar has and how to
encode/decode/whatevercode it to a bloody nice scalar with shiny UTF-8
chars in it.
Short: -- Umlauts with Catalyst for dummies --
(sorry for sounding so emotional.... afaik our company burned man-weeks
on solving minor encoding-bugs :-/ every tutorial we found was like "you
can do it so or so or another way 'round the house, so it's perfect and
if you don't understand is, you're retard and should use 7bit-ASCII"...
while lately even a colleague sounds like this - as he is enlinghtened
by CPAN literature like "UTF-8 vs. utf8 vs. UTF8" ;)).
Greets and regards,
Tom Weber
More information about the Catalyst
mailing list