[Catalyst] Re: Avoiding UTF8 in Catalyst

Aristotle Pagaltzis pagaltzis at gmx.de
Tue Dec 8 08:26:36 GMT 2009


* Jonathan Rockway <jon at jrock.us> [2009-12-08 06:40]:
> Basically, if you are doing things right, this code will cause
> no harm

Yes it will, in some cases.

> (as the string will be an octet stream

There is no such thing as an octet stream in Perl. There are only
strings, and strings are sequences of arbitrarily large integers.
You can store an octet stream in a string, which will then be
a string that just happens to be a sequence of integers < 256,
but it’s a string like any other, not specifically an octet
sequence, and any string in Perl can internally have either form
of internal representation. *Usually* after encoding a string
will be a packed byte array… but that’s an implementation detail.

> and bytes::length will return the length of the octet stream
> you are about to send).

This will work only if the string is using one of the two kinds
of internal representation but not in the other.

The case the OP had was that he wanted to send Latin-1 and his
strings contained sequences of Latin-1 characters, which happen
to be interchangeable with their octet representation. His
strings were getting upgraded in the course of the code, which is
hardly uncommon with Latin-1 strings and in fact is necessary in
some cases.

It should not have mattered that they were upgraded. Their
content was semantically correct. But it did matter, because
Catalyst::Engine used bytes::length, so forced the user to care
about the internal representation.

And you know what you said about the internal representation.

> HTTP is a binary protocol, but people need to send text, so
> there is an impedance mismatch.

HTTP is a red herring. *All* forms of I/O have this mismatch.

> But for now, trying hard to Do The Right Thing (instead of
> causing weird web browser errors) is what we're stuck with.

Nice ideal. Unfortunately you can’t. You can merely partially
paper over one set of problems – only by creating another.

I’m not saying that people who have broken apps should be told to
take a hike. It might be nice to provide old workaround approach
as a plugin for people who depended on that behaviour. It can be
agonising to fix an app after the fact, as I know very well.
I only recently cleaned up $job app in that regard, which still
suffered the legacy of the days of Perl 5.6 and some very old
DBD::mysql versions… and therefor required cleaning a database
that contained arbitrarily mixed doubly- & triply-encoded data.
So I put it off for as long as it could wait; other things took
priority. It’s nice to have the option to wait until an opportune
moment.

But Catalyst shouldn’t in the meantime punish people who haven’t
done anything wrong for the mistakes of other people.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>



More information about the Catalyst mailing list