[Catalyst] Re: Avoiding UTF8 in Catalyst
Carl Johnstone
catalyst at fadetoblack.me.uk
Mon Nov 23 17:38:29 GMT 2009
Aristotle Pagaltzis wrote:
> But there’s no room for “likelies” here: that’s programming by
> coincidence.
The "likely" was correct.
When using UTF-8 whether the length of the string is different in bytes and
characters depends entirely on what the contents of the string are. Given a
particular string I could tell you exactly whether they should match, but in
the general case all I can say is that it's *likely* to be different.
In any case that's an argument about English :-)
> Either you want it or you don’t, and in this case
> you do. But bytes::length doesn’t do that.
>
> Please plese don’t make statements like “not in this case”
> without knowing what the thing you are talking about does, i.e.
> in this case bytes::length, does. There are enough misconceptions
> about Unicode in Perl already.
As far as the usage of bytes::length. Yes I agree with you that the code is
wrong as it's taking the byte length of perl's internal representation -
which happens to be utf-8 and whilst correct in that case, isn't for any
other character set and shouldn't be relied upon.
You *do* have to take a byte length of the string in the destination
character set though, so I'm interested in what the correct solution would
be.
Carl
More information about the Catalyst
mailing list