[Catalyst] Catalyst Unicode

Bill Moseley moseley at hank.org
Fri Jan 31 15:03:38 GMT 2014


On Fri, Jan 31, 2014 at 3:58 AM, Will Crawford
<billcrawford1970 at gmail.com>wrote:

>
> If the string has been decoded *from* UTF-8 to Perl's internal
> representation, it's *not* going to be marked as UTF8 internally; it
> *shouldn't* be. It's no longer a "UTF8" string but a "Unicode" string,
> complete with wide characters. If anything, the internal "UTF8" flag
> means "this string needs decoding" rather than "has been decoded".
>


$ perl -le 'use Encode;  my $chars = decode_utf8( "bytes" ); print
Encode::is_utf8( $chars ) ? "Is flagged utf8\n" : "not flagged\n"; use
Devel::Peek; Dump($chars)'
Is flagged utf8

SV = PV(0x7fb8c10023f0) at 0x7fb8c102b6a8
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK,UTF8)
  PV = 0x7fb8c0e01170 "bytes"\0 [UTF8 "bytes"]
  CUR = 5
  LEN = 16

Everything is encoded.   The flag tells Perl that its internal
representation is encoded as utf8 so knows to work with it as utf8
characters (e.g. length() is length of chars, matching works on chars, etc.)

$ perl -le 'use Encode;  my $chars = decode( 'latin1', "bytes" ); print
Encode::is_utf8( $chars ) ? "Is flagged utf8\n" : "not flagged\n"; use
Devel::Peek; Dump($chars)'
Is flagged utf8



-- 
Bill Moseley
moseley at hank.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.scsys.co.uk/pipermail/catalyst/attachments/20140131/a902cb95/attachment.htm>


More information about the Catalyst mailing list