[Catalyst] Re: utf8 in regexes in Catalyst

Matt Lawrence matt.lawrence at ymogen.net
Mon Mar 3 12:45:19 GMT 2008


Aristotle Pagaltzis wrote:
> So for that one-liner, you do this:
>
>     echo 'é' | perl -MEncode -e '$_ = decode 'UTF-8', scalar <>; print /\w/'
>
> Yes, this is tedious. So what you do is you find ways to get the
> parts of your program that speak to the outside world to decode
> input on receipt and encode output on emission. Then inside your
> program, you don’t need to think about it at all. F.ex., for the
> one-liner, you would declare that your STDIN and STDOUT are in
> UTF-8 and then reading from and writing to them automatically
> does what it should. Handily, perl has a switch for that when it
> comes to UTF-8:
>
>     echo 'é' | perl -CS -e 'print <> =~ /\w/'
>
>   
See also the PERL_UNICODE environment variable, documented in man perlrun.

Matt




More information about the Catalyst mailing list