[Catalyst-commits] r14544 - trunk/examples/CatalystAdvent/root/2014
jnapiorkowski at dev.catalyst.perl.org
jnapiorkowski at dev.catalyst.perl.org
Thu Dec 11 02:43:19 GMT 2014
Author: jnapiorkowski
Date: 2014-12-11 02:43:19 +0000 (Thu, 11 Dec 2014)
New Revision: 14544
Added:
trunk/examples/CatalystAdvent/root/2014/12.pod
Log:
day 12
Added: trunk/examples/CatalystAdvent/root/2014/12.pod
===================================================================
--- trunk/examples/CatalystAdvent/root/2014/12.pod (rev 0)
+++ trunk/examples/CatalystAdvent/root/2014/12.pod 2014-12-11 02:43:19 UTC (rev 14544)
@@ -0,0 +1,147 @@
+=head1 UTF8 in Controller Actions
+
+All about stuff that is changing in the Holland (current development)
+release around content encoding and unicode support (part one,
+controllers and actions).
+
+=head1 Summary
+
+Starting in the upcoming L<Catalyst> release (holland, which is as of this
+writing dev003 on CPAN, and ready for your testing) Unicode encoding will be
+enabled by default. In addition we've made a ton of fixes around encoding
+and utf8 scattered throughout the codebase.
+
+This is part one of a three part series on UTF8 and content body encoding.
+In this part we will review changes to how UTF8 characters can be used in
+controller actions, how it looks in the debugging screens (and your logs)
+as well as how you construct L<URL> objects to actions with utf8 paths
+(or using utf8 args or captures).
+
+=head1 Unicode in Controllers and URLs
+
+ package MyApp::Controller::Root;
+
+ use uf8;
+ use base 'Catalyst::Controller';
+
+ sub heart_with_arg :Path('♥') Args(1) {
+ my ($self, $c, $arg) = @_;
+ }
+
+ sub base :Chained('/') CaptureArgs(0) {
+ my ($self, $c) = @_;
+ }
+
+ sub capture :Chained('base') PathPart('♥') CaptureArgs(1) {
+ my ($self, $c, $capture) = @_;
+ }
+
+ sub arg :Chained('capture') PathPart('♥') Args(1) {
+ my ($self, $c, $arg) = @_;
+ }
+
+=head1 Discussion
+
+In the example controller above we have constructed two matchable URL routes:
+
+ http://localhost/root/♥/{arg}
+ http://localhost/base/♥/{capture}/♥/{arg}
+
+The first one is a classic Path type action and the second uses Chaining, and
+spans three actions in total. As you can see, you can use unicode characters
+in your Path and PartPart attributes (remember to use the C<utf8> pragma to allow
+these multibyte characters in your source). The two constructed matchable routes
+would match the following incoming URLs:
+
+ (heart_with_arg) -> http://localhost/root/%E2%99%A5/{arg}
+ (base/capture/arg) -> http://localhost/base/%E2%99%A5/{capture}/%E2%99%A5/{arg}
+
+That path path C<%E2%99%A5> is url encoded unicode (assuming you are hitting this with
+a reasonably modern browser). Its basically what goes over HTTP when your type a
+browser location that has the unicode 'heart' in it. However we will use the unicode
+symbol in your debugging messages:
+
+ [debug] Loaded Path actions:
+ .-------------------------------------+--------------------------------------.
+ | Path | Private |
+ +-------------------------------------+--------------------------------------+
+ | /root/♥/* | /root/heart_with_arg |
+ '-------------------------------------+--------------------------------------'
+
+ [debug] Loaded Chained actions:
+ .-------------------------------------+--------------------------------------.
+ | Path Spec | Private |
+ +-------------------------------------+--------------------------------------+
+ | /base/♥/*/♥/* | /root/base (0) |
+ | | -> /root/capture (1) |
+ | | => /root/arg |
+ '-------------------------------------+--------------------------------------'
+
+And if the requested URL uses unicode characters in your captures or args (such as
+C<http://localhost:/base/♥/♥/♥/♥>) you should see the arguments and captures as their
+unicode characters as well:
+
+ [debug] Arguments are "♥"
+ [debug] "GET" request for "base/♥/♥/♥/♥" from "127.0.0.1"
+ .------------------------------------------------------------+-----------.
+ | Action | Time |
+ +------------------------------------------------------------+-----------+
+ | /root/base | 0.000080s |
+ | /root/capture | 0.000075s |
+ | /root/arg | 0.000755s |
+ '------------------------------------------------------------+-----------'
+
+Again, remember that we are display the unicode character and using it to match actions
+containing such multibyte characters BUT over HTTP you are getting these as URL encoded
+bytes. For example if you looked at the L<PSGI> C<$env> value for C<REQUEST_URI> you
+would see (for the above request)
+
+ REQUEST_URI => "/base/%E2%99%A5/%E2%99%A5/%E2%99%A5/%E2%99%A5"
+
+So on the incoming request we decode so that we can match and display unicode characters
+(after decoding the URL encoding). This makes it straightforward to use these types of
+multibyte characters in your actions and see them incoming in captures and arguments.
+
+=head1 UTF8 in constructing URLs.
+
+For the reverse (constructing meaningful URLs to actions that contain multibyte characters
+in their paths or path parts, or when you want to include such characters in your captures
+or arguments) L<Catalyst> will do the right thing (again just remember to use the C<utf8>
+pragma).
+
+ use utf8;
+ my $url = $c->uri_for( $c->controller('Root')->action_for('arg'), ['♥','♥']);
+
+When you stringyfy this object (for use in a template, for example) it will automatically
+do the right thing regarding utf8 encoding and url encoding.
+
+ http://localhost/base/%E2%99%A5/%E2%99%A5/%E2%99%A5/%E2%99%A5
+
+Since again what you want is a properly url encoded version of this. Ultimately what you want
+to send over the wire via HTTP needs to be bytes (not unicode characters).
+
+=head1 Conclusion
+
+Starting with the Holland release we've made a big effort to improve L<Catalyst> support
+for multibyte characters. You can use them in actions and in constructing URLs. Also
+we've updated the debugging screens to show you these types of characters correctly.
+
+In upcoming articles we will look at how L<Catalyst> deal with utf8 body encoding
+and how we handle HTML forms. So stay tuned!
+
+L<Catalyst> unicode is a work in progress; we are targeting the Holland release to
+make these fixes stable but you can play with it right now with the dev003 or better
+release on CPAN today! If you are a unicode master please help us get it right and
+review the code changes and test cases.
+
+Even if you don't consider yourself an expert we recommend you start testing this
+release since unicode is on by default going forward. I know this is a big change
+but it seems the only way to start getting this right is by getting everyone in the
+same conversation. But this is still development code and everything can change
+between now and stable release. So get your voice heard.
+
+=head1 Author
+
+John Napiorkowski L<jjnapiork at cpan.org|email:jjnapiork at cpan.org>
+
+=cut
More information about the Catalyst-commits
mailing list