[Catalyst] Escaping of "argument" of private path
Octavian Rasnita
orasnita at gmail.com
Wed Mar 16 07:02:10 GMT 2011
From: "John M. Dlugosz" <wxju46gefd at snkmail.com>
> On 3/15/2011 4:56 AM, Octavian Rasnita orasnita-at-gmail.com
> |Catalyst/Allow to home| wrote:
>>
>> uri_for() escapes only the chars which are not in the following list
>> (from URI.pm):
>>
>> $reserved = q(;/?:@&=+$,[]);
>> $mark = q(-_.!~*'()); #'; emacs
>> $unreserved = "A-Za-z0-9\Q$mark\E";
>>
>> The char "&" is a valid char in the URI, so it should not be escaped..
>> With other words, the following url is OK:
>>
>> http://localhost/dir1/dir2/ham%20&%20eggs.jpg
>>
>> uri_for() generates the URI as it needs to be accessed on the server and
>> not as it should be printed in an HTML page. In order to be printed
>> correctly, the "&" char must be HTML-encoded, so the html TT filter must
>> be used:
>>
>> <a href="[% c.uri_for('/path', 'eggs & ham.jpg', {a=1, b=2}).path_query |
>> html%]">label</a>
>>
>> It will give:
>>
>> <a href="/path/eggs%20&%20ham.jpg?a=1&b=2">label</a>
>>
>
> In contrast, the 'uri' filter in TT "converting any characters outside of
> the permitted URI character set (as defined by RFC 2396)" and that
> includes |&|, |@|, |/|, |;|, |:|, |=|, |+|, |?| and |$|.
> The 'url' filter in TT is less aggressive, and does not include those.
Those chars are not permitted in query strings but they are permitted in
URLS. The "?", "&", "=", "+", ";" signs are used for separating the path
and the query string, to delimit the query string parts, to represent a
space char...
They can be also used in names of the files in path. For example, the
following URL is valid:
http://localhost/static/a%20&%20@%20;%20$%20+%20=.txt
If you want, you can escape these chars everywhere, not only in the query
strings, but why would you want to do this?
> The '&' is a "Reserved Character" according to §2.2 of RFC 2396. That is
> what the code sample you quoted notes: the set of reserved characters.
> They may have specific meanings as delimiters within the overall URI, so
> should be escaped. Just skimming, I see that it's reserved within the
> query component.
Yes, but uri_for() escapes them in the query components (where they need to
be escaped).
For example:
[% file = 'a+b = c & $î @â'; a = 'a+b = c & $î @â'; b= 'a+b = c & $î @â' %]
<a href="[% c.uri_for('/path', file, {a=a, b=b}).path_query %]">label</a>
will display:
<a
href="/path/a+b%20=%20c%20&%20$%C3%AE%20@%C3%A2?a=a%2Bb+%3D+c+%26+%24%C3%AE+%40%C3%A2&b=a%2Bb+%3D+c+%26+%24%C3%AE+%40%C3%A2">label</a>
Note that I didn't html-encoded the URL for beeing easier to see the result.
As you may see, the reserved chars are escaped by uri_for() only where they
need to be escaped.
And of course, if you need to print this URL in an HTML document, you can
add the TT html filter and the "&" chars will be displayed as &.
> Anyway, using the TT 'uri' filter on the dynamic path component means I
> don't have to use the html filter also!
Why would you like to need to escape every path component by using the TT
uri filter for more times and escape the reserved chars even where they can
be used as they are, instead of using the html filter once?
If you want, you can uri-escape even the [a-zA-Z0-9] chars, but why would
you want to escape chars where they don't need to be escaped? :-)
Octavian
More information about the Catalyst
mailing list