[Catalyst-dev] Request URI (path) normalisation

Sebastian Riedel sri-lists at labs.kraih.com
Sat Sep 27 20:36:04 BST 2008


27.09.2008 19:30 Matt S Trout:

> On Sun, Sep 14, 2008 at 08:04:31PM +0200, Florian Zumbiehl wrote:
>> Hi,
>>
>> this email basically arose from a discussion on #catalyst/ 
>> irc.perl.org
>> where my (more or less) original question was for the format of the
>> string that Regex actions do match against.
>>
>> As nobody really seemed to know the answer, it got into a discussion
>> of basic URI semantics and finally kindof to the conclusion that
>> the current implementation of Regex (at least) probably is broken.
>> Part of that conclusion actually isn't from first-hand experience on
>> my part, but rather from Sebastian Riedel's examination of the source
>> of the current version, AFAICT - the debian backport package (5.7006)
>> I am using behaves differently. So, please forgive me, should this
>> invalidate parts of the following.
>
> Yeah, sri implemented this broken in the first place.
>
>> The behaviour I would consider sensible would be the normalisation
>> of the path in such a way that any two URI paths that are mandated
>> by the RFC to be equivalent will result in the exact same string,
>> and any two URI paths that are not mandated by the RFC to be
>> equivalent will result in different strings.
>
> Backwards compatible patches to make it sane very welcome; I don't  
> really
> ever use Regex actions so I've no itch to scratch.
>
>> IMO, in addition, as many characters as possible should be in
>> unescaped form after normalisation. For the path alone, that
>> would mean that only slashes in path components would really have
>> to be escaped.
>
> You also miss that things like () are marked by the URI standard as
> being allowed to be used for internal subhierarchies so these have  
> to be
> kept intact as well.
>
> I think I had a play with this a while back and determined I didn't  
> have
> time to do it right.

Mojo::ByteStream::url_sanitize() should fix this.

--
sebastian



More information about the Catalyst-dev mailing list