[Catalyst-dev] RFC 3875, REQUEST_URI and PATH_INFO

Tomas Doran bobtfish at bobtfish.net
Sun May 2 22:42:58 GMT 2010

On 21 Apr 2010, at 23:30, <danny-catalyst-dev at sadinoff.com> <danny-catalyst-dev at sadinoff.com 
 > wrote:

> per bobtfish, I'm writing about the RFC3875 paragraph of
> Catalyst/Engine/CGI.pm rev 13152 around line 152.
> This new bit of code is harming the current mapping REQUEST_URI +
> PATH_INFO => $base_uri, which is used for dispatch.
> In particular, I've got a set of mod_rewrite config clauses, which are
> getting misdirected by this new code.
> http://paste.scsys.co.uk/42123
> If I understand the situation correctly, bobtfish has broken my
> redirects in an attempt to address issues surrounding the encoding of
> slash-containing query parameters.

That'd be me then :)

Sorry about the delay in getting to this - I've been thinking about it  
fairly deeply for a while, and I've come to the conclusion it's  
entirely unsolvable easily...

To make this entirely clear to everyone (hopefully), the difficulty is  
that previously Catalyst entirely ignored the REQUEST_URI environment,  
and instead constructed the request base ($c->req->base), uri ($c->req- 
 >uri) and path ($c->req->path) from the combination of SCRIPT_NAME  

However, the PATH_INFO is _always_ decoded, which means that %2F is  
decoded into /, meaning that you can't possibly get it right if you're  
just using PATH_INFO.

> I have suggested to him that his
> changes are hard to prove correct without a good spec for the above
> mapping, and he's agreed to either remove the code or configure it off
> by default.

It's not to do with any spec - it's down to what web servers actually  
do in reality.

In the case where (for example), your app is at /cgi-bin/myapp.cgi,  
the request path is /foo%2Fbar, and mod_rewrite or mod_alias is used  
to map / into /cgi-bin/myapp.cgi, then the REQUEST_URI will reflect  
the path (/foo%2Fbar), SCRIPT_NAME will be /cgi-bin/myapp.cgi and the  
PATH_INFO contains /cgi-bin/myapp.cgi/foo/bar

This means that if you're using REQUEST_URI and SCRIPT_NAME, then the  
request base cannot be correctly determined. If you're using PATH_INFO  
and SCRIPT_NAME (as we used to) then everything works as expected,  
however you _cannot_ handle %2F correctly...

So basically, you're damned if you do and damned if you don't.

Given that many people are relying on being able to map arbitrary  
paths into the application using mod_(rewrite|alias|ssi), then I think  
we have to revert to the previous behavior by default, and provide the  
behavior of using REQUEST_URI as a configuration option.

I have the change in a branch now:

I've attached the diff of the branch right now to this mail.. Please  
someone review.

I'd also plan to back out some of the heuristics in the REQUEST_URI  
handling that I added after initially finding the issues so that the  
handling is simple, and we also need to add a way to have your cake  
and eat it too - by telling Catalyst the place(s) the application is  
based (by config) so that it can both get the $c->req->base correct,  
AND use REQUEST_URI to get %2F handling (i.e. have your cake and eat  
it too).

However all of this can happen later - the main issue blocking merging  
this code is (a) having tests for both cases (config option on and  
off) - I'll get to these very shortly (but probably not tonight), and  
(b) the name I've given the config option ('rfc3875_path') is entirely  
rubbish. Can someone please suggest something better?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix_request_uri.patch
Type: application/octet-stream
Size: 6326 bytes
Desc: not available
Url : http://lists.scsys.co.uk/pipermail/catalyst-dev/attachments/20100502/8001334c/fix_request_uri.obj
-------------- next part --------------


More information about the Catalyst-dev mailing list