[Catalyst] PageCache support for caching redirect responses

Paul Flow flowsbits at gmail.com
Sun Oct 31 08:38:37 GMT 2010


Hi everyone,

My Catalyst app supports requests using a variety of easily typed
shorthand URIs that redirect to a canonical URI.  Determining the
canonical from some shorthands can be expensive, but rarely if ever
change, so I'm looking to cache the response (with its Status: 301,
Location: ..., no body) so that subsequent requests for that shorthand
get the same response pulled from cache instead of re-evaluating and
re-generating the canonical and the redirect to it.

I've tried to set this up with Catalyst::Plugin::PageCache.  When it
returns the page from cache, the cache-related headers are correct,
but the resulting response has HTTP status 200 (OK) and there is no
Location header.


	First time request gets the desired redirect response:
	
		HTTP/1.1 301 Moved Permanently
		Date: Sun, 31 Oct 2010 07:01:35 GMT
		Server: Apache
		Cache-Control: max-age=1800
		Expires: Sun, 31 Oct 2010 07:31:36 GMT
		Last-Modified: Sun, 31 Oct 2010 07:01:36 GMT
		Set-Cookie: session=d9eb3632430335e064c98c6e9e38fa269e55157a;
path=/; expires=Sun, 31-Oct-2010 09:01:36 GMT; HttpOnly
		Location: http://example.com/path/to/canonical?q=something
		Vary: Accept-Encoding
		Content-Encoding: gzip
		Keep-Alive: timeout=15, max=100
		Connection: Keep-Alive
		Transfer-Encoding: chunked
		Content-Type: text/html; charset=utf-8


It seems well formed and the browser does cache it correctly.  A
subsequent request from the same browser for that shortcut URI
redirects without touching the server.

However if the request is repeated after clearing the browser cache
(or by another client), the request goes to the server and the cached
response comes back as below, with status 200 and without Location.
Since there's no body (as expected), we get what effectively amounts
to an empty response.

	Subsequent request is served from PageCache:
	
		HTTP/1.1 200 OK
		Date: Sun, 31 Oct 2010 07:03:01 GMT
		Server: Apache
		Cache-Control: max-age=1715
		Expires: Sun, 31 Oct 2010 07:31:36 GMT
		Last-Modified: Sun, 31 Oct 2010 07:01:36 GMT
		Set-Cookie: session=d9eb3632430335e064c98c6e9e38fa269e55157a;
path=/; expires=Sun, 31-Oct-2010 09:03:01 GMT; HttpOnly
		X-Catalyst: 5.80029
		X-PageCache: Catalyst
		Vary: Accept-Encoding
		Content-Encoding: gzip
		Keep-Alive: timeout=15, max=96
		Connection: Keep-Alive
		Transfer-Encoding: chunked
		Content-Type: text/html; charset=utf-8


Obviously, that's not going to do it.

Looking at the C::P::PageCache source, I can see that it stores
headers enumerated from $c->res->headers->header_field_names when its
cache_headers config is set (along with ones it generates).  It
applies these stored headers to cached responses by mapping them each
back into $c->res->headers->header() calls.  It seems that the
Location header _should_ be coming through.  Except that it's not.

PageCache isn't storing $c->res->status, nor setting it in most cases
(it does 304 on unchanged if_modified_since requests).  So that
explains why it's not redirecting.  Since it doesn't specify Status,
something defaults it to 200.

But I'm not sure what's happening to the Location header, something
might be discarding it when status isn't 201 (Created) or 3xx, since
it might only used in those cases.


Before I dig any further, I thought I'd ask if I'm approaching this correctly.

Is PageCache with cache_headers the right way or at least a reasonable
way to cache redirect responses in this case?  If so, is it reasonable
for PageCache to store the HTTP status and set it for cached responses
instead of forcing a 200 response?  Is there a technical reason
PageCache isn't doing this today?

I thought there might have been a conflict with needing to return 304
(Not Modified) for unchanged If-Modified-Since requests, but the HTTP
1.1 spec for If-Modified-Since
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.25 covers
it: "... If the request would normally result in anything other than a
200 (OK) status, ... the response is exactly the same as for a normal
GET...", so the redirect statuses would take precedence over the 304
(Not Modified).

Thoughts?


Here are simplified snips from my Cache and PageCache setup and an
affected controller to illustrate:


	package MyApp
		
		# ...
	
		use Catalyst qw/... Cache PageCache/;
		
		__PACKAGE__->config->{page_cache} = {
			set_http_headers => 1,
			cache_headers => 1,
			busy_lock => 10,
			cache_hook => 'should_cache'
		};
		
		
		sub should_cache {
			my $c = shift;
		
			return ($c->user_exists || $c->req->cookies->{nocache}) ? 0 : 1;
		}
		
		
		__PACKAGE__->config->{'Plugin::Cache'}{backend} = {
			class   => "Cache::Memcached",
			servers => ['127.0.0.1:11211']
		};
	
		# ...
	
	
	
	package MyApp::Controller::Root
	
		# ...
	
		sub default : Path {
			my ( $self, $c, @args ) = @_;
			
			$c->cache_page( 1800 );  # cached responses fail to redirect (wrong
status, no Location header)
	
			# expensive lookup of @args shortcut that finds its $canonicalURI
	
			if ($canonicalURI) {
				$c->res->redirect($c->uri_for($canonicalURI, $c->req->params), 301);
				$c->detach();
			}
	
			# ...
		}


Regards,

Paul Flow



More information about the Catalyst mailing list