[Catalyst] Re: How are you handling multiformat URL

Wed Nov 29 03:20:25 GMT 2006

* John Napiorkowski <jjn1056 at yahoo.com> [2006-11-29 00:05]:
> I often find that the resource of a given URI can have
> alternative formats.  For example, I might have a table of
> information on an html webpage and I want to offer two
> alternative representations, a comma separated version and an
> ATOM feed.

It’s not ATOM, any more than Perl is PERL. :-) It’s just Atom.
The name doesn’t stand for anything.

> So, what are you all doing to handle this?  Has the 'cool urls
> don't have file extensions' school of though fallen out of
> favor?  Has someone found a better way, or a compelling reason
> to choose one over the other?

HTTP has had an answer for this for a long time, although it
hasn’t seen much use. (Just like a lot of HTTP in general…) It’s
called Content Negotiation and revolves around the Accept header.

HTTP in general has a three-pronged architecture: URIs, resources
and representations. URIs identify resources, but each resource
can have multiple representations. Clients can use the Accept
header on a request to supply a list of MIME types they
understand, complete with a preference ranking. The server can
use this to decide which representation of the resource it will
serve.

In practice, conneg (content negotiation) suffers from both
conceptual problems and neglect. The conceptual problem is that
that he resource/representation duality is often fuzzy, and
conneg makes it harder to use the main cornerstone of web
architecture, ie. identify specific things by URI, which is
a powerful concept.

My current thinking on how to handle things is based on the fact
that the form of response to a request with conneg is entirely at
the server’s discretion. So I think this makes the most sense:

• The resource is identified by <http://example.org/foo>.

• Requests with (explicit or implicit) `Accept: *` directly
  return the default representation in the body of a 200 OK
  response.

• Requests with a more restrictive Accept header which the server
  deems it can serve are answered with a 302 redirect.

• Such redirects go to <http://example.org/foo.ext>.

• If a request to <http://example.org/foo.ext> has an Accept
  header that precludes the media type of that particular
  representation, the server responds 406 Not Acceptable.

An open question I have is whether requests to
<http://example.org/foo.ext> with `Accept: *` should actually
serve the resource or redirect back to <http://example.org/foo>
– which would mean that unless you insist on a GIF image when you
ask for <http://example.org/foo.gif>, you might actually get
a PNG image from <http://example.org/foo>. Yeah, it sounds crazy,
but the issue at hand is “canonical URIs.” Eg. I don’t want
a search engine spider to index my content thrice, just because
I offer it as all of text/plain, text/html and
application/xhtml+xml.

Mind you, this is not at all practice-proofed. It’s just my
current hypothesis as to the probably best approach to handle
this.

I decided for the /foo.ext rather than the /foo?format=ext
approach because there’s no difference in information conveyed
between the two.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>