[Catalyst-dev] Request for Comments: Chained

Sun Jun 18 21:01:20 CEST 2006

Good morning everyone,

There are currently some discussions going on in #catalyst-dev about the
API of the new Chained DispatchType, so Matt asked me to forward this to
you for comments, notes and ideas.

You can find the current state of implementation in the Catalyst ChildOf
branch in the repository. Following to this paragraph comes the intro
documentation for Chained, and I'll sum up the open questions afterwards.

----- Catalyst/Manual/Intro.pod - Chained Attribute -----

=item * B<Chained>

The C<Chained> attribute allows you to chain public path parts together
by their private names. A chain part's path can be specified with
C<PathPart> and can be declared to expect an arbitrary number of
arguments. The endpoint of the chain specifies how many arguments it
gets through the C<Args> attribute. C<:Args(0)> would be none at all,
C<:Args> without an integer would be unlimited. The path parts that
aren't endpoints are using C<Captures> to specify how many parameters
they expect to receive. As an example setup:

  package MyApp::Controller::Greeting;
  use base qw/ Catalyst::Controller /;

  #   this is the beginning of our chain
  sub hello : PathPart('hello') Chained('/') Captures(1) {
      my ( $self, $c, $integer ) = @_;
      $c->stash->{ message } = "Hello ";
      $c->stash->{ arg_sum } = $integer;
  }

  #   this is our endpoint, because it has no :Captures
  sub world : PathPart('world') Chained('hello') Args(1) {
      my ( $self, $c, $integer ) = @_;
      $c->stash->{ message } .= "World!";
      $c->stash->{ arg_sum } += $integer;

      $c->response->body( join "<br/>\n" =>
          $c->stash->{ message }, $c->stash->{ arg_sum } );
  }

The debug output provides a separate table for chained actions, showing
the whole chain as it would match and the actions it contains. Here's
an example of the startup output with our actions above:

  ...
  [debug] Loaded Path Part actions:
  .-----------------------+------------------------------.
  | Path Spec             | Private                      |
  +-----------------------+------------------------------+
  | /hello/*/world/*      | /greeting/hello (1)          |
  |                       | => /greeting/world           |
  '-----------------------+------------------------------'
  ...

As you can see, Catalyst only deals with chains as whole path and
builds one for each endpoint, which are the actions with C<:Chained>
but without C<:Captures>.

Let's assume this application gets a request at the path
C</hello/23/world/12>, what happens then? First, Catalyst will dispatch
to the C<hello> action and pass the value C<23> as argument to it after
the context. It does so because we have previously used C<:Captures(1)>
to declare that it has one path part after itself as it's argument. We
told Catalyst that this is the beginning of the chain by specifying
C<:Chained('/')>. Also note that instead of saying C<:PathPart('hello')>
we could also just have said C<:PathPart>, as it defaults to the name of
the action.

After C<hello> has run, Catalyst goes on to dispatch to the C<world>
action. This is the last action to be called, as Catalyst knows this
is an endpoint because we specified no C<:Captures> attribute.
Nevertheless we specify that this action expects an argument, but at
this point we're using C<:Args(1)> to do that. We could also have said
C<:Args> or leave it out alltogether, which would mean this action gets
all arguments that are there. This action's C<:Chained> attribute says
C<hello> and tells Catalyst that the C<hello> action in the current
controller is it's parent.

With this we have built a chain consisting of two public path parts.
C<hello> captures one part of the path as it's argument, and also
specifies the path root as it's parent. So this part is C</hello/$arg>.
The next part is the endpoint C<world>, expecting one argument. It sums
up to the path part C<world/$arg>. This leads to a complete chain of
C</hello/$arg/world/$arg> which is matched against the requested paths.

This example application would, if run and called by e.g.
C</hello/23/world/12>, set the stash value C<message> to C<Hello > and
the value C<arg_sum> to C<23>. The C<world> action would then append
C<World!> to C<message> and add C<12> to the stash's C<arg_sum> value.
For the sake of simplicity no view is shown. Instead we just put the
values of the stash into our body. So the output would look like:

  Hello World!
  35

And our test server would've given us this debugging output for the
request:

  ...
  [debug] "GET" request for "hello/23/world/12" from "127.0.0.1"
  [debug] Path is "/greeting/world"
  [debug] Arguments are "12"
  [info] Request took 0.164113s (6.093/s)
  .------------------------------------------+-----------.
  | Action                                   | Time      |
  +------------------------------------------+-----------+
  | /greeting/hello                          | 0.000029s |
  | /greeting/world                          | 0.000024s |
  '------------------------------------------+-----------'
  ...

What would be common usecases of this dispatching technique? It gives
the possibility to split up logic that contains steps that each depend
on each other. An example would be, for example, a wiki path like
C</wiki/FooBarPage/rev/23/view>. This chain can be easily built with
these actions:

  sub wiki : PathPart('wiki') Chained('/') Captures(1) {
      my ( $self, $c, $page_name ) = @_;
      #  load the page named $page_name and put the object
      #  into the stash
  }

  sub rev : PathPart('rev') Chained('wiki') Captures(1) {
      my ( $self, $c, $revision_id ) = @_;
      #  use the page object in the stash to get at it's
      #  revision with number $revision_id
  }

  sub view : PathPart Chained('rev') Args(0) {
      my ( $self, $c ) = @_;
      #  display the revision in our stash. An other option
      #  would be to forward a compatible object to the action
      #  that displays the default wiki pages, unless we want
      #  a different interface here, for example restore
      #  functionality.
  }

It would now be possible to add other endpoints. For example C<restore>
to restore this specific revision as current state.

Also, you of course don't have to put all the chained actions in one
controller. The specification of the parent through C<:Chained> also
takes an absolute action path as it's argument. Just specify it with a
leading C</>.

If you want, for example, to have actions for the public paths
C</foo/12/edit> and C</foo/12>, just specify two actions with
C<:PathPart('foo')> and C<:Chained('/')>. The handler for the former
path needs a C<:Captures(1)> attribute and a endpoint with
C<:PathPart('edit')> and C<:Chained('foo')>. For the latter path give
the action just a C<:Args(1)> to mark it as endpoint. This sums up to
this debugging output:

  ...
  [debug] Loaded Path Part actions:
  .-----------------------+------------------------------.
  | Path Spec             | Private                      |
  +-----------------------+------------------------------+
  | /foo/*                | /controller/foo_view         |
  | /foo/*/edit           | /controller/foo_load (1)     |
  |                       | => /controller/edit          |
  '-----------------------+------------------------------'
  ...

Here's a more detailed specification of the attributes belonging to
C<:Chained>:

=over 8

=item PathPart

Sets the name of this part of the chain. If it is specified without
arguments, it takes the name of the action as default. So basically
C<sub foo :PathPart> and C<sub foo :PathPart('foo')> are identical.
This can also contain slashes to bind to a deeper level. An action
with C<sub bar :PathPart('foo/bar') :Chained('/')> would bind to
C</foo/bar/...>. If you don't specify C<:PathPart> it has the same
effect as using C<:PathPart>, it would default to the action name.

=item Chained

Has to be specified for every child in the chain. Possible values are
absolute and relative private action paths, with the relatives pointing
to the current controller, or a single slash C</> to tell Catalyst that
this is the root of a chain. The attribute C<:Chained> without aguments
also defaults to the C</> behaviour.

Due to the fact that you can specify an absolute path to the parent
action, it doesn't matter to Catalyst where that parent is located. So,
if your design requests it, you can redispatch a chain through every
controller or namespace you want.

Another interesting possibility gives C<:Chained('.')>, which chains
itself to an action with the path of the current controllers namespace.
For example:

  #   in MyApp::Controller::Foo
  sub bar : Chained Captures(1) { ... }

  #   in MyApp::Controller::Foo::Bar
  sub baz : Chained('.') Args(1) { ... }

This builds up a chain like C</bar/*/baz/*>. The specification of C<.>
as argument to Chained here chains the C<baz> action to an action with
the path of the current controller namespace, namely C</foo/bar>. That
action chains directly to C</>, so the above chain comes out as end
product.

=item Captures

Also has to be specified for every part of the chain that is not an
endpoint. With this attribute Catalyst knows how many of the following
parts of the path (separated by C</>) this action wants to captures as
it's arguments. If it doesn't expect any, just specify C<:Captures(0)>.
The captures get passed to the action's C<@_> right after the context,
but you can also find them as array reference in
C<$c-E<gt>request-E<gt>captures-E<gt>[$level]>. The C<$level> is the
level of the action in the chain that captured the parts of the path.

An action that is part of a chain (read: that has a C<:Chained>
attribute) but has no C<:Captures> attribute is treated by Catalyst as a
chain end.

=item Args

By default, endpoints receive the rest of the arguments in the path. You
can tell Catalyst through C<:Args> explicitly how many arguments your
endpoint expects, just like you can with C<:Captures>. Note that this
also influences if this chain is invoked on a request. A chain with an
endpoint specifying one argument will only match if exactly one argument
exists in the path.

You can specify an exact number of arguments like C<:Args(3)>, including
C<0>. If you just say C<:Args> without any arguments, it is the same as
leaving it out alltogether: The chain is matched independent of the
number of path parts after the endpoint.

Just like with C<:Captures>, the arguments get passed to the action in
C<@_> after the context object. They can also be reached through
C<$c-E<gt>request-E<gt>arguments>.

=back

----- End of Chained Attribute documentation -----

And these are the currently ongoing discussions:

a) PathPart('foo') vs. Path('foo')

What name should the attribute get which defines the name of the current
part of the chain? Suggestions for other attribute names are welcome
too, of course.

Reasons for PathPart: Distinction between the path part specification
for chained actions and actual "Path" actions. "Path" could confuse
people into believing they can use "Global", "LocalRegex" & Co. for this.

Reasons for Path: Reads better and the functionality is similar to that
of the actual "Path" action.

b) Captures, Args and endpoints

Currently, the dispatcher regards a part of a chain without :Captures as
an endpoint. Endpoints use :Args (or none at all, which defaults to
unlimited arguments).

Note however, that the dispatcher needs to distinguish between:
- A loose chain with no endpoint and
- A complete chain with an endpoint

This is needed because with this combined with :Chained('.') we can
pretty easily build modular applications. In the current implementation,
a loose chain without an end is just not active and can't match. If we
drop in (e.g. through a config option) a controller with a namespace
that matches that loose end, it can link itself to the loose chain with
:Chained('.') and the loose end can provide the sub application with
the context it needs, as it is run before Catalyst dispatches to the
plugged in application.

The question is therefore, how to design the API specifying how many
arguments each chain part takes, and how to distinguish what's an
endpoint and what's not.

These are the ideas that came up in this regard:

a) CaptureArgs instead of Captures

This would keep the "Captures" keyword free for other uses. Also it
would come closer to :Args' name and it would be clearer that they have
similar (but not equal) semantics in a chain.

b) lose Captures alltogether, use Args and add an Endpoint attribute

This would result in actions like (not literally):

  sub foo :Chained PathPart('foobar') Args(1) { ... }
  sub baz :Chained('foo') PathPart('foobaz') ChainEnd { ... }

Is this saner? How should the end of the chain be called?

Another issue this is closely tied to is where to access the
captures/arguments. At them moment there's $c->req->captures for all
parts of the chain bound with :Captures, and $c->req->args as well as
the action's @_ for the current action's arguments, captured or otherwise.

So, I guess this is a rough overview of the current situation as it is.
Comments for the API are as appreciated as ones for implementation and
documentation of course.

phaylon