[Catalyst] Dispatching with Chained vs HTTP method

Wed May 7 10:10:37 BST 2008

On Wed, May 07, 2008 at 06:02:46PM +1000, Adam Clarke wrote:
> On 07/05/2008, at 3:57 PM, Toby Corkindale wrote:
>> On Wed, May 07, 2008 at 03:30:12PM +1000, Adam Clarke wrote:
>>>
>>> The solution suggested in "Restful Web Services" is to POST to a 
>>> "factory"
>>> resource which creates you with a transaction resource. e.g. "POST
>>> /transactions/account-transfer" returns "Location:
>>> /transactions/account-transfer/11a5", where the 11a5 is a unique
>>> transaction identifier.
>>>
>>> Then "PUT /transactions/account-transfer/11a5/accounts/checking/11", 
>>> where
>>> 11 is the account identifier. The body carries the transaction details, 
>>> in
>>> the example the balances are adjusted absolutely, i.e. "balance=150". A
>>> similar PUT is sent for the other account.
>>>
>>> Once the required components of the transaction have been PUT it is
>>> possible to rollback by DELETEing the transaction resource or commit it 
>>> by
>>> putting "committed=true" to the resource.
>>>
>>> While seeming a bit fiddly, it does keep the state on the client and 
>>> allows
>>> the client to make (at least some of) the commit / rollback decision 
>>> rather
>>> than (only) the server.
>>
>> I've read parts of RESTful Web Services, but not that bit.. I'll have to go
>> back and look.
>>
>> I wonder how one goes about implementing such a transaction on the server
>> side.. One would not want to lock DB rows indefinitely, waiting for the
>> client to finally complete the transaction. But if one just recorded the
>> queries and then executed them all (internally) at the end, then other risks
>> exist, eg:
>
> I haven't done this before, but I have thought about it a bit. I think I 
> would handle this as a two-phase commit. PostgreSQL has "PREPARE 
> TRANSACTION" which allows you to start a transaction and assign it a 
> "transaction_id" for use with a subsequent "COMMIT TRANSACTION". I would 
> also use Multi-Version Concurrency Control (MVCC) rather than any kind of 
> blocking locks to minimise the impact of the longer transaction lifetime.

I'm not sure the former command does what we'd like it to - after running it, I
don't think you can add any further commands to it; it's merely held in stasis
until you commit/rollback it, and you can start another transaction meanwhile.
I *think* but I haven't used it either.

Regarding the MVCC; that's a rather good idea, although my understanding of
postgres is that it will start blocking in conditions where you're updating the
same thing. (Edit: Just tested - it seems to)
That said, I don't know how other systems handle it, or if in fact the
SET SERIALISATION parameter to psql can alter this behaviour..
Are there other MVCC implementations which manage it?
You seem to have a good idea about using that rather than locking.

Something else occured to me - Have you had any experience at trying to get DB
transactions to span connections? Since the HTTP requests could hit different
processes for each request (or possibly even different servers in a farm)..
I suppose one could push the DB requests to a back-end processing daemon that
could ensure a consistent connection was used, but again seems to be tieing up
resources if there's a network drop-out. I haven't looked into that at all
really though.
I was just assuming one might have to use locks as they would span, but that
would be ugly, and breaks the concept of not storing state on the server for
REST.

> This would at least keep a good deal of the hard work in the DB.

*nods* I agree that's the best option.
I'm mainly familiar with Postgres (and a little of MySQL) so I don't know if
the commercial DBs have added features to help with these issues already?