[Dbix-class] Recursive update_or_create - RFC

Sun Jun 22 23:23:37 BST 2008

On Sun, Jun 22, 2008 at 3:21 PM, luke saunders <luke.saunders at gmail.com> wrote:
> On Sat, Jun 21, 2008 at 9:45 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
>> On Sat, Jun 21, 2008 at 10:09 PM, luke saunders <luke.saunders at gmail.com> wrote:
>>> On Sat, Jun 21, 2008 at 4:05 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
>>>> On Sat, Jun 21, 2008 at 4:44 PM, luke saunders <luke.saunders at gmail.com> wrote:
>>>>> On Sat, Jun 21, 2008 at 1:51 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
>>>>>> On Sat, Jun 21, 2008 at 1:51 PM, luke saunders <luke.saunders at gmail.com> wrote:
>>>>>>> On Thu, Jun 19, 2008 at 1:29 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
>>>>>>>> In the attachments you'll find  DBIx::Class::ResultSet::RecursivePUT -
>>>>>>>> it is a base class for ResultSets and it provides just one method:
>>>>>>>> recursive_PUT which works just like update_or_create - but can update
>>>>>>>> a whole tree of related records.  I wrote it as a method for REST PUT
>>>>>>>> operations, i.e. creating and updating complex objects.  There is one
>>>>>>>> more difference from update_or_create - it will work on tables with
>>>>>>>> auto_increment primary keys - you just need to set the key to 'undef'
>>>>>>>> - then it will be deleted from the INSERT sent to the database (this
>>>>>>>> will only happen for auto_increment keys - other columns will be set
>>>>>>>> to 'NULL' just as usual).  This is additional complexity of the
>>>>>>>> semantics - but with the benefit of more universal and uniform usage.
>>>>>>>>
>>>>>>>> I am waiting for comments - especially about the names - and also
>>>>>>>> about the interface.  I am thinking that perhaps the object oriented
>>>>>>>> interface creates too much hassle for the programmer as it requires to
>>>>>>>> explicitly set up the ResultSet classes - while normally they can be
>>>>>>>> omitted.  So maybe I should change the interface to a simple
>>>>>>>> procedural interface and change $rs->recursive_PUT( $update_hash ) to
>>>>>>>> recursive_PUT( $rs, $update_hash).  What do you think?  Any other
>>>>>>>> comments are welcome as well.
>>>>>>>
>>>>>>> so essentially this provides another $rs->update_or_create which can
>>>>>>> work recursively, right? so if that's the case then why not just
>>>>>>> override the normal $rs->update_or_create method rather than providing
>>>>>>> another one? which then begs the question, if this is worth having at
>>>>>>> all, then maybe it should just be part of DBIC itself. though if this
>>>>>>> were to be a separate ResultSet class, then I wouldn't call it
>>>>>>> RecursivePUT - it's nothing to do with PUT requests really - so maybe
>>>>>>> RecursiveUpdateOrCreate or something.
>>>>>>
>>>>>> Certainly it would be an ideal solution for me if it was included in
>>>>>> the DBIC core - unfortunately I don't think this can happen - because
>>>>>> it requires a special treatment of 'undef's for auto_increment columns
>>>>>> - see below.
>>>>>>
>>>>>>>
>>>>>>> i don't quite follow the point about it working with auto increment
>>>>>>> PKs. are you just saying that if you provide { pk => undef } to the
>>>>>>> existing update_or_create implementation, it'll try to insert a row
>>>>>>> with a NULL PK? but presumably it won't do that if you don't set the
>>>>>>> pk at all, so I'm not sure I see the problem.
>>>>>>
>>>>>> Yes - the current update_or_create will try to insert a NULL PK and
>>>>>> this will fail in PostgreSQL.  Unfortunately you cannot also just
>>>>>> remove the pk from the query - because then the find part will not
>>>>>> work - I've described this at
>>>>>> http://perlalchemy.blogspot.com/2008/06/dbixclass-gotchas.html .  I
>>>>>> consider my arrangement with special treatment of 'undef' for
>>>>>> auto_increment columns as the only way to have it working universally
>>>>>> - otherwise you cannot have just one method to do the update and
>>>>>> create - you need to tell the resultset if you want to do update or
>>>>>> create. This also means that it would be very complicated to do it
>>>>>> recursively - because you'll need additional parameters to tell if it
>>>>>> is and update or create for each row in the structure.
>>>>>
>>>>> if the either all of the primary key columns or all of the columns
>>>>> from one of the unique constraints are provided then it makes sense
>>>>> for update_or_create to attempt the find but if this isn't the case
>>>>> then there's no point doing the find and it should skip straight to
>>>>> the create.
>>>>
>>>> This cannot be done - because some of the columns can be set in the
>>>> resultset internal condition and it is hard to determine what columns
>>>> were set there.  See the example from my blog article:
>>>>
>>>> $new_cd_rs = $cd_rs->search( { -and => [ id => [1, 2], id => { '>' => 1 } ] } )
>>>> and then $new_cd_rs->update_or_create( { artist => 1 } ).
>>>>
>>>> This is  important when accessing related rows (because part of the
>>>> key of the whole of it can be the foreign key - so it will be in the
>>>> resultsource and it does not need to be in the query) and in case of
>>>> RestrictedWithObject.  Both cases require that you take into account
>>>> the conditions internal to the resultset, but even without them it
>>>> would be just more consistent with the rest of resultset methods (like
>>>> search - which does take into account the internal resultset
>>>> conditions).
>>>
>>> Yeah, I stand corrected, we can't do this check. So given that I guess
>>> we won't be able to have this in DBIC core until after the SQLA
>>> refactor, but I think that's ultimately what *should* happen.
>>>
>>> So back to your current implementation. The undef thing seems like a
>>> bit of a hack - I'm not sure how well it's going to play with unique
>>> constraints. For example if I have PK 'id' and a unique on 'name' then
>>> ->find({ id => undef, name => 'luke' }) is going to return nothing
>>> even if there already exists a row with that name, and so when the
>>> subsequent ->create({ name => 'luke' }) happens we're going to get a
>>> nasty surprise. Right?
>>
>> Yes - I am assuming that the rows are validated elsewhere.  I have no
>> idea how to include validation in this method.  But it is actually not
>> related to the 'undef' thing - this query is an error - and it should
>> fail.  If you let it update the row with name => 'luke' then a new
>> user will be able up change data of someone else.
>
> Right, so by specifying pk => undef you're basically saying "don't
> bother with the find, just do the create". I think probably we could need a
> more elegant way to specify this.
>
> Other than that I think this could be useful.

Also, I currently need the update method of the
Catalyst::Controller::DBIC::API module we've talked about before to
allow updates to related rows, so it would be good if your module also
provided a recursive update method which would error if a related row
was not found rather than trying to create it, then I could use that.
Would that be simple to do?