[Dbix-class] Recursive update_or_create - RFC

Sat Jun 21 21:09:40 BST 2008

On Sat, Jun 21, 2008 at 4:05 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
> On Sat, Jun 21, 2008 at 4:44 PM, luke saunders <luke.saunders at gmail.com> wrote:
>> On Sat, Jun 21, 2008 at 1:51 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
>>> On Sat, Jun 21, 2008 at 1:51 PM, luke saunders <luke.saunders at gmail.com> wrote:
>>>> On Thu, Jun 19, 2008 at 1:29 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
>>>>> In the attachments you'll find  DBIx::Class::ResultSet::RecursivePUT -
>>>>> it is a base class for ResultSets and it provides just one method:
>>>>> recursive_PUT which works just like update_or_create - but can update
>>>>> a whole tree of related records.  I wrote it as a method for REST PUT
>>>>> operations, i.e. creating and updating complex objects.  There is one
>>>>> more difference from update_or_create - it will work on tables with
>>>>> auto_increment primary keys - you just need to set the key to 'undef'
>>>>> - then it will be deleted from the INSERT sent to the database (this
>>>>> will only happen for auto_increment keys - other columns will be set
>>>>> to 'NULL' just as usual).  This is additional complexity of the
>>>>> semantics - but with the benefit of more universal and uniform usage.
>>>>>
>>>>> I am waiting for comments - especially about the names - and also
>>>>> about the interface.  I am thinking that perhaps the object oriented
>>>>> interface creates too much hassle for the programmer as it requires to
>>>>> explicitly set up the ResultSet classes - while normally they can be
>>>>> omitted.  So maybe I should change the interface to a simple
>>>>> procedural interface and change $rs->recursive_PUT( $update_hash ) to
>>>>> recursive_PUT( $rs, $update_hash).  What do you think?  Any other
>>>>> comments are welcome as well.
>>>>
>>>> so essentially this provides another $rs->update_or_create which can
>>>> work recursively, right? so if that's the case then why not just
>>>> override the normal $rs->update_or_create method rather than providing
>>>> another one? which then begs the question, if this is worth having at
>>>> all, then maybe it should just be part of DBIC itself. though if this
>>>> were to be a separate ResultSet class, then I wouldn't call it
>>>> RecursivePUT - it's nothing to do with PUT requests really - so maybe
>>>> RecursiveUpdateOrCreate or something.
>>>
>>> Certainly it would be an ideal solution for me if it was included in
>>> the DBIC core - unfortunately I don't think this can happen - because
>>> it requires a special treatment of 'undef's for auto_increment columns
>>> - see below.
>>>
>>>>
>>>> i don't quite follow the point about it working with auto increment
>>>> PKs. are you just saying that if you provide { pk => undef } to the
>>>> existing update_or_create implementation, it'll try to insert a row
>>>> with a NULL PK? but presumably it won't do that if you don't set the
>>>> pk at all, so I'm not sure I see the problem.
>>>
>>> Yes - the current update_or_create will try to insert a NULL PK and
>>> this will fail in PostgreSQL.  Unfortunately you cannot also just
>>> remove the pk from the query - because then the find part will not
>>> work - I've described this at
>>> http://perlalchemy.blogspot.com/2008/06/dbixclass-gotchas.html .  I
>>> consider my arrangement with special treatment of 'undef' for
>>> auto_increment columns as the only way to have it working universally
>>> - otherwise you cannot have just one method to do the update and
>>> create - you need to tell the resultset if you want to do update or
>>> create. This also means that it would be very complicated to do it
>>> recursively - because you'll need additional parameters to tell if it
>>> is and update or create for each row in the structure.
>>
>> if the either all of the primary key columns or all of the columns
>> from one of the unique constraints are provided then it makes sense
>> for update_or_create to attempt the find but if this isn't the case
>> then there's no point doing the find and it should skip straight to
>> the create.
>
> This cannot be done - because some of the columns can be set in the
> resultset internal condition and it is hard to determine what columns
> were set there.  See the example from my blog article:
>
> $new_cd_rs = $cd_rs->search( { -and => [ id => [1, 2], id => { '>' => 1 } ] } )
> and then $new_cd_rs->update_or_create( { artist => 1 } ).
>
> This is  important when accessing related rows (because part of the
> key of the whole of it can be the foreign key - so it will be in the
> resultsource and it does not need to be in the query) and in case of
> RestrictedWithObject.  Both cases require that you take into account
> the conditions internal to the resultset, but even without them it
> would be just more consistent with the rest of resultset methods (like
> search - which does take into account the internal resultset
> conditions).

Yeah, I stand corrected, we can't do this check. So given that I guess
we won't be able to have this in DBIC core until after the SQLA
refactor, but I think that's ultimately what *should* happen.

So back to your current implementation. The undef thing seems like a
bit of a hack - I'm not sure how well it's going to play with unique
constraints. For example if I have PK 'id' and a unique on 'name' then
->find({ id => undef, name => 'luke' }) is going to return nothing
even if there already exists a row with that name, and so when the
subsequent ->create({ name => 'luke' }) happens we're going to get a
nasty surprise. Right?