[Dbix-class] Recursive update_or_create - RFC

Sun Jun 22 15:21:49 BST 2008

On Sat, Jun 21, 2008 at 9:45 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
> On Sat, Jun 21, 2008 at 10:09 PM, luke saunders <luke.saunders at gmail.com> wrote:
>> On Sat, Jun 21, 2008 at 4:05 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
>>> On Sat, Jun 21, 2008 at 4:44 PM, luke saunders <luke.saunders at gmail.com> wrote:
>>>> On Sat, Jun 21, 2008 at 1:51 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
>>>>> On Sat, Jun 21, 2008 at 1:51 PM, luke saunders <luke.saunders at gmail.com> wrote:
>>>>>> On Thu, Jun 19, 2008 at 1:29 PM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
>>>>>>> In the attachments you'll find  DBIx::Class::ResultSet::RecursivePUT -
>>>>>>> it is a base class for ResultSets and it provides just one method:
>>>>>>> recursive_PUT which works just like update_or_create - but can update
>>>>>>> a whole tree of related records.  I wrote it as a method for REST PUT
>>>>>>> operations, i.e. creating and updating complex objects.  There is one
>>>>>>> more difference from update_or_create - it will work on tables with
>>>>>>> auto_increment primary keys - you just need to set the key to 'undef'
>>>>>>> - then it will be deleted from the INSERT sent to the database (this
>>>>>>> will only happen for auto_increment keys - other columns will be set
>>>>>>> to 'NULL' just as usual).  This is additional complexity of the
>>>>>>> semantics - but with the benefit of more universal and uniform usage.
>>>>>>>
>>>>>>> I am waiting for comments - especially about the names - and also
>>>>>>> about the interface.  I am thinking that perhaps the object oriented
>>>>>>> interface creates too much hassle for the programmer as it requires to
>>>>>>> explicitly set up the ResultSet classes - while normally they can be
>>>>>>> omitted.  So maybe I should change the interface to a simple
>>>>>>> procedural interface and change $rs->recursive_PUT( $update_hash ) to
>>>>>>> recursive_PUT( $rs, $update_hash).  What do you think?  Any other
>>>>>>> comments are welcome as well.
>>>>>>
>>>>>> so essentially this provides another $rs->update_or_create which can
>>>>>> work recursively, right? so if that's the case then why not just
>>>>>> override the normal $rs->update_or_create method rather than providing
>>>>>> another one? which then begs the question, if this is worth having at
>>>>>> all, then maybe it should just be part of DBIC itself. though if this
>>>>>> were to be a separate ResultSet class, then I wouldn't call it
>>>>>> RecursivePUT - it's nothing to do with PUT requests really - so maybe
>>>>>> RecursiveUpdateOrCreate or something.
>>>>>
>>>>> Certainly it would be an ideal solution for me if it was included in
>>>>> the DBIC core - unfortunately I don't think this can happen - because
>>>>> it requires a special treatment of 'undef's for auto_increment columns
>>>>> - see below.
>>>>>
>>>>>>
>>>>>> i don't quite follow the point about it working with auto increment
>>>>>> PKs. are you just saying that if you provide { pk => undef } to the
>>>>>> existing update_or_create implementation, it'll try to insert a row
>>>>>> with a NULL PK? but presumably it won't do that if you don't set the
>>>>>> pk at all, so I'm not sure I see the problem.
>>>>>
>>>>> Yes - the current update_or_create will try to insert a NULL PK and
>>>>> this will fail in PostgreSQL.  Unfortunately you cannot also just
>>>>> remove the pk from the query - because then the find part will not
>>>>> work - I've described this at
>>>>> http://perlalchemy.blogspot.com/2008/06/dbixclass-gotchas.html .  I
>>>>> consider my arrangement with special treatment of 'undef' for
>>>>> auto_increment columns as the only way to have it working universally
>>>>> - otherwise you cannot have just one method to do the update and
>>>>> create - you need to tell the resultset if you want to do update or
>>>>> create. This also means that it would be very complicated to do it
>>>>> recursively - because you'll need additional parameters to tell if it
>>>>> is and update or create for each row in the structure.
>>>>
>>>> if the either all of the primary key columns or all of the columns
>>>> from one of the unique constraints are provided then it makes sense
>>>> for update_or_create to attempt the find but if this isn't the case
>>>> then there's no point doing the find and it should skip straight to
>>>> the create.
>>>
>>> This cannot be done - because some of the columns can be set in the
>>> resultset internal condition and it is hard to determine what columns
>>> were set there.  See the example from my blog article:
>>>
>>> $new_cd_rs = $cd_rs->search( { -and => [ id => [1, 2], id => { '>' => 1 } ] } )
>>> and then $new_cd_rs->update_or_create( { artist => 1 } ).
>>>
>>> This is  important when accessing related rows (because part of the
>>> key of the whole of it can be the foreign key - so it will be in the
>>> resultsource and it does not need to be in the query) and in case of
>>> RestrictedWithObject.  Both cases require that you take into account
>>> the conditions internal to the resultset, but even without them it
>>> would be just more consistent with the rest of resultset methods (like
>>> search - which does take into account the internal resultset
>>> conditions).
>>
>> Yeah, I stand corrected, we can't do this check. So given that I guess
>> we won't be able to have this in DBIC core until after the SQLA
>> refactor, but I think that's ultimately what *should* happen.
>>
>> So back to your current implementation. The undef thing seems like a
>> bit of a hack - I'm not sure how well it's going to play with unique
>> constraints. For example if I have PK 'id' and a unique on 'name' then
>> ->find({ id => undef, name => 'luke' }) is going to return nothing
>> even if there already exists a row with that name, and so when the
>> subsequent ->create({ name => 'luke' }) happens we're going to get a
>> nasty surprise. Right?
>
> Yes - I am assuming that the rows are validated elsewhere.  I have no
> idea how to include validation in this method.  But it is actually not
> related to the 'undef' thing - this query is an error - and it should
> fail.  If you let it update the row with name => 'luke' then a new
> user will be able up change data of someone else.

Right, so by specifying pk => undef you're basically saying "don't
bother with the find, just do the create". I think probably we could need a
more elegant way to specify this.

Other than that I think this could be useful.