[Dbix-class] A more accurate redux of zby's comments

Thu Jan 24 12:43:36 GMT 2008

On Jan 24, 2008 12:39 PM, Matt S Trout <dbix-class at trout.me.uk> wrote:
> On Thu, Jan 24, 2008 at 08:53:08AM +0100, Zbigniew Lukasiak wrote:
> > This started with me trying to add some new features to create - they
> > were rejected because I was using update_or_create and it does not
> > always work, so I went and tried to fix update_or_create and I found
> > out that this bug is a result of two things:
> >
> > 1. that you need to delete the primary key when you have them
> > undefined and try to insert the row in PostgreSQL
> > 2. that when you delete the pk instead of leaving it undef then find
> > will not work as advertised
> >
> > My plead to a workaround for 1) - by simply deleting the PK in the Pg
> > storage driver was rejected so those that use PostgreSQL are forced to
> > live with the consequences of using find with the PK deleted.
>
> That's because Postgres' behaviour in this situation is correct.
>
> The fact that MySQL and SQLite both silently accept bad input is not a
> reason to emulate the bug under other DBs that actually care about data
> integrity.

OK - I can agree with that.  My thinking was that you are not slave of
the SQL semantics here - you can introduce your own (in fact you
already do because you don't insert undef - you change it to NULL in
the query). But maybe you are right that this should not go into the
core - I am now looking into ways to introduce that as an add on.

>
> > First about the nature of the problem. When called on a ResultSet
> > object find tries to determine if the query it receives and the
> > internal conditions of the ResultSet object do include at least one
> > full unique constraint (for example a primary key). But it is not
> > always possible to do that. Let me quote from one of the core
> > developers about that problem:
> >
> > > Of course you can't always determine. But if you don't know, then you default
> > > to "no it doesn't produce a unique row" and provide some way for the user to
> > > say "actually I know it does and I accept you can't help me if I'm wrong" :)
>
> I was talking about join conditions there and introspection onto those.
>
> Please don't quote me out of context.
>
> My comments there were for the *_related stuff - that it couldn't -always-
> be possible to know that a join condition would result in a unique record
> on the other side, but that we should provide a flag to allow the user to
> specify that they know it -does- and DBIC should proceed as normal.

*_related can sneak in in many ways. For example when you filter the
ResultSet by some access condition (rows that a user owns or
something).  But I agree that this is the hard case.  And I think I
agree with your solution with that flag as well - just what would that
'proceed as normal' mean? Would it mean that the uniqueness is not
checked at all, the query is reduced by all columns that don't belong
to any of the unique constraints (or that don't belong to the chosen
unique constraint), then it is run and the first row is returned?

> > The following methods use find and will silently create a new row
> > instead of using an existing one in that case
> >
> > update_or_create
> >
> > update_or_create_related
> >
> > find_or_create
> >
> > find_or_create_related
> >
> > find_or_new
> >
> > find_or_new_related
>
> If you're key-ing off the PK and it isn't present this is correct behaviour.
>
> The bug there is that they'll pass it to find which can then randomly
> get the wrong row by its search() fallback.

I agree that the fallback is worsening the condition.

> > The consequence of that is that you should not use these methods in
> > libraries where you cannot say "actually I know it does and I accept
> > you can't help me if I'm wrong" because you don't know what ResultSet
> > and query you receive.
>
> Well, this is true. But it doesn't actually matter since if you -don't-
> know that you shouldn't be calling find() or any of the methods you
> enumerate in the first place.

When you do know the PK then you would call find, when you do know
that you have no PK then you would directly call create or new.  It is
only when you don't know that you need any of these listed methods.

--
Zbigniew
http://perlalchemy.blogspot.com

>
> As I say, my note of that is that DBIC should provide -warnings- if it
> can't determine for itself that the query is unique and the user hasn't
> specified that they know it is.
>
> But you clearly either didn't understand that or were just being a
> scaremongering sack of shit by misquoting me - I'm going to assume stupidity
> rather than malice because it's usually a good guess; everybody else can
> draw their own conclusions.
>
> --
>       Matt S Trout       Need help with your Catalyst or DBIx::Class project?
>    Technical Director                    http://www.shadowcat.co.uk/catalyst/
>  Shadowcat Systems Ltd.  Want a managed development or deployment platform?
> http://chainsawblues.vox.com/            http://www.shadowcat.co.uk/servers/
>
> _______________________________________________
> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
> IRC: irc.perl.org#dbix-class
> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
> Searchable Archive: http://www.grokbase.com/group/dbix-class@lists.rawmode.org
>