[Dbix-class] RFC - some DBMS features you use

Thu Feb 4 22:07:37 GMT 2010

Hello,

In order to help me with some prioritization on my Muldis database projects 
(which would typically be used as a DBMS wrapper above the DBI level but below 
the level of other non-trivial DBMS wrappers like yours; it has its own query 
language intended to be what I think SQL should have been in the first place), 
and decide what features to try and design into the first version versus putting 
off for later, I wanted to quickly survey what SQL DBMS features you are using 
or care about now.  I will just ask about some specific features that I haven't 
formally specced yet.

I may also be asking about what you know regarding the current feature set of 
existing SQL DBMSs, such as what they do or support already if one wanted to use 
them.

If I don't mention something, it could either be because I already specced it 
(probably 80-90% of everything you care about by now) or it isn't yet on my 
radar or I don't consider it important enough for now.

Some of these questions overlap, so I recommend reading all of them before 
answering any of them; you may be able to consolidate responses.

1.  Locking.  Do you explicitly place locks (including with "select ... for 
update" or otherwise) on either the whole or parts of the database to ensure no 
one else updates them for the duration of the locks?  What DBMSs do or don't 
support doing this?

1a.  At what levels of granularity do you care about being supported?  Examples:
- whole database at once
- whole database relvars/tables at once
- just specific tuples/rows with specific primary key or other key values (eg, 
where person id is 3 or 7)
- generic predicate locks, covering any tuples/rows that are or might be visible 
to a particular query (eg, where name starts with 'f' and date_a is before 
date_b and the person is a member of clubs started before this date)
- other kinds?

1b.  Do you want locks to be freed only explicitly or automatically at the end 
of a transaction commit or rollback?  What do current DBMSs do and what do you 
want to be able to do?  (Personally, I would think being able to maintain locks 
independently of transactions is useful, especially of inner transactions or 
savepoints.)

2.  Read consistency.  After you start a transaction, are you seeing a 
consistent database state besides any changes you make yourself, or if you 
repeated the same query without making changes, would you get the same results 
both times?  That is, are committed changes by other database users visible to 
you to affect the results.  Do you want to be able to choose a desired behavior 
and have it enforced, such as whether you want to see other committed changes 
immediately or not see them at all during your transaction, the latter 
especially if it is a read-only transaction?  For picking either behavior, how 
do you currently tell the DBMS to do this?  And do you do it natively with SQL 
or does your Perl wrapper handle it for you and if so then how?

2a.  Is there anything you do differently if you are using cursors versus if you 
are not?

2b.  Is your DBMS multi-versioned, so that someone can see a consistent database 
state for an extended period without blocking others that want to update it?  Do 
you desire such functionality or do you prefer that updates of a common-interest 
section will block other users?  What do current DBMSs do in this regard or 
which let you choose?

2c.  Do you explicitly concern yourself with the above at all, or do you just do 
your read and update queries without worrying about whether things might have 
changed in the middle, and just let the DBMS sort it out implicitly, and you let 
things fall as they may?  Note that conceivably this is the scenario easiest to 
support as I just don't do anything special, sometimes.  Do any Perl DBMS 
wrappers care about this?

3.  Transition constraints.  Can you currently or do you want to be able to 
declare declarative transition constraints that the DBMS enforces.  In contrast 
to state constraints, which look at the current state of a whole database and 
see if it is "consistent" with respect to itself, a transition constraint 
compares the state of the database before the change you made with the state 
afterwards and says whether the database may directly transition from the first 
to the second.  Any SQL TRIGGER which looks at both "OLD" and "NEW" together for 
the purpose of veto power on a database change or transaction would qualify.

3a.  Do you expect transition constraints to evaluate per statement or inner 
transaction / savepoint or only at the end of a main transaction, comparing the 
state before the transaction to after it?

4.  Do you use triggers to optionally cause a database update that isn't 
conceptually part of the triggering action, such as write to a log table, or do 
you only use them to define constraints that say can't be expressed with a CHECK 
clause or unique/foreign/etc key declaration?  Note that if the trigger is 
simply implementing an updateable view, where all of the values being written 
are from the user in the same statement/statement group, that doesn't count as I 
would consider that lot simultaneous, the cause of itself.

5.  Users.  Do you use multiple DBMS-recognized users in your database or just 
one?  I refer to what you give to DBI's connect() as its user/pass.  Do all 
users of your applications share the same user/pass or does each human user of 
your application have their own user/pass that is used there?  If you only use 
one, do you wish you could have multiple or does your Perl layer emulate that 
you are without you actually using multiple at the DBMS level?

5a.  Do you generate temporary database users at runtime, ostensibly for 
security purposes.  For example, the user/pass a user of your app logs in with 
isn't an actual DBMS user, but when they want to login, a DBMS user whose only 
privilege is to invoke a specific stored procedure is used to validate the 
user/pass that the user gave, and then if accepted a temporary user/pass is 
generated by the stored procedure where those credentials are returned by the 
procedure, and then the application logs into the DBMS with those to do all the 
other/normal work of the user for that session.  (Similar to this has actually 
been done in a major government database application.)

5b.  Do you define your users using plain SQL or can your Perl layer do it?

6.  User privileges.  Do you have differing ranges of privileges in your 
databases for what database users can see or change or invoke?  Is this enforced 
by the DBMS itself or just by your application, or is it not enforced at all but 
is just a "supposed to"?  Do you want this definable and enforceable at a lower 
level such that if say your higher level code is overly broad and makes a 
mistake, it won't end up showing or changing something the user shouldn't be 
allowed to?

6a.  What DBMSs currently support roles or per-user privileges other than simply 
whether they may login or not?  And what sorts of granularity do they have?

6b.  How important is it to you to be able to have per-user privileges in your 
databases?  Or how important is it to be able to define these without using SQL?

7.  Spatial types and operators.  Who uses spatial or GIS or such types or 
operators?  How valuable is it to be able to do so without writing SQL?

8.  Special values for numerics.  Who uses those IEEE float special values such 
as NaN or infinities or under/overflows or distinct +/- zero or whatever in 
their databases, or do you only care about storing or working with normal 
numbers?  If you do use any special values, then which ones and why?  Do you 
currently distinguish between multiple kinds of NaN (eg, 0/0 versus square root 
of -1 versus whatever), or is a NaN just a NaN?  Note that "not using" would 
mean that an operation that would otherwise produce one instead returns an error 
/ throws an exception.

9.  Full text search.  Who uses this versus a simple substring match?

10.  Who uses regular expressions of any kind in the database?  Either 
Perl-compatible or otherwise.  Basically any kind of pattern matching besides 
the trivial kind that LIKE supports (literals plus . plus .*).

11.  Out of band sequence generators.  If you read from / increment a sequence 
generator from within a transaction, do you want a rollback of that transaction 
to also rollback the sequence generator or not?  Do you want both possibilities 
to be supported.  What do current DBMSs support?

12.  Sequence generators that produce values other than integers?  Who uses this 
or what DBMSs support them?  And what would be the semantics for non-integers?

13.  Transactions and data definition.  If you start an explicit transaction, 
and then do a data definition operation, such as changing the type of a table 
column, do you want the latter to be subject to the transaction such that you 
can roll it back?  (I believe SQLite subjugates everything to transactions.)

14.  Implicit commit of explicit transaction.  If you have an explicit 
transaction, should any operation you do in it other than an explicit 'commit' 
cause it to commit, or should only 'commit' do that?  If someone attempts data 
definition within a transaction and the DBMS doesn't support subjugating those 
to transactions, then should the data definition attempt fail or should it cause 
an automatic commit.  (I think the subjugate-or-fail should be the case, and the 
implicit commit is always bad.)  What DBMSs currently support each behavior? 
(AFAIK, SQLite does not automatically commit, but MySQL does.  Know otherwise?)

I may have more questions, but that's for now.

Thank you in advance for any feedback.

-- Darren Duncan