[Dbix-class] RFC - some DBMS features you use
Darren Duncan
darren at darrenduncan.net
Thu Feb 4 22:07:37 GMT 2010
Hello,
In order to help me with some prioritization on my Muldis database projects
(which would typically be used as a DBMS wrapper above the DBI level but below
the level of other non-trivial DBMS wrappers like yours; it has its own query
language intended to be what I think SQL should have been in the first place),
and decide what features to try and design into the first version versus putting
off for later, I wanted to quickly survey what SQL DBMS features you are using
or care about now. I will just ask about some specific features that I haven't
formally specced yet.
I may also be asking about what you know regarding the current feature set of
existing SQL DBMSs, such as what they do or support already if one wanted to use
them.
If I don't mention something, it could either be because I already specced it
(probably 80-90% of everything you care about by now) or it isn't yet on my
radar or I don't consider it important enough for now.
Some of these questions overlap, so I recommend reading all of them before
answering any of them; you may be able to consolidate responses.
1. Locking. Do you explicitly place locks (including with "select ... for
update" or otherwise) on either the whole or parts of the database to ensure no
one else updates them for the duration of the locks? What DBMSs do or don't
support doing this?
1a. At what levels of granularity do you care about being supported? Examples:
- whole database at once
- whole database relvars/tables at once
- just specific tuples/rows with specific primary key or other key values (eg,
where person id is 3 or 7)
- generic predicate locks, covering any tuples/rows that are or might be visible
to a particular query (eg, where name starts with 'f' and date_a is before
date_b and the person is a member of clubs started before this date)
- other kinds?
1b. Do you want locks to be freed only explicitly or automatically at the end
of a transaction commit or rollback? What do current DBMSs do and what do you
want to be able to do? (Personally, I would think being able to maintain locks
independently of transactions is useful, especially of inner transactions or
savepoints.)
2. Read consistency. After you start a transaction, are you seeing a
consistent database state besides any changes you make yourself, or if you
repeated the same query without making changes, would you get the same results
both times? That is, are committed changes by other database users visible to
you to affect the results. Do you want to be able to choose a desired behavior
and have it enforced, such as whether you want to see other committed changes
immediately or not see them at all during your transaction, the latter
especially if it is a read-only transaction? For picking either behavior, how
do you currently tell the DBMS to do this? And do you do it natively with SQL
or does your Perl wrapper handle it for you and if so then how?
2a. Is there anything you do differently if you are using cursors versus if you
are not?
2b. Is your DBMS multi-versioned, so that someone can see a consistent database
state for an extended period without blocking others that want to update it? Do
you desire such functionality or do you prefer that updates of a common-interest
section will block other users? What do current DBMSs do in this regard or
which let you choose?
2c. Do you explicitly concern yourself with the above at all, or do you just do
your read and update queries without worrying about whether things might have
changed in the middle, and just let the DBMS sort it out implicitly, and you let
things fall as they may? Note that conceivably this is the scenario easiest to
support as I just don't do anything special, sometimes. Do any Perl DBMS
wrappers care about this?
3. Transition constraints. Can you currently or do you want to be able to
declare declarative transition constraints that the DBMS enforces. In contrast
to state constraints, which look at the current state of a whole database and
see if it is "consistent" with respect to itself, a transition constraint
compares the state of the database before the change you made with the state
afterwards and says whether the database may directly transition from the first
to the second. Any SQL TRIGGER which looks at both "OLD" and "NEW" together for
the purpose of veto power on a database change or transaction would qualify.
3a. Do you expect transition constraints to evaluate per statement or inner
transaction / savepoint or only at the end of a main transaction, comparing the
state before the transaction to after it?
4. Do you use triggers to optionally cause a database update that isn't
conceptually part of the triggering action, such as write to a log table, or do
you only use them to define constraints that say can't be expressed with a CHECK
clause or unique/foreign/etc key declaration? Note that if the trigger is
simply implementing an updateable view, where all of the values being written
are from the user in the same statement/statement group, that doesn't count as I
would consider that lot simultaneous, the cause of itself.
5. Users. Do you use multiple DBMS-recognized users in your database or just
one? I refer to what you give to DBI's connect() as its user/pass. Do all
users of your applications share the same user/pass or does each human user of
your application have their own user/pass that is used there? If you only use
one, do you wish you could have multiple or does your Perl layer emulate that
you are without you actually using multiple at the DBMS level?
5a. Do you generate temporary database users at runtime, ostensibly for
security purposes. For example, the user/pass a user of your app logs in with
isn't an actual DBMS user, but when they want to login, a DBMS user whose only
privilege is to invoke a specific stored procedure is used to validate the
user/pass that the user gave, and then if accepted a temporary user/pass is
generated by the stored procedure where those credentials are returned by the
procedure, and then the application logs into the DBMS with those to do all the
other/normal work of the user for that session. (Similar to this has actually
been done in a major government database application.)
5b. Do you define your users using plain SQL or can your Perl layer do it?
6. User privileges. Do you have differing ranges of privileges in your
databases for what database users can see or change or invoke? Is this enforced
by the DBMS itself or just by your application, or is it not enforced at all but
is just a "supposed to"? Do you want this definable and enforceable at a lower
level such that if say your higher level code is overly broad and makes a
mistake, it won't end up showing or changing something the user shouldn't be
allowed to?
6a. What DBMSs currently support roles or per-user privileges other than simply
whether they may login or not? And what sorts of granularity do they have?
6b. How important is it to you to be able to have per-user privileges in your
databases? Or how important is it to be able to define these without using SQL?
7. Spatial types and operators. Who uses spatial or GIS or such types or
operators? How valuable is it to be able to do so without writing SQL?
8. Special values for numerics. Who uses those IEEE float special values such
as NaN or infinities or under/overflows or distinct +/- zero or whatever in
their databases, or do you only care about storing or working with normal
numbers? If you do use any special values, then which ones and why? Do you
currently distinguish between multiple kinds of NaN (eg, 0/0 versus square root
of -1 versus whatever), or is a NaN just a NaN? Note that "not using" would
mean that an operation that would otherwise produce one instead returns an error
/ throws an exception.
9. Full text search. Who uses this versus a simple substring match?
10. Who uses regular expressions of any kind in the database? Either
Perl-compatible or otherwise. Basically any kind of pattern matching besides
the trivial kind that LIKE supports (literals plus . plus .*).
11. Out of band sequence generators. If you read from / increment a sequence
generator from within a transaction, do you want a rollback of that transaction
to also rollback the sequence generator or not? Do you want both possibilities
to be supported. What do current DBMSs support?
12. Sequence generators that produce values other than integers? Who uses this
or what DBMSs support them? And what would be the semantics for non-integers?
13. Transactions and data definition. If you start an explicit transaction,
and then do a data definition operation, such as changing the type of a table
column, do you want the latter to be subject to the transaction such that you
can roll it back? (I believe SQLite subjugates everything to transactions.)
14. Implicit commit of explicit transaction. If you have an explicit
transaction, should any operation you do in it other than an explicit 'commit'
cause it to commit, or should only 'commit' do that? If someone attempts data
definition within a transaction and the DBMS doesn't support subjugating those
to transactions, then should the data definition attempt fail or should it cause
an automatic commit. (I think the subjugate-or-fail should be the case, and the
implicit commit is always bad.) What DBMSs currently support each behavior?
(AFAIK, SQLite does not automatically commit, but MySQL does. Know otherwise?)
I may have more questions, but that's for now.
Thank you in advance for any feedback.
-- Darren Duncan
More information about the DBIx-Class
mailing list