[Dbix-class] rfc, treating NULL as a value like any other
Darren Duncan
darren at DarrenDuncan.net
Sun Jan 27 01:28:00 GMT 2008
Hello,
This email is primarily concerned with the possible future internals
of DBIx-Class with respect to an intermediate AST based on Muldis D
that you can convert to/from SQL, but this email is secondarily
concerned with a possible future enhancement to the public interface
of DBIx-Class.
In a nutshell, one issue that often trips up new (and experienced)
SQL users is its 3-valued logic such that you can't just say "foo =
NULL" to match null values, requiring "IS NULL" instead, or that both
"foo = 3" and "foo <> 3" will not match foo that are NULL. And AFAIK
in some situations SQL treats 2 NULL as matching and other times
non-matching.
My question for you is whether, from a typical or new or experienced
user's point of view, it would be preferable for the DBMS to treat
NULL like an actual single value, which is equal to itself while not
equal to every other normal value that is not NULL? Consequences of
this include that "foo = NULL" will DWIM (result in true when foo is
NULL), and "foo <> 3" will be true when foo is NULL (foo is anything
but 3).
(Other consequences are that null-allowing columns with a
single-column unique key constraint will allow a NULL value to appear
exactly once in the column. In a naive case, a foreign key
relationship would then match a NULL in one table with one in
another, or require one in the parent. Though if this isn't desired,
then the foreign key constraint could be defined to only apply to
child table values that do not equal NULL. Both of which I think SQL
already does and that doesn't necessarily need to change.)
Regardless of whether any updates to the DBIx-Class public interface
are made, I believe it is preferrable to think in terms of NULL =
NULL and structure the AST along those principles, so you get more
AST simplicity or DWIM when you want that to be the case, while you
may or may not get more AST complexity when you don't want that to be
the case. Or regardless of whether that happens, the AST is simpler
when dealing with not-nullable data, and marginally more complex when
dealing with nullable data.
The latest release 0.18.0 of the Muldis D spec,
http://search.cpan.org/dist/Language-MuldisD/ , clarifies that any 2
table ("relation") values are identical when they contain zero rows
("tuples") and they have the same column ("attribute") names,
regardless of the declared types of those columns. Also, new
subtypes Nothing,Just of Maybe were added. Now, Muldis D is
optimized towards not-nullable data, so in SQL terms, if you simply
say "foo Int" as a field type, that is implicitly not nullable, and
you have to explicitly say nullable to make it so, as "foo
Maybe{Int}". Defining a field nullable is actually defining the
field to contain a table value whose sole column name is 'value' and
which may contain only zero rows or 1 row; if it contains zero, it is
null, if it contains 1 row, namely the normal value, it is not null.
You could say that not-nullable means unboxed while nullable means
boxed. But those are details that users typically don't have to know
about. The single Nothing value matches the empty case, and Just
constructs a full case. Testing a not nullable field is simple as
"foo = 3" or "foo <> 3", while testing a nullable is "foo = Nothing"
(SQL "foo IS NULL") "foo = Just(3)" (SQL "foo = 3") or "foo <>
Just(3)" (SQL "foo <> 3 OR foo IS NULL") "foo <> Just(3) and foo <>
Nothing" (SQL "foo <> 3"). In this system, "Nothing = Nothing" is
true, "Nothing = 3" and "Nothing = Just(3)" are both false, "Nothing
<> 3" and "Nothing <> Just(3)" are both true, "3 = Just(3)" is false,
"Just(3) = Just(3)" is true, "3 = Just(3).attr" is true. To do math
with not-nullable, you just say eg "row.x / row.y", with nullable but
known not null you say "row.x.attr / row.y.attr".
So, assuming it is clearly documented, do you forsee any problems in
using an AST that is 2-valued-logic and assumes NULL = NULL, where
you have to explicitly say when you don't want that behaviour? It
should still be possible to generate SQL with the desired semantics,
so this is more a useability question.
Thank you. -- Darren Duncan
More information about the DBIx-Class
mailing list