[Catalyst] KiokuDB, MongoDB and the NoSQL thing

Tue Mar 2 08:30:02 GMT 2010

S.A. Kiehn wrote:
> I do not see many posts regarding uses of KiokuDB within Catalyst so I 
> was curious about the opinion of the community in regards to its usage.  
> Is it still to early within development?
> 
> Also, I have been reading more about the increase in the NoSQL interest, 
> with a particular interest in the MongoDB database (it seems to be 
> similar in some respects to KiokuDB), but I do not find Perl people in 
> the discussion as much as others (Ruby, PHP).  Are there developers in 
> the Catalyst community who lean toward NoSQL concepts over traditional 
> RDMS's, or is best to view as a tool to use at times?
> 
> How about MongoDB?  Am I being suckered by another bandwagon?
> 
> Thanks, Scott K.

Well I happen to be strongly opinionated on this topic, so here goes ...

While these other DBMSs have their uses, I believe that anyone is misguided who 
figures they are superior solutions for most uses of relational databases.

I believe that the relational model of data is still the single best general 
solution that we have come up with for organizing and querying any non-trivial 
amount of data that is the least bit structured.  Especially so when that data 
needs to be or could possibly be worked with by multiple kinds of users or 
applications that may have different needs, and need their own views of that 
data.  It is also good for persisting data, but persistence isn't the main 
point; rather easy and flexible organization and querying is the point; 
persistence is optional, same as persisting an array of data is optional.

That said, while it does the job well enough many times, SQL is deeply flawed 
and doesn't represent the relational model of data properly, but just 
approximates it to varying degrees, this variance depends partly on the SQL DBMS 
in use, which range in features quite a bit.

I believe that quite often when people complain about "relational databases", 
they are really complaining about "SQL databases", as if those were the same, 
and so various "solutions" are proposed like ORMs or these NoSQL concepts.

But the problem is that they are throwing out the baby with the bathwater.  Yes, 
SQL is quite flawed, but the relational model it approximates is not (or it is 
much less flawed, if you want to argue that having something actually flawless 
is impossible).

I believe that the best solution is not to ditch everything, but rather to 
provide a DBMS that actually implements the full relational model of data, and 
not just an approximation, nor ditching the concept entirely.  If you do that, 
then a lot of these other kinds of DBMSs like so-called "object-relational" or 
"object" or "key-value" become redundant, because the full relational model 
incorporates their features.

For example, the full relational model supports having tuple/row and 
relation/table values as attributes/fields of other ones, and so you can 
natively model any arbitrarily complex data type that an object could model, 
without too much indirection (similarly to how many languages support having an 
array as an element of an array).  Hence ORMs are redundant and so are distinct 
concepts of object stores, or alternately they become a lot thinner.

And so, as a model of good community behaviour, I'm not just sitting around 
talking about what people should do, but I am going out and actually doing it, 
creating a DBMS that provides the full relational model (both as self-contained 
implementations as well as implementations over existing DBMSs), and right now. 
  Simultaneously taking what SQL should have been and all the good features of 
the SQL alternatives, elegantly integrated in one streamlined bloat-free package.

This project, with the umbrella name "Muldis" (see CPAN etc), is mostly 
pre-alpha right now, but I am filling in the gaps as soon as I can and I am 
confident it will be viable.  In fact, I released updates to 3 of 4 current 
sub-projects on CPAN earlier today.

So, to answer your question, go ahead and explore the alternatives you name, but 
I will say to anyone that the relational model is the single best general 
solution for data aggregation and processing.

-- Darren Duncan