[Dbix-class] Explicit ASTs (ping nate)
David E. Wheeler
david at kineticode.com
Mon Sep 4 20:04:49 CEST 2006
On Sep 3, 2006, at 23:10, Darren Duncan wrote:
> While your goals don't require as much attention to detail and/or
> "work" to implement, I raised the detail points I did because I know
> that people will keep wanting more out of this solution under
> discussion, and that having an eye on design principles used by more
> involved systems will let us design this good-enough smaller system
> in such a way that it is easier to scale up with new features later,
> rather than resorting to premature hacks and/or breaking backwards
> compatability unnecessarily.
Oh, agreed. It's because of the insufficient design of the current
implementations that we're having this discussion (see mst's comments
about Object::Relation in the first post in this thread).
> Also, based on the discussions to date, I didn't/don't really know
> what the intended scope of the project is, so I just defaulted to
> assuming this may grow to resemble a "complete" solution. Perhaps
> more commentary about what people do or don't actually want in the
> explicit AST is helpful when going forward, such as where the line
> might be between good-enough and insufficient. Mind you, some of
> what you said in your reply does just that, assuming others agree.
Right.
> Fair enough, if that's actually true, and it does cut down the
> problem space by orders of magnitude.
Hell yeah! We'll leave creating an actual RDBMS to insane geniuses
like you. ;-)
> In fact, this means that every AST can simply be a single
> arbitrary-depth expression which represents the select statement, all
> generally in the functional language sense. Each node in said
> expression represents either a literal or variable name or an
> operator invocation that takes zero or more arguments and returns a
> value. The root node's value would be of a Table or Relation type,
> since that's what a SELECT returns.
Yes, exactly, although the return value from the root node is not
specified; that will be implementation-specific.
> Generally speaking, just use a collection of separate simple
> relational operators (take the "original 8" and/or D for inspiration)
> that together do what a SELECT does, and then compose them into a
> SELECT when generating the SQL.
FYI to others, the "original 8" are:
Restrict
Project
Join
Intersect
Union
Difference
Cartesian Product
Divide
See "Database in Depth" pp 86-92 for detailed descriptions of each.
http://www.amazon.com/exec/obidos/ASIN/0596100124/
> That makes something that is a lot more Perl-like and easier for
> programmers to understand, while people that know SQL already know
> what the parts of a SELECT do and can easily compose the analogous
> simpler functions.
Well, maybe. I had to really struggle to understand those operators,
myself. I still don't have much of a grasp (I am not a mathematician,
let alone a set theorist).
> Or just have a big "select" operator instead that is relatively
> complicated, though I would strongly suggest that the the more
> smaller functions are less work than the big one. (I know from
> experience when making the defunct SQL::Routine, where much of the
> complexity was modelling an actual SELECT statement.)
Yes, smaller is definitely better.
> But however you do it, if you just deal with function operators
> everywhere, including for both the select and any
> math/string/whatever operations, your syntax will be straightforward
> and simple, and Perl-like, and easier to make work over a non-SQL
> backend like LDAP or whatever.
Yes, that's the ideal.
> Yes. But we are making an EXPLICIT AST, right? The DWIM wrapper
> would just take uc() of course, but I suggest the explicit version
> including a namespace will just make it less ambiguous in important
> ways. Eg, such as if a user defines a function that is the same name
> as one of our AST's built-ins, because it isn't the same as their
> choice of underlying DBMS' reserved word. Of course, the explicit
> AST should be easy to use, but one point of it being intended to use
> under a wrapper is that we can make it more verbose to aid clarity.
Fair enough; your point is well-taken. We'd need to have a way to
distinguish them when new core functions are added, though, where
they might conflict with user-defined functions.
> I fully agree that that code example is confusing. I would expect
> there to be explicit operators for both AND() and OR() at any time
> where they are intended; you should be constructing an expression
> where the root node returns a boolean value, which is what AND(),
> OR(), and any arguments to those return.
Yes, but the syntax in pure Perl is hard; see my recent post on p5p
for details.
Best,
David
More information about the Dbix-class
mailing list