[Dbix-class] Explicit ASTs (ping nate)

Wed Sep 6 16:46:08 CEST 2006

Nate Wiger wrote:
> Sam Vilain wrote:
>> Matt S Trout wrote:
>>>    foo = 'yay' AND bar LIKE 'wooho%'
>>>
>>> becoming
>>>
>>>    [ -op, 'AND',
>>>      [ -op, '=', [ -name, 'foo' ], [ -bind, 'yay' ] ],
>>>      [ -op, 'LIKE', [ -name, 'bar' ], [ -bind, 'wooho%' ] ],
>>>    ],
>> Would it hurt *that* much to bless these fragments, to make it easier in
>> places?
> 
> I'm curious about that myself. If we want to really be able to supply 
> callbacks, relations, etc, it seems that we need something like this 
> (and this is completely off the top of my head after a long weekend on 
> vacation):
> 
>     foo = 'yay' AND bar LIKE 'wooho%'
> 
> becoming:
> 
>     $s->and(
>        $s->eq(   $s->bind( $s->name('foo'), 'yay') ),
>        $s->like( $s->bind( $s->name('bar'), 'wooho%') ),
>     ),

I was thinking more

$s->op_and(
   $s->op_eq(...),
   ...
)

for namespacing

> and a function call like
> 
>     COUNT( * )
> 
> would become
> 
>     $s->count('*')

so this would check for the existence of the func_specific_count method and 
then call

$s->func_specific_count('*')

or

$s->func_generic('count', '*')

as appropriate

>  From there, a person could subclass SQL::Abstract and provide their own 
> custom site "count()" function that did something special. Or an XPath 
> version/etc. Or a $s->name() that does the weird quoting MySQL wants.

right, exactly - and the generic/specific stuff allows a fallback handler but 
still adding specific handlers where required.

> An intermediate parsing tree is a good idea, but then again I could see 
> as much benefit from being able to just insert into a node using Perl 
> methods, kinda like how JS handles the XML DOM (although less crappy). 
> Otherwise we pretty much have to rewrite SQL to support all the myriad 
> variations.

I was rather assuming we'd take the explicit AST and feed it to an object with 
methods like this to generate the final query.

The point of having the AST contain the full info is that we can reason on / 
transform it before we pass it to the generator. If people find the 
programmatic version easier to read it should be trivial to write a null 
generator that they can call that hands them the AST back (i.e. one that would 
be a no-op if set to traverse across its own output).

> Here's one expanded example I could think of:
> 
>     SELECT d.id, d.breed FROM dog_breeds d
>      WHERE d.size > 100 AND d.color IN ('brown','black','grey')
>      LIMIT 100;
> 
> That might be something like:
> 
>     $s->select(
>       $s->table('dog_breeds', 'd'),  # dog_breeds d
>       $s->list(                      # for commas
>         $s->name('id', 'd'),         # d.id, d.breed
>         $s->name('breed', 'd'),      # now we know d = d
>       ),
>       $s->where(                     # will give "WHERE"
>         $s->and(
>           $s->where(\%data_struct),
>           $s->in($s->name('color', 'd'), @members),
>         ),
>       ),
>     );

That's moving more towards DWIM-y syntax to me that would be user-used.

I see the $s in the first couple examples that's processing the explicit AST 
as something like SQL::Abstract::Generator::Generic (or ::MySQL or ::Oracle or 
etc.) and the $s in the immediately above example as SQL::Abstract::Query or 
something.

> Ooops, forgot the limit:
> 
>     if (my $l = $cgi->param('limit')) {
>        $s->append($s->limit($l));
>     }
> 
> And we really wanted an extra column back:
> 
>     my $n = $s->find('from');
>     $n->append($s->name('color', 'd'));
> 
> And why don't we get all the d members to set types:
> 
>     my @m = $s->members('d');
>     for (@m) {
>        if ($_->name eq 'id') {
>            $_->type('integer');
>        }
>     }

All this looks very much like a DWIM syntax to me, again.

> Finally, render the SQL:
> 
>     my($stmt, @bind) = $s->render;

And if the $s in these examples were to build up said AST internally, it would 
then become trivial for $s->render to pass it on to an appropriate generator.

Really quite like your syntax sketch there as a user-level API, although I 
think I still prefer resultset's pure-functional approach.

One thing that I am sure of is that the final generator should be 
pure-functional, no state - i.e. "my $sql = $gen->generate($ast)" - that way 
the generator can cache SQL fragments and even entire queries it's built, and 
we don't have to worry about selective deep copying and stuff (which is always 
a source of performance and weird-bug issues IME)

> Just a thought. Great thread BTW.

Looks interesting to me, you seem to have taken the stuff I was thinking about 
and started looking a bit further towards implementation from the AST on down 
and interfaces to build up the AST in the first place (along with of course 
the existing SQL::Abstract interface which I see no reason we can't continue 
to transparently support with this new stuff as a replacement backend).

What do you think of my comments above, would you be interested in 
SQL::Abstract being moved in this direction as part of a larger effort?

-- 
      Matt S Trout       Offering custom development, consultancy and support
   Technical Director    contracts for Catalyst, DBIx::Class and BAST. Contact
Shadowcat Systems Ltd.  mst (at) shadowcatsystems.co.uk for more information

+ Help us build a better perl ORM: http://dbix-class.shadowcatsystems.co.uk/ +