[Dbix-class] Explicit ASTs (ping nate)

Sun Sep 3 15:08:13 CEST 2006

David E. Wheeler wrote:
> On Sep 2, 2006, at 13:53, Matt S Trout wrote:
> 
>>    DWIM AST -> explicit AST -> SQL
> 
> Agreed. This way, the AST does not have to be specific to any one  
> back end (PostgreSQL, SQLite, MySQL, LDAP, a object iterator, etc.).

Precisely. And we can have as many front-ends as we want (we'll need at least 
an SQL::Abstract DWIM AST -> explicit AST for our and for Nate's usage for a 
start, and I think figuring out how to compile Jifty::Query to this is 
probably the way to enable its usage against DBIC, which I know audrey wanted 
to happen).

>> I reckon the basic query structure can be a hashref since we have
>> @lots_of_clauses (and since we're bound to miss something first  
>> time round)
>> but each clause should probably be in arrayref form, so we have  
>> something like
>>
>>    foo = 'yay' AND bar LIKE 'wooho%'
> 
> I'll just assume that you're using explicit SQL here, since we're not  
> discussing the DWIM part. IOW, You're saying, to get the above, one  
> has to create an AST like the below, yes?

Right.

>> becoming
>>
>>    [ -op, 'AND',
>>      [ -op, '=', [ -name, 'foo' ], [ -bind, 'yay' ] ],
>>      [ -op, 'LIKE', [ -name, 'bar' ], [ -bind, 'wooho%' ] ],
>>    ],
>>
>> and a function call like
>>
>>    COUNT( * )
>>
>> would become
>>
>>    [ -func, 'COUNT', '*' ]
> 
> Sure, that works for me.
> 
>> the basic principle here is that an op generally takes a specific  
>> number of
>> arguments, whereas a function is a containing clause that can take a
>> potentially arbitrary number. So a complete statement might be  
>> something like
>>
>>    SELECT my.a, my.b FROM my_table my WHERE spork = 1
>>
>> and that would become
>>
>>    {
>>      select => [ [ -name, 'my', 'a' ], [ -name, 'my', 'b' ] ],
> 
> The '-name' key might be unnecessary, no? Unless there's something  
> else that might be included? Oh, wait, there is. It should probably  
> be something like:
> 
>    [ [ -name => 'my', -alias => 'a' ],
>      [ -name => 'my', -alias => 'b' ],
>    ]

Err, no, that's going the wrong way -

   [ -name, 'my', 'a' ] => my.a
   [ -alias, [ -name, 'my', 'a' ], 'a' ] => my.a AS a

I think.

> If we're going to be precise (and pedantic). No? And since there  
> multiple keys per token, it should actually probably be:
> 
>    [ { -name => 'my', -alias => 'a' },
>      { -name => 'my', -alias => 'b' },
>    ]

No, not at all. The structure logically nests the way shown above - that 
should enable things like

   [ -alias,
     [ -func, 'COUNT',
       [ -func, 'DISTINCT',
         [ -name, 'my', 'a' ]
       ]
     ],
     'a_cnt'
   ]

for "COUNT(DISTINCT my.a) AS a_cnt"

>>      from => [ [ -alias, 'my_table', 'my' ] ],
> 
> That would have to be:
> 
>        from => [ { -name => 'my_table', -alias => 'my' } ]

More

   [ [ -alias, [ -name, 'my_table' ], 'my' ] ]

to preserve the nesting (again that -name could be replaced with a subselect, 
or a function invocation for SELECT * FROM my_sproc(...), or etc.)

>>      where => [ [ -op, '=', [ -name, 'spork' ], [ -bind, 1 ] ] ],
> 
> And here:
> 
>       where => [ [ -op, '=', { -name => 'spork', -bind => 1 } ] ],
> 
> No?

No, definitely not, that restricts us to <column reference> = <value> which is 
one of our big bugbears with SQL::Abstract on its own - for example keeping it 
as a pair of arrayrefs allows

   [ [ -op, '=', [ -name, 'spork' ], [ -name, 'spoon' ] ] ]

to produce "spork = spoon" in the final SQL - or having a function on the LHS 
or whatever.

>>    }
>>
>> which is ... spectacularly ugly, but I -think- explicit enough. Fun  
>> will be
>> handling specific things like
> 
> Yes, I was making it even more explicit. Whether that's over the top  
> or not I'm not sure. Also, I'm thinking that the -bind value ought to  
> support a code ref, so that it could potentially bind a different  
> value every time it's called.

I agree in places, but your hashref style results in data loss as expanded 
above so I'm pretty much certain that's wrong.

We'll have to think carefully about how -bind works; Scalar::Defer might be 
sufficient to the task but I think explicit support for subrefs and for named 
bind parameters somehow will also be worthwhile.

>>    MATCH (title,body) AGAINST ('database')
>>
>> but I guess that's just
>>
>>    [ -func, 'MATCH',
>>      [ [ -name, 'title' ], [ -name, 'body' ] ],
>>      [ [ -bind, 'database' ] ]
>>    ]
> 
> Right.
> 
>> and have something like "func_specific_$func" and "func_generic"  
>> methods on
>> the emitter, so MATCH gets handled by func_specific_MATCH whereas  
>> something
>> "normal" like COUNT can go through func_generic (we'd enumerate the  
>> specific
>> methods at instantiation time for performance, I think).
> 
> Well, I think that COUNT() is only in the select or group by clauses,  
> never in where. But I'd also like to explicitly include support for  
> MATCH(), so that one can use regular expressions in WHERE clauses,  
> just as one can use LIKE.

True re count, but the point here is more one of enabling easy writing of 
syntax generators, and COUNT and its ilk are a relatively standard syntax 
whereas MATCH isn't. The func_generic/func_specific stuff would be an 
implementation detail from an external POV.

As for supporting MATCH explicitly we'd have to do a run-through of what other 
databases support by way of regex syntax, but I guess you can answer that?

-- 
      Matt S Trout       Offering custom development, consultancy and support
   Technical Director    contracts for Catalyst, DBIx::Class and BAST. Contact
Shadowcat Systems Ltd.  mst (at) shadowcatsystems.co.uk for more information

+ Help us build a better perl ORM: http://dbix-class.shadowcatsystems.co.uk/ +