[Dbix-class] DBIx::Class::Index::Simple

Wed Apr 2 10:35:12 BST 2008

Mark Lawrence wrote:
> On Wed Apr 02, 2008 at 08:50:02AM +0200, Peter Rabbitson wrote:
>> Mark Lawrence wrote:
>>> On Tue Apr 01, 2008 at 08:35:38PM -0500, Jonathan Rockway wrote:
>>>> * On Tue, Apr 01 2008, Peter Rabbitson wrote:
>>>>
>>>>> __PACKAGE__->add_columns(
>>>>>   id => { data_type => 'integer', is_auto_increment => 1 },
>>>>>   starts_at => { data_type => 'datetime' },
>>>>> -  created_on => { data_type => 'timestamp' }
>>>>> +  created_on => { data_type => 'timestamp', index_as => 
>>>>> 'created_test_simple_idx' }
>>>>> );
>>>> The problem with this syntax is that you can only index one column.  Why
>>>> not do:
>>>>
>>>>  __PACKAGE__->add_index( idx_foo_bar => [qw/foo bar/] );
>>> How often do you need to know the name of an index? Why not go one
>>> simpler and do:
>>>
>>>    __PACKAGE__->add_index(qw/foo bar/);
>>>
>>> Which could return the autogenerated name of the index. If you need the
>>> name at a later (more dynamic stage perhaps) then maybe expose the
>>> name generating method as well.
>>>
>> When you generate several indexes per table, you need to be in full control 
>> of naming so no clashes will occur. In the above example what would be the 
>> index name? foo_bar? what if foo_bar is an indexed column too?
> 
> Of all the schemas I've worked on I've never had that name clash, but
> then I also find it a bit silly to have a column (indexed) named foo_bar
> if you are going to index foo and bar. My autogenerating
> index name code tends to prefix the table name, working around similarly
> named columns in different tables.
> 
> A better suggestion then is to make the simple case easy, and the
> complicated possible, al la this pseudo code:
> 
>     sub add_named_index { # usage: name => qw/columns/
>         'CREATE INDEX '. (shift) .' ON '.......'. join(',', @_);
>     }
> 
>     sub add_index { # usage: qw/columns/
>         my $name = 'idx_'. $table_name .'_'. join('_', @_);
>         $self->add_named_index($name, @_);
>     }
> 

This is arguably less readable, but this is beyond the point. The 
functionality you guys are asking for used to be in DBIC:
svn diff -r 3815:3814

I audited the diff line by line and it does exactly what you guys seem to 
want. Yet it gave way to the arcane sqlt_deploy_hook (which is really cool to 
overload, but a mess to use directly). So what gives?

As far as syntax. I agree that being able to add automatically named indexes, 
and multicolumn indexes would be cool. Also it would be cool to have different 
methods to set this. However for the simple case consider the following 
examples. I personally find the first one easier to follow as all info is in 
the place where one would expect:

--------------------------------------------------
__PACKAGE__->add_columns (
     id => {
         data_type => 'BIGINT',
         is_auto_increment => 1,
     },
     item_id => {
         data_type => 'BIGINT',
         is_foreign_key => 1,
     },
     warehouse_id => {
         data_type => 'SMALLINT',
         is_foreign_key => 1,
     },
     sku => {
         data_type => 'VARCHAR',
         size => 100,
         index_as => 'sku',
     },
     status => {
         data_type => 'TINYINT',
         default_value => 1,
         index_as => 'status',
     },
);

__PACKAGE__->set_primary_key ('id');

---------------------------------------------------

versus

---------------------------------------------------

__PACKAGE__->add_columns (
     id => {
         data_type => 'BIGINT',
         is_auto_increment => 1,
     },
     item_id => {
         data_type => 'BIGINT',
         is_foreign_key => 1,
     },
     warehouse_id => {
         data_type => 'SMALLINT',
         is_foreign_key => 1,
     },
     sku => {
         data_type => 'VARCHAR',
         size => 100,
     },
     status => {
         data_type => 'TINYINT',
         default_value => 1,
     },
);

__PACKAGE__->set_primary_key ('id');
__PACKAGE__->add_index (sku => [qw/sku/]);
__PACKAGE__->add_index (status => [qw/status/]);

--------------------------------------------------

As a matter of fact if this catches on, it would be neat to reduce

__PACKAGE__->add_unique_index (stuff => [qw/stuff/]);
to
stuff => {
	...
	uindex_as => 'stuff'.
},

and

__PACKAGE__->set_primary_key ('id');
to
id => {
	...
	is_primary_key => 1,
},

This way all multicolumn methods will still work, but for single column needs 
(more prevalent imo) simpler syntax will be also available. And if this 
_really_ catches on, it is trivial to group together all columns with the same 
index name, and create a multicol index (with random order of columns). For 
ordered indexes one would still use the clunky __PACKAGE__->add_...

Peter