[Dbix-class] DBIx::Class field metadata and validation
Dan Kubb
dan.kubb at autopilotmarketing.com
Fri Jul 29 22:30:12 CEST 2005
I've got an idea on how DBIx::Class could handle field metadata
and validation and I wanted to get some input on it -- I'm not
really sure what the best approach is, but I wanted to spark some
discussion anyway.
Some of frustrations I have with Class::DBI seem to stem from
the fact that field's metadata isn't stored in a organized way.
For example there are hacks like accessor_name() that map the
field name onto a database column name.
At the same time there's no easy way for me to attach other
metadata to a specific field, like specifying a default value,
or the data type, or some other constraints.
To try and solve this I've hacked up some basic code thats not
quite ready yet for posting to the list, but I wanted to outline
my approach to handling field metadata anyway.. I'd love to get
some feedback before I get too deeply into this:
- Each field is put in its own unique class with the naming
convention: $table_class::$field_name, so if I have a table
called "contact" specified in the class Local::Model::Contact,
the name_first field would be defined in the class
Local::Model::Contact::name_first
- Field classess can inherit from one or more classes that
define the metadata and validation rules for a specific type
of data. For example, there might be a Type class for strings,
which provides properties like length_min, length_max, format
(for a regex the column values must match), etc.
- The Type classes use Class::Data::Inheritable to specify the
properties they provide.
- A Field class is defined like so:
package Local::Model::Contact::name_first;
use base qw/
DBIx::Class:::Type::database_column
DBIx::Class:::Type::string
/;
__PACKAGE__->name('name_first');
__PACKAGE__->length_min(1);
__PACKAGE__->length_max(50);
__PACKAGE__->allow_null(0);
__PACKAGE__->table('Local::Model::Contact');
__PACKAGE__->database_column_name('name_first');
1;
- Normally typing all this in would be really tedious, so I'm
using an approach similar to Sebastian's Class::DBI::Loader
where I iterate over all the tables in a database, and create
the code like the above for each column. This would be
optional of course.
- You're free to add further properties for things that cannot
be inferred from the database in your model code, such as
regex patterns that must match, or a human readable label.
You could also specify inflate/deflate routines for the
column here as well.
- Each of the DBIx::Class::Type::* classes has a validate()
method that checks the data type. The string class'
validate() method looks like this:
sub validate : method {
my $self = shift;
return
$self->__validate_is_string,
$self->__validate_with_length_min,
$self->__validate_with_length_max,
$self->__validate_with_allowed_chars,
$self->__validate_with_disallowed_chars,
$self->__validate_with_format,
$self->__validate_with_encoding,
$self->NEXT::DISTINCT::validate;
}
If a validation rule fails, it returns an array of hashrefs
of errors. The hashrefs contain a name, a message, and
possibly some data to help explain what failed the validation
rule if it can't be found in another way.
I toyed with the idea of throwing an exception when a rule
fails, but I would rather have all the rules execute and know
everything thats wrong with the data in one go. What to do
with the errors is punted to the caller. ;)
- Fields don't necessarily need to be columns in a database;
but they can have most of the same properties.
- Foreign key fields can inherit from the primary key field
of the table they link to to show the relationship between
them.
Here's the DBIx::Class::Type::.* classes I've made so far and
the properties each provides:
base
name - the method name to use to access the field
label - the human-readable label (for GUIs mostly)
description - a description (for documentation purposes)
default - the default value to use if undef is
supplied
allowed_values - a list of values that are allowed
disallowed_values - a list of values that are not allowed
callback - a list subrefs (or method names) to use
in checking the value
allow_null - a flag that says if the field can be
undefined
read_only - a flag that says if the value is read-only
string
length_min - minimum length of the string
length_max - maximum length of the string
alllowed_chars - a list of characters allowed
disallowed_chars - a list of characters not allowed
format - a regex to match
encoding - character encoding that must match
numeric
range_min - the smallest the number can be
range_max - the largest the number can be
fractional_min - the smallest length the fractional part
of the number can be
fractional_max - the largest length the fractional part of
the number can be
database_column
table - the name of the table the field belongs to
database_column_name - the name of the column inside the table
inflate - the subref (or method name) to execute to
inflate the value
deflate - the subref (or method name) to execute to
deflate the value
object
roles - names of the methods the object value
must have
classes - names of the classes the object must
inherit from
So far most of this is working great. Of course it really doesn't
do much on its own, but it provides a lot of information that
DBIx::Class (and other classes) can use.
The one thing I'm really not sure about is that I've made it so
each column value is a blessed scalar in its Field class. That
way the data is stored, but I can easily get at the property
values if there are needed. I'm concerned about performance and
memory usage though, although it allows you to do nice things
like this once an object is instantiated:
# get the maximum length for the field
my $length_max = $obj->name_first->length_max;
Which you could use in an TT template like so:
<input type="text" name="contact.name_first" value="[%
contact.name_first %]" maxlength="[% contact.name_first.length_max
%]" />
Scalar objects are pretty light-weight so I'm not sure it would
make much of a difference either way, but I do like the idea of
keeping all the properties close to the column. Using a simple
AUTOLOAD would allow pass-through to underlying object calls as
long as there wasn't a collision with the method names.
(NOTE: I haven't quite gotten to working with objects yet).
Worst case though, as long as the table knew the class names for
each column, it could use the validation methods on its own. I
could handle not making the object a blessed Scalar; but I do
like the syntax it allows you ;)
Here's a few nice side effects I can think of with this system:
Default values:
We could know what the default value of a field should be without
having to create a corresponding record in the database first.
Query optimization:
If someone performs a search, where the string 'foo' is used for
an integer column, we can skip going to the database and just
return no results, since we know there can't be any.
Likewise if a character column has to match a regex like
qr/\A[A-Z]+\z/, and must be beween 5 and 10 characters in length,
and the supplied value is either "BAR" or "foobar" we can also
return immediately.
Also, if a field is NOT NULL, but undef is supplied for the
value, then we can return immediately.
Database Table Creation:
If tables and columns can be described in a rich enough way, then
it should be possible to make CREATE TABLE statements based on
the descriptions of the columns in the perl code. You should be
able to just change the DSN and rerun a script to re-create
everything with a different database engine.
This will require a way of richly describing things at the table
level, but I think it is do-able.
Simplified code:
I'm still getting my head wrapped around DBIx::Class, but I'm
pretty sure that when (if) I refactor DBIx::Class to use this
module some things will become simpler in the code.
Filters:
Once there is a way of handling things at the column level, it
should be fairly easy to make something that pipes the data
through a chain of filters that can remove leading/trailing
whitespace, or properly case words.
Anyway, thats all for now... The main point I want to get across
is that the level of detail this allows should be optional. I
think I can derive the critical information from interface
DBIx::Class provides right now. This just provides a way of
going deeper and specifying the properties each field should
have at a more granular level.
Sorry for the length of this post, I wanted to make sure I
described everything properly.. comments and suggestions are
welcomed and appreciated.
--
Thanks,
Dan
__________________________________________________________________
Dan Kubb Email: dan.kubb at autopilotmarketing.com
Autopilot Marketing Inc. Phone: 1 (604) 820-0212
Web: http://www.autopilotmarketing.com
__________________________________________________________________
More information about the Dbix-class
mailing list