[Dbix-class] DBI + DBD::Oracle LOB/LONG fetches and LongReadLen LongTruncOk

Leandro Hermida leandro.hermida at unibas.ch
Fri Feb 17 13:11:31 CET 2006


Hi everyone,

Just was looking at the internals of DBIx::Class::Storage::DBI and I have a
couple of questions/concerns.  I work with Oracle, MySQL and PostgreSQL
using DBI with their respective DBD::* drivers and an particular feature of
DBD::Oracle is the LongReadLen and LongTruncOk database handle attributes.
Both of these attributes are extremely important to set when
fetching/selecting LONG/LOB data using DBI and the underlying DBD::Oracle
driver.  LongReadLen is an unsigned integer value that needs to be set to at
least as large as the length of the LONG/LOB data you are selecting.
LongTrunkOk is a boolean value which tells DBI what to do if you try to
fetch LONG/LOB data that is longer than LongReadLen (1 - truncate data
silently to LongReadLen length, 0 - raise an error).  In almost all cases
you want LongTruncOk = 0 which I think is the DBI default.  Both attributes
are not implemented or used in DBD::mysql or DBD::Pg so no problems there.

I was trying to find where DBIx::Class sets or modifies the database handle
LongReadLen attribute.  It doesn't seem to be set or modified anywhere (I
grepped the entire DBIx tree) other than in the unrelated
Class::PK::Auto::Oracle package.  Now I think you guys are going to say
"well you just pass your value for LongReadLen in the database handle $attrs
hashref { ..., LongReadLen => xxxxx, ... } when you create your
DBIx::Class::Schema object, a la my $schema =
My::Schema->connect($dsn,$user,$pass,$attrs);".  But for many applications
this will not work because we cannot know ahead of time for our entire
database what the maximum length of any LONG/LOB column data might be.  Or
if we set it to some extremely high value to cover (hopefully) all
possibilties then there will be a big waste of memory since
$dbh->{LongReadLen} directly affects the memory used by resulting statement
handles.  The DBI docs say don't just set it to some huge value and they
advise doing a dynamic calculation of LongReadLen.  So the LongReadLen
attribute needs to get dynamically set on the $dbh before doing the
$dbh->prepare() or $dbh->prepare_cached() call for the SQL SELECT statement
that is fetching any LONG/LOB data.  What I do in my code to dynamically
calculate LongReadLen (based off of the idea presented in the DBI docs) is
the following:

As an example, suppose this is the SQL statement for the data I want to
fetch which contains two LOB columns:

SELECT  t1.col1, t1.col2, t1.lob_col3, t2.lob_col4, t2.col5
FROM    table1 as t1 INNER JOIN table2 as t2 ON t1.col1 = t2.col1
WHERE   t1.col2 = ?
        AND
        t2.col5 > ?

Before running the above SQL, I first need to run the following SQL
statement below to get the LongReadLen.  It finds the maximum length in
bytes of the LONG/LOB data I will return from the SQL above.  It is
important to note that between the fetch data SQL above and the get
LongReadLen SQL below all of the SQL clauses that determine what rows of
data we are getting back (FROM, WHERE, GROUP BY, HAVING) need to be exactly
the same.  Clauses like ORDER BY can be omitted in the get LongReadLen SQL
because they do not determine what rows come back and of course it improves
performance to omit it.

SELECT GREATEST(MAX(LENGTHB(t1.lob_col3)), MAX(LENGTHB(t2.lob_col4)))
FROM   table1 as t1 INNER JOIN table2 as t2 ON t1.col1 = t2.col1
WHERE  t1.col2 = ?
       AND
       t2.col5 > ?

In Oracle the GREATEST funtion will even work with one argument as in the
case if we have only one LONG/LOB column in the fetch data SELECT statement.
It will just return the same length as if it weren't used (i.e. SELECT
MAX(LENGTHB(lob_column)) ).

Now in my Perl code I do things very differently compared to DBIx::Class so
it is hard to just cut and paste what I do.  But when looking at
DBIx::Class::Storage::DBI code I will try to put what needs to go on in the
right places (below).  The sth method needs to accept two extra parameters,
the SQL operation (SELECT/INSERT/UPDATE/DELETE) and an array reference of
the bind values $bind_ref.  From what I can find in the entire DBIx::Class
tree, the sth method is only called from within DBIx::Class::Storage::DBI
inside the _execute method.  So I also show the required code changes within
the _execute method.

sub _execute {
    ... existing code ...
    
    # stringify @bind args operation is moved from just after sth 
    # creation to just before sth creation because now I need to 
    # pass \@bind to sth method
    @bind = map { ref $_ ? ''.$_ : $_ } @bind; # stringify args
    my $sth = $self->sth($sql, $op, \@bind);
    
    ... existing code ...
}

sub sth {
    my ($self, $sql, $op, $bind_ref) = @_;
    if (uc($self->dbh->{Driver}->{Name}) eq 'ORACLE' and uc($op) eq
'SELECT') {
        if ($self->_lobs_in_select_col_list()) {
            $self->dbh->{LongReadLen} =
$self->dbh->selectrow_array($self->_get_lrl_sql(), undef, @$bind_ref) || 0;
        } else {
            $self->dbh->{LongReadLen} = 0;
        }
    }
    # 3 is the if_active parameter which avoids active sth re-use
    return $dbh->prepare_cached($sql, {}, 3);
}    

In the _execute method I move the line of code which stringifies the @bind
array args to just above the line of code which calls the sth method to
create the sth since I now need to pass \@bind to the sth method.  Inside of
the sth method I first use the DBI database handle attribute
$self->dbh->{Driver}->{Name} to check if I have a driver which needs
LongReadLen attention (like Oracle) and I check if I have a SELECT
statement.  Then I check if any of the columns to return in the statement
are of the LONG/LOB type.  If yes, I need to generate the get LongReadLen
SQL statement and set $self->dbh->{LongReadLen} to the value returned from
executing the statement.  If the statement returns undef then I don't have
any data in the LONG/LOB columns so I should set LongReadLen to 0.  If I
don't have any LONG/LOB columns to return in my SQL statement then
LongReadLen should also be set to 0.

Now I am not familiar enough yet with all the internals of DBIx::Class and
all the modules it uses.  So I don't know yet how from within the sth method
in DBIx::Class::Storage::DBI I can find out if I have any LONG/LOB columns
(_lobs_in_select_col_list()) and also generate the get LongReadLen SQL
statement (_get_lrl_sql()).  I just put these dummy methods there to show
where it needs to be done.  I would prefer not to have to do some regex
stuff on the $sql statement passed in order to get the individual components
because I think somewhere in DBIx::Class they were all individual components
before they became $sql.  Where are these components?  ResultSource or
ResultSet?  Hmmm...  If I could get access from within the sth method to
some object which has the individual columns from $sql and the tables they
come from then I could call the "column_info_for" method and find whether a
column is LONG/LOB.  And I think generating the get LongReadLen SQL would
also be straightforward knowing this and from the same or other object the
FROM, WHERE, GROUP BY, HAVING components which were used to generate $sql.

I hope I am going in the right direction with this,

Leandro Hermida






More information about the Dbix-class mailing list