[Catalyst] Output as XML

Mitch Jackson perimus at gmail.com
Thu May 15 14:49:28 BST 2008


Russell,

Thanks for the suggestion.  I looked at that, however it basically
does what I'm already doing.  The bottleneck wasn't so much TT, but
the creation of thousands of DBIC objects and sticking them into an
array.  The same would need to be done with C::V::Rest::XML, as it
serializes the stash.  I needed an approach that generated the XML
while walking the query results, rather than caching them all into
memory first.

Here's what I ended up doing.  It needs more work to support joins or
complex queries, but the speed difference is insane.  Here's benchmark
results between pulling 100, 1000, 5000 and 15000 table rows using the
old way and the following function.  As you can see, sending a DBIC
array of 15,000 rows to TT took 228 seconds to render :-(  This xml()
method took 1.65 seconds.

$ perl xmlbench.pl
          Rate obj 100 xml 100
obj 100 3.53/s      --    -94%
xml 100 62.5/s   1669%      --

         s/iter obj 1000 xml 1000
obj 1000   3.38       --     -97%
xml 1000  0.112    2932%       --

         s/iter obj 5000 xml 5000
obj 5000   32.3       --     -98%
xml 5000  0.549    5779%       --

          s/iter obj 15000 xml 15000
obj 15000    228        --      -99%
xml 15000   1.65    13753%


## include in the schema class

# Use in place of ->search to return an XML document containing the
# records for the query
#
# my $xml = $schema->xml('table',{field => 'value'},{rows => 20});
sub xml {
    my ( $self, $model, @search_params ) = @_;

    croak 'xml( $model, @params ) requires a model parameter'
        unless defined $model and $model;

    my %xml_escape_map = (
        '<' => '&lt;',
        '>' => '&gt;',
        '"' => '&quot;',
        '&' => '&amp;',
    );

    # Prepare the query
    my $rs  = $self->resultset($model)->search(@search_params);
    croak "xml() unable to prepare query" unless defined $rs;

    # Begin the XML document
    my $xml = '<?xml versiono="1.0" encoding="utf-8" ?>'."\n"
            . "<total_records>$rs</total_records>\n"
            . '<records>'."\n";

    # Add an xml block for each record in the set
    my @cols = $self->resultset($model)->result_source->columns;
    my $cursor = $rs->cursor;
    while ( my @rec = $cursor->next ) {
        $xml .= '<record>'."\n";
        for my $f ( @cols ) {
            my $v = shift @rec;
            $v =~ s/([\<\>\"\&])/$xml_escape_map{$1}/g;
            $xml .= "    <${f}>$v</${f}>\n";
        }
        $xml .= '</record>'."\n";
    }

    # Terminate the xml
    $xml .= '</records>'."\n";
    return $xml;
}

/Mitchell K. Jackson

On Wed, May 14, 2008 at 10:10 PM, Russell Jurney <rjurney at lucision.com> wrote:
> Have you thought about using
> this: http://search.cpan.org/~sri/Catalyst-View-REST-XML-0.01/XML.pm with
> raw data to achieve the desired speed?  Not sure where your bottleneck is,
> but if TT is a problem then I assume XML::Simple is faster than TT to
> serialize XML?
> Russell Jurney
> rjurney at lucision.com
>
>
> On May 14, 2008, at 10:02 AM, Mitch Jackson wrote:
>
> Good morning!
> I'm about to start working on some DBIC query to XML code, but before
> I do I was wondering if anybody out there has already done this, or if
> perhaps my approach is thick-headed.
> I'm generating XML from database queries in a catalyst app.  At the
> moment, I am doing it a bit like this (simplified for readability):
> ---------------------------------------------------------------------
> # controller.pm  /controller/action/parm1/parm2/parm3/something.xml
> sub action : Local {
>   ...
>   $c->stash->{records} = [ $c->model('table')->search( {}, { rows =>
> 20, page 2 } ) ];
>   $c->res->content_type('text/xml');
>   $c-.res->header('Content-disposition' => 'attachment;
> filename=action_${timestamp}.xml');
>   $c->res->template('xml/action.xml');
> }
> # xml/action.xml
> <?xml version="1.0" encoding="utf-8" ?>
> <records>
> [% FOREACH record IN records -%]
> <record id="[% record.id %]">
>   <field1>[% record.field1 %]</field1>
>   <field2>[% record.field2 %]</field2>
>   <field3>[% record.field3 %]</field3>
> </record>
> [% END # foreach record -%]
> </records>
> -------------------------------------------------------------------------------
> This approach works fine for paged record sets ( that get loaded into
> an ExtJS ajax grid ).  When I use this on a record set of 15k-16k
> records, the app goes to 100% CPU and cannot complete the request
> after several minutes.  There is a lot of overhead to generate 16k
> DBIC objects, dump them in an array, and then manipulate them through
> TT.
> This speed problem is unacceptable for my app, especially considering
> my users may be dealing with much larger datasets than this.
> One solution would be to write something proprietary to this
> implementation as a module that would throw away the overhead bloat
> and generate the XML file efficiently... but I want something reusable
> in the long term from a catalyst perespective.
> I am considering writing some sort of DBIC query to XML code that
> would use the DBI cursor directly to bypass object creation and build
> the XML while looping through the results.
> (http://search.cpan.org/~ash/DBIx-Class-0.08010/lib/DBIx/Class/Manual/Cookbook.pod#Get_raw_data_for_blindingly_fast_results)
> An interface like this:
> my $xml = $c->model('table')->xml( \%filter_parms, \%dbic_opts )
> That would generate output like this
> <?xml version="1.0" encoding="utf-8" ?>
> <records>
> <record id="42">
>   <field1>don't panic</field1>
>   <field2>vogon poetry</field2>
>   <field3>see if i don't</field3>
> </record>
> ...
> </records>
> The questions I pose are this:
> - Is there something already out there that does what I need?
> - Is there a big problem with my approach?
> - Would anybody else be interested in this if I get it working?
>
> Kind Regards,
> /Mitchell K. Jackson
> _______________________________________________
> List: Catalyst at lists.scsys.co.uk
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
> Dev site: http://dev.catalyst.perl.org/
>
>
> _______________________________________________
> List: Catalyst at lists.scsys.co.uk
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
> Dev site: http://dev.catalyst.perl.org/
>
>



More information about the Catalyst mailing list