[Catalyst] Output as XML
Mitch Jackson
perimus at gmail.com
Fri May 16 04:36:59 BST 2008
J,
My solution was better suited for the DBIx::Class list I suppose, but
I posed the question here to see if there was already some sort of
Catalyst solution I had overlooked... a view for example.
I took a look at that part of the cookbook before, but it seems to
only apply if you're pulling one table row, not a record set.
Thanks for the advice...
/Mitch
On Thu, May 15, 2008 at 9:36 PM, J. Shirley <jshirley at gmail.com> wrote:
> On Thu, May 15, 2008 at 10:49 PM, Mitch Jackson <perimus at gmail.com> wrote:
>> Russell,
>>
>> Thanks for the suggestion. I looked at that, however it basically
>> does what I'm already doing. The bottleneck wasn't so much TT, but
>> the creation of thousands of DBIC objects and sticking them into an
>> array. The same would need to be done with C::V::Rest::XML, as it
>> serializes the stash. I needed an approach that generated the XML
>> while walking the query results, rather than caching them all into
>> memory first.
>>
>> Here's what I ended up doing. It needs more work to support joins or
>> complex queries, but the speed difference is insane. Here's benchmark
>> results between pulling 100, 1000, 5000 and 15000 table rows using the
>> old way and the following function. As you can see, sending a DBIC
>> array of 15,000 rows to TT took 228 seconds to render :-( This xml()
>> method took 1.65 seconds.
>>
>> $ perl xmlbench.pl
>> Rate obj 100 xml 100
>> obj 100 3.53/s -- -94%
>> xml 100 62.5/s 1669% --
>>
>> s/iter obj 1000 xml 1000
>> obj 1000 3.38 -- -97%
>> xml 1000 0.112 2932% --
>>
>> s/iter obj 5000 xml 5000
>> obj 5000 32.3 -- -98%
>> xml 5000 0.549 5779% --
>>
>> s/iter obj 15000 xml 15000
>> obj 15000 228 -- -99%
>> xml 15000 1.65 13753%
>>
>>
>> ## include in the schema class
>>
>> # Use in place of ->search to return an XML document containing the
>> # records for the query
>> #
>> # my $xml = $schema->xml('table',{field => 'value'},{rows => 20});
>> sub xml {
>> my ( $self, $model, @search_params ) = @_;
>>
>> croak 'xml( $model, @params ) requires a model parameter'
>> unless defined $model and $model;
>>
>> my %xml_escape_map = (
>> '<' => '<',
>> '>' => '>',
>> '"' => '"',
>> '&' => '&',
>> );
>>
>> # Prepare the query
>> my $rs = $self->resultset($model)->search(@search_params);
>> croak "xml() unable to prepare query" unless defined $rs;
>>
>> # Begin the XML document
>> my $xml = '<?xml versiono="1.0" encoding="utf-8" ?>'."\n"
>> . "<total_records>$rs</total_records>\n"
>> . '<records>'."\n";
>>
>> # Add an xml block for each record in the set
>> my @cols = $self->resultset($model)->result_source->columns;
>> my $cursor = $rs->cursor;
>> while ( my @rec = $cursor->next ) {
>> $xml .= '<record>'."\n";
>> for my $f ( @cols ) {
>> my $v = shift @rec;
>> $v =~ s/([\<\>\"\&])/$xml_escape_map{$1}/g;
>> $xml .= " <${f}>$v</${f}>\n";
>> }
>> $xml .= '</record>'."\n";
>> }
>>
>> # Terminate the xml
>> $xml .= '</records>'."\n";
>> return $xml;
>> }
>>
>> /Mitchell K. Jackson
>>
>> On Wed, May 14, 2008 at 10:10 PM, Russell Jurney <rjurney at lucision.com> wrote:
>>> Have you thought about using
>>> this: http://search.cpan.org/~sri/Catalyst-View-REST-XML-0.01/XML.pm with
>>> raw data to achieve the desired speed? Not sure where your bottleneck is,
>>> but if TT is a problem then I assume XML::Simple is faster than TT to
>>> serialize XML?
>>> Russell Jurney
>>> rjurney at lucision.com
>>>
>>>
>>> On May 14, 2008, at 10:02 AM, Mitch Jackson wrote:
>>>
>>> Good morning!
>>> I'm about to start working on some DBIC query to XML code, but before
>>> I do I was wondering if anybody out there has already done this, or if
>>> perhaps my approach is thick-headed.
>>> I'm generating XML from database queries in a catalyst app. At the
>>> moment, I am doing it a bit like this (simplified for readability):
>>> ---------------------------------------------------------------------
>>> # controller.pm /controller/action/parm1/parm2/parm3/something.xml
>>> sub action : Local {
>>> ...
>>> $c->stash->{records} = [ $c->model('table')->search( {}, { rows =>
>>> 20, page 2 } ) ];
>>> $c->res->content_type('text/xml');
>>> $c-.res->header('Content-disposition' => 'attachment;
>>> filename=action_${timestamp}.xml');
>>> $c->res->template('xml/action.xml');
>>> }
>>> # xml/action.xml
>>> <?xml version="1.0" encoding="utf-8" ?>
>>> <records>
>>> [% FOREACH record IN records -%]
>>> <record id="[% record.id %]">
>>> <field1>[% record.field1 %]</field1>
>>> <field2>[% record.field2 %]</field2>
>>> <field3>[% record.field3 %]</field3>
>>> </record>
>>> [% END # foreach record -%]
>>> </records>
>>> -------------------------------------------------------------------------------
>>> This approach works fine for paged record sets ( that get loaded into
>>> an ExtJS ajax grid ). When I use this on a record set of 15k-16k
>>> records, the app goes to 100% CPU and cannot complete the request
>>> after several minutes. There is a lot of overhead to generate 16k
>>> DBIC objects, dump them in an array, and then manipulate them through
>>> TT.
>>> This speed problem is unacceptable for my app, especially considering
>>> my users may be dealing with much larger datasets than this.
>>> One solution would be to write something proprietary to this
>>> implementation as a module that would throw away the overhead bloat
>>> and generate the XML file efficiently... but I want something reusable
>>> in the long term from a catalyst perespective.
>>> I am considering writing some sort of DBIC query to XML code that
>>> would use the DBI cursor directly to bypass object creation and build
>>> the XML while looping through the results.
>>> (http://search.cpan.org/~ash/DBIx-Class-0.08010/lib/DBIx/Class/Manual/Cookbook.pod#Get_raw_data_for_blindingly_fast_results)
>>> An interface like this:
>>> my $xml = $c->model('table')->xml( \%filter_parms, \%dbic_opts )
>>> That would generate output like this
>>> <?xml version="1.0" encoding="utf-8" ?>
>>> <records>
>>> <record id="42">
>>> <field1>don't panic</field1>
>>> <field2>vogon poetry</field2>
>>> <field3>see if i don't</field3>
>>> </record>
>>> ...
>>> </records>
>>> The questions I pose are this:
>>> - Is there something already out there that does what I need?
>>> - Is there a big problem with my approach?
>>> - Would anybody else be interested in this if I get it working?
>>>
>>> Kind Regards,
>>> /Mitchell K. Jackson
>
>
> You probably would do better asking on the DBIx::Class mailing list.
> The other thing is that you don't need to inflate into DBIC objects,
> as the DBIC Cookbook states:
>
> http://search.cpan.org/~ash/DBIx-Class-0.08010/lib/DBIx/Class/Manual/Cookbook.pod#Skip_object_creation_for_faster_results
>
> That will handle the serialization out to just a hash.
>
> -J
>
> _______________________________________________
> List: Catalyst at lists.scsys.co.uk
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
> Dev site: http://dev.catalyst.perl.org/
>
More information about the Catalyst
mailing list