[Catalyst-commits] r14486 - trunk/examples/CatalystAdvent/root/2013

jnapiorkowski at dev.catalyst.perl.org jnapiorkowski at dev.catalyst.perl.org
Thu Dec 12 00:30:34 GMT 2013


Author: jnapiorkowski
Date: 2013-12-12 00:30:34 +0000 (Thu, 12 Dec 2013)
New Revision: 14486

Added:
   trunk/examples/CatalystAdvent/root/2013/24.pod
   trunk/examples/CatalystAdvent/root/2013/25.pod
Log:
two more articles

Added: trunk/examples/CatalystAdvent/root/2013/24.pod
===================================================================
--- trunk/examples/CatalystAdvent/root/2013/24.pod	                        (rev 0)
+++ trunk/examples/CatalystAdvent/root/2013/24.pod	2013-12-12 00:30:34 UTC (rev 14486)
@@ -0,0 +1,470 @@
+=head1 Using SOLR in a Catalyst Model with WebService::Solr
+
+=head1 OVERVIEW
+
+Using L<Solr|http://lucene.apache.org/solr/>, a Search Server from L<Apache's Lucene Project|http://lucene.apache.org> as a B<Catalyst Model>.
+
+=head1 INTRODUCTION
+
+Compared to conventional database search (and the full text query extensions found in most modern SQL implementations), a Search Server is going to provide better performance and search features. Since Solr is writeable as well as readable it can be used as a NoSQL datastore. 
+
+=head2 Solr Basics
+
+Solr is a java servlet, implementing a web based interface to Lucene. Requests to Solr are made via HTTP requests. Request data may be sent in either POST or GET values. Data is returned in JSON but Solr will also return data in xml or CSV formats. Similarly POSTs of data to Solr may be in any of these formats. Lucene provides indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities. The Data Import Handler will allow you to import from lots of other sources rather than needing to POST it all through web requests. Once up and going it gives you a lot of possibilities for finding documents. 
+
+Solr runs in a java servelet answering HTTP protocol requests on a designated port, 8983 by default. Operations are carried out by Get requests and by Post requests of either XML or JSON. Data returned is also in JSON or XML format. CSV and a few other formats are also supported for data. Large data sets are usually imported through the Data Import Handler (DIH) which can among other methods load a CSV file or query a SQL database. The Server and the Collections (equivalent of a database) are configured through XML files. Solr does not include a crawling capability. The L<Nutch|http://nutch.apache.org/> utility or custom scripts are used in conjunction with Solr for crawling.
+
+Requests to Solr can be made through a web browser or any utility such as wget, curl or LWP scripts that implements the HTTP protocol. Each request is independent of all others, there is no session or handshake, Solr recieves a request via HTTP and responds to it. The Solr Server also provides a web interface, anyone who can access the interface, can access all of its features. This has obvious architectural implications in securing solr, because security must be implemented at either the network level or by placing Solr behind another webserver which implements security. Normally Solr is not directly exposed to the public internet.
+
+=head2 Perl Modules for Solr
+
+There are a number of Perl Modules available for Solr, the two that appear the most viable are Apache::Solr and WebService::Solr. Unfortunately, all of the modules have problems both in bugs and unimplemented features. Initally I had the best luck with Apache::Solr but ran into some limitations there. After reading through the source code for several of the modules, I decided to work with WebService::Solr. At present it looks like Apache::Solr is being more actively developed so in the future it could become the better choice. 
+
+A lot of Perl's Solr users prefer to implement their own Agent/Model. Since the Solr interface is based on HTTP requests, JSON and XML, this is not hard so much as potentially time-consuming. 
+
+=head1 Preparing the Environment
+
+You will need to have a Catalyst Development Environment ready, in addition you should install WebService::Solr and Catalyst::Model::WebService::Solr. Day One of this Advent Calendar discussed the enhancements in 5.90050, for our purposes the improved UTF8 handling is a Critical Feature and resolves a showstopper bug. When I first looked at WebService::Solr "Wide Character on Print" really stymied my first sessions. If updating Catalyst isn't an option for you the workaround I was using is detailed in rtcpan bug 89288. 
+
+You will also need to install a JVM like open JDK and then download a copy of Solr from the L<Solr Download Page|http://lucene.apache.org/solr/downloads.html>.  Once downloaded and extracted you will need to load the example data. Open up two terminals. To save space I'll refer you to the L<Solr tutorial|http://lucene.apache.org/solr/tutorial.html>, to speed up use post.sh in the exampledocs folder to populate the test data, and skip ahead to querying to confirm that you have loaded the 32 documents. For the purpose of the rest of the article it will be assumed you have Solr running locally with the test data loaded and answering the default port 8983.
+
+ Terminal 1
+ cd ..path_to../example
+ java -jar start.jar
+ 
+ Terminal 2
+ cd ..path_to../example/exampledocs
+ ./post.sh *.xml
+
+=head1 Create A Project with a Template Toolkit View
+
+ catalyst.pl SolrDemo
+ cd SolrDemo
+ ./script/solrdemo_create.pl view HTML TT
+
+Edit SolrDemo.conf
+
+ solrserver         http://localhost:8983/solr/collection1
+
+Edit the config section of lib/SolrDemo.pm and add B<encoding =E<gt> 'utf8',> to prevent wide character errors.
+
+ __PACKAGE__->config(
+    name => 'SolrDemo',
+    # Disable deprecated behavior needed by old applications
+    disable_component_resolution_regex_fallback => 1,
+    enable_catalyst_header => 1, # Send X-Catalyst header    
+    encoding => 'utf8', # prevents wide character explosions
+    'View::HTML' => {  #Set the location for TT files        
+         INCLUDE_PATH => [ SolrDemo->path_to( 'root' ), ], },    
+ );
+
+Create some additional files we'll need
+
+ touch lib/SolrDemo/Model/Solr.pm
+ touch lib/SolrDemo/Model/SolrModelSolr.pm
+ touch lib/SolrDemo/Controller/Thin.pm
+ touch lib/SolrDemo/Controller/Thin.pm 
+ touch root/results.tt
+ touch t/model_solr.t
+
+In addition I recommend creating a reset script in the example/exampledocs folder of the Solr distribution. This script will delete your collection and replace it with the sample docs when you are testing CRUD operations. 
+
+ wget "http://localhost:8983/solr/update?stream.body=<delete><query>*:*</query></delete>" -O /dev/null
+ wget "http://localhost:8983/solr/update?stream.body=<commit/>" -O /dev/null 
+ ./post.sh *.xml
+
+=head1 A Thin Model with Catalyst::Model::WebService::Solr
+
+There are only two options to configure, where to connect and autocommit. By default Solr may not immediately reflect changes to data, and the autocommit flag tells WebService::Solr to always follow delete/add operations with a commit request. You might want to turn it off if you were writing batches of data and wanted to commit at the end of the batch for performance purposes.
+
+File: lib/SolrDemo/Model/SolrModelSolr
+
+    package SolrDemo::Model::SolrModelSolr;
+    use namespace::autoclean;
+    use Catalyst::Model::WebService::Solr;
+    use Moose;
+
+    extends 'Catalyst::Model::WebService::Solr';
+    __PACKAGE__->config(
+        server  => SolrDemo->config->{solrserver},
+        options => { autocommit => 1, }
+    );
+    1;
+
+File: lib/SolrDemo/Controller/Thin.pm 
+
+    package SolrDemo::Controller::Thin;
+    use namespace::autoclean;
+    use WebService::Solr::Query;
+    use Moose;
+
+    BEGIN { extends 'Catalyst::Controller' }
+
+    sub response2info {
+        my $response = shift;
+        my $raw      = $response->raw_response();
+        my $pre      = '';
+        $pre .= "\n_msg\n" . $raw->{'_msg'};
+        $pre .= "\n_headers";
+        my %hheaders = %{ $raw->{'_headers'} };
+        for ( keys %hheaders ) { $pre .= "\n    $_ = $hheaders{$_}"; }
+        $pre .= "\n_request";
+        my %rreq = %{ $raw->{'_request'} };
+        for ( keys %rreq ) { $pre .= "\n    $_ = $rreq{$_}"; }
+        $pre .= "\n_content</pre>\n" . $raw->{'_content'} . '<pre>';
+        $pre .= "\n_rc\n" . $raw->{'_rc'};
+        $pre .= "\n_protocol\n" . $raw->{'_protocol'};
+        $pre .= "\nRequest Status (via method)\n" . $response->solr_status();
+        my @docs = $response->docs;
+        $pre .= "\nDocument Count: " . scalar(@docs);
+        return $pre;
+    }
+
+    sub dump : Local : Args(0) {
+        my ( $self, $c ) = @_;
+        my $response =
+        $c->model('SolrModelSolr')
+        ->search( WebService::Solr::Query->new( { '*' => \'*' } ),
+            { rows => 10000 } );
+        my @docs = $response->docs;
+        $c->log->info( "\nDocument Count: " . scalar(@docs) );
+        my $pre = &response2info($response);
+        $c->response->body("<pre>$pre </pre>");
+    }
+
+    sub example : Local : Args(0) {
+        my ( $self, $c ) = @_;
+        my $response =
+        $c->model('SolrModelSolr')
+        ->search( WebService::Solr::Query->new( { text => ['hard drive'] } ),
+            { rows => 10000 } );
+        my $pre = &response2info($response);
+        $c->response->body("<pre>$pre </pre>");
+    }
+
+    __PACKAGE__->meta->make_immutable;
+
+    1;
+
+=head2 About the thin controller
+
+=head3 response2info
+
+This is Viewish code shared by two of the methods that puts the raw elements of the response into a string. 
+
+=head3 dump
+
+Executes a query for all records in the Solr database. WebService::Solr::Query generates queries. To generate the query you need to pass a hashref of the Solr fields you want and the values for the fields, the \ indicates to pass the second * as a literal string. The second argument is a hashref of options, here we want to override the Solr default of returning 10 rows by specifying an arbitrary high value.
+
+=head3 example
+
+This example query is hard coded to find items matching the phrase 'hard drive', which, we see from the spew, gets translated as 'hard+drive'. Here we specified the field text (which is a catch-all field defined to hold everything searchable) and passed an array ref to the list of values. If you copy and rename the method and then change the field list to B<['hard drive','maxtor']>, you will find that you still get the same 2 records, this is because of solr's matching behaviour. If you want to filter for only maxtor hard drives you'll need to use a filter query (fq) which is specified in the options.
+
+Add the following method to Thin.pm
+
+    sub maxtor :Local :Args(0) {
+        my ( $self, $c ) = @_;
+        my $maxq = WebService::Solr::Query->new( { manu => ['maxtor'] } );
+        my $response =
+        $c->model('SolrModelSolr')
+        ->search( WebService::Solr::Query->new( { text => ['hard drive'] } ),
+            { rows => 10000, fq => $maxq } );
+        my $pre = &response2info($response);
+        $c->response->body("<pre>$pre </pre>");
+    } 
+
+=head3 a real search
+
+Add this method to: Thin.pm
+
+    sub select : Local : Args(2) {
+        my ( $self, $c, $fieldname, $fieldvalue ) = @_;
+        my $response =
+        $c->model('SolrModelSolr')
+        ->search( WebService::Solr::Query->new( { $fieldname => [$fieldvalue] } ),
+            { rows => 10000 } );
+        my @docs = $response->docs;
+        $c->stash(
+            template  => 'results.tt',
+            field     => $fieldname,
+            value     => $fieldvalue,
+            docs      => \@docs,
+        );
+    }
+
+File: /root/results.tt    
+
+    <h1>Catalyst SolrDemo</h1>
+    <h2>Docs in this query [% docs.list.size %]</h2>
+    <h3>Field [% field %] value [%value %]</h3>
+    <table>
+    [% FOREACH doc IN docs %]
+    <tr><th>descriptor</th><th>field value</th></tr>
+    [% FOREACH fieldname IN doc.field_names.sort %]
+    <tr><td>[% fieldname %]</td><td>[% doc.value_for( fieldname ) %]</td></tr>
+    [% END %]
+    [% END %]
+    </table>
+
+Try some queries: 
+
+=over
+
+=item *
+http://localhost:3000/thin/select/text/ipod  
+
+=item *
+http://localhost:3000/thin/select/features/cache 
+
+=item *
+http://localhost:3000/thin/select/manu/maxtor
+
+=back
+
+First the model returns a WebService::Solr::Response object, we use the docs method to extract an array of WebService::Solr::Document objects from it which are then passed by reference to the view. The view iterates the array and uses the B<fieldnames> method to get a list of the fields in that document (not all documents in the test data have the same fields) and then iterates through it, retrieving the individual fields with the B<value_for method>.
+
+=head1 Moving to a Fat Model
+
+My Solr search queries typically require a lot of supporting code, which is easier to test in a model than a controller, and is generally more appropriate to the model. Unlike DBI-based models which maintain a connection, each request to Solr is completely independent of all others and no connection is maintained between them, so instantiating a new WebService::Solr object is relatively trivial, additionally if you work with multiple collections you need to create a seperate object for each one. 
+
+lib/SolrDemo/Model/Solr.pm
+
+    package SolrDemo::Model::Solr;
+
+    use WebService::Solr;
+    use WebService::Solr::Query;
+    use WebService::Solr::Field ;
+    use WebService::Solr::Document ;
+    use namespace::autoclean;
+
+    use parent 'Catalyst::Model';
+
+    our $SOLR = WebService::Solr->new( SolrDemo->config->{solrserver} );
+
+    sub _GeoFilter {
+        my ( $location, $sfield, $distance ) = @_;
+        return qq/\{!geofilt pt=$location sfield=$sfield d=$distance\}/;
+    }
+
+    sub List {
+        my $self      = shift;
+        my $params    = shift;
+        my $mainquery = WebService::Solr::Query->new($params);
+        my %options   = ( rows => 100 );
+        my $response  = $SOLR->search( $mainquery, \%options );
+        return $response->docs;
+    }
+
+    sub Kimmel {
+        my $self         = shift;
+        my $distance     = shift;
+        my $kimmelcenter = '39.95,-75.16';
+        my $mainquery    = WebService::Solr::Query->new( { '*' => \'*' } );
+        my $geofilt      = &_GeoFilter( $kimmelcenter, 'store', $distance );
+        my %options      = ( rows => 100, fq => $geofilt, sort => 'price asc' );
+        my $response     = $SOLR->search( $mainquery, \%options );
+        return $response->docs;
+    }
+
+    1;
+
+t/model_solr.t
+
+    use Test::More;
+    BEGIN { use_ok 'SolrDemo' }
+
+    my $C = SolrDemo->new ;
+    my @docs = $C->model('Solr')->List( { cat => 'electronics', manu => 'corsair' } );
+    is( scalar(@docs), 2, 'We expect 2 docs' );
+
+    my $carnegiehall = '40.76,-73.98' ;
+    my $geofilt = &SolrDemo::Model::Solr::_GeoFilter( $carnegiehall, 'store', 400 ) ;
+    is( $geofilt, '{!geofilt pt=40.76,-73.98 sfield=store d=400}', 
+        'Test geofilter construction using carnegie hall as a testcase');
+    my @docs2 = $C->model('Solr')->Kimmel( 1600 ) ;
+    is( scalar(@docs2), 3, 'There are 3 items within 1600 km of the Kimmel Center' );
+
+    done_testing();
+
+lib/SolrDemo/Controller/Fat.pm
+
+    package SolrDemo::Controller::Fat;
+    use namespace::autoclean;
+    use Moose;
+
+    BEGIN { extends 'Catalyst::Controller' }
+
+    sub select : Local : Args(2) {
+        my ( $self, $c, $fieldname, $fieldvalue ) = @_;
+        my @docs = $c->model('Solr')->List( { $fieldname => $fieldvalue } );
+        $c->stash(
+            template => 'results.tt',
+            field    => $fieldname,
+            value    => $fieldvalue,
+            docs     => \@docs,
+        );
+    }
+
+    sub nearkimmel : Local : Args() {
+        my ( $self, $c ) = @_;
+        my $distance = 500;
+        my @docs     = $c->model('Solr')->Kimmel(500);
+        $c->stash(
+            template => 'results.tt',
+            field    => 'Distance from Kimmel Center in Philadelphia',
+            value    => $distance,
+            docs     => \@docs,
+        );
+    }
+
+    __PACKAGE__->meta->make_immutable;
+
+    1;
+
+The Model contains 3 methods. The private method generates a geofilter string, because that isn't currently implemented in WebService::Solr, but I've proposed it for a future release. Of the other two methods the first replicates the select we used in the thin model and the third finds things near the Kimmel Center in Philadelphia as an example of geospatial search. 
+
+A couple of times now we've seen B<WebService::Solr::Query-E<gt>new( { '*' =E<gt> \'*' } )>. If you go back to the first dump methods you can see it ended up as B<(*%3A*)>, B<%3A> translates back to 'B<:>'. You could use the string B<'(*:*)'> instead of generating the value with Query. Modify the Kimmel method to demonstrate this yourself. In this special case we wanted to preseve the value '*' as a literal, not risk having it converted to B<%2A>, which we accomplished by passing it as a reference. For this case it might be clearer to just use the string directly in your query, but generally I'd rather use WebService::Solr::Query and have it worry about the details of Solr Grammar. WebService::Solr::Query is capable of generating complex queries with numerous options beyond the scope of this introduction. 
+
+I also added in the %options of the Kimmel method a third option, sort. The sort option takes two arguments, a field_name and either 'asc' or 'desc'. 
+
+You should now be able to run the tests, and if they work when you run the test server, /fat/select/?/? will work as it did in the thin model, and /fat/nearkimmel will show you results of a geospatial filter.
+
+=head2 Adding, Updating, and Deleting
+
+We're now going to add a record, modify it, and then delete it. This is all going to be done in tests. 
+
+Two methods get added to the model, Delete and Add (which is also the update method). Both methods normally return a value of B<1>, which is the value normally returned by the underlying WebService::Solr method, which in turn is determined by Solr's response, which is not necessarily an indicator that what you intended to happen is what happened. 
+
+=head2 Add
+
+Add and Update are the same method. When a document is added with the same id as an existing document, Solr replaces the original record with the new one. This means whenever we update a record we need to send the entire new version. 
+
+The Add method takes a hashref of fieldnames and values which it uses to create a WebService::Solr::Document. There is an optional parameter to the WebService::Solr->add method for setting boost values on fields, this is not implemented in our Model. A WebService::Solr::Document can be created in 3 ways, first it can be returned by a query to Solr, second it can be constructed from arrays of WebService::Solr::Field objects, and finally we can pass an array of array references to the constructructor.  
+
+Here is an example of a data structure to create a WebService::Solr::Document. 
+
+ my @fields = (
+    [ id     => 'B0019032-02'                     { boost => 1.6 } ],
+    [ artist => 'Philadelphia Orchestra',       { boost => '7.1' } ],
+    [ format => 'CD Audio'                                         ],
+    [ title  => 'Yannick Conducts Stravinsky: The Rite of Spring'  ],
+ );
+
+=head2 Delete
+
+Delete takes a hashref that is used to construct a query. If we use { id => $VALUE } we will delete one record. A careless query could delete a lot of records, as the last test shows { cat => 'electronics' } will delete half of our records! After you run the last test you will need to reset your data using the script you created earlier for that purpose. The sprintf statement is in the method because when the output of WebService::Solr::Query is fed to the delete method the delete method may recieve it is an object rather than a string.
+
+Add to Solr.pm Model
+
+    sub Add {
+        my $self      = shift;
+        my $params    = shift;
+        my @fields_array = () ;
+        foreach my $k ( keys %{$params} ) { 
+                my @fields = ( $k, $params->{ $k } );
+                push @fields_array, ( \@fields ) ;
+            }
+        my $doc = WebService::Solr::Document->new( @fields_array ) || die "cant newdoc $!";
+        my $result = $SOLR->add( $doc ) ;
+        return $result ;
+    }
+
+    sub Delete {
+        my $self      = shift;
+        my $params    = shift;
+        # If the query isn't forcibly stringified an exception may be thrown.
+        my $result = $SOLR->delete_by_query( 
+            sprintf( "%s", WebService::Solr::Query->new($params) ) ) ;
+        return $result ;
+    }
+
+Add to t/model_solr.t immediately above done_testing
+
+    note( "\n* CRUD Tests *\n" );
+    my $added1 = $C->model('Solr')->Add(
+        {
+            name     => 'Zune Player',
+            manu     => 'Microsoft',
+            features => 'Truly Obsolete',
+            price    => '300',
+            store    => '40.76,-73.98',
+            cat      => 'electronics',
+            id       => 'MSZUNE'
+        }
+    );
+    is( $added1, 1,
+        'successfully added a zune located at Carnegie Hall to inventory' );
+
+    my @docs3 = $C->model('Solr')->Kimmel(1600);
+
+    is( scalar(@docs3), 4,
+        'There are now 4 items within 1600 km of the Kimmel Center' );
+
+    # a subroutine to list a doc.
+    sub listdoc {
+        my $d      = shift;
+        my $string = '';
+        $string .=
+            ' ID: '
+        . $d->value_for('id') . ' -- '
+        . $d->value_for('name')
+        . ' Manu: '
+        . $d->value_for('manu') . "\n\t"
+        . $d->value_for('features');
+        return $string;
+    }
+
+    note( '* Display the 4 items within 1600km showing added record *');
+    for (@docs3) { note( &listdoc($_) ) }
+
+    my $updated1 = $C->model('Solr')->Add(
+        {
+            name     => 'Zune Player',
+            manu     => 'Microsoft',
+            features => 'Half price Closeout Sale on our last MS ZUNE! Save $150',
+            price    => '150',
+            store    => '40.76,-73.98',
+            cat      => 'electronics',
+            id       => 'MSZUNE'
+        }
+    );
+
+    is( $updated1, 1, 'Updated the Zune for Closeout!' );
+
+    my @zunedocs = $C->model('Solr')->List( { id => 'MSZUNE' } );
+    my $zunedoc = $zunedocs[0];
+    is( $zunedoc->value_for('price'), 150, 'Prove that zune now costs $150' );
+
+    note( '* Display the Documents showing modified record for ZUNE. *');
+    @docs3 = $C->model('Solr')->Kimmel(1600);
+    for (@docs3) { note( &listdoc($_) ) }
+
+    my $delete1 = $C->model('Solr')->Delete( { id => 'MSZUNE' } );
+    is( $delete1, 1, 'delete returned success' );
+    @zunedocs = $C->model('Solr')->List( { id => 'MSZUNE' } );
+    is( scalar( @zunedocs ) , 0, 'Confirm it is deleted' );
+
+    # This test deletes data, after running it you must reset your data
+    # Comment it or skip it to avoid.
+    my @before = $C->model('Solr')->List( { '*' => \'*' } );
+    my $delete2 = $C->model('Solr')->Delete( { cat => 'electronics' } );
+    my @after = $C->model('Solr')->List( { '*' => \'*' } );
+    is( scalar(@after) , 18, "There were 32 documents, there are now 18");
+
+=head1 For More Information
+
+After following this how-to document you'll want to read the WebService::Solr Documentation. It is organized by sub-module so you'll have to read all of the pieces, plus the tests from the distribution which are where you'll find code examples. You'll also want to read the Solr Documentation, there is a lot more on the web about it than the module.
+
+If there are any errata to this article they will be posted on my L<technical blog|http://techinfo.brainbuz.org/?p=368>. You can download the entire contents and source for this article as well L<http://www.brainbuz.org/images/solrcattut.tgz>.
+
+=head1 Summary
+
+In this article we created both Thin and Fat Models for WebService::Solr. For the Fat Model we also Created, Updated, and Destroyed data, and wrote tests for everything we did.
+
+=head1 Author
+
+John Karr <brainbuz at brainbuz.org> brainbuz
+
+Thanks to Andy Lester for taking time to review this article and make a few helpful recommendations.
+
+=cut

Added: trunk/examples/CatalystAdvent/root/2013/25.pod
===================================================================
--- trunk/examples/CatalystAdvent/root/2013/25.pod	                        (rev 0)
+++ trunk/examples/CatalystAdvent/root/2013/25.pod	2013-12-12 00:30:34 UTC (rev 14486)
@@ -0,0 +1,177 @@
+=encoding utf8
+
+
+=head1 Spoilerific: a (semi-)practical example project with Catalyst
+
+This article introduces I<Spoilerific>, a simple but complete Catalyst-based web application I built to fill a specific (if somewhat frivolous) need in early 2013. I continue to host a live instance of it on my own webserver.
+
+When Perl Advent season came around, the topic arose on Catalyst's IRC channel that the framework could benefit from some more real-world example projects. It happened that I had already shared Spoilerific's source on GitHub, but hadn't really written much about it yet. This article, then, offers a brief tour through the codebase, and a short description of the process I used to build and deploy the project.
+
+
+=head1 What is Spoilerific?
+
+In the spring of 2013 I released L<Spoilerific|http://spoilerific.jmac.org>, a website that helps Twitter users discuss stuff they like in public without spoiling details for their friends. You can L<read the full apologia on the site's "about" page|http://spoilerific.jmac.org/about>, but the idea in essence involves my desire to take the trivial, two-way L<ROT13 encryption scheme|http://en.wikipedia.org/wiki/Rot13>, once ubiquitous in bygone Usenet conversations about books and movies, and reintroduce it to a modern social network.
+
+If you, say, want to tweet "I can't believe the butler did it!" about the latest I<Downton Abbey>, Spoilerific makes it easy to create a Twitter post reading "I can't believe I<gur ohgyre qvq vg!> #downtonabbey", ending with a URL that allows your friends to decipher what you've written -- after clicking past a I<spoilers ahead!> warning screen. Said friends can then use the resulting webpage to add their own thoughts, which will in turn post to Twitter, once again safely veiled by ROT13 and offering a linkback for the curious.
+
+I would describe Spoilerific's success as let's-say-I<modest>. As a strictly-for-fun project, it didn't really drive me to launch a marketing campaign larger than saying "hey y'all lookit" on L<my own Twitter feed|https://twitter.com/JmacDotOrg> and an IRC channel or two. It got L<a curious little writeup in Kill Screen|http://killscreendaily.com/articles/news/spoilerific-draft/>, and enough people made use of it that L<I can link to live examples of its use|https://twitter.com/search?q=spoilerific.jmac.org%2Fthread> without feeling embarrassed (as I figure folks probably don't peer too hard at the resulting tweets' timestamps). Beyond that, it succeeds in its primary goal of scratching my own itch for a tool allowing to me tweet sensitive plot points guilt-free, and in the end that's all I set out to make.
+
+More saliently to readers of a Catalyst blog, I later in the year decided to L<share the thing on GitHub|https://github.com/jmacdotorg/Spoilerific>, as part of a recent personal effort to play a more active role in open source by making the ol' graph a little greener. Now that it's Advent season, I thought I'd offer a bit of an annotated tour through the Spoilerific codebase. I certainly don't hold the project up as the epitome of tight coding practices, but I did my best to stay mindful of the modern Catalyst fat-logic, thin-controller philosophy while I built it, and learned a lot.
+
+It was fun to make, and I hope its guts serve as a tidy example of a small and focused example of how to build an interesting web application with Catalyst.
+
+
+=head2 Why Catalyst?
+
+The simplest answer as to why I chose Catalyst is the least interesting one: it's what I know! I've been using Catalyst both in L<my consulting day-job|http://appleseed-sc.com> and with hobby projects for several years.
+
+However, a small and well-defined hobby project like Spoilerific can be a perfect opportunity to explore new software-creation tools and techniques. Catalyst's feisty younger cousins L<Dancer|https://metacpan.org/pod/Dancer> and L<Mojolicious|https://metacpan.org/pod/Mojolicious> called to me to try them, learning their ways by building a simple but non-trivial project like this. 
+
+Someday, I will answer those calls! But in this case, I also happened that I remained mere months into teaching myself L<Moose|https://metacpan.org/pod/Moose>, with only a single significant Moose-driven project under my belt -- and not even one that used a database. (Yes, I was already an accidental Moose user by dint of our ungulate friend powering Catalyst's core, but that's a world apart from boldly writing C<use Moose;> and then knowing what to do with it.) Spoilerific clearly wanted to run on a classic LAMP stack, a trough to which I had yet to truly lead a Moose of my own. 
+
+I figured that would provide enough novelty for one project, so I stuck with Catalyst's familiar patterns, pledging to make the application's model code as Moosely as I could. Some major Moose features don't appear here -- I wouldn't grok roles, for one, until my next major project -- but I did end up pleased with my experimental use of lazily built object attributes, method modifiers, and other antler-bearing features.
+
+
+=head1 How I built Spoilerific
+
+This is the pattern I have fallen into when starting a new Catalyst-driven LAMP-stack project:
+
+=over
+
+=item 1.
+
+Use Catalyst's helper scripts to set up the app's workspace
+
+
+=item 2.
+
+Create the first draft of a database
+
+
+=item 3.
+
+Create the first draft of the data model, using Catalyst helper scripts again
+
+
+=item 4.
+
+Develop a complete draft of the project's business logic, rebuilding the database-based model modules as needed, but otherwise not thinking about Catalyst much
+
+
+=item 5.
+
+Build the application's controllers and templates, iterating further on the model as needed
+
+
+=item 6.
+
+Iterate and test until you can't stand it anymore and want to throw the whole project into a ditch. I<Ready for release!>
+
+
+=back
+
+
+=head2 Set up the app's skeleton
+
+B<Use C<catalyst.pl> to throw down an app-skeleton with an appropriate name.> In this case, I C<cd>ed over to a directory without too many loose objects lying around and typed C<catalyst.pl Spoilerific>. I always enjoy the few fleeting seconds of watching that script cut a broad new sheet of glittering blank code-canvas, pregnant with potential Perl.
+
+But before laying down code, I had to design the object model.
+
+
+=head2 Draft the database
+
+I have been working with MySQL for as long as I've known Perl. I just can't say no to it when it comes to whipping up a real database quickly.
+
+While aware of L<a growing dissent|http://grimoire.ca/mysql/choose-something-else> in the larger programming world regarding MySQL versus other, variously potent DB solutions, I would like to point out that L<Sequel Pro|http://www.sequelpro.com> not only makes editing MySQL on Mac OS X a delight, but its icon may be the most inspired thing to have ever graced my dock. I voice neither doubt nor shame that L<that buttery stack|http://www.sequelpro.com/blog/2013.01/hello-this-is-sequel-pro-1-0/> has all by itself extended MySQL's lifetime as a part of my programming workflow by several years.
+
+So, yeah, B<Spoilerific uses MySQL.>
+
+
+=head2 Wave magic wand to create the model; then wave it some more
+
+B<Implement the database, then use it to mold a corresponding Catalyst model via the (in this case) C<spoilerific_create.pl> script, which C<catalyst.pl> puts into the application's C<scripts> directory.> I'll freely admit that I see the command that performs this as an opaque incantation; I copy and paste it from a textfile I keep of such things, editing any obvious project-specific substrings to fit. For Spoilerific, I invoke the script like this, while standing in the app's directory:
+
+    script/spoilerific_create.pl model SpoilerificDB DBIC::Schema Spoilerific::Schema create=static dbi:mysql:dbname=spoilerific root
+
+Loosely, that scree of arguments says "Create a new Catalyst model class that hooks into a local MySQL database named C<spoilerific>, and use L<DBIx::Class::Schema::Loader|https://metacpan.org/pod/DBIx::Class::Schema::Loader> to generate (or update) a bunch of appropriate L<DBIx::Class|https://metacpan.org/pod/DBIx::Class> files for the tables you find there." However, I cast this particular spell so often, and without any need for further modification, that it feels rather a blur of Enochian sigils whose deeper workings I need not think terribly hard about. 
+
+I allow myself this cheerfully ignorant attitude towards this particular command because the output befits true sorcery: a full set of documented DBIx::Class modules, one per table, with their core column definitions and relationships all set up according to the columns and foreign-key relationships found in the database itself. Even better, once you begin adding your project's custom model code to these basic class definitions, you can re-run that same C<create.pl> invocation every time you make any iterative changes to your database. So long as you didn't change any of the pre-generated code (all located above a checksum comment warning you about it), DBIx::Class::Schema::Loader will safely update your database model classes to reflect the changes.
+
+I run this command many, many times over the development cycle of a typical Catalyst LAMP project. It's cool.
+
+
+=head2 Develop the model (and only the model)
+
+B<Proceed to ignore all the Catalyst-specific stuff for a while, focusing only on buidling the model classes.> I I<certainly> didn't work this way when new to Catalyst, instead diving right into controllers and templates, building the website from the outside in. But with Spoilerific, I took the opportunity to practice the more contemporary Catalyst philosophy of restricting the role of controllers and views as mere manipulators of the model, keeping logical code out of controllers as much as possible
+
+To that end, Spoilerific has three main model classes. Each of them maps to an SQL table, and therefore each is a class originally created by that crazy C<create.pl> incantation; I merely extended each one, tucking all their custom code safely underneath their checksum lines, allowing me to re-run the DBIx::Class::Schema::Loader spell whenever I wanted to reflect SQL table definitions in the code.
+
+You can find these classes in the C<lib/Spoilerific/Schema/Result> directory within the Spoilerific source tree:
+
+B<User.pm>, unsurprisingly, defines a user of the system. It contains only a little custom code, just enough to transform an inert DB object describing a user into a live Twitter connection specific to that user. Namely, this is its C<twitter_ua> object attribute -- "ua" standing for I<user agent>, here -- which holds a L<Net::Twitter|https://metacpan.org/pod/Net::Twitter> object. 
+
+You can see this attribute has its Moose C<lazy> bit bit set, so it instantiates itself only when it needs to, and once it does it sticks around for the lifetime of this object. This is one of the L<Moose best practices|https://metacpan.org/pod/Moose::Manual::BestPractices> I tried mindfully to stick to while building Spoilerific.
+
+When it does build itself, the attribute calls on database-defined attributes like C<twitter_access_token>, which Spoilerific defines in adherence to L<Catalyst::Authentication::Credential::Twitter|https://metacpan.org/pod/Catalyst::Authentication::Credential::Twitter>. The values for these fields become magically populated through this module when the website user logs into Spoilerific via Twitter's OAuth. Spoilerific::Controller::Auth defines a bit of connective tissue, but that credential module provides most of the heavy lifting.
+
+B<Thread.pm> defines a collection of Spoilerific posts on a single, named topic, with an associated Twitter hashtag. While a key concept to the user experience -- threads being what you browse, when you visit any Spoilerific discussion -- the concept is so simple that I didn't initially need to write a single bit of code outside of what DBIx::Class::Schema::Loader automatically provides, based only on the C<thread> table structure.
+
+I did end up tossing in one addition, an C<around> modifier to the C<hashtag> accessor method. It simply helps normalize the data stored in this column, prepending an octothorpe to the provided tag if the user didn't do so themselves.
+
+B<Post.pm> contains most of the project's business logic. Some interesting features include:
+
+=over
+
+=item *
+
+C<url_length>, a class attribute that asks Twitter itself how much space out of a tweet's precious 140 characters to budget for each C<t.co>-shortened URL. Lazy building at the class level means that Spoilerific will ask Twitter about this when it needs to, and then retain the value for the rest of its life as a system process; Twitter doesn't adjust this value very often.
+
+
+
+=item *
+
+An C<around body_plaintext> method modifier, which reacts to any change to the post's text by generating the ROT13-enciphered text, then calling the internal C<_fill> method to decorate the text with additional hashtag as space allows. It updates the post object's various stored permutations of the plain, user-supplied tweet text before returning control back to the accessor. 
+
+
+
+=item *
+
+A C<post_to_twitter> method, which does what you'd expect, via the Net::Twitter handle attached to the current user object.
+
+
+
+=back
+
+I shall leave further exploration of Spoilerific's business logic, including its tests in the project's C<t/> directory and and the little bit of extra ResultSet class magic I added, as an exercise for the reader. The point I wish to illustrate here is that, though Spoilerific is "a Catalyst application", I put most of my early thought and work into code that manipulates database-stored objects based on a combination of user input and messages from Twitter, and which has no intrinsic concept at all of being a web application per se. I find Catalyst's helper scripts invaluable for getting this process started quickly and keeping the DBIC-based model modules updated, but I otherwise don't think about Catalyst much until I've completed building the model's first draft.
+
+
+=head2 Create controllers and templates (and everything else)
+
+Spoilerific has only three controller modules, found (as expected) in C<lib/Spoilerific/Controller>. B<Auth.pm> and B<Root.pm> are both rather minimal; the former provides some project-specific interfacing between the application and L<Catalyst::Authnetication::Credential::Twitter|https://metacpan.org/pod/Catalyst::Authnetication::Credential::Twitter>, and the latter sets up a handful of defaults and (mostly-)static informational pages. B<Thread.pm>, on the other hand, contains a couple hundred lines of Catalyst action definitions, all detailing different things a Spoilerific user can do with a given thread object -- create it, add to it, or read it in its encrypted or "spoiled" forms. (And a C<random> action, just for fun, fetches a randomly chosen thread from the database for display.)
+
+Looking at it another way, Spoilerific::Controller::Thread contains I<only> a couple hundred lines of code, much of which is devoted to mundane tracking of the user's current state in the application's flow, or setting up the display of error and success messages. While all the verbs the user can invoke are defined in this module, none have terribly long definitions, as the more logically complex work of processing text and working with Twitter is all handled elsewhere. For the most part, the controller just fetches or manipulates model objects based on which action the web-user wishes to invoke, perhaps sets some values in the stash or session object, and then allows the view to do the rest. 
+
+Even the most painful part of form-handling is handled elsewhere, in Spoilerific::Form::Thread -- which, you'll note, is itself quite short, as it lets its grandparent class, the excellent L<HTML::FormHandler|https://metacpan.org/pod/HTML::FormHandler>, do all the hard work.
+
+In essence, the controller conceptually lies closer to the UI than it does to the business logic, even while functioning explicitly as bond between them. I find Spoilerific::Controller::Thread quite easy to read and follow, even though I haven't touched it for half a year, precisely because it sticks only to defining the web application's actions (and reactions), without itself defining too deeply how any of these actions actually work.
+
+I've little to say about the templates, which comprise unsurprising examples of L<Template::Toolkit|https://metacpan.org/pod/Template::Toolkit> documents. C<root/thread/post_form.tt> is perhaps notable for its use of JavaScript -- including an AJAX call to the Catalyst backend -- in order to make a pleasantly interactive form, able to count characters remaining and offer a preview of what the ROT13-encoded tweet will look like prior to actual posting. 
+
+The one most regretful piece of Spoilerific code lurks in here, too: I ended up recasting the URL-detecting logic I created for Spoilerific::Schema::Result::Post as JavaScript, just so that the characters-remaining counter can rapidly and accurately update itself without having to consult with the backend on every keystroke. (This because Twitter replaces I<every> URL in a tweet with a shortened version of a fixed length.) A bit dirty, but a comprise I decided to make for the sake of a better user experience.
+
+
+=head2 Finishing up
+
+Once all this was laid out, of course, came quite a bit of iterative development of every bit of the app, followed by betatesting from some fine and patient friends. I finally launched the project via FCGI on Apache -- L<one of the standard Catalyst deployment schemes|http://search.cpan.org/~ether/Catalyst-Manual-5.9007/lib/Catalyst/Manual/Deployment/Apache/FastCGI.pod>. It's run since then without any further work from me, except for a bump we hit in midsummer when Twitter changed its API enough to have me spend several hours tracking strange protocol errors down. (An issue, certainly, that I could have staved off earlier had I only paid more attention to Twitter's well-published API machinations.)
+
+Sometime during the final stretch of development I discovered the delightful L<Catalyst::TraitFor::Model::DBIC::Schema::SchemaProxy|https://metacpan.org/pod/Catalyst::TraitFor::Model::DBIC::Schema::SchemaProxy>, which makes easy the passing of configuration information from Catalyst config files through Model modules, down to the underlying (and purposefully Catalyst-ignorant) logic classes. Spoilerific uses this trait to allow storage of Twitter credentials as class attributes on the database schema object, where the project's various other objects can read them as needed. This isn't the only or necessarily the best way to handle logic-class configuration -- I've since been introduced to tools like L<MooseX::SimpleConfig|https://metacpan.org/pod/MooseX::SimpleConfig> -- but in this case I found it does the job quite well.
+
+
+=head1 Unccl unpxvat!
+
+I hope you've found some value in this short tour through a small Catalyst application. I leave you with a reminder that L<I put it on GitHub|https://github.com/jmacdotorg/Spoilerific> in the full spirit of sharing, so whether you'd like to mess around with the code, pull-request an improving patch, or make something entirely of your own by scooping out my Usenet-obsessed nonsense and replacing it with something far more interesting, I heartily invite you to do so.
+
+
+=head1 Author
+
+Jason McIntosh L<jmac at jmac.org|mailto:jmac at jmac.org> (L<@jmacdotorg|http://twitter.com/jmacdotorg> on Twitter, jmac on IRC)




More information about the Catalyst-commits mailing list