[Catalyst-commits] r13804 - trunk/examples/CatalystAdvent/root/2010/pen

dhoss at dev.catalyst.perl.org dhoss at dev.catalyst.perl.org
Tue Dec 7 23:06:20 GMT 2010


Author: dhoss
Date: 2010-12-07 23:06:20 +0000 (Tue, 07 Dec 2010)
New Revision: 13804

Removed:
   trunk/examples/CatalystAdvent/root/2010/pen/catalyst-and-elasticsearch.pod
   trunk/examples/CatalystAdvent/root/2010/pen/catalyst-and-gearman.pod
Modified:
   trunk/examples/CatalystAdvent/root/2010/pen/TODO
Log:
updated TODO, removed published files of mine

Modified: trunk/examples/CatalystAdvent/root/2010/pen/TODO
===================================================================
--- trunk/examples/CatalystAdvent/root/2010/pen/TODO	2010-12-07 23:02:45 UTC (rev 13803)
+++ trunk/examples/CatalystAdvent/root/2010/pen/TODO	2010-12-07 23:06:20 UTC (rev 13804)
@@ -7,7 +7,7 @@
 
 Article ideas (2010 Calendar):
  - facebook general application development (Getty)
- - Catalyst + Gearman (dhoss)
+ - Catalyst + Gearman (dhoss) - **DONE**
  - Catalyst + DBIx::Class::InflateColumn::FS + HTML files + X-sendfile for a simple CMS (dhoss) - i think davewood is going to take care of this
  - Catalyst + ElasticSearch (dhoss) - **DONE**
  - DBIC::Fixtures - lukes (?)
@@ -58,12 +58,12 @@
  - 4: Test::DBIx::Class - dhoss
  - 5: t0m
  - 6: DONE ("Excel in Catalyst" - Caelum)
- - 7:      ("Catalyst and Gearman" - dhoss) 
- - 8:       Form::Sensible::Reflector::DBIC + Catalyst - dhoss 
+ - 7: DONE("Catalyst and Gearman" - dhoss) 
+ - 8:       ("OpsView" - Ton Voon)
  - 9:       jqgrid + Catalyst:Controller::REST - dhoss
- - 10:
+ - 10:      Form::Sensible::Reflector::DBIC + Catalyst - dhoss 
  - 11:
- - 12:
+ - 12:      ("jQueryUI +Catalyst" - Sir and friends)
  - 13:
  - 14:
  - 15: twitter authentication/applications + Catalyst - dhoss

Deleted: trunk/examples/CatalystAdvent/root/2010/pen/catalyst-and-elasticsearch.pod
===================================================================
--- trunk/examples/CatalystAdvent/root/2010/pen/catalyst-and-elasticsearch.pod	2010-12-07 23:02:45 UTC (rev 13803)
+++ trunk/examples/CatalystAdvent/root/2010/pen/catalyst-and-elasticsearch.pod	2010-12-07 23:06:20 UTC (rev 13804)
@@ -1,270 +0,0 @@
-=head1 Creating an Easy to Manage Search Engine with Catalyst and ElasticSearch
-
-=head1 Overview
-
-L<http://www.elasticsearch.com|ElasticSearch> is a search engine based on Lucene that has a number of really cool features that in my opinion, elevate it above a number of L<http://lucene.apache.org/solr/|other> L<http://sphinxsearch.com/|search> L<http://www.rectangular.com/kinosearch/|engines>.
-
-For instance, it's schema-less, which some would argue is a bad thing, but the way things are indexed (indexed "things" care called documents) in ElasticSearch allows the user to create a sort of per-document schema much like you would with MongoDB or other document-based storage engines.  It also has an "autodiscovery" features for other ElasticSearch instances on the network.  All you have to do is C<bin/elasticsearch> on the machines you want to cluster and poof, you have a distributed and fault tolerant index.
-
-So! moving forward, let's get into some code and set up.
-
-=head1 Getting ElasticSearch
-
-=over 12
-
-=item Step 1
-
-Download your desired version and build of ElasticSearch here: L<http://www.elasticsearch.com/download/> (you can also L<http://www.elasticsearch.com/download/master/|build from source>)
-
-=item Step 2
-
-Decompress (or build) ElasticSearch into your desired location.  It's really not important where you do this, but /opt/elasticsearch is where I put mine.
-
-=item Step 3
-
-Start your instances by typing C<bin/elasticsearch> in the root directory where you decompressed ElasticSearch.  You can also run with the C<-f> switch to have it run in the foreground and spit out debug information.
-
-=back
-
-=head1 A Simple API Introduction
-
-L<http://search.cpan.org/~drtech/ElasticSearch-0.27/lib/ElasticSearch.pm|ElasticSearch> is the Perl binding to the ElasticSearch REST API, and is written (marvelously) by Clinton Gormley.  It has a few key methods we will be using in this article.
-
-=over 12
-
-=item C<new>
-
-Creates your connection to your ElasticSearch instance(s).  
-
-=item C<index>
-
-Indexes your data.  Takes an index name, a document id (unique, autogenerated if you leave it out), and your data which should be in the form of a hashref.
-
-=item C<search>
-
-Search your indexed data.  Takes an index name, a query type (you can also type your documents when you index them, for instance, a document that is an email, or a tweet), and your query string.  There are a number of search options you can use to query your data, but the one we'll use here is the C<field> query.
-
-=back
-
-Okay.  So that's a basic ElasticSearch API.  There are plenty of L<http://www.elasticsearch.com/docs/elasticsearch/rest_api/|examples> on the site you can check out if you feel you need to grok this more thoroughly.  Next, we figure out how to tie this thing to Catalyst.
-
-=head1 Catalyst::Model
-
-We will be using creating a small model to hook up our ElasticSearch model to our Catalyst application.  
-
-Code:
-
-Search.pm:
-
-    package Search;
-    
-    use Moose;
-    use namespace::autoclean;
-    use ElasticSearch;
-    
-    has 'es_object' => (
-        is       => 'ro',
-        isa      => 'ElasticSearch',
-        required => 1,
-        lazy     => 1,
-        default  =>  sub {
-            ElasticSearch->new(
-                servers     => 'localhost:9200',
-                transport   => 'httplite',
-                trace_calls => 'log_file',
-            );
-        },
-
-    );
-
-    sub index_data {
-        my ($self,  %p) = @_;
-        $self->es_object->index(
-        index => $p{'index'},
-            type  => $p{'type'},
-            data  => $p{'data'},
-        );
-    }
-
-    sub execute_search {
-        my ($self, %p) = @_;
-        my $results =  $self->es_object->search(
-            index => $p{'index'},
-            type  => $p{'type'},
-            query => {
-                field => {
-                    _all => $p{'terms'},
-                },
-            }
-        );
-        $results;
-    }
-
-
-
-    1;
-
-
-
-MyApp::Model::Search:
-
-    package MyApp::Model::Search;
-
-    use Moose;
-    use namespace::autoclean;
-
-    sub COMPONENT {
-        my ($class, $c, $config) = @_;
-        my $self = $class->new(%{ $config });
-
-        return $self;
-    }    
-
-    __PACKAGE__->meta->make_immutable;
-
-
-Okay.  So we have the search portion set up. This will be called like C<my $results = $c->model('Search')-E<gt>results(%opts)> from inside our application.
-
-The next step is to set up an indexer.  My example uses DBIx::Class as the source of data to index, as that's what I originally wrote all this for.  However, you can use an arbitrary data source as long as you can break it up into the bits that ElasticSearch needs.
-
-The script:
-
-    use Search;
-    use My::Schema;
-    
-    my $schema = My::Schema->connect("dbi:Pg:dbname=mydb", "user", "pass");
-    my $search = Search->new;
-    my $rs = $schema->resultset('Entry')->search({ published => 1 });
-    print "Search obj: " . Dumper $search_obj;
-    print "Beginning indexing\n";
-    
-    while ( my $entry = $rs->next ) {
-       print "Indexing " . $entry->title . "\n";
-        my $result = $search_obj->index_data(
-            index => 'deimos',
-            type => $entry->type,
-            data => {
-                title       => $entry->title,
-                display_title => $entry->display_title,
-                author      => $entry->author->name,
-                created     => $entry->created_at ."",
-                updated     => $entry->updated_at ."",
-                body        => $entry->body,
-                attachments => \@attachments,
-            },
-        );
-
-    }
-
-That is a basic script to get our data indexed.  To confirm, we can run a few cURL searches: 
-
-    curl -XGET 'http://127.0.0.1:9200/_all/_search'  -d '
-    {
-       "query" : {
-          "field" : {
-             "_all" : "your search terms that you know will get you a document returned"
-          }
-       }
-    }
- 
-This will return something like: 
-
-    {
-       "query" : {
-          "field" : {
-             "_all" : "test"
-          }
-       }
-    }'
-    {"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":4,"max_score":0.24368995,"hits":[{"_index":"ourindexdeimos","_type":"post","_id":"l_3Jw9PkRz2arFdHO3t5Pg","_score":0.24368995, "_source" : {
-    "thingy":"thingy data"
-    }
-
-If you get something looking like that, congrats! Your data index properly.
-
-=head1 Executing searches within your application
-
-So here we go, what we all came here for.
-
-Here is the Search controller:
-
-    package MyApp::Controller::Search;
-    use Moose;
-    use namespace::autoclean;
-    BEGIN { extends 'Catalyst::Controller::REST'; }
-
-
-    sub base : Chained('/') PathPart('') CaptureArgs(0) {
-        my ($self, $c) = @_;
-        my $data = $c->req->data || $c->req->params;
-        my $results = $c->model('Search')->results( 
-            terms => $data->{'q'}, 
-            index => $data->{'index'} || "default", 
-            type => $data->{'type'} || "post" 
-        );
-        my @results;
-        for my $result ( @{$results->{'hits'}{'hits'}} ) {
-            my $r = $result->{'_source'};
-            my $body = substr($r->{'body'}, 0, 300);
-            $body .= "...";
-            push @results, {
-                display_title => $r->{'display_title'},
-                title   => $r->{'title'},
-                created => $r->{'created'},
-                updated => $r->{'updated'},
-                author  => $r->{'author'},
-                body    => $body,
-            };
-
-        }
-       $c->stash( results => \@results ); 
-
-    }
-
-
-    sub index :Chained('base') PathPart('search') Args(0) ActionClass('REST'){
-        my ($self, $c) = @_;
-    
-    }
-
-    sub index_GET {
-        my ($self, $c) = @_;
-        $self->status_ok($c, 
-            entity => {
-                results => $c->stash->{'results'} ,
-            },
-        );
-    }
-
-
-
-    __PACKAGE__->meta->make_immutable;
-    1;
-
-And a simple template to display them: 
-
-    <h2>Search results for <strong>"[% c.req.param('q') %]</strong>":</h2>
-    <ul>
-    [% FOR result IN results %]
-    <li>
-    <div>By [% result.author %]</div>
-    <div><a href="[% c.uri_for_action('/url/to/your/document', [ result.title ]) %]">[% result.display_title %]</a></div>
-    <div>[% result.body %]</div>
-    </li>
-    [% END %]
-    </ul>
-
-And there you go.  A very simple, flexible, and relatively fast search engine, with the ability to use any data storage back end for your indexable data.
-
-=head1 Parting notes
-
-ElasticSearch is extremely customizable and tuneable.  You can get a GREAT deal of performance improvement by playing with the indexing options, ranking algorithms, storage and request transports.  All of this is documented again at the L<http://www.elasticsearch.com|ElasticSearch> web site. 
-
-One final thought, you can add the portion of the indexer code that actually inserts the document into the search index right after your "commit" portion of your data store for your application.  This way, you get virtually instantaneous indexing of your document upon its creation.
-
-Enjoy folks, I hope you find this as useful as I did!
-
---Devin "dhoss" Austin, 2010.
-
-Created using Catalyst 5.80029 on a Mac Book Pro Perl version 5 revision 12 subversion 0
-=cut

Deleted: trunk/examples/CatalystAdvent/root/2010/pen/catalyst-and-gearman.pod
===================================================================
--- trunk/examples/CatalystAdvent/root/2010/pen/catalyst-and-gearman.pod	2010-12-07 23:02:45 UTC (rev 13803)
+++ trunk/examples/CatalystAdvent/root/2010/pen/catalyst-and-gearman.pod	2010-12-07 23:06:20 UTC (rev 13804)
@@ -1,319 +0,0 @@
-=head1 Using Gearman with Catalyst to Create a Simple Image Thumbnailer
-
-=head2 SYNOPSIS
-
-L<Gearman|http://search.cpan.org/~dormando/Gearman-1.11/> is a distributed job queue system that excels in doing things quickly, and asynchronously run job processes for things that need batch processesing, or those which you don't want your web application having to deal with.  It's quick, and good for things like thumbnailing, which we will be talking about today.
-
-=head2 REQUIREMENTS
-
-=over 12
-
-=item L<Gearman::Server|http://search.cpan.org/~dormando/Gearman-Server-1.11/lib/Gearman/Server.pm> 
-
-This is the actual job server that keeps track of jobs, and what the worker processes connect to. Initialized with C<gearmand>.
-
-=item L<Gearman::Worker|http://search.cpan.org/~dormando/Gearman-1.11/lib/Gearman/Worker.pm>
-
-This is your worker instance, which actually does the job you want done.  HINT:  We want to do thumbnailing, so this will contain the code for creating thumbnails from images.
-
-=item L<Gearman::Client|http://search.cpan.org/~dormando/Gearman-1.11/lib/Gearman/Client.pm>
-
-This connects to the server(s) and communicates what jobs to execute etc.
-
-=item L<Catalyst>
-
-Duh.
-
-=back
-
-=head2 The Process
-
-So, here's how this goes down:
-
-The web browser uploads the image to your Catalyst app.  Catalyst writes out the image to your given directory on the file system, maybe inserts a pointer to that in the database, and then at the same time, creates a job in the Gearman worker pool to create a thumbnail of said image.  Gearman either queues it up because it's working on other images, or it takes care of it right away if it's just twiddling its thumbs.  Then, presto, you have a thumbnail of your image(s)!
-
-Again, a small ascii flowchart of this process should remove all questions you have of this process:
-
-<pre>
-Browser uploads image -> Catalyst writes this to the filesystem 
-                      -> Catalyst creates a job in the Gearman worker pool -> Gearman queues job, then takes care of it -> your thumbnail appears wherever needed!
-                                                                           |
-                                                                         < -
-                        Catalyst doesn't sit and wait for the thumbnail to be created, 
-                        instead, sends confirmation page back to the browser 
-</pre>
-
-
-=head2 Finally, Some Code
-
-Okay, so we need:
-
-=over 12
-
-=item A Server
-
-Simply enough, we just run C<gearmand --daemonize> and we have our server.
-
-
-=item A Worker
-
-Here's the code for the thumbnail job I came up with:
-
-    package Worker;
-    use Moose;
-    use namespace::autoclean;
-    use Gearman::Worker;
-    use Deimos::ConfigContainer;
-    use Imager;
-    use IO::Scalar;
-    use File::Basename;
-
-    has worker => ( 
-        is => 'ro', 
-        isa => 'Gearman::Worker', 
-        required => 1, 
-        lazy => 1,
-        default => sub { Gearman::Worker->new }
-    );
-
-
-    has 'imager' => (
-        is         => 'ro',
-        required   => 1,
-        lazy => 1,
-        default => sub { Imager->new }
-    );
-
-    has 'config' => (
-       is => 'ro',
-       required => 1,
-       lazy => 1,
-       default => sub { # your config object goes here },
-    );
-
-    sub BUILD {
-        my $self = shift;
-        $self->worker->job_servers('127.0.0.1'); 
-        $self->worker->register_function(thumbinate => sub { $self->thumbinate($_[0]->arg) });
-    }
-
-
-    sub thumbinate {
-        my ($self, $file) = @_;
-
-        my $image = $self->imager;
-        my $scaled;
-        if ( $image->read( file => $file ) ) {
-                 $scaled = $image->scale( ypixels => $self->config->{thumbnail_max_height} );
-                 
-                 ## write our image to disk
-                 binmode STDOUT;
-                 $| = 1;
-                 my $data;
-                 $scaled->write( data => \$data, type => 'png'  )
-                   or die $scaled->errstr;
-                 return $file;
-                    
-             } else {
-                 die "file not read, " . $image->errstr;
-             }
-    }
-    1;
-
-Quick breakdown: 
-
-    has worker => ( 
-        is => 'ro', 
-        isa => 'Gearman::Worker', 
-        required => 1, 
-        lazy => 1,
-        default => sub { Gearman::Worker->new }
-    );
-
-This creates our C<Gearman::Worker> object.
-
-    has 'imager' => (
-        is         => 'ro',
-        required   => 1,
-        lazy => 1,
-        default => sub { Imager->new }
-    );
-
-This creates our C<Imager> object, which we use to create the thumbnails.
-
- sub BUILD {
-        my $self = shift;
-        $self->worker->job_servers('127.0.0.1'); 
-        $self->worker->register_function(thumbinate => sub { $self->thumbinate($_[0]->arg) });
-  }
-
-This tells our C<Gearman::Worker> object that our server (a list of server IPs can be passed here) resides at 127.0.0.1, and registers the job "thumbinate" with the associated job defined in this class, with the proper arguments passed.
-
-    sub thumbinate {
-        my ($self, $file) = @_;
-
-        my $image = $self->imager;
-        my $scaled;
-        if ( $image->read( file => $file ) ) {
-                 $scaled = $image->scale( ypixels => $self->config->{thumbnail_max_height} );
-                 
-                 ## write our image to disk
-                 binmode STDOUT;
-                 $| = 1;
-                 my $data;
-                 $scaled->write( data => \$data, type => 'png'  )
-                   or die $scaled->errstr;
-                 return $file;
-                    
-             } else {
-                 die "file not read, " . $image->errstr;
-             }
-    }
-
-This is where the thumbnailing takes place, and it is the job that we register with Gearman.
-
-=item A Worker Initializer
-
-    #!/usr/bin/env perl
-
-    use strict;
-    use Jobs::Worker;
-
-    my $worker = Jobs::Worker->new;
-    $worker->worker->work while 1;
-
-This acts as the daemon for workers, that basically listens for jobs from the C<gearmand> instance.  Start with C<< perl -Ilib script/jobs.pl 2>workerlog.log & >> or some such.
-
-
-=item A Catalyst app and Model to Glue it all Together
-
-    package MyApp::Model::Job;
-
-    use parent 'Catalyst::Model';
-    use Gearman::Client;
-    use Moose;
-    use namespace::autoclean;
-
-    has 'gearman' => (
-        is => 'ro',
-        isa => 'Gearman::Client',
-        required => 1,
-        lazy => 1,
-        default => sub { Gearman::Client->new }
-    );
-
-    has 'job_servers' => (
-        is => 'ro',
-        required => 1,
-        lazy => 1,
-        default => '127.0.0.1',
-    );
-
-
-
-
-    sub add {
-        my ($self, @tasks) = @_;
-        my $gm = $self->gearman;
-        $gm->job_servers($self->job_servers);
-        my $res = $gm->do_task($tasks[0] => $tasks[1]);
-        return $res;
-    }
-
-    1;
-
-Quick synopsis: 
-
-The important method here is C<add>.  Basically, we set our job servers, and call C<< ->do_task >> with the taskname and the associated job name defined in our worker class.  Not much going on here.
-
-Finally, we need to have code that actually calls this when we upload an image.
-
-    package MyApp::Controller::Media;
-    # the usual Catalyst controller stuff goes here 
-    
-    # this is actually a method using ActionClass('REST'), so that's where the ->status_* stuff comes from
-    sub add_asset : Local  {  # or Chained, if you prefer like I do
-        my ( $self, $c ) = @_;
-        my $data = $c->req->data || $c->req->params;
-        $c->log->debug( "uploads: " . Dumper $c->req->uploads );
-        return $self->status_bad_request( $c, message => "you must upload a file" )
-          unless $c->req->upload('file') || $c->req->upload('qqfile');
-        my $upload = $c->req->upload('file') || $c->req->upload('qqfile');
-        my $filename = $upload->filename;
-        my $media_type = ( by_suffix $data->{'file'} )[0];
-        try {
-
-            my $media = $c->model('Database::Attachment')->create(
-                {
-                    name => $data->{'name'} || $filename,
-                    owner     => $c->user->get("userid"),
-                    published => $data->{'published'},
-                    mediatype => $media_type, 
-                    file      => $upload->fh,
-                }
-            );
-            $c->log->debug("Thumbnail: " . $media->thumbnail . "");
-            my $thumb;
-            if ( $media_type =~ /^image\/.+/i ) {
-                $c->log->debug("matched image");
-                $thumb = $c->model('Job')->add('thumbinate' => $media->file . "");
-            }
-            if ( $data->{entry} ) {
-                $c->log->debug( "adding an attachment to " . $data->{entry} );
-                $media->add_to_entries( { entry => $data->{entry} } );
-                $c->model('CMS')->entry->publish($data->{entry}) or die "Can't publish entry; $!";
-            }
-
-            return $self->status_created(
-                $c,
-                location => $c->req->uri->as_string,
-                entity   => { message => "file uploaded" }
-            );
-        }
-        catch {
-            return $self->status_bad_request( $c, message => "Failed to save '$filename': $_" );
-        };
-
-    }
-
-There is some extra code in here that describes other things you might do, say, update a record with the thumbnail image's location, check to see if there IS a thumbnail, etc.
-
-The real important bits are: 
-
-=over 4
-
-=item Get the filename 
-
-
-    my $upload = $c->req->upload('file') || $c->req->upload('qqfile');
-    my $filename = $upload->filename;
-
-=item Get the MIME type:
-
-(Obviously you need to C<use MIME::Types qw/by_suffix/;> for this)
-
-     my $media_type = ( by_suffix $data->{'file'} )[0];
-
-=item Create the Job
-
-     if ( $media_type =~ /^image\/.+/i ) {
-            $c->log->debug("matched image");
-            $thumb = $c->model('Job')->add('thumbinate' => $media->file . "");
-     }
-
-=back 
-
-
-=back
-
-And that's that.  You now have the ability to create thumbnails cleanly and smoothly with Gearman and Catalyst.
-
-=head2 Final Notes
-
-Note that some of you may have done this with L<TheSchwartz|http://search.cpan.org/~sixapart/TheSchwartz-1.10/lib/TheSchwartz.pm>.  Which is all good and fine.  However, it's concerned more about reliability than speed.  Gearman doesn't check to make sure jobs got done successfully, but it is much more suited for creating thumbnails because of speed.  Thumbnails can also be trivially recreated if something fails.
-
-=head2 Author
-
-Devin Austin <dhoss at cpan.org>
-
-=cut




More information about the Catalyst-commits mailing list