[Catalyst] Newbie help

Wed Jan 14 18:34:53 GMT 2009

On Wed, 14 Jan 2009, Diego M. Vadell wrote:

>   The process may run for about 15 minutes, so I have to handle , somehow,
> the browser timing out because of lack of output. I thought about making the
> script output to a tmp file and using ajax to query that file.
>
>   What is the best way to do that? Is there a nice, magical CPAN module out
> there? :)

Not sure about the best way, but to send in you in the right direction, a 
customary method for doing this is to have your server send chunks of data 
at regular intervals (shorter than the minimal browser time-out) until the 
request is complete. Each chunk will have its own http header, and the 
first chunk will idicate that this is a multipart stream by sending a 
special multipart/x-mixed-replace header, in which it will specify a 
string to use as a chunk boundary. You will probably have to form that 
yourself in your catalyst action.

Read more about this and other options here:

    http://en.wikipedia.org/wiki/Push_technology

Not having done that in Catalyst, I am not even sure whether its 
architecture supports this directly or not. You need to make sure that 
parts of your response are printed and flushed to stdout as soon as the 
data are available, and certainly before the action method completes.

Here's the sequence that should take place:

1. The first response indicating that the process has been launched will 
have something along these lines:

Content-type: multipart/x-mixed-replace;boundary=NEXT

--NEXT
Content-type: text/html

<html>
   <head>
     <title>...</title>
   </head>
   <body>
     ... a message saying we're working to fulfill the request and
     advising the user against pushing browser buttons ...
   </body>
</html>

2. If you have any connection to your worker process and it tells you 
about its progress, stream that to the browser, perhaps converting 
something that can be used to drive a progress bar, &c.

--NEXT
Content-type: text/html
<html>
   .... xx% done so far ....
</html>

3. When done, send the results in the same way :

--NEXT
Content-type: text/html

... results ...
--NEXT
<-blank line

Note the additional empty chunk sent after the results chunk -- it 
indicates that the response is complete.

I can show you a very rough and ready "library" I use as a shell to run 
things on the server, pipitf their output verbatim to the browser. This 
method is not bullet-proof, but it lets you achieve what you want very 
quickly. It doesn't even send intermediate chunks to keep the browser 
alive; it pipes raw stdout to the browser to let the user know that the 
job is in progress.

You run it like this (this example shows multiple things done done to a 
set of files in a directory:

use CGI;
use Cwd;
use Util;

my $cgi = new CGI;

Util::open_cgi_shel("job title");

print "<b>&gt; cd data/$dir</b>\n";

chdir "data/$dir" or die "Couldn't call 'chdir data/$dir', reason: $!";

foreach my $file ( grep /pattern/, `ls .` ) {
   chomp $file;
   my $command = "tool -params $file > $file.output";
   Util::run_task(*STDOUT, $command);
}

Util::close_cgi_shell();

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The Util library has these routines:

sub open_cgi_shell {
   my ($title) = @_;
   my $this_loc = "http://cci.uchicago.edu/~selkovjr/trflp";
   print <<END
Content-type: multipart/x-mixed-replace;boundary=NEXT

--NEXT
Content-type: text/html

<html>
   <head>
     <title>$title</title>
   </head>
   <body onLoad="top.opener.location.href='$this_loc'">

     <h2>$title</h2>

     <table cellpadding="5">
       <tr>
         <td valign="center"><h1><font color="brown">!</font></h1></td>
         <td valign="top">
           Pushing your browser's stop button will cancel this
           job, possibly leaving debris in your project directory on
           the server. Avoid stopping the browser while the job is
           running, even if the browser indicates the transaction as
           "stalled". Some tasks may take a long time to complete.
           It may also be a good idea to check the timeout value in
           your browser settings. It must not be smaller than the expected
           job completion time.
         </td>
       </tr>
     </table>

     <p>
       <em>Starting ...</em>
     </p>

     <pre style="color: #edb; background-color: #333">
END
;
   Util::run_task(*STDOUT, "date");
   print "\n";
}

sub close_cgi_shell {
   Util::run_task(*STDOUT, "date");
   print q(    </pre>

     <p>
       <em>Finished. Give the parent window a few seconds to update...</em>
     </p>
   </body>
</html>
--NEXT--
);
}

# this is a poor man's shell
sub run_task {
   my ($FH, $cmd, $silent) = @_;
   my $pid = open PIPE, "-|";
   if ( $pid ) { # I'm the parent
     select(STDERR); $| = 1;         # make unbuffered
     select(STDOUT); $| = 1;         # make unbuffered
     my($stat, $data);
     while ($stat = sysread PIPE, $data, 1024) {
       print $FH "$data";
     }
     die "sysread error: $!" unless defined $stat;
     close PIPE;
   }
   else { # I'm the child
     if ( $silent ) {
       exec qq(echo ""; $cmd 2>\&1);
     }
     else {
       exec qq(echo ""; echo "\<b\>&gt; $cmd\</b\>"; $cmd 2>\&1);
     }
     die "error running '$cmd': $!"; # if you get here
   }
}