[Catalyst] How to detect cancelled requests?

Wed Oct 29 14:11:35 GMT 2008

Thanks Wade, this is already very helpful, and it seems that a queue is 
a sensible solution to the problem. We know some of the problem is hard, 
what is frustrating is that so much of it is outside our control, and 
also that of Catalyst and even Perl.

To give context, the queries that are an issue are SQL queries against a 
database that contains millions of components, where users may construct 
wildcard queries of the form "*A*", with additional filtering 
constraints. The *A* pattern typically generates a full table-walk, and 
if the database engine doesn't do a good job optimising the query (we 
are working to add appropriate hints) the filter may remove all but a 
few. The effect of this is that SQL searches may involve either a few, 
or tens of thousands of hits, and these could take minutes to retrieve 
from the database system. We handle timeouts at this stage, so we don't 
burn up the database engine too badly.

The problem is more at the boundary between the web server and Catalyst. 
Under IIS, we don't seem to get any kill signal at all, certainly when 
using ActivePerl/PerlEx. I had kind of hoped we might get SIGPIPE, but 
we don't. For this reason, the only option under ISAPI/IIS is to let the 
request run to completion and be ignored, which seems to be the 
IIS/ISAPI normal behaviour. All are queries are searches, so there is no 
problem letting them run to completion. Unfortunately, in a few cases, 
this can involve some fairly complex rendering of database results, 
which may itself take a good few minutes to generate. Our client has 
explicitly requested that some searches should return *all* elements, no 
matter how many there are. Most searches take <0.1 seconds, but a few 
can take minutes (this was expected and accepted by the client). If a 
few searches are extreme, and then allowed to progress to completion, 
they can clog up the queue (or even the server) to a point where the 
entire system is totally busy. It is hard (impossible?) to determine 
whether a query will be fast or slow in advance.

Our ideal would be a non-buffered model, where results are displayed 
incrementally, and where user requests can cancel a query, at worst when 
the next hit is about to be displayed. Since we fetch hits and render 
them incrementally, this should just work.

It seems likely that IIS is most of the problem. Personally, I have 
never been happy with it (especially with ISAPI/PerlEx), but we are 
required to use IIS at least, and with much of the documentation 
relating to Catalyst on IIS being slightly on the sketchy side, it is 
hard to know how to improve the position. Ideally, we could use the 
following:

1. Any advice on FastCGI/Catalyst/IIS -- I assume it does work, but I 
have yet to achieve it
2. Some system for detecting a request has been cancelled, even if it is 
as naive as polling - when using (1) above
3. Any experience with trying to get HTML passed to IIS in a 
non-buffered manner -- it all seems buffered right now

All the best
Stuart
--
Stuart Watt
swatt at infobal.com

Wade.Stuart at fallon.com wrote:
> Stuart Watt <swatt at infobal.com> wrote on 10/28/2008 12:22:07 PM:
>
>   
>> Hi all,
>>
>> Has anyone found a neat way of detecting and handling cancelled
>> requests? We have a Catalyst app that dynamically generates SQL queries
>> for part of its search, some of which are long and complex, and users
>> are able to create queries that can take minutes to execute. This is OK,
>> except that we need users to be able to cancel those requests through
>> the browser.
>>     
>
>
> Hard(er) problem alert -- depending on how your app is structured.  The
> only sane way I can think of to handle this is to submit the query to a
> queue outside of your cat app.  this queue would then need logic to fork
> out the query and check periodically for the "kill" tag in the queue entry.
> upon completion the status and output can be left in a database or file for
> the cat app to recover. Maybe a POE based queue would work.  This may also
> help you prioritize and limit such queries.
>
> Otherwise,  if the query is not destructive and usage is low -- is there a
> reason why you can't just take the easy way out and let it finish and
> ignore the output?
>
> -Wade
>
>
>
>   
>> To add complexity to this, we are using IIS (client specification) as
>> the front end, although we are trying to get a FastCGI rather than CGI
>> (with ActiveState's PerlEx) engine in place. We're doing this because we
>> had to use our own Perl, simply because we were getting too many
>> DBI-based memory leaks in the ActivePerl and Strawberry for our indexing
>> system to be able to function effectively. (Essentially, this is a
>> large-scale IR type application).
>>
>> The Perl we use is not threaded, essentially a "5.10 with all the
>> patches as of September 2008", although I'd be happy to make it threaded
>> if that would help. As far as I can tell, alarm is just about capable of
>> cancelling long-running database queries, and with polling, the database
>> no longer seems to be the issue. However, rendering the results can take
>> a while, and IIS seems to choose not to inform anyone (or us, at least)
>> when the user cancels a request and the connection close is initiated.
>>
>> Does anyone have any experience or recommendations?
>>
>> All the best
>> Stuart
>> --
>> Stuart Watt
>> swatt at infobal.com
>>
>>
>> _______________________________________________
>> List: Catalyst at lists.scsys.co.uk
>> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
>> Searchable archive:
>>     
> http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
>   
>> Dev site: http://dev.catalyst.perl.org/
>>     
>
>
> _______________________________________________
> List: Catalyst at lists.scsys.co.uk
> Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
> Dev site: http://dev.catalyst.perl.org/
>
> --
> This message was scanned by ESVA and is believed to be clean.
> Click here to report this message as spam. 
> http://antispam.infobal.com/cgi-bin/learn-msg.cgi?id=E818927F05.41F65
>
>