[Catalyst] Re: [OT] Search Solution

Octavian Rasnita orasnita at gmail.com
Fri Nov 9 21:47:36 GMT 2007


From: "Peter Karman" <peter at peknet.com>
>> Do you know if it can index the html documents without parsing them with
>> other tools, or possibly other type of files like pdf, doc?
>>
>
> Xapian is a library. The related Omega project has support for parsing 
> docs of various formats.

Oh yes, Omega seems to be nice. Too bad it doesn't allow indexing the 
auto-generated web pages, but only the static content.

Do you have a recommendation for a good perl module that can be used easyly 
for creating a spider that should index a web site?

Octavian 




More information about the Catalyst mailing list