[Catalyst] Re: [OT] Search Solution
Octavian Rasnita
orasnita at gmail.com
Fri Nov 9 21:47:36 GMT 2007
From: "Peter Karman" <peter at peknet.com>
>> Do you know if it can index the html documents without parsing them with
>> other tools, or possibly other type of files like pdf, doc?
>>
>
> Xapian is a library. The related Omega project has support for parsing
> docs of various formats.
Oh yes, Omega seems to be nice. Too bad it doesn't allow indexing the
auto-generated web pages, but only the static content.
Do you have a recommendation for a good perl module that can be used easyly
for creating a spider that should index a web site?
Octavian
More information about the Catalyst
mailing list