[Catalyst] Alien::Dojo uses regexes to parse HTML, so what?

Daniel McBrearty danielmcbrearty at gmail.com
Mon May 29 12:24:00 CEST 2006


Why not write your own damn parser and handle the whole problem on a
character-by-character basis if you want. All you have to do is debug it and
make it work right.

or you could just do what software engineers have been advocating for
decades and make use of an existing *reusable library* that is already out
there. That way you help yourself, by taking advantage of debugging that has
already been done.



On 5/29/06, phaylon <phaylon at dunkelheit.at> wrote:
>
> Dominique Quatravaux said:
>
> > I rest my case, unless someone can provide compelling reasons for
> > avoiding regexes *in general* for this task.
>
> mst gave only one to demonstrate the whole problem. It's like a big,
> lightsucking black hole. First it's just an alt attribute. Next it's <a vs
> <A, next it's " vs. ', next it's one link versus multiple.
>
> Asking the dojo site maintainer for a canonical link would be aeons easier
> and more future-proof. If the dojo guys change something your regular
> expression doesn't expect (which is a pretty broad field of
> possibilities), all your users have to update to get the new regex.
>
>
> p
>
>
> _______________________________________________
> Catalyst mailing list
> Catalyst at lists.rawmode.org
> http://lists.rawmode.org/mailman/listinfo/catalyst
>



-- 
Daniel McBrearty
email : danielmcbrearty at gmail.com
www.engoi.com : the multi - language vocab trainer
BTW : 0873928131
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rawmode.org/pipermail/catalyst/attachments/20060529/47781b92/attachment.htm 


More information about the Catalyst mailing list