[Catalyst] Alien::Dojo uses regexes to parse HTML, so what?

phaylon phaylon at dunkelheit.at
Mon May 29 11:34:42 CEST 2006


Dominique Quatravaux said:

> I rest my case, unless someone can provide compelling reasons for
> avoiding regexes *in general* for this task.

mst gave only one to demonstrate the whole problem. It's like a big,
lightsucking black hole. First it's just an alt attribute. Next it's <a vs
<A, next it's " vs. ', next it's one link versus multiple.

Asking the dojo site maintainer for a canonical link would be aeons easier
and more future-proof. If the dojo guys change something your regular
expression doesn't expect (which is a pretty broad field of
possibilities), all your users have to update to get the new regex.


p




More information about the Catalyst mailing list