[Catalyst] Alien::Dojo uses regexes to parse HTML, so what?
dom at idealx.com
Tue May 30 14:03:30 CEST 2006
-----BEGIN PGP SIGNED MESSAGE-----
Matt S Trout wrote:
>> my ($url) =
> <a href="http://download.dojotoolkit.org/release-notes.txt">
> Congratulations, you're toast.
This is besides the point for at least three reasons:
* this won't happen in real life, as the maintainer of
download.dojotoolkit.org apparently knows about directories,
* my regex can be cured (C<< ...\.zip)"}sx >> if I must), this was
just a throwaway example not a snippet of something that I
intend to put in 0.02,
* I fail to see how a full-fledged HTML parser would make any
> Get a canonical address from the dojo maintainers,
I am not discussing that, coz I wholeheartedly agree: this is IMO the
best thing proposed so far.
> or at the very least consider a lightweight SGMLish parsing job.
Still not sold on this one.
> Regexps are only sane for hacky one-off scripts,
I am very surprised to hear that from a top contributor to a framework
written *in Perl*: pardon me, but this particular statement just
sounds like flamebait from a Python or Ruby zealot. Hopefully there is
more to your opinion that you are willing to discuss on-list?
> at least certainly not for production use.
Despite all the respect I have for your work I simply cannot agree. I
*do* use regexes in production, sometimes even for parsing (not HTML),
they are //x, ripe with comments, they are covered by a suitable
amount of unit tests (which amounts to more than for pure-OO code),
and they just do the job.
Dominique QUATRAVAUX Ingénieur senior
01 44 42 00 08 IDEALX
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
More information about the Catalyst