[Catalyst] Alien::Dojo uses regexes to parse HTML, so what?

Tue May 30 13:53:18 CEST 2006

Dominique Quatravaux said:
>
> phaylon wrote:
>
>> Well, seeing how you don't seem to *want* to argue about it, but
>> rather just prove your point, I think it might better we end this
>> discussion?
>
> Feel free to do so, my version of Alien::Dojo is not even moving in
> the direction of long-term survival anyway.

Huh? You're releasing something you think will not last long? Why release
it then?

>> Just accept it, regular expressions were *not* made to parse HTML.
>
> And neither do I intend to use them to. Quoting myself:
>
>> We are not trying to address the problem of parsing HTML in
>> general,

You missed something from your post. Quoting yourself:
| ...we are trying to address the problem of parsing *one
| single page*

(Note the part highlighted by yourself.)

> For the record, what I am trying to do is
>
>     pick the first URI in the homepage that points to a Dojo zipball
>     on download.dojotoolkit.org

You didn't do that. You were looking for the first link to
http://download.dojotoolkit.org/release*. That's a different thing, both
bring problems. That's why the use of canonical URIs is the best, because
it's based on a social agreement, not a technical assumption.

> Regexes are (as argued in the rest of the thread) one of the right
> tools for *that* job.

I'm sorry, but just repeating the same stuff over and over doesn't make it
any better. There's plenty of potential errors, and seeing that you
haven't replied to one of my examples (that *you* requested by the way) I
believe you're pretty aware of that.

> You wouldn't use them, I did. It doesn't matter
> to the Catalyst community anyway since my module is going to be taken
> over. Can we settle on these terms?

"Feel free to do so." :)

p