[Bast-commits] r9606 - in ironman: IronMan-Web/trunk branches/mk-ii/IronMan-Web

Tue Jun 29 13:01:46 GMT 2010

Author: idn
Date: 2010-06-29 14:01:46 +0100 (Tue, 29 Jun 2010)
New Revision: 9606

Added:
   ironman/IronMan-Web/trunk/Changes
   ironman/IronMan-Web/trunk/Makefile.PL
   ironman/IronMan-Web/trunk/README
   ironman/IronMan-Web/trunk/ironman_web.conf
   ironman/IronMan-Web/trunk/ironman_web_ironboy.conf
   ironman/IronMan-Web/trunk/ironman_web_ironman.conf
   ironman/IronMan-Web/trunk/ironman_web_localhost.conf
   ironman/IronMan-Web/trunk/lib/
   ironman/IronMan-Web/trunk/root/
   ironman/IronMan-Web/trunk/script/
   ironman/IronMan-Web/trunk/t/
   ironman/IronMan-Web/trunk/todo.pod
Removed:
   ironman/branches/mk-ii/IronMan-Web/lib/
   ironman/branches/mk-ii/IronMan-Web/root/
   ironman/branches/mk-ii/IronMan-Web/script/
   ironman/branches/mk-ii/IronMan-Web/t/
Log:
Moving

Copied: ironman/IronMan-Web/trunk/Changes (from rev 9605, ironman/branches/mk-ii/IronMan-Web/Changes)
===================================================================

--- ironman/IronMan-Web/trunk/Changes	                        (rev 0)
+++ ironman/IronMan-Web/trunk/Changes	2010-06-29 13:01:46 UTC (rev 9606)
@@ -0,0 +1,4 @@
+This file documents the revision history for Perl extension IronMan::Web.
+
+0.01  2009-07-01 21:07:26
+        - initial revision, generated by Catalyst

Copied: ironman/IronMan-Web/trunk/Makefile.PL (from rev 9605, ironman/branches/mk-ii/IronMan-Web/Makefile.PL)
===================================================================
--- ironman/IronMan-Web/trunk/Makefile.PL	                        (rev 0)
+++ ironman/IronMan-Web/trunk/Makefile.PL	2010-06-29 13:01:46 UTC (rev 9606)
@@ -0,0 +1,43 @@
+use inc::Module::Install;
+
+name 'IronMan-Web';
+all_from 'lib/IronMan/Web.pm';
+
+requires 'Catalyst::Runtime' => '5.7015';
+requires 'Catalyst::Plugin::ConfigLoader';
+requires 'Catalyst::Plugin::Static::Simple';
+requires 'Catalyst::Plugin::StackTrace';
+requires 'Catalyst::Action::RenderView';
+requires 'YAML'; # This should reflect the config file format you've chosen
+                 # See Catalyst::Plugin::ConfigLoader for supported formats
+
+requires 'Catalyst::Controller::reCAPTCHA';
+requires 'Catalyst::View::TT';
+requires 'Catalyst::Model::DBIC::Schema';
+requires 'Data::UUID';
+requires 'Email::Valid';
+requires 'LWP::Simple';
+requires 'XML::Feed';
+requires 'DateTime';
+requires 'XML::OPML';
+requires 'DateTime::Format::HTTP';
+requires 'YAML::XS';
+requires 'IronMan::Schema';
+
+# We need DateTime::Format::SQLite for script/import_csv.pl if using SQLite
+recommends 'DateTime::Format::SQLite';
+
+# We need TryCatch::Error for script/pull_urls.pl
+recommends 'TryCatch::Error';
+
+# We need FCGI::ProcManager to run this as a Fast CGI process.
+recommends 'FCGI::ProcManager';
+
+# Testing deps
+test_requires 'Catalyst::Test';
+
+catalyst;
+
+install_script glob('script/*.pl');
+auto_install;
+WriteAll;

Copied: ironman/IronMan-Web/trunk/README (from rev 9605, ironman/branches/mk-ii/IronMan-Web/README)
===================================================================
--- ironman/IronMan-Web/trunk/README	                        (rev 0)
+++ ironman/IronMan-Web/trunk/README	2010-06-29 13:01:46 UTC (rev 9606)
@@ -0,0 +1 @@
+Run script/ironman_web_server.pl to test the application.

Copied: ironman/IronMan-Web/trunk/ironman_web.conf (from rev 9605, ironman/branches/mk-ii/IronMan-Web/ironman_web.conf)
===================================================================
--- ironman/IronMan-Web/trunk/ironman_web.conf	                        (rev 0)
+++ ironman/IronMan-Web/trunk/ironman_web.conf	2010-06-29 13:01:46 UTC (rev 9606)
@@ -0,0 +1,9 @@
+name   IronMan::Web
+frontpage_entries   20
+<branding>
+    page_title   "Planet Perl Iron Man"
+    banner "<h1 class='title'>enlightened perl organisation</h1><p class='dict'><strong>enlightened</strong> |en'litnd|: <em>adjective</em>:<br />having or showing a rational, modern, and well-informed outlook</p>"
+    page_header "Planet Perl Iron Man"
+    page_footer "Planet Perl Iron Man"
+</branding>
+default_view   TT

Copied: ironman/IronMan-Web/trunk/ironman_web_ironboy.conf (from rev 9605, ironman/branches/mk-ii/IronMan-Web/ironman_web_ironboy.conf)
===================================================================
--- ironman/IronMan-Web/trunk/ironman_web_ironboy.conf	                        (rev 0)
+++ ironman/IronMan-Web/trunk/ironman_web_ironboy.conf	2010-06-29 13:01:46 UTC (rev 9606)
@@ -0,0 +1,10 @@
+<recaptcha>
+## share ironman "global" key
+  priv_key 6LdtfQwAAAAAAHPUBd4M0YquNYsAqjARPPO1jEXn
+  pub_key  6LdtfQwAAAAAANfQgaj4u9dACa8-mr6J18HCXWhf
+</recaptcha>
+<Model::FeedDB>
+   <connect_info>
+       dsn  dbi:SQLite:/var/www/ironboy.enlightenedperl.org/ironman/subscriptions.db
+   </connect_info>
+</Model::FeedDB>

Copied: ironman/IronMan-Web/trunk/ironman_web_ironman.conf (from rev 9605, ironman/branches/mk-ii/IronMan-Web/ironman_web_ironman.conf)
===================================================================
--- ironman/IronMan-Web/trunk/ironman_web_ironman.conf	                        (rev 0)
+++ ironman/IronMan-Web/trunk/ironman_web_ironman.conf	2010-06-29 13:01:46 UTC (rev 9606)
@@ -0,0 +1,10 @@
+<recaptcha>
+## ironman keys
+  priv_key 6Le7UQwAAAAAAFYPa1kYrjElL4IBEoACORjghDk5
+  pub_key  6Le7UQwAAAAAAMi84pCrncgnp0lYHEVZmB7h7yI6
+</recaptcha>
+<Model::FeedDB>
+  <connect_info>
+     dsn dbi:SQLite:/var/www/ironman.enlightenedperl.org/ironman/subscriptions.db
+  </connect_info>
+</Model::FeedDB>

Copied: ironman/IronMan-Web/trunk/ironman_web_localhost.conf (from rev 9605, ironman/branches/mk-ii/IronMan-Web/ironman_web_localhost.conf)
===================================================================
--- ironman/IronMan-Web/trunk/ironman_web_localhost.conf	                        (rev 0)
+++ ironman/IronMan-Web/trunk/ironman_web_localhost.conf	2010-06-29 13:01:46 UTC (rev 9606)
@@ -0,0 +1,10 @@
+<recaptcha>
+## localhost keys
+  priv_key 6LcsbAAAAAAAANQQGqwsnkrTd7QTGRBKQQZwBH-L
+  pub_key  6LcsbAAAAAAAAPDSlBaVGXjMo1kJHwUiHzO2TDze
+</recaptcha>
+<Model::FeedDB>
+  <connect_info>
+     dsn dbi:SQLite:/home/castaway/plagger/subscriptions.db
+  </connect_info>
+</Model::FeedDB>

Copied: ironman/IronMan-Web/trunk/todo.pod (from rev 9605, ironman/branches/mk-ii/IronMan-Web/todo.pod)
===================================================================
--- ironman/IronMan-Web/trunk/todo.pod	                        (rev 0)
+++ ironman/IronMan-Web/trunk/todo.pod	2010-06-29 13:01:46 UTC (rev 9606)
@@ -0,0 +1,216 @@
+=head1 IronMan-Web release issues.
+
+Remaining issues for live deployment on 14th of April 2010.
+
+=head2 Page banner. - DONE - IDN
+
+The banner as follows:
+
+    Are you flesh? Or are you Iron?
+    Take the challenge
+    http://ironman.enlightenedperl.org/signup/new_feed
+
+Missing from the top of the page
+
+=head2 Paging.
+
+The detail between the banner detailed above and the very first feed post:
+
+    Join the program |  Learn about the program |  Report a problem
+
+    Only showing posts tagged "perl", "cpan" or "ironman" (or containing those words).
+    Last updated: 21:25:27 13-Apr-2010 First Previous 1 2 3 4 5 Next Last
+
+The above is missing at the top of the page, though the Older and Newer posts
+links are shown at the bottom of the page.
+
+=head2 Captcha on signup form.
+
+Castaway has added a recaptcha to the signup form on the existing
+site.  This needs back porting to the dev site.
+
+=head2 Branding.
+
+These are minor issues that are inconsistent with the existing UI.
+Whilst we need to fix these up, we also need provision to have the
+all.things.per.ly branding restored in the future.  Please bear
+this in mind when making the changes such that this can be achieved
+simply.
+
+Page title should be    "Planet Perl Iron Man"
+
+The epo banner at the top should read:
+
+    enlightened perl organisation
+    enlightened |en'litnd|: adjective:
+    having or showing a rational, modern, and well-informed outlook
+
+Header in the yellow bar at the top of the page should similarly be:
+
+    Planet Perl Iron Man
+
+Footer should be:
+
+    Perl Iron Man Planet
+
+or:
+
+    Planet Perl Iron Man
+
+The second would be more consistent, though the former is the existing text.
+
+=head2 Images.
+
+For some reason images are being removed from posts.  No idea why this might be
+but it's a degredation to existing functionality if we can't figure it out.
+
+=head2 Article headers.
+
+Tags and such are missing from the article headers.
+
+I'm sure this has been fixed in the data move once, but I don't recall how :(
+
+=head1 IronMan-Web feature requests.
+
+=head2 OPML file link.
+
+The existing site shows a link to download the OPML file.
+
+We need something here to download a similar SQLite file with the email addresses removed.
+
+=head2 Spam handling.
+
+The existing site has many spam handling issues.  There's lots of crap in the
+database.
+
+We need:
+
+=over
+
+=item * Tools to remove the existing crap from the database.
+
+=item * Support included into Perlanet-IronMan to try and limit the amount that then re-appears.
+
+A number of people have suggested that using Spam Assassin might be a starting
+point for scanning existing content and possibly new content.
+
+=item * We need a quick and simple way to remove spam feeds once they're identified.
+
+This should probably be by feed or post URL.
+
+From IRC discussion:
+
+    12:55 < mst> for the live one it's easy
+    12:55 < mst> just cp the sqlite db first
+    12:55 < castaway> on ironman I always do: login, cd plagger, cp subscriptions.db scubscriptions_pewdespam.db; sqlite3 subscriptions.db
+    12:56 < castaway> any mess, re copy and start again ;)
+    12:56 < castaway> backups++
+
+=item * A name should not appear on the index page until at least one post has been made.
+
+=back
+
+Note from robinsmidsrod in #epo-ironman:
+
+    12:20 < robinsmidsrod> idn: I just wanted to suggest to you to use the http_bl support from https://www.projecthoneypot.org to reduce spam entering the ironman database - I've 
+                           successfully used it on my blog - now I barely have spam entering my blog, and I don't have a captcha installed
+    12:21 < robinsmidsrod> I used the mod_httpbl apache implementation from https://www.projecthoneypot.org/httpbl_implementations.php
+    12:21 < idn> robinsmidsrod: Thanks for the suggestion, I'll stick it in the todo list for investigation.
+    12:23 < robinsmidsrod> sorry, my bad, I didn't use the mod_httpbl module - I actually used a b2evolution plugin - but projecthoneypot has a simple DNS-based API, so it shouldn't be to 
+                           hard to make a perl module to handle it
+    12:23 < robinsmidsrod> it works just as any other DNS-based blacklist
+    12:24 < robinsmidsrod> but the cool thing is that you can actually choose the threshold level for when you will block users
+
+Discussion followed:
+
+    12:24 < idn> That's an interesting idea that I hadn't thought of.
+    12:25 < idn> How would you look to implement it, at collection time or at signup time?
+    12:25 < robinsmidsrod> actually, if you run your own DNS server (and http server) I would suggest to support the project - it is an awesome project (I've been a member for a bit over a 
+                           year)
+    12:25 < robinsmidsrod> the new site is dynamic, right?
+    12:25 < idn> Yes
+    12:26 < idn> Hmm, I'd love to support the project, but my employer isn't very community minded or responsible in that respect.
+    12:26 < robinsmidsrod> so just look up REMOTE_IP, do a lookup against projecthoneypot BL (via DNS) and check the response - if it shows something that looks bad, just block or redirect 
+                           the user to a page that explains the problem, or enable captcha
+    12:26 < robinsmidsrod> idn: well, I support it with my private stuff
+    12:27 < robinsmidsrod> anyone can donate a spare MX pointer ;)
+    12:27 < robinsmidsrod> for some hostname you would probably never use
+    12:27 < idn> Ah, I'm with you.
+    12:28 < robinsmidsrod> mine is XXXmailserver.smidsrod.no which points to a honeypot
+    12:29 < idn> So all I need is some domains rather than any actual kit ;)  All of mine are on 123reg which neatly solves that problem.
+    12:30 < robinsmidsrod> me alone have helped catching approx. 20 harvesters and spammers in the last year
+    12:30 < robinsmidsrod> which is nice to know :)
+    12:31 < robinsmidsrod> you can support them by donating an MX entry in your DNS, setting up a "hidden" link on your own websites linking to a honeypot, or you can setup an actual 
+                           honeypot - I've only done the two first
+    12:32 < idn> That seems like a good idea, I'll put it forward to the boss too and see if we can't do something here at work.
+    12:32 < robinsmidsrod> what I do in my blog is that if the remote_ip looks suspicies I redirect the user to my honeypot page, which means that those IPs that are already somewhat fishy 
+                           will be redirected to something that will make them more fishy if they harvest it :)
+    12:33 < idn> I had been contemplating using spam assassin to scan content too
+    12:34 < robinsmidsrod> idn: this is my honeypot page: http://minmailserver.smidsrod.no/
+    12:34 < robinsmidsrod> if you look at the HTML content you'll see that there are some hidden links that harvesters will catch
+    12:35 < idn> There are a couple of problems to address, one is the existing bad feeds (most of which don't appear because they don't use the right keywords) and secondly preventing new 
+                 bad feeds.
+    12:36 < idn> The former is more of a problem due to the way in which the list of signed up users appears on the front page.
+    12:36 < robinsmidsrod> honeypot will only help with the new bad feeds
+    12:36 < idn> Possibly, the site hosting the spam might well be listed
+    12:37  * mst still thinks "only appears in the right bar if they've got at least one post" would be a start
+    12:37 < robinsmidsrod> if you have any IP-adresses linked with existing content you could of course manually run it throuh their BL and see what you find
+    12:37 < robinsmidsrod> mst: I agree with that one
+    12:37 < mst> actual spam posts will get nailed pretty quickly, I think
+    12:37 < idn> mst: Yes, that's what's in the todo I think
+    12:38 < robinsmidsrod> a "report spam" feature is available?
+    12:38 < castaway> robinsmidsrod: no but that'd be handy, care to write one?
+    12:38 < idn> I've thought about that and I've mixed opinions.  It seems like it could be open to abuse and or create work for someone to deal with.
+    12:38 < castaway> idn: we run a website, its gonna create work ;
+    12:38 < castaway> ;)
+    12:39 < idn> Yes.  But I like to try my best to minimise that ;)
+    12:39 < robinsmidsrod> castaway: I don't have any time available, 200% workload with work + full time studies, but I can explain how it could be created to mitigate moderator 
+                           intervention.
+    12:40 < idn> I was contemplating some kind of scoring system that would blacklist a feed once so many reports have been received from different requesting hosts, but it's still a 
+                 little open to abuse.  Coupled with administrative notification and oversight to re-enable if needed.
+    12:40 < robinsmidsrod> castaway: create a form (POST) with a "Report spam" button so that behaving robots won't access it. When enough people have clicked that button the article will 
+                           be blacklisted until a moderator actually clears it from blacklist
+    12:40 < castaway> idn: like, say, bayes? ;)
+    12:41 < robinsmidsrod> idn: exactly the same as I thought
+    12:41 < castaway> robinsmidsrod: makes sense.. (user moderation, yay)
+    12:41 < robinsmidsrod> that way either the author needs to complain to a moderator that his post doesn't show up
+    12:41 < idn> castaway: Erm, not quite, though my understanding of things statistical could be written on the back of a very small pin head....
+    12:41 < robinsmidsrod> because it got blacklisted
+    12:42 < idn> That wouldn't work in that each and every time the feed generated a spam, it would need to be black listed.  Though I like where you're going.
+    12:42 < robinsmidsrod> mst: do you have a suggestion on how to calculate how many reports should cause blacklisting to be triggered?
+    12:42 < idn> Blacklist the individual post with some kind of hysteresis, then blacklist the feed once enough posts have been blacklisted.
+    12:42 < mst> idn: er, what?
+    12:43 < mst> why would you have to regen?
+    12:43 < robinsmidsrod> idn: or enable reporting spam on both posts and feeds
+    12:43 < mst> oh, each time. yes.
+    12:43 < mst> idn: that's simple.
+    12:43 < mst> two blacklists and the feed goes.
+    12:43 < idn> There we go then :)
+    12:44 < robinsmidsrod> I'd suggest to put the feed in a quarantine so that it is easy for a moderator to un-blacklist feeds - and once a feed has been un-blacklisted you would increase 
+                           the blacklist threshold
+    12:45 < robinsmidsrod> sometimes member blogs get hijacked and start generating spam, but I guess that problem is much smaller than spammer blogs in general
+    12:46 < idn> I wouldn't remove the feed, they could just sign it up again.  I'd opt for blacklisting it and never collecting it again
+    12:47 < robinsmidsrod> if the feed has been in the blacklist for , let's say a month or two, it will be automatically purged from the database
+    12:47 < robinsmidsrod> what you said makes more sense yes :)
+    12:47 < robinsmidsrod> and if they try to signup again you redirect them to a projecthoneypot page :)
+    12:49 < robinsmidsrod> just make sure the report spam button is a form/post button, not a link, or else you'll gather report spam-reports en masse when misbehaving robots come in
+    12:51 < idn> Hmm, that's a good place to use the honeypot stuff too.
+    12:52 < robinsmidsrod> catch the spammers/harvesters by their own bad behaviour :)
+
+
+=head2 standards compliance
+
+=head3 Atom feed validation
+
+L<http://feedvalidator.org/check.cgi?url=http://ironman.enlightenedperl.org/atom.xml>
+
+I'm told this involves fixing L<XML::Feed>.
+
+=head2 Gravatar support?
+
+12:12 < poisonbit> gravatar support could be funny
+12:13 < idn> poisonbit: Cool.  I like.  I'll add it to the wishlist.
+
+http://en.gravatar.com/
+
+=cut