[Catalyst] Production session issue - commercial support inquiry?

Matt Pitts mpitts at a3its.com
Thu Jan 8 15:20:29 GMT 2009


A Cat app for one of our clients has been experiencing random session
cross-over issues for the past several months. The first reported
instance was on Oct. 16 and I have spent countless hours looking at any
changes that took place around that time to try to isolate the problem
without success.

Here is an overview of the application setup:

 - Catalyst 5.70014
 - Catalyst::Plugin::Session 0.20
 - single Apache 2.2 load balancer front end handling remote SSL and
non-SSL requests
 - two backend servers running Cat app via HTTP::Prefork for non-SSL
traffic
 - each backend also runs Apache/mod_ssl + mod_fastcgi in order to
provide an SSL channel to the backends
 - session storage via DBIC (was originally memcached)

The symptoms of the session issues:

 - users "seeing" other user's items in their Cart
 - "club" members updating their information, but the update gets tied
to another member

I have personally experienced the issue myself while the site was under
a bit of traffic - I added an item to my Cart and I had other items in
there that I didn't touch. I grabbed my cookie and tracked down my
session and it was pointing to a cart_id that had all the items in it
that were showing on the page. The only conclusion I can draw from this
is that two separate sessions somehow got mapped to the same cart_id -
which I don't know where this could be happening at. I'm fairly
confident that it's not my Cart code because the issue is also happening
on another part of the site - a member's area.

In an effort to track down the issue I have done the following (not
necessary in chronological order):

 - put checks in place during Cart retrieval to help ensure a good
"load"
	Result: no change

 - applied both "finalize_race_condition" and "flash_in_stash" patches
from Sergio Salvi
 
(http://dev.catalyst.perl.org/svnweb/Catalyst/browse/branches/Catalyst-P
lugin-Session/both)
 	Result: no change

 - taken one of the backends out of the pool for several days
	Result: no change

 - switched session storage to FastMmap while using only a single
backend
	Result: no change

 - switched session storage to DBIC
	Result: no change

 - updated C::P::Session from 0.19 to 0.20
	Result: no change

 - changed deployment model from FastCGI to HTTP::Prefork and back
	Result: no change

 - setup SHH tunnel for SSL traffic to go directly to HTTP::Prefork
backend instances (completely eliminate FastCGI components)
	Result: no change

 - attempted to replicate problem in a test environment using a forking
site crawler I hacked up
 	Result: no success

All in all, I'm fairly confident that this issue is either an error in
my code/setup of Catalyst and C::P::Session or it's a bug somewhere in
Catalyst or C::P::Session. I know there's been some traffic on the list
about session issues and some talk of refactoring the session code, but
I haven't seen anything that is relevant to this problem.

Any obvious help or suggestions are *greatly* appreciated, but...

I'm to the point that I need additional help with this problem and since
I'm the only Linux/Perl guys in our shop I'm calling out. My company is
committed to getting this problem resolved and we want to get an idea of
what commercial support is available. Our preference would be on-site
support - we are located in Greensboro, NC, USA - but we'll we can work
with remote support if needed. If you're currently providing commercial
support specifically for Catalyst applications and you're interested in
the work, please feel free to contact me privately with your information
and rates.

Much thanks and appreciation,

Matthew Pitts
Senior Engineer
Software and Linux Solutions
A3 IT Solutions, LLC
[c] 336.202.3913
[o] 336.389.1101 ex117
[e] mpitts at a3its.com





More information about the Catalyst mailing list