[Catalyst-dev] Leaking files / file descriptors in HTTP::Body..

Tomas Doran bobtfish at bobtfish.net
Tue Oct 7 09:42:21 BST 2008


On 7 Oct 2008, at 00:17, Andy Grundman wrote:
>
> On Oct 6, 2008, at 5:49 PM, Tomas Doran wrote:
>>
>> As this is happening for me, I thought I'd knock up some tests of  
>> the unlink behavior (attached). They're not quite perfect yet, but  
>> a good start at testing this specific case.
>>
>> This test however steadfastly refuses to fail (on my laptop,  
>> haven't tested at work yet), however if I alter the File::Temp- 
>> >new call in HTTP::Body::OctetStream to add UNLINK => 0, it fails  
>> as I'd expect (so proving at least that it's testing what I think  
>> it is testing).
>>
>> The only things I can think of which could be causing this are:
>> * Really old File::Temp versions (is UNLINK new behavior) - I'll  
>> check this out when I get into work tomorrow.
>> * $File::Temp::KEEP_ALL = 1 - would it be sane to localise this  
>> variable to 0 in HTTP::Body to stop users shooting themselves in  
>> the foot?
>>
>> Anyone else got any suggestions or ideas about what may be causing  
>> this?
>
> Currently you need to manually clean up any temp files HTTP::Body  
> creates that are not the result of a form upload.  I think we'll  
> probably change this at some point, since it really should not be  
> your problem to clean up the files, but Catalyst's.
>

The temp files created by HTTP::Body::OctetStream _are_ correctly  
cleaned up at the end of the request in my test case (as they go out  
of scope at the end of the request), at least for the PUT use-case,  
which is what I care about.

I'm pretty damn sure that this also happens correctly in real  
applications (I'm going to do some more testing of this today).

_However_ - I have some production systems which are steadfastly  
refusing to clean up temp files, and so there is something wiggy  
going on. My current suspects (as previously noted) are really old  
versions of some CPAN library in /usr/local, or some other piece of  
code doing something nasty (like $File::Temp::KEEPALL = 1), which  
ruins this behavior.

My current personal problems with retarded Debian boxen aside - there  
is some logical inconsistency - some of the HTTP::Body sub-classes  
say UNLINK => 0, and rely on Catalyst to clean up, whereas some say  
nothing, rely on the default cleanup behavior. I guess that this  
should in some way be unified.

I'm happy to supply a patch & tests to unify this behavior - but  
which is the correct way to go?

If we allow the File::Temp objects to auto-cleanup, then if the user  
decides (for some mad reason) to keep the File::Temp object around,  
then the file doesn't get deleted (and that's obviously their  
problem, as they've probably had too many hits on the crack pipe if  
they want to do this) - behavior that won't work/works inconsistently  
currently, as Catalyst will come along and unlink your file for you  
(but only for certain request types)... Alternatively - we should  
make _all_ the File::Temp objects with UNLINK => 0, but get Catalyst  
to unlink the files manually (as it does currently for form uploads)  
- this approach also means you're safe from $File::Temp::KEEPALL=1.

I don't have a strong opinion about which of these is better,  
although the second feels less dangerous as the user is _forced_ to  
copy / hard link the file if they want it preserved, which I think is  
probably better behavior (i.e. less easy to shot your own foot).  
Letting File::Temp in some way do the cleanup may be superior, as  
when you look at it's code, it is dealing with a load of edge cases  
to get the cleanup right for you - but I can't think of any cases of  
the user having a legitimate reason to keep a reference to the  
temporary object around after the end of a hit...

Cheers
t0m




More information about the Catalyst-dev mailing list