[Catalyst] Large requests with JSON?

Bill Moseley moseley at hank.org
Sat Feb 6 16:28:12 GMT 2010


On Fri, Feb 5, 2010 at 8:56 PM, Tomas Doran <bobtfish at bobtfish.net> wrote:

>
> On 5 Feb 2010, at 20:54, Bill Moseley wrote:
>
>> AFAIK, there's no way to stream parse JSON (so that only part is in memo=
ry
>> at any given time).  What would be the recommended serialization for
>> uploaded files -- just use multipart/form-data for the uploads?
>>
>
> Don't?


> Why not just do a PUT request with all the data as unmangled binary?


As in don't provide a way to upload meta data along with the file (name,
date, description, author, title, reference id) like the web upload allows
with multipart/form-data?  Or invent some new serialization where the meta
data is embedded in the upload?  Or do a POST with the file, then flag the
new upload as incomplete until a PUT is done to set associated meta data?

The API is suppose to offer much of the same functionality as the web
interface.  JSON is somewhat nice because, well, customers have requested
it, and also that it lends itself to more complex (not flat) data
representations.  Of course, urlencoded doesn't have to be flat -- we have
some YUI-based AJAX code that sends json in $c->req->params->{data}.  But I
digress.

The 'multipart/form-data' is nice because if the client is well behaved
uploads are chunked to disk.  XML can also do this, too (I have an
HTTP::Body subclass for XML-RPC that chunks base64 elements to disk).



>  BTW -- I don't see any code in HTTP::Body to limit body size.  Doesn't
>> that seem like a pretty easy DoS for Catalyst apps?  I do set a request =
size
>> limit in the web server, but if I need to allow 1/2GB uploads or so then
>> could kill the machine pretty easily, no?
>>
>
> Well, you set it at the web server.. That stops both overlarge
> content-length requests, and when the body exceeds the specified content
> length.
>

Yes, for example in Apache LimitRequestBody can be set and if you send a
content-length header larger than that value the request is rejected right
away.  And, IIRC, Apache will just discard any data over the what is
specified in the content-length header (i.e. Catalyst won't see any data
past the content length from Apache).



> But yes, you have to provision temp file space for n files in flight x max
> file size...
>

You are making an assumption that the request body actually makes it to a
temp file.

Imagine you allow uploads of CD iso files, so say 700MB.  So, you set the
webserver's limit to that.  Normally, when someone uploads HTTP::Body you
expect OctetStream or form-data posts which ends up buffering to disk.

Now, if someone sets their content type to Urlencoded then HTTP::Body just
gathers up that 700MB in memory.   MaxClients is 50, so do that math.

Granted someone would have to work very hard to get enough data at once all
to the same web server, and if an attacker is that determined they could
find other equally damaging attacks.  And a good load balancer can monitor
memory on disk space on the web servers and stop sending requests to a
server low on resources.


Most applications don't have this problem since uploading that large of a
file is likely rare.  Well, that assumes that everyone is using something in
front of Catalyst that limits upload size (like Apache's LimitRequestBody).

It's unusual to have a very large valid Urlencoded (or non-upload form-data)
body in a normal request (that's a lot of radio buttons and text to type!)
so, it would is not be wise for HTTP::Body to limit the size of
$self->{buffer} to something sane?  I suppose it could flush to disk after
getting too big, but that doesn't really help because some serializations
require reading the entire thing into memory to parse.



-- =

Bill Moseley
moseley at hank.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.scsys.co.uk/pipermail/catalyst/attachments/20100206/d5ae3=
128/attachment.htm


More information about the Catalyst mailing list