[Catalyst] More detailed proposal for changes related to content negotiation and REST

Fri Aug 9 21:37:30 GMT 2013

On Friday, August 9, 2013 4:52 PM, Bill Moseley <moseley at hank.org> wrote:

>
>
>On Fri, Aug 9, 2013 at 12:11 PM, John Napiorkowski <jjn1056 at yahoo.com> wrote:
>
>
>>
What's the use case you have in mind?  Something like first check for something like 'application/vnd.mycompany.user+json' and then fall back to 'application/(?:vmd.*+)?json' if you don't find it?  Is that an actual case you've come across?  
>>
>
>
>Ya, that's kind of what I was thinking.   Or also having a final fallback parser that tries to figure out the type by other means than just looking at the Content type provided in the request.  Or even a '.' final match-anything that does some special logging. 
>
>
>It would be easy enough to find out if application/json was in the array more than once by mistake.
>
>

Seems like a reasonable use case then, although I would encourage future development to aim at putting more specificity in the controller, rather than rely on the global application.  The primary reason to have anything here at all is to have a base people can build on.  I do fear the globalness of it, but it seems not an unreasonable compromise based on how Catalyst actually works today.

> 
>
>>We've spoken before about the parsing larger incoming and chunked data thing before.  I would love to address this, but right now it seems like something we need better agreement on in the psgi level.  For example, since star man already buffers incoming input, it feels silly to me to have catalyst then try to re-stream that.  You've already paid the full price of buffering in terms of memory, and performance right?  Or am I not understanding?
>>
>
>
>I added a Plack middleware to handle chunked encoded requests -- I needed it for the Catalyst dev server and for Apache/mod_perl.   Yes, Starman already de-chunks and buffers and works perfectly.
>
>
>Apache actually de-chunks the request, but doesn't update the Content-Length header and leaves on the Transfer-Encoding: chunked header.  So, sadly, I do flush this to a temporary file only to get the content-length to make Catalyst happy.
>
>
Right, so I think in the end we all agreed it was psgi that should be responsible for dealing with chunks or whatever (based on the http level support of the server).  The only think would be could there be some sane approach that exposed the input stream as a non blockable file handle that has not already been read into a buffer (memory or otherwise).  I do see the possible advantage there for processing efficiently large POST or PUT.  However again this needs to be done at the PSGI level, something like input.stream or similar.  That would smooth over chucked versus non chunked and expose a readable stream of the input that has not yet been buffered.

> 
>I'd really like to have something at the Catalyst level that sanely acheives this end, but I think part of the price we paid when going to PSGi at the core, is that most of the popular plack handlers are pre loading and buffering input, even large request input.  This seems to be an area where it might behoove us to work with the psgi group to find something stable.  Even the optional psgix.io isn't always going to work out, since some people don't want to support that in the handler (its a somewhat vague definition I guess and makes people uncomfortable).
>>
>>Until them, or until someone helps me understand that my thinking is totally wrong on this score, it seems the best thing to do is to put this out of scope for now.  That way we can move on supporting a goodly number of real use cases.
>>
>
>
>Agreed.
> 
>
>>I intended to say that $_ equals a string that is the buffered request body.  This way we can reserve other args for handling the future streaming case.  I was actually pondering something were the sub ref returns a sub ref that gets called over and over to do the parse.  
>>
>
>
>I just don't want file uploads in memory.   (Oh, I have another post coming on that -- thanks for the reminder.)
>
Well, Catalyst doesn't but I think Starman might depending on the size of the incoming.  However I think you can override that with a monkey patch.
>
> >
>>I not quite sure about $c->res->body( \%data );   I think body should be the raw body.   What does $c->res->body return?  The serialized json?  The original hashref?   
>>>
>>>
>>
>>I'm not sure I like it either.  I would say body returns whatever you set it to, until the point were encoding happens.  It does feel a bit flaky, but I can't actually put my finger on a real code smell here.
>>
>>Any other suggestions?  This is certainly a part of the proposal that is going to raise doubt, but I can't think of something better, or assemble problematic use cases in my head over it either.
>>
>
>
>I don't really mind adding to $c->stash->{rest}.   It's kind of a staging area to put data until it's ready to be encoded into the body.   I might get it partially loaded with data and then never use it and return some other body.   Noting precludes that, of course.   Ya, tough one. 
>

Well, I definitely don't want to stick this in the stash, you all will have to tie me down to get that past!  Given that body already allows a file handle, I thought adding into that would be the most simple thing, but lets give it more thought and maybe some other minds will come up with better ideas.  I'll bounce it off t0m and mst as well.

>
>>
>>>If a parser dies what kind of exception is thrown?   You say they would not set any response status, but wouldn't we want to catch the error and then set a 400?  (I use exception objects that carry http status, a message to return in the body and a message used for logging at a given level.)
>>>
>>
>>How people do exceptions in Perl tends to be nearly religious, and I didn't want to hold this up based on figuring that stuff out :)  I was thinking to just raise an exception and let the existing Catalyst stuff do its thing.  I'm just thinking not to add anything special for this type of error, but just do the existing behavior, for better or worse.
>>
>
>
>Agreed.  If I were to write everything from scratch again I'd be doing $c->throw_not_found or $c->throw_forbidden with exception objects as the code ends up much cleaner and sane.   But, everyone has their own approaches. 
>
>
One thing is to have the response->from_psgi thing which would make ti easy to graft in something like https://metacpan.org/module/HTTP::Throwable

sub myaction :Local {
my ($self, $c) = @_;
$c->res->from_psgi( http_throw({
    status_code => 500,
    reason      => 'Internal Server Error',
    message     => 'Something has gone very wrong!'
}))
}

somthing along those lines I think.

> 
>since request->body_data is intended to be lazy, we won't run that parse code until you ask for the data.  We don't need to parse the data to do the basic match here, this is just based on the HTTP meta data, no the actual content.  I think for common cases this is fine (I realize that yet again this might not be the best approach for multipart uploads...)
>>
>
>
>Another tough one.    Just seems like PUT /user should accept the same data regardless of how it is serialized.   And GET /user would get the user data and then serialize that to JSON or whatever but it's the same data.
>
>
>But, maybe you have a point.    I would worry that someone assumes JSON and adds that content type match and then wonder why later it's not working for other request serializations.
>
>

well strikely speaking restful content negotiation should tell the client what can and can't be accepted, for other purposes we have docs.  I think its safe for the first go to just support json since for one of the main use cases, making it easy for people building websites with some ajax forms, that is all you need.  For more hard core REST I could easily see returning data very differently based on what is asked.  Like an endoint could serve an image if you ask for png, but metadata on the image if you ask for json.  

>
>
>-- 
>Bill Moseley
>moseley at hank.org
>
>