[Html-widget] file uploads and element_type combined patch

John Napiorkowski jjn1056 at yahoo.com
Tue Nov 21 15:39:05 GMT 2006


--- Carl Franks <fireartist at gmail.com> wrote:

> On 17/11/06, John Napiorkowski <jjn1056 at yahoo.com>
> wrote:
> > --- Carl Franks <fireartist at gmail.com> wrote:
> >
> > > John,
> > >
> > > Thanks for the patch.
> > > I'm going through it now, but there's a couple
> > > things I'm not sure about...
> >
> > Yeah, there are a couple of things I'm not sure
> about
> > either.  This patch scratches my itch but it's
> likely
> > to give others a rash :)
> >
> > >
> > > On 15/11/06, John Napiorkowski
> <jjn1056 at yahoo.com>
> > > wrote:
> > > > Hi,
> > > >
> > > > Here's a patch for the two changes I've been
> > > talking
> > > > about the last day anf a half.
> > > >
> > > > First it has the new accessor called
> > > 'element_type'
> > > > for the element object, which returns the
> scalar
> > > name
> > > > of whatever you typed into the type field when
> > > > creating the element.  It's just a nicer way
> of
> > > doing
> > > > a regex against ref $element to get the type
> name.
> > > > The only changes between this and the patch I
> sent
> > > > earlier today is some documentation about the
> new
> > > > accessor.
> > > >
> > > > The second patch is more involved, and changes
> the
> > > way
> > > > file uploads are handled.  Here's what I did:
> > > >
> > > > If you don't set $widget->uploads or don't set
> the
> > > > uploads during process
> ($widget->process($query,
> > > > $uploads) this works exactly like it did
> before,
> > > 100%
> > > >
> > > > If you DO set either the uploads accessor or
> pass
> > > a
> > > > second parameter to process, the following
> > > happens:
> > > >
> > > > First, whenever you used for $uploads is
> tested to
> > > see
> > > > if it is a ref and if it has a method called
> > > 'upload'.
> > > >  If it doesn't it croaks out and stops.  We
> might
> > > want
> > > > to have this do a warn or something instead,
> > > because I
> > > > imagine this might break some people's stuff. 
> Let
> > > me
> > > > know.
> > >
> > > I don't think it's necessary to change
> > > uploads()/process() to expect a
> > > Catalyst::Request object, rather than the output
> > > from
> > > $c->req->uploads()
> > > Doing this /will/ cause the application to die
> for
> > > everyone using
> > > Catalyst::Plugin::HTML::Widget.
> > > It looks like you're only doing it to use the
> > > upload() method to get
> > > the list of upload parameter names - which we
> should
> > > be able to get
> > > with just `keys %$uploads`.
> > > If there's something I'm missing, let me know -
> > > otherwise I'll change
> > > it back to expecting a hash-ref.
> >
> > Yes, this would exactly work, and give us back
> some
> > additional semantic meaning.  I wasn't too happy
> with
> > $w->process($c->request, $c->request) either, but
> just
> > wanted to make sure this was something you could
> 'turn
> > off' if you didn't want to process uploads.
> >
> > >
> > > > If everything is good, it will get the uploads
> and
> > > > merge them into the local params, not the
> > > > catalyst->request->parameters.
> > >
> > > I think I might delay this change, until I've
> > > managed to experiment
> > > with it in a couple of applications.
> > > I think it's reasonable for the user to have to
> use
> > > $result->uploads(), as uploaded files are really
> a
> > > special case from
> > > all other parameters.
> > > I'll experiment with Cat a bit more, and see how
> it
> > > handles uploads,
> > > with regards the param() method.
> > >
> > > I think it's also reasonable for there to be a
> > > special mechanism for
> > > filters and constraints to handle uploads, as
> again
> > > they're a special
> > > case, and the filter/constraint needs to know
> that
> > > it's getting a
> > > file.
> >
> > I was just concerned that it would make it hard to
> use
> > existing filters and constraints without having to
> add
> > a lot of "If this field is an upload use the
> upload
> > acessor otherwise use the param accessor" in all
> the
> > places that field value get used.
> >
> > Since as you say uploads are a special case, maybe
> I
> > could make a subclass of filter and constraint
> that
> > operated on the uploads and then use that as a
> base
> > class for upload constraints/filters?
> >
> > Since upload fields also get something stuffed
> into
> > params under the fieldname (typically the
> filename,
> > although this can vary by browser) we can already
> use
> > the All contraint and any of the other contraints
> that
> > look to see if param(field) is defined.  Obviously
> > length and regex type contraints don't work.
> >
> > >
> > > > This get's processed as normal for the rest.
> > > > Afterwards, the results object will give you
> > > access to
> > > > the upload object (that have been filtered)
> via
> > > the
> > > > $results->param(upload_field) method.  The
> only
> > > that
> > > > that is changed with this is that before the
> > > > $results->param(...) would give you the
> filename
> > > of
> > > > the uploaded file.
> > >
> > > As I said, I'll experiment and see what I think
> > > about this.
> > >
> > > > This is also behavior that could break
> existing
> > > stuff,
> > > > if you are looking for that filename. 
> Although
> > > > honestly it's pretty useless, so hopefully
> this
> > > won't
> > > > be an issue.
> > > >
> > > > When you make filters or constraints against
> an
> > > upload
> > > > type field, the value is not a scalar, but the
> > > upload
> > > > upload that catalyst or a compatible object
> > > presents.
> > > > So if you want to do some checking you'll have
> to
> > > call
> > > > those methods on the object.
> > > >
> > > > Here's an example of a filter I wrote that
> resizes
> > > an
> > > > uploaded image:
> > > >
> > > > my $mediafile_filter = $self->form->filter(
> > > > 'Callback', 'mediafile' );
> > > >
> > > >    $mediafile_filter->callback(sub {
> > > >
> > > >       my ($value) = @_;
> > > >
> > > >       if($value)
> > > >       {
> > > >          my $content_type = $value->type;
> > > >          my $mediafile_out;
> > > >
> > > >
> > > if($c->request->parameters->{media_type_id}
> > > > eq 'profile_photo')
> > > >          {
> > > >             my $mediafile_hd = $value->fh;
> > > >
> > > >             ## Convert to the right size and
> > > output as
> > > > a jpeg
> > > >
> > > >             my $upload_img = Imager->new;
> > > >
> > > >             if
> > > ($upload_img->read(fh=>$mediafile_hd))
> > > >             {
> > > >                my $resized_img   =
> > > > $upload_img->scale(xpixels => 120, ypixels =>
> 90,
> > > type
> > > > => 'min');
> > > >
> > > >                my $background   =
> > > Imager->new(xsize =>
> > > > 120, ysize => 90);
> > > >                $background->box(filled => 1,
> color
> > > =>
> > > > 'blue');
> > > >
> > > $background->paste(src=>$resized_img);
> > > >
> > > >                $background->write(data =>
> > > > \$mediafile_out, type => 'jpeg');
> > > >             }
> > > >          }
> > > >
> > > >          return $mediafile_out;
> > > >       }
> > > >
> > > >       return '';
> > > >    });
> > >
> > > I think it's wrong for the filter to take in a
> file
> > > upload object, and
> > > return a blob/string.
> > > What if you want to run multiple filters on an
> > > uploaded file? The
> > > first one will turn it into a normal string
> variable
> > > and the following
> > > filters will choke.
> > > The filter should write back to the $file->fh()
> > > file.
> >
> > Yeah, your totally right on that, I just didn't
> have
> > time yet to write a in/deflator for DBIx that
> > converted database values to/from filehandles.  So
> I
> > fudged and let the HTML::Widgets do it for me. 
> Bad
> > seperation of concerns, I know.
> >
> > I'm just wondering would do people think this
> should
> > be.  Is returning the Catalyst upload object good
> or
> > would we prefer something else?  I'd lean toward
> the
> > object, since it would let us build the ->accept
> and
> > max size constraints, size that is already part of
> the
> > object.
> >
> > >
> > > > I also added a test for this new behavior.  I
> had
> > > to
> > > > make some changes to the test mock object to
> > > support
> > > > uploads.
> > > >
> > > > Now that we have this, I can update the upload
> > > element
> > > > so that all those documented constrainsts
> (accept,
> > > > size, ect) will actually do something.
> > > >
> > > > let me know if you want a more generic form of
> the
> > > > image filter above.  I can imagine other
> people
> > > might
> > > > want it.  It's good for when you want to allow
> a
> > > > client to upload an image but you want to
> > > normalize it
> > > > to a certain size and file type.
> > > >
> > > > Anyway, let me know if this is the right tack
> to
> > > take.
> > > >  Awaiting your thoughts.
> > >
> > > I'll also take a look at how Catalyst handles
> > > multiple inputs with the
> > > same name, where one is a file, and another
> isn't -
> > > and make sure the
> > > changes to $widget->process() work appropriate
> to
> > > that too.
> >
> > It works just like if you have multiple post
> params
> > with the same name, you get a reference to an
> array:
> >
> > $c->request->upload('upload')->[0...];
> >
> > So should fit right in to how we handle this for
> > params.
> >
> > > In fact, I'll have to see if there's a HTTP/POST
> > > spec/rfc which states
> > > how it /should/ be handled, rather than just how
> Cat
> > > handles it.
> >
> > http://www.ietf.org/rfc/rfc1867.txt is the spec I
> look
> > at, but seems to be silent on the issue.
> >
> > Both Cat and CGI.pm specify that the uploads will
> > return a ref to an array if there are more than
> one
> > with the same name.
> >
> > >
> > > Thanks again for the patch, but I think you'll
> have
> > > to be patient to
> > > see it getting committed :)
> >
> > I don't want to break HTML::Widget either.  But
> I'm
> > glad to have a change to contribute something
> since
> > this package has really helped me in my work.  I'm
> > happy to keep hacking at it until it's correct,
> just
> > might need to bother you to make sure it's cool.
> >
> > > It's important to get it right first time, er,
> > > second time!
> >
> > Let me know what you think.  I have time aside for
> > working on this over the weekend and next week. 
> Until
> > then I'll just use my local revision for my
> project.
> 
> The more I think about this, the less keen I am on
> letting Filters get
> hold of file uploads.
> Constraints such as file-type, or max-size are
> trivial, but consider
> that filters and constraints are run during
> $w->process()
> Any type of constraint or filter that requires an
> image processor or
> any other external library, has a good chance of
> dying or at least
> generating an error.
> I think this would require that each call to a
> filter or constraint's
> process method needs to be wrapped in an eval - and
> then we'll need to
> figure out how to handle any errors.
> It might be best that we hand a new copy of
> filehandles to the filters
> and constraints, so that if there is an error, the
> file can be
> restored from the original.
> This has obvious implications for large files and
> disk/memory usage.
> 
> Does anyone else have any input on this?
> 
> Carl

I think for sure we have to hand off and/or create a
new file handle for the filters and constraints.  The
IO::File object in Catalyst is set to be read only so
we are going to have to convert it to something else
internally if we are going to be able to change it.

 Yesterday I was trying to set this up so that my blob
database columns return a IO::All object (which is
pretty much a crazy IO Handle with tons of extra
functionality) on access.  Also I have it now where it
deflates an IO::* type to a string that can go into
the database.  Inbetween I want to process the upload
through Imager.pm to normalize it in some way.

I can write some custom filters for this, inheriting
from H:W:Filter, right? so I can reuse the logic but
then it get's a bit messy, because I end up doing it
twice, once as a custom constraint to make sure that
Imager.pm can actually load the upload and a second
time as a filter to normalize it.  This is as you said
above, you need to validate carefully before passing
it to a filter that relies on an external lib that can
segfault if you send the wrong data.  Imager.pm is
pretty good about this, it's a pretty modern design
and doesn't usually completely die if you do something
wrong, but some older libraries are known to do this.

It really starts to come down to how much of the
business logic constraints am I going to expect HTML
widgets to enforce.  In the Java world they have this
distinction between syntactic validation and domain or
business validation.  Figuring out where to put
filters is a little tricky for me.  Particularly
because in Catalyst we don't typically create an
object for the request query on a per model basis, we
usually just use the $c->request object and pass that
around.

In jifty they put content normalization in the
database model, [field]_canonicalize or something like
that, not in the widgets.

One thing I was thinking is that we could provide some
ready to go filters for some of the basic things you
might want from filtering uploads.  For example we
could have a H::W:F::Imager that supplies an Imager
object to the filter and expects one in the return. 
Then we could isolate that code in the base filter,
use eval if we think it's needed, etc.  This would
cover a lot of the types of things you might need to
do with an image, such as size and type normalization.

The only other kind of upload filtering I am doing is
Regex based, for things like xxxing out curse words in
various languages or making sure illegal html tags are
removed, stuff like that.  I don't think we have to
worry too much about that kind of filter generating
die level errors, at least any more so than for
running these filters against more 'normal' query
params.

Handling the larger uploads is also a consideration as
you mentioned.  First of all there is a desire to
integrate with one of the process meter plugins
floating around, both for uploading and second for
running filters.  That get's messy!  For the video
conversion I am doing I ended up moving all of it to a
custom set of forms and filters outside of H::W, at
least until I understand it better.

Well, I rambled quite a bit, not sure if I said
anything useful. 

--john



 
____________________________________________________________________________________
Sponsored Link

Online degrees - find the right program to advance your career. 
www.nextag.com



More information about the Html-widget mailing list