Re: [Quixote-users] HTTP upload: anyone care?

HTTP upload: anyone care?
2002-09-26
Greg Ward
2002-09-26
2002-09-26
2002-09-27
2002-10-01
HTTP upload: anyone care?
Greg Ward
2002-09-27
On 26 September 2002, To quixote-users@mems-exchange.org said:
> Would anyone be greatly inconvenienced if I either removed HTTP upload
> support, or replaced it with something that might be completely
> different from the current interface?  (For example, code that handles
> an upload might get a different type of request object.)

OK, I think I've come up with a pretty good HTTPUploadRequest class,
based on the code in my standalone upload.cgi script.  It derives from
the HTTPRequest already in Quixote, and has a pretty similar interface.

The basic idea is simple: when you process a form that includes
uploaded file(s), the "form value" for those files is an Upload object.
Upload objects just contain a couple of filenames; from the class
docstring:

class Upload:
    """
    Represents a single uploaded file.  Uploaded files live
    in the filesystem, *not* in memory -- this is not a file-like
    object!  It's just a place to store a couple of filenames.
    Specifically:

      orig_filename
        the complete filename supplied by the user-agent in the
        request that uploaded this file.  Depending on the browser,
        this might have the complete path of the original file
        on the client system, in the client system's syntax -- eg.
        "C:\foo\bar\upload_this" or "/foo/bar/upload_this" or
        "foo:bar:upload_this".
      base_filename
        the base component of orig_filename, shorn of MS-DOS,
        Mac OS, and Unix path components and with "unsafe"
        characters neutralized (see make_safe())
      tmp_filename
        where you'll actually find the file on the current system
    """

IOW, if you want a file-like object right now, you do this:

  upload = request.get_form_var("upload_file")
  file = open(upload.tmp_filename)

If you want to move the upload to somewhere more permanent, with
(roughly) the name supplied by the user-agent (ie. its original name on
the client system):

  upload = request.get_form_var("upload_file")
  filename = os.path.join(upload_dir, upload.base_filename)
  os.rename(upload.tmp_filename, filename)

I don't know what else you'd want to do with uploaded files, but the
information you need is all in the Upload instance.

Currently, there's one wart in HTTPUploadRequest: you have to do

  request.set_upload_dir("/my/upload/dir")

after the request object is created, but before it attempts to parse the
request body.  That's because the uploaded file is *in* the request
body, and HTTPUploadRequest needs to know where it's supposed to write
the upload first.  But the interval between creating the request and
parsing the body is under the control of Quixote's Publisher class, so
normal Quixote application code won't be able to call set_upload_dir().

And anyways, the upload_dir should probably be the same for all uploads
in an application -- my current application is a CGI script, so it
doesn't matter if upload_dir is defined as a module-level constant, a
class attribute, an instance attribute, or what.  But if
HTTPUploadRequest is to be useful for general Quixote apps, there needs
to be a better way to specify the (temporary) location for uploaded
files.

Oh duh, of course, this is what Quixote config variables are for.  Just
put your app's upload location in UPLOAD_DIR and be done with it.  If
UPLOAD_DIR not set, HTTPUploadRequest will raise an exception.  In the
worst (?) case, an attacker would craft a bogus upload request for a
Quixote application that doesn't expect uploads.  HTTPUploadRequest will
barf, attacker gets an "Internal Server Error" page, and the upload will
be discarded.  Attacker still manages to steal bandwidth, but not disk
space.

This sound good to everyone?

        Greg
--
Greg Ward - software developer                gward@mems-exchange.org
MEMS Exchange                            http://www.mems-exchange.org