durusmail: quixote-users: timestamp in upload.py prone to inaccuracy (patch)
timestamp in upload.py prone to inaccuracy (patch)
2003-03-16
timestamp in upload.py prone to inaccuracy (patch)
2003-03-17
Re: timestamp in upload.py prone to inaccuracy (patch)
2003-03-17
Re: timestamp in upload.py prone to inaccuracy (patch)
2003-03-17
Re: timestamp in upload.py prone to inaccuracy (patch)
2003-03-17
Re: timestamp in upload.py prone to inaccuracy (patch)
2003-03-17
2003-03-17
Re: timestamp in upload.py prone to inaccuracy (patch)
2003-03-17
2003-03-17
timestamp in upload.py prone to inaccuracy (patch)
Greg Ward
2003-03-17
On 15 March 2003, Graham Fawcett said:
> I'm writing an application that takes multiple file uploads on a page. I
> discovered that the naming of uploaded files is based on a timestamp,
> and can only ensure unique filenames if no more than one file is
> uploaded per second.

Not true on Unix at least -- the files would have to come in at more
than one per *millisecond*, since the timestamp uses the floating-point
part of time.time() as well.  But maybe time.time() doesn't act like
that on all systems.  Can you fire up an interpreter and try this:

>>> from time import time
>>> time() ; time() ; time() ; time() ; time() ; time()

and tell us what you get?

> I tackled the problem by adding a 'counter' iterator, that tacks a
> unique number on the end of every filename. This should guarantee
> uniqueness. Though it's not threadsafe; maybe I should have tossed the
> thread id in there as well... ;-)

Please don't use absolute diffs -- I prefer unified ("diff -u") myself.
Easier to read, and much more likely to work if the baseline code has
changed since you prepared your patch.

Anyways, I think a better approach would be to tack on a random number,
and keep trying until the chosen filename does not exist.  Here's a
first crack; this is completely untested, not threadsafe (because the
standalone functions in random.py are not threadsafe), and it's subject
to a fairly obvious race condition:

--- upload.py   (revision 21160)
+++ upload.py   (working copy)
@@ -11,6 +11,7 @@
 __revision__ = "$Id$"

 import os, string
+import random
 from cgi import parse_header
 from rfc822 import Message
 from time import time, strftime, localtime
@@ -150,11 +151,15 @@
         return "<%s at %x: %s>" % (self.__class__.__name__, id(self), self)

     def receive (self, file, boundary, dir):
-        now = time()
-        tstamp = (strftime("%Y%m%d.%H%M%S", localtime(now)) +
-                  ("%.3f" % (now % 1))[1:])
-        filename = "upload.%s.%s" % (tstamp, os.getpid())
-        filename = os.path.join(dir, filename)
+        while 1:
+            now = time()
+            tstamp = strftime("%Y%m%d.%H%M%S", localtime(now))
+            fuzz = randint(0, 9999)
+            filename = "upload.%s.%s.%04d" % (tstamp, os.getpid(), fuzz)
+            filename = os.path.join(dir, filename)
+            if not os.path.exists(filename):
+                break
+
         ofile = open(filename, "wb")
         done = read_mime_part(file, boundary, ofile=ofile)
         ofile.close()

Can anyone improve on that?

        Greg
--
Greg Ward - software developer                gward@mems-exchange.org
MEMS Exchange                            http://www.mems-exchange.org

reply