durusmail: qp: Problems with MIMEInput and boundaries.
Problems with MIMEInput and boundaries.
2009-06-02
2009-06-03
2009-06-03
2009-06-03
2009-06-04
Problems with MIMEInput and boundaries.
Peter Wilkinson
2009-06-02
Hi all.

I've been doing some work with a Flash based file uploader and I'm
running into a problem. The uploader is hosted on a 3rd party site and
posts a binary to us as multipart/form-data. The problem arises when
MIMEInput is parsing the input stream with readpart() and it doesn't
get a boundary line as it expects.

Trying to track this down I've done some reading on what I think are
the appropriate RFCs and I think the base issue is that the pattern
used in MIMEInput has a \r\n at the end of the line whereas the RFC
says that the \r\n is at the beginning of the boundary. The Flash
uploader (using the standard Flash APIs for this I presume) sends all
the data as we expect but the last boundary doesn't end with \r\n, the
-- at the end of the boundary is the end of the input stream.

I believe relevant part of the RFC (http://www.faqs.org/rfcs/rfc2046.html
) is:
The boundary delimiter MUST occur at the beginning of a line, i.e.,
following a CRLF, and the initial CRLF is considered to be attached to
the boundary delimiter line rather than part of the preceding part.
The boundary may be followed by zero or more characters of linear
whitespace. It is then terminated by either another CRLF and the
header fields for the next part, or by two CRLFs, in which case there
are no header fields for the next part. If no Content-Type field is
present it is assumed to be "message/rfc822" in a "multipart/digest"
and "text/plain" otherwise.
NOTE: The CRLF preceding the boundary delimiter line is conceptually
attached to the boundary so that it is possible to have a part that
does not end with a CRLF (line break). Body parts that must be
considered to end with line breaks, therefore, must have two CRLFs
preceding the boundary delimiter line, the first of which is part of
the preceding body part, and the second of which is part of the
encapsulation boundary.
Boundary delimiters must not appear within the encapsulated material,
and must be no longer than 70 characters, not counting the two leading
hyphens.
The boundary delimiter line following the last body part is a
distinguished delimiter that indicates that no further body parts will
follow. Such a delimiter line is identical to the previous delimiter
lines, with the addition of two more hyphens after the boundary
parameter value.
--gc0pJq0M:08jU534c0p--
NOTE TO IMPLEMENTORS: Boundary string comparisons must compare the
boundary value with the beginning of each candidate line. An exact
match of the entire candidate line is not required; it is sufficient
that the boundary appear in its entirety following the CRLF.
There appears to be room for additional information prior to the first
boundary delimiter line and following the final boundary delimiter
line. These areas should generally be left blank, and implementations
must ignore anything that appears before the first boundary delimiter
line or after the last one.

Does this make sense?

Regards,
Peter W.

Peter Wilkinson
pfw@thirdfloor.com.au

Thirdfloor Software Works Pty Ltd
http://www.thirdfloor.com.au

Sydney       +61 2 8916 7366
reply