How to POST multipart/form-data using Vegeta? - http

I wanted to request POST a image file as multipart/form-data with Vegeta
But when I use this code, it didn't work well. As I thought, 'mean of Bytes In' in Vegeta report needed to be over 20000 because of the image size, but it was just 55.00
I command like this because it is on Windows Power Shell
PS >vegeta attack -duration=10s -rate 100 -targets .\targets_formdata.txt -output output\results.bin
targets_formdata.txt
POST http://url/to/request
Content-Type: multipart/form-data; boundary=Boundary+1234
#body.txt
body.txt
--Boundary+1234
Content-Disposition: form-data; name="file"; filename="DvBp50cVYAEIfxd.jpg"
Content-Type: image/jpeg
I wrote --Boundary+1234 as really it is, as --Boundary+1234, this could be a problem? I don't know what is the real problem.

Related

Robot Framework “multipart/form-data” REST request with multiple parameters

I am trying to use requestslibrary to upload some files, goal is to achieve this:
------WebKitFormBoundary61N9vqJ7380nh6iv
Content-Disposition: form-data; name="files"; filename="photo-2.jpeg"
Content-Type: image/jpeg
------WebKitFormBoundary61N9vqJ7380nh6iv
Content-Disposition: form-data; name="fileId"
b3duLWZpbGVzL2ZmZmZmZmZmYTQyNDVmODAvMjAxNTY*
------WebKitFormBoundary61N9vqJ7380nh6iv
Content-Disposition: form-data; name="extract"
false
------WebKitFormBoundary61N9vqJ7380nh6iv--
and now I have this, as per this:
${data}= Evaluate {'files': open("C:/testautomation/resources/Assets/photo-2.jpeg", 'r+b'), 'extract': (None, 'false'), 'fileId': (None, 'b3duLWZpbGVzL2ZmZmZmZmZmYTQyNDVmODAvMjAxNTY*')}
log ${data}
${result}= Post Request rest ${url} headers=${HEADERS} files=${data}
I THINK that the only bit I am missing is the "Content-Type: image/jpeg" from the first part, but how on earth I can add that? Currently the file gets uploaded, but it is not considered to be an image file.
The answer was:
${data}= Evaluate {'files': ('photo-1.jpeg', open("C:/testautomation-robot/resources/Assets/photo-1.jpeg", 'r+b'), 'image/jpeg'), 'extract': (None, 'false'), 'fileId': (None, 'b3duLWZpbGVzL2ZmZmZmZmZmYTQyNDVmODAvMjAxNTY*')}
Found an example from here: https://code.i-harness.com/en/q/bcfb9b
>>> url = 'http://httpbin.org/post'
>>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})}
In the above, the tuple is composed as follows:
(filename, data, content_type, headers)
Create a python resources and call it through robot framework:
and call it through robot framework
Call params from robotframework:
upload multipart files post request ${headers} ${url} resources/files/upload_file/testfile1_upload.pdf

nginx + fcgiwrapper sporadic Prblem: Delivers application/octet-stream instead of text/plain

I am new to this forum. This is my first question.
I have a nginx-server + fcgiwrapper set up to run Programs on user request (no PHP).
For testing I have a simple bash script, which displays the environment variables and sets two cookies, a second bash-script prints "Hello World" as text/plain and another bash-script prints "Hello World" as text/html.
Another Program written in C is supposed to read Text from stdin, parse it and print Text based in the input to stdout, which the should be displayed as text/plain in the requesting webbrowser. (the requesting browser needs to use POST).
However sometimes it displays the returned text as "text/plain" (which it should do), but sometimes the browser wants to download the returned text, as if it was "application/octet-stream".
But, if I test the C-Program in a prepared Environment
Environment Variables:
CONTENT_LENGTH=30
REQUEST_METHOD=POST
HTTP_COOKIE=NAME=TEST; ID=200
it works every time, shows no errors and at the beginning it prints:
Content-type: text/plain (plus two newlines)
I have found that depending on the contents length it sometimes works and sometimes doesn't. (This only happens when the program is started through a webbrowser.)
In Firefox, using the dev-tools, I could see that the answers Content-type was
application/octet-stream
and if I save it, it turns out to be a text file which contains the text that should have been displayed in the browser directly.
What am I doing wrong?
Edit: I have already searched for similar problems with no success
+ all other things work perfectly
+ This also happens with different browsers (epiphany, lynx, internet explorer on Windows)
Through trial and error (with curl + firefox-dev-tools) I have found, that the character
0x11
in combination with:
Content-type: text/plain
makes nginx deliver Content-type: application/octet-stream .
I don't know why this happens, but i found that the C-program produces the error, because it prints 0x11 or ^Q or dc1.
This phenomenom also happens with files containing this character.

Is it necessary to check both '\r\n\r\n' and '\n\n' as HTTP Header/Content separator?

The accepted answer on this page says we should check HTTP server responses for both \r\n\r\n and \n\n as the sequence that separates the headers from the content.
Like:
HTTP/1.1 200 Ok\r\n
Server: AAA\r\n
Cache-Control: no-cache\r\n
Date: Fri, 07 Nov 2014 23:20:27 GMT\r\n
Content-Type: text/html\r\n
Connection: close\r\n\r\n <--------------
or:
HTTP/1.1 200 Ok\r\n
Server: AAA\r\n
Cache-Control: no-cache\r\n
Date: Fri, 07 Nov 2014 23:20:27 GMT\r\n
Content-Type: text/html\r\n
Connection: close\n\n <--------------
In all the responses I've seen in Wireshark, servers use \r\n\r\n.
Is it really necessary to check for both? Which servers/protocol versions would use \n\n?
I started off with \r\n\r\n but soon found some sites that used \n\n. And looking at some professional libraries like curl, they also handle \n\n even if it's not according to the standard.
I don't really know the curl code, but see for example here: https://github.com/curl/curl/blob/7a33c4dff985313f60f39fcde2f89d5aa43381c8/lib/http.c#L1406-L1413
/* find the end of the header line */
end = strchr(start, '\r'); /* lines end with CRLF */
if(!end) {
/* in case there's a non-standard compliant line here */
end = strchr(start, '\n');
if(!end)
/* hm, there's no line ending here, use the zero byte! */
end = strchr(start, '\0');
}
Looking at that I think even \0\0 would be handled.
So:
To handle "anything" out there, then yes.
If you want to strictly follow the standard then no.
HTTP spec says:
The line terminator for message-header fields is the sequence CRLF. However, we recommend that applications, when parsing such headers, recognize a single LF as a line terminator and ignore the leading CR.
In practice I've never seen web server with CR line separator.

Understanding the "content type" for PDFs in crawling output

Using heritrix, I have crawled a site which contained some PDF files. The crawl log shows that the content type for the pdf link is "application/pdf", whereas the response in .warc file (crawl output) shows that the content type is "application/http" as well as "application/pdf" (see the example below:).
WARC/1.0^M
WARC-Type: response^M
WARC-Target-URI: `http://example.com/b/c/files/abc.pdf`^M
WARC-Date: 2014-05-29T10:48:03Z^M
WARC-Payload-Digest: sha1:JMRPMGSNIPHBPSBNPD2VJ2NIOGD75UUK^M
WARC-IP-Address: 86.36.67.50^M
WARC-Record-ID: <urn:uuid:00c8b80f-2851-42a1-a449-3cd9e238bfe9>^M
**Content-Type: application/http; msgtype=response^M**
Content-Length: 592173^M
WARC-Block-Digest: sha256:0a56d251257dbcbd6a54e19a528a56aae3e0c9e92a6702f4048e3b69bb3e0920^M
^M
HTTP/1.1 200 OK^M
Date: Thu, 29 May 2014 10:48:04 GMT^M
Server: Apache/2.4.4 (Unix) OpenSSL/0.9.7d PHP/5.3.12 mod_jk/1.2.35^M
Last-Modified: Wed, 20 Nov 2013 08:13:50 GMT^M
ETag: "90805-4eb975c6bcb80"^M
Accept-Ranges: bytes^M
Content-Length: 591877^M
Connection: close^M
**Content-Type: application/pdf^M**
followed by the content of the PDF file
I do not understand how this is happening. Can anyone please explain?
The WARC file contains:
First the WARC-Header-Metadata, from the beginning to the first empty line. This header describes what follows, ie. a full http response, with header and content. Hence the content-type to application/http.
Then comes the HTTP-Response-Metadata. This header is the actual HTTP header and describes what follows, ie. a PDF document.

Proper Chunked Transfer Encoding Format

I'm curious about the proper format of chunked data in comparison to the spec and what Twitter returns from their activity stream.
When using curl to try to get a Chunked stream from Twitter, curl reports:
~$ curl -v https://stream.twitter.com/1/statuses/sample.json?delimited=length -u ...:...
< HTTP/1.1 200 OK
< Content-Type: application/json
< Transfer-Encoding: chunked
<
1984
{"place":null,"text":...
1984
{"place":null,"text":...
1984
{"place":null,"text":...
I've written a chunked data emitter based upon the Wikipedia info and the HTTP spec (essentially: \r\n\r\n), and my result looks like this:
~$ curl -vN http://localhost:7080/stream
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Transfer-Encoding: chunked
<
{"foo":{"bar":...
{"foo":{"bar":...
{"foo":{"bar":...
The difference being that it appears that Twitter is including the length of the string as part of the body of the chunk as an integer (in conjunction with the value in Hex that must also be there), and I wanted to make sure that I wasn't missing something. The Twitter docs make no mention of the length value, it's not in their examples, nor do I see anything about it in the spec.
If your code does not emit length information that it is clearly incorrect. See http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.3.6.1.
RCF2616-19.4.6 Introduction of Transfer-Encoding
A process for decoding the "chunked" transfer-coding (section 3.6) can be represented in pseudo-code as:
length := 0
read chunk-size, chunk-extension (if any) and CRLF
while (chunk-size > 0) {
read chunk-data and CRLF
append chunk-data to entity-body
length := length + chunk-size
read chunk-size and CRLF
}
read entity-header
while (entity-header not empty) {
append entity-header to existing header fields
read entity-header
}
Content-Length := length
Remove "chunked" from Transfer-Encoding
As RFC says, the chunk-size will not append to the entity-body. So that is nomal you can not see the chunk-size.And I have read the souce code of curl(function Curl_httpchunk_read)and make sure it skips the chunk-size\r\n, just append chunk-size bytes behind it to body.
The twitter replys with chunk-size,I think it is because of using https, the whole data is encrypted.

Resources