in HTTP request and response paradigm :
Is it possible to use chunked transfer encoding with multipart HTTP messages?
If no Why?
Yes, it is possible. Why do you think it's not?
It looks like the entire request body is subject to chunking, and structuring (or providing alternative versions of) multipart data appears orthogonal to the chunking. I can see curl wrapping entire parts into chunks,
$ curl --trace-ascii - https://httpbin.org/post --form "field1=Bonjour." -H "Expect:" -H "User-Agent:" -H "Accept:" --http1.1
[...]
=> Send header, 149 bytes (0x95)
0000: POST /post HTTP/1.1
0015: Host: httpbin.org
0028: Content-Length: 149
003d: Content-Type: multipart/form-data; boundary=--------------------
007d: ----5cb7d87e27c5c1bf
0093:
== Info: TLSv1.2 (OUT), TLS header, Supplemental data (23):
=> Send SSL data, 5 bytes (0x5)
0000: .....
=> Send data, 149 bytes (0x95)
0000: --------------------------5cb7d87e27c5c1bf
002c: Content-Disposition: form-data; name="field1"
005b:
005d: Bonjour.
0067: --------------------------5cb7d87e27c5c1bf--
== Info: We are completely uploaded and fine
[...]
$ curl --trace-ascii - https://httpbin.org/post --form "field1=Bonjour." -H "Expect:" -H "User-Agent:" -H "Accept:" --http1.1 -H "Transfer-Encoding: chunked"
[...]
=> Send header, 156 bytes (0x9c)
0000: POST /post HTTP/1.1
0015: Host: httpbin.org
0028: Transfer-Encoding: chunked
0044: Content-Type: multipart/form-data; boundary=--------------------
0084: ----db8064ac2c1d04aa
009a:
== Info: TLSv1.2 (OUT), TLS header, Supplemental data (23):
=> Send SSL data, 5 bytes (0x5)
0000: .....
=> Send data, 155 bytes (0x9b)
0000: 95
0004: --------------------------db8064ac2c1d04aa
0030: Content-Disposition: form-data; name="field1"
005f:
0061: Bonjour.
006b: --------------------------db8064ac2c1d04aa--
0099:
== Info: Signaling end of chunked upload via terminating chunk.
[...]
=> Send data, 5 bytes (0x5)
0000: 0
0003:
[...]
It's possible to split the request body into arbitrary chunks.
$ cr=$'\r'; lf=$'\n'; delim="xyz:abc:foobar"; h="httpbin.org"; body="--${delim}${cr}${lf}Content-Disposition: form-data; name=\"field1\"${cr}${lf}${cr}${lf}Bonjour.${cr}${lf}--${delim}--${cr}${lf}"; cl=${#body}; { echo -ne "POST /post HTTP/1.1\r\nHost: ${h}\r\nContent-Type: multipart/form-data; boundary=${delim}\r\nContent-Length: ${cl}\r\n\r\n${body}"; sleep 2; } | tee /dev/stderr | openssl s_client -connect "${h}:443" -servername "${h}" -quiet -no_ign_eof
POST /post HTTP/1.1
Host: httpbin.org
Content-Type: multipart/form-data; boundary=xyz:abc:foobar
Content-Length: 97
--xyz:abc:foobar
Content-Disposition: form-data; name="field1"
Bonjour.
--xyz:abc:foobar--
depth=2 C = US, O = Amazon, CN = Amazon Root CA 1
verify return:1
depth=1 C = US, O = Amazon, OU = Server CA 1B, CN = Amazon
verify return:1
depth=0 CN = httpbin.org
verify return:1
HTTP/1.1 200 OK
Date: Thu, 27 Oct 2022 03:57:21 GMT
Content-Type: application/json
Content-Length: 388
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
{
"args": {},
"data": "",
"files": {},
"form": {
"field1": "Bonjour."
},
"headers": {
"Content-Length": "97",
"Content-Type": "multipart/form-data; boundary=xyz:abc:foobar",
"Host": "httpbin.org",
"X-Amzn-Trace-Id": "Root=1-635a01a1-569f3ccb69a8ad4d07704cc7"
},
"json": null,
"origin": "WWW.XXX.YYY.ZZZ",
"url": "https://httpbin.org/post"
}
DONE
$ cr=$'\r'; lf=$'\n'; delim="xyz:abc:foobar"; h="httpbin.org"; body="--${delim}${cr}${lf}Conte
nt-Disposition: form-data; name=\"field1\"${cr}${lf}${cr}${lf}Bonjour.${cr}${lf}--${delim}--${cr}${lf}"; chunked=""; whi
le true; do b="${body:0:32}"; body="${body:32}"; chunked+="$(printf "%x" ${#b})${cr}${lf}${b}${cr}${lf}"; (( ${#b} )) ||
break; done; cl=${#chunked}; { echo -ne "POST /post HTTP/1.1\r\nHost: ${h}\r\nContent-Type: multipart/form-data; bounda
ry=${delim}\r\nTransfer-Encoding: chunked\r\nContent-Length: ${cl}\r\n\r\n${chunked}"; sleep 2; } | tee /dev/stderr | op
enssl s_client -connect "${h}:443" -servername "${h}" -quiet -no_ign_eof
POST /post HTTP/1.1
Host: httpbin.org
Content-Type: multipart/form-data; boundary=xyz:abc:foobar
Transfer-Encoding: chunked
Content-Length: 125
20
--xyz:abc:foobar
Content-Dispos
20
ition: form-data; name="field1"
20
Bonjour.
--xyz:abc:foobar--
1
0
depth=2 C = US, O = Amazon, CN = Amazon Root CA 1
verify return:1
depth=1 C = US, O = Amazon, OU = Server CA 1B, CN = Amazon
verify return:1
depth=0 CN = httpbin.org
verify return:1
HTTP/1.1 200 OK
Date: Thu, 27 Oct 2022 04:07:47 GMT
Content-Type: application/json
Content-Length: 388
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
{
"args": {},
"data": "",
"files": {},
"form": {
"field1": "Bonjour."
},
"headers": {
"Content-Length": "97",
"Content-Type": "multipart/form-data; boundary=xyz:abc:foobar",
"Host": "httpbin.org",
"X-Amzn-Trace-Id": "Root=1-635a0412-6a407dba17a44c4b09c49a36"
},
"json": null,
"origin": "76.71.106.87",
"url": "https://httpbin.org/post"
}
DONE
Related
i try use sendgridr from this topic How I can add cc when sending email from R with SendGrid
and I have problem with HTTP error.
in the result I've got:
POST /v3/mail/send HTTP/1.1
Host: api.sendgrid.com
User-Agent: libcurl/7.64.1 r-curl/4.3.2 httr/1.4.2
Accept-Encoding: deflate, gzip
Accept: application/json, text/xml, application/xml, */*
Authorization: Bearer SG.Iu.............xxxx....
Content-Type: application/json
Content-Length: 470
>> {"personalizations":[{
>> "to": [ {"email": "mymail#mail.com"}],
>> "cc": [ {"email": "mymail#mail.com"}]
>> }],
>> "from": {"email": "mymail#mail.com"},
>> "subject": "Testing Sendgrid",
>> "content": [{"type": "text/plain", "value": "
>> Dear friend,
>>
>>
>>
>> I'm testing email.
>>
>>
>>
>> Kind regards,
>>
>> Mati"}]
>> }
HTTP/1.1 400 Bad Request
Server: nginx
Date: Wed, 29 Jun 2022 09:13:15 GMT
Content-Type: application/json
Content-Length: 63
Connection: keep-alive
Access-Control-Allow-Origin: https://sendgrid.api-docs.io
Access-Control-Allow-Methods: POST
Access-Control-Allow-Headers: Authorization, Content-Type, On-behalf-of, x-sg-elas-acl
Access-Control-Max-Age: 600
X-No-CORS-Reason: https://sendgrid.com/docs/Classroom/Basics/API/cors.html
Strict-Transport-Security: max-age=600; includeSubDomains
Kind regards
Mat
I'd like to send a HEAD request with a request body.
So I tried the below commands. But I got some errors.
$ curl -X HEAD http://localhost:8080 -d "test"
Warning: Setting custom HTTP method to HEAD with -X/--request may not work the
Warning: way you want. Consider using -I/--head instead.
curl: (18) transfer closed with 11 bytes remaining to read
or I tried this one:
$ curl -I http://localhost:8080 -d "test"
Warning: You can only select one HTTP request method! You asked for both POST
Warning: (-d, --data) and HEAD (-I, --head).
I think that RFC doesn't prohibit sending HEAD request with a request body.
How can I send ?
By default, with -d/--data, method "POST" is used.
With -I/--head you sugest to use "HEAD" method.
How your service accept which method (POST or HEAD) ?
I use "https://httpbin.org" site for testing.
With cURL, yout could use, POST like this:
$ curl --silent --include https://httpbin.org/post -d "data=spam_and_eggs"
HTTP/2 200
date: Thu, 30 Sep 2021 18:57:02 GMT
content-type: application/json
content-length: 438
server: gunicorn/19.9.0
access-control-allow-origin: *
access-control-allow-credentials: true
{
"args": {},
"data": "",
"files": {},
"form": {
"data": "spam_and_eggs"
},
"headers": {
"Accept": "*/*",
"Content-Length": "18",
"Content-Type": "application/x-www-form-urlencoded",
"Host": "httpbin.org",
"User-Agent": "curl/7.71.1",
"X-Amzn-Trace-Id": "Root=1-6156087e-6b04f4645dce993909a95b24"
},
"json": null,
"origin": "86.245.210.158",
"url": "https://httpbin.org/post"
}
or "HEAD" method:
$ curl --silent -X HEAD --include https://httpbin.org/headers -d "data=spam_and_eggs"
HTTP/2 200
date: Thu, 30 Sep 2021 18:58:30 GMT
content-type: application/json
content-length: 260
server: gunicorn/19.9.0
access-control-allow-origin: *
access-control-allow-credentials: true
I inspected with strace (with the HTTP protocol) the HEAD request with data are passed to the server:
sendto(5, "HEAD /headers HTTP/1.1\r\nHost: httpbin.org\r\nUser-Agent: curl/7.71.1\r\nAccept: */*\r\nContent-Length: 18\r\nContent-Type: application/x-www-form-urlencoded\r\n\r\ndata=spam_and_eggs", 170, MSG_NOSIGNAL, NULL, 0) = 170
Of course, without "--silent" option, the warning message appears:
Warning: Setting custom HTTP method to HEAD with -X/--request may not work the
Warning: way you want. Consider using -I/--head instead.
My research are based on this very old post: https://serverfault.com/questions/140149/difference-between-curl-i-and-curl-x-head
I am trying to consume an API which requires NTLM authentication.
This curl command works fine:
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' --ntlm -u user:password -d '{ "key1": "100", "key2": "1"}' http://some/api/v/12
Now I am trying to do the same from a Go program:
package main
import (
"bytes"
"fmt"
"net/url"
"net/http"
"io/ioutil"
"log"
"github.com/Azure/go-ntlmssp"
)
func main() {
url_ := "http://some/api/v/12"
client := &http.Client{
Transport: ntlmssp.Negotiator{
RoundTripper:&http.Transport{},
},
}
data := url.Values{}
data.Set("key1", "100")
data.Set("key2", "1")
b := bytes.NewBufferString(data.Encode())
req, err := http.NewRequest("POST", url_, b)
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Accept", "application/json")
req.SetBasicAuth("user", "password")
resp, err := client.Do(req)
if err != nil {
fmt.Printf("Error : %s", err)
} else {
responseData, err := ioutil.ReadAll(resp.Body)
if err != nil {
log.Fatal(err)
}
responseString := string(responseData)
fmt.Println(responseString)
resp.Body.Close()
}
}
When I execute this program I receive an "invalid credentials" error which I normally receive when I don't include "--ntlm" flag in the curl command.
Can you please me give me a hint how can I accomplish this task with Go?
Update
printing the request from the curl command:
* About to connect() to www.xxx.xxx.com port xx (#0)
* Trying xxx.xxx.x.xxx...
* Connected to www.xxx.xxx.com (xxx.xxx.x.xx) port xx (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* Server auth using NTLM with user 'user'
> POST /some/api/v2 HTTP/1.1
> Authorization: NTLM xxxxx (44 cahracters)
> User-Agent: curl/7.29.0
> Host: www.xxx.xxx.com
> Content-Type: application/json
> Accept: application/json
> Content-Length: 0
>
< HTTP/1.1 401 Unauthorized
< Content-Type: text/html; charset=us-ascii
< Server: Microsoft-HTTPAPI/2.0
< WWW-Authenticate: NTLM xxxxx (312 characters)
< Date: Thu, xx Aug xxxx xx:xx:xx xxx
< Content-Length: 341
<
* Ignoring the response-body
* Connection
* Issue another request to this URL: 'http://some/api/v2'
* Found bundle for host www.xxx.xxx.com: 0x0000
* Re-using existing connection!
* Connected to www.xxx.xxx.com (xxx.xxx.x.xx) port xx (#0)
* Server auth using NTLM with user 'user'
> POST /api/v2 HTTP/1.1
> Authorization: NTLM xxx (176 characters)
> User-Agent: curl/7.29.0
> Host: www.xxx.xxx.com
> Content-Type: application/json
> Accept: application/json
> Content-Length: 39
>
* upload completely sent off: 39 out of 39 bytes
< HTTP/1.1 200 OK
< Cache-Control: no-cache
< Pragma: no-cache
< Content-Type: application/json; charset=utf-8
< Expires: -1
< Server: Microsoft-IIS/7.5
< X-AspNet-Version: 4.0.30319
< Persistent-Auth: true
< X-Powered-By: ASP.NET
< Date: Thu, 08 Aug 2019 06:49:41 GMT
< Content-Length: 1235
NTLM needs a fully qualified Domain\Username login. Email or simple username does not work. So for the username part, it has to look like this:
MYDOMAIN\[username]
where [username] is the actual windows user.
Apache Tika should be accessible from Python program via HTTP, but I can't get it to work.
I am using this command to run the server (with and without the two options at the end):
java -jar tika-server-1.17.jar --port 5677 -enableUnsecureFeatures -enableFileUrl
And it works fine with curl:
curl -v -T /tmp/tmpsojwBN http://localhost:5677/tika
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 5677 (#0)
> PUT /tika HTTP/1.1
> Host: localhost:5677
> User-Agent: curl/7.47.0
> Accept: */*
> Accept-Encoding: gzip, deflate
> Content-Length: 418074
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Date: Sat, 07 Apr 2018 12:28:41 GMT
< Transfer-Encoding: chunked
< Server: Jetty(8.y.z-SNAPSHOT)
But when I try something like (tried different combinations for headers, here I recreated same headers as python-tika client uses):
with tempfile.NamedTemporaryFile() as tmp_file:
download_file(url, tmp_file)
payload = open(tmp_file.name, 'rb')
headers = {
'Accept': 'application/json',
'Content-Disposition': 'attachment; filename={}'.format(
os.path.basename(tmp_file.name))}
response = requests.put(TIKA_ENDPOINT_URL + '/tika', payload,
headers=headers,
verify=False)
I've tried to use payload as well as fileUrl - with the same result of WARN javax.ws.rs.ClientErrorException: HTTP 406 Not Acceptable and java stack trace on the server. Full trace:
WARN javax.ws.rs.ClientErrorException: HTTP 406 Not Acceptable
at org.apache.cxf.jaxrs.utils.SpecExceptions.toHttpException(SpecExceptions.java:117)
at org.apache.cxf.jaxrs.utils.ExceptionUtils.toHttpException(ExceptionUtils.java:173)
at org.apache.cxf.jaxrs.utils.JAXRSUtils.findTargetMethod(JAXRSUtils.java:542)
at org.apache.cxf.jaxrs.interceptor.JAXRSInInterceptor.processRequest(JAXRSInInterceptor.java:177)
at org.apache.cxf.jaxrs.interceptor.JAXRSInInterceptor.handleMessage(JAXRSInInterceptor.java:77)
at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307)
at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)
at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:274)
at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:261)
at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:76)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1088)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1024)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:973)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1035)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:641)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:231)
at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:748)
I've also tried to compare ( with nc -l localhost 5677 | less) what is so different with two requests (payload abbreviated):
From curl:
PUT /tika HTTP/1.1
Host: localhost:5677
User-Agent: curl/7.47.0
Accept: */*
Content-Length: 418074
Expect: 100-continue
%PDF-1.4
%<D3><EB><E9><E1>
1 0 obj
<</Creator (Chromium)
From Python requests library:
PUT /tika HTTP/1.1
Host: localhost:5677
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: application/json
User-Agent: python-requests/2.13.0
Content-type: application/pdf
Content-Length: 246176
%PDF-1.4
%<D3><EB><E9><E1>
1 0 obj
<</Creator (Chromium)
The question is, what is the correct way to call Tika server from Python?
I've also tried python tika library in client-only mode and using tika-app via jnius. With tika client, as well as using tika-app.jar with pyjnius, I only freezes (call never returns) when I use them in a celery worker. At the same, pyjnius / tika-app and tika-python script both work nicely in a script: I have not figured out what is wrong inside celery worker. I guess, something to do with threading and/or initialization in wrong place. But that is a topic for another question.
And here is what tika-python requests:
PUT /tika HTTP/1.1
Host: localhost:5677
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: application/json
User-Agent: python-requests/2.13.0
Content-Disposition: attachment; filename=tmpb3YkTq
Content-Length: 183234
%PDF-1.4
%<D3><EB><E9><E1>
1 0 obj
<</Creator (Chromium)
And now it seems like this is some kind of a problem with tika server:
$ tika-python --verbose --server 'localhost' --port 5677 parse all /tmp/tmpb3YkTq
2018-04-08 09:44:11,555 [MainThread ] [INFO ] Writing ./tmpb3YkTq_meta.json
(<open file '<stderr>', mode 'w' at 0x7f0b688eb1e0>, 'Request headers: ', {'Accept': 'application/json', 'Content-Disposition': 'attachment; filename=tmpb3YkTq'})
(<open file '<stderr>', mode 'w' at 0x7f0b688eb1e0>, 'Response headers: ', {'Date': 'Sun, 08 Apr 2018 06:44:13 GMT', 'Transfer-Encoding': 'chunked', 'Content-Type': 'application/json', 'Server': 'Jetty(8.y.z-SNAPSHOT)'})
['./tmpb3YkTq_meta.json']
Cf:
$ tika-python --verbose --server 'localhost' --port 5677 parse text /tmp/tmpb3YkTq
2018-04-08 09:43:38,326 [MainThread ] [INFO ] Writing ./tmpb3YkTq_meta.json
(<open file '<stderr>', mode 'w' at 0x7fc3eee4a1e0>, 'Request headers: ', {'Accept': 'application/json', 'Content-Disposition': 'attachment; filename=tmpb3YkTq'})
(<open file '<stderr>', mode 'w' at 0x7fc3eee4a1e0>, 'Response headers: ', {'Date': 'Sun, 08 Apr 2018 06:43:38 GMT', 'Content-Length': '0', 'Server': 'Jetty(8.y.z-SNAPSHOT)'})
2018-04-08 09:43:38,409 [MainThread ] [WARNI] Tika server returned status: 406
['./tmpb3YkTq_meta.json']
I would like to execute a fairly complex HTTP request with multipart/mixed boundaries from the command line.
POST /batch HTTP/1.1
Host: www.googleapis.com
Content-length: 592
Content-type: multipart/mixed; boundary=batch_0123456789
Authorization: Bearer authorization_token
--batch_0123456789
Content-Type: application/http
Content-ID: <item1:user#example.com>
Content-Transfer-Encoding: binary
POST /drive/v2/files/fileId/permissions
Content-Type: application/json
Content-Length: 71
{
"role": "reader",
"type": "user",
"value": "user#example.com"
}
--batch_0123456789
Content-Type: application/http
Content-ID: <item2:user#example.com>
Content-Transfer-Encoding: binary
POST /drive/v2/files/fileId/permissions
Content-Type: application/json
Content-Length: 71
{
"role": "reader",
"type": "user",
"value": "user#example.com"
}
--batch_0123456789--
Ideally I would like to put this request into a file and then simply call curl to execute that HTTP request.
curl myrequest.txt
Is there any easy straightforward way of doing this? I understand that there are client libraries that have their idiomatic ways of handling this, but I am interested to find out if there is a way to do this from the command line.
You can use the --config option (see the "CONFIG FILE" section of the manual for more details):
curl --config myrequest.txt
I don't think there's a clean way to embed a multiline POST body within the config file. You could replace each newline character with \r\n (CRLF newlines are required for multipart requests):
url = "http://www.googleapis.com/batch"
header = "Content-length: 592"
header = "Content-type: multipart/mixed; boundary=batch_0123456789"
header = "Authorization: Bearer authorization_token"
data-binary = "--batch_0123456789\r\nContent-Type: application/http\r\nContent-ID: <item1:user#example.com>\r\nContent-Transfer-Encoding: binary\r\n\r\n..."
but that's not very easy to read.
Alternatively, you could put the POST body in a separate file. For example:
myrequest.txt
url = "http://www.googleapis.com/batch"
header = "Content-length: 592"
header = "Content-type: multipart/mixed; boundary=batch_0123456789"
header = "Authorization: Bearer authorization_token"
data-binary = "#myrequestbody.txt"
myrequestbody.txt
--batch_0123456789
Content-Type: application/http
Content-ID: <item1:user#example.com>
Content-Transfer-Encoding: binary
POST /drive/v2/files/fileId/permissions
Content-Type: application/json
Content-Length: 71
...