Sample Code:
package main
import (
"fmt"
"net/http"
"net/http/httputil"
)
func main() {
client := &http.Client{
Transport: &http.Transport{
DisableCompression: true,
},
}
url := "https://google.com"
req, err := http.NewRequest(http.MethodGet, url, nil)
if err != nil {
return
}
//req.Header.Set("Accept-Encoding", "*")
//req.Header.Del("Accept-Encoding")
requestDump, err := httputil.DumpRequestOut(req, false)
if err != nil {
fmt.Println(err)
}
fmt.Println(string(requestDump))
client.Do(req)
}
Output:
GET / HTTP/1.1
Host: google.com
User-Agent: Go-http-client/1.1
Accept-Encoding: gzip
With only req.Header.Set("Accept-Encoding", "*" uncommented:
GET / HTTP/1.1
Host: google.com
User-Agent: Go-http-client/1.1
Accept-Encoding: *
With only req.Header.Del("Accept-Encoding") uncommented:
GET / HTTP/1.1
Host: google.com
User-Agent: Go-http-client/1.1
Accept-Encoding: gzip
With both lines uncommented:
GET / HTTP/1.1
Host: google.com
User-Agent: Go-http-client/1.1
Accept-Encoding: gzip
Does DisableCompression actually do anything to the HTTP Request itself?
According to the godocs:
// DisableCompression, if true, prevents the Transport from
// requesting compression with an "Accept-Encoding: gzip"
// request header when the Request contains no existing
// Accept-Encoding value. If the Transport requests gzip on
// its own and gets a gzipped response, it's transparently
// decoded in the Response.Body. However, if the user
// explicitly requested gzip it is not automatically
// uncompressed.
As per document:
DumpRequestOut is like DumpRequest but for outgoing client requests.
It includes any headers that the standard http.Transport adds, such as User-Agent.
That means it adds "Accept-Encoding: gzip" to the printed wire format.
To test what is actually written to the connection, you need to wrap Transport.Dial or Transport.DialContext to provide connection that logs written data.
If you are using a transport that supports httptrace (which all built-in and "x/http/..." transport implementation supports), you may set up a WroteHeaderField callback to inspect written header fields.
If you just need to inspect the headers, however, you can spawn up a httptest.Server.
Playground link provided by #EmilePels:
https://play.golang.org/p/ZPi-_mfDxI8
Related
I have a strange situation. I want to return the content type application/json; charset=utf-8 from an http handler.
func handleTest() http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
if r.Header.Get("Accept") != "application/json" {
w.WriteHeader(http.StatusNotAcceptable)
return
}
w.WriteHeader(http.StatusOK)
w.Header().Set("Content-Type", "application/json; charset=utf-8")
json.NewEncoder(w).Encode(map[string]string{"foo": "bar"})
}
}
When I check for this in my unit tests it is correct. This test does not fail.
func TestTestHandler(t *testing.T) {
request, _ := http.NewRequest(http.MethodGet, "/test", nil)
request.Header.Set("Accept", "application/json")
response := httptest.NewRecorder()
handleTest().ServeHTTP(response, request)
contentType := response.Header().Get("Content-Type")
if contentType != "application/json; charset=utf-8" {
t.Errorf("Expected Content-Type to be application/json; charset=utf-8, got %s", contentType)
return
}
}
But when I try with curl (and other clients) it comes out as text/plain; charset=utf-8.
$ curl -H 'Accept: application/json' localhost:8080/test -v
* Trying 127.0.0.1:8080...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /test HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.68.0
> Accept: application/json
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Tue, 28 Dec 2021 13:02:27 GMT
< Content-Length: 14
< Content-Type: text/plain; charset=utf-8
<
{"foo":"bar"}
* Connection #0 to host localhost left intact
I have tried this with curl, insomnia and python. In all 3 cases the content type came out as text/plain; charset=utf-8.
What is causing this problem and how can I fix it?
From the http package docs:
WriteHeader sends an HTTP response header with the provided status code.
and
Changing the header map after a call to WriteHeader (or Write) has no effect unless the modified headers are trailers.
So you are setting the "Content-Type" header after the header has already been sent out to the client. While mocking this likely works because the buffer where the headers are stored can be modified after the WriteHeader call. But when actually using a TCP connection you can't do this.
So simply move your w.WriteHeader(http.StatusOK) so it happens after the w.Header().Set(...)
Can't understand what is wrong. ioutil.ReadAll should use gzip as for other URLs.
Can reproduce with URL: romboutskorea.co.kr
Error:
gzip: invalid header
Code:
resp, err := http.Get("http://" + url)
if err == nil {
defer resp.Body.Close()
if resp.StatusCode == http.StatusOK {
fmt.Printf("HTTP Response Status : %v\n", resp.StatusCode)
bodyBytes, err := ioutil.ReadAll(resp.Body)
if err != nil {
fmt.Printf("HTTP Response Read error. Url: %v\n", url)
log.Fatal(err)
}
bodyString := string(bodyBytes)
fmt.Printf("HTTP Response Content Length : %v\n", len(bodyString))
}
}
The response of this site is wrong. It is claiming gzip encoding but it does not actually compress the content. The response looks something like this:
HTTP/1.1 200 OK
...
Content-Encoding: gzip
...
Transfer-Encoding: chunked
Content-Type: text/html; charset=euc-kr
8000
<html>
<head>
...
The "8000" comes from the chunked transfer encoding but the "..." is the beginning of the unchunked response body. Obviously this is not compressed even though it is claimed so.
It looks like browsers simply work around this broken site by ignoring the wrong encoding specification. Browsers actually work around lot of broken stuff which does not really add motivation for the providers to fix these issues :( But you can see that curl will fail to:
$ curl -v --compressed http://romboutskorea.co.kr/main/index.php?
...
< HTTP/1.1 200 OK
< ...
< Content-Encoding: gzip
< ...
< Transfer-Encoding: chunked
< Content-Type: text/html; charset=euc-kr
<
* Error while processing content unencoding: invalid code lengths set
* Failed writing data
* Curl_http_done: called premature == 1
* Closing connection 0
curl: (23) Error while processing content unencoding: invalid code lengths set
And so does Python:
$ python3 -c 'import requests; requests.get("http://romboutskorea.co.kr/main/index.php?")'
...
requests.exceptions.ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check'))
I see
Content-Type: text/html; charset=euc-kr
Content-Encoding: gzip
Check the Body content: as in here, it could be an HTTP response where the body is first compressed with gzip and then encoded with chunked transfer encoding.
An NewChunkedReader would be needed, as in this example.
I had a similar issue, but I was dealing with a "hand-crafted" PHP script response which did something like this:
header('Content-Encoding: gzip');
echo #gzcompress($return);
I was trying to read the response from GO with:
gzip.NewReader(resp.Body)
But I should be doing:
zlib.NewReader(resp.Body)
From gzcompress PHP docs:
https://www.php.net/manual/en/function.gzcompress.php
'This function compresses the given string using the ZLIB data format.'
'This is not the same as gzip compression, which includes some header data. See gzencode() for gzip compression.'
Is there a way to intercept a bad HEAD request in a Go HTTP server? A bad request here would be to send a JSON payload with a HEAD request. I call this a Bad Request, but when I attempt a HEAD request with a body via curl, I get this error. However, no logging occurs in Go.
package main
import (
"fmt"
"log"
"net/http"
)
func handler(w http.ResponseWriter, r *http.Request) {
log.Println(r.Method, r.URL)
_, _ = fmt.Fprintf(w, "Hello")
}
func main() {
http.HandleFunc("/", handler)
log.Fatal(http.ListenAndServe(":8080", nil))
}
If I send a curl request without a body, it works as expected and a log entry is generated 2019/11/28 10:58:59 HEAD / .
$ curl -v -X HEAD http://localhost:8080
curl -i -X HEAD http://localhost:8080
Warning: Setting custom HTTP method to HEAD with -X/--request may not work the
Warning: way you want. Consider using -I/--head instead.
HTTP/1.1 200 OK
Date: Thu, 28 Nov 2019 16:03:22 GMT
Content-Length: 5
Content-Type: text/plain; charset=utf-8
However, if I send a curl request with a body, then I get a Bad Request status but no log is updated.
$ curl -i -X HEAD http://localhost:8080 -d '{}'
Warning: Setting custom HTTP method to HEAD with -X/--request may not work the
Warning: way you want. Consider using -I/--head instead.
HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=utf-8
Connection: close
400 Bad Request
I want to catch this error so I can send my own custom error message back. How can I intercept this?
You can't. The HTTP server of the standard lib does not provide any interception point or callback for this case.
The invalid request is "killed" before your handler would be called. You can see this in server.go, conn.serve() method:
w, err := c.readRequest(ctx)
// ...
if err != nil {
switch {
// ...
default:
publicErr := "400 Bad Request"
if v, ok := err.(badRequestError); ok {
publicErr = publicErr + ": " + string(v)
}
fmt.Fprintf(c.rwc, "HTTP/1.1 "+publicErr+errorHeaders+publicErr)
return
}
}
// ...
serverHandler{c.server}.ServeHTTP(w, w.req)
Go's HTTP server provides you an implementation to handle incoming requests from clients that use / adhere to the HTTP protocol. All browsers and notable clients follow the HTTP protocol. It's not the implementation's goal to provide a fully customizable server.
We have a script that on a daily basis checks all of the web links in all of our database records (the users want notifications when a link becomes out of date).
There are a couple of sites that work fine through a web browser from this IP address, but when fetched through GO, they either disconnect before completing the request or return a HTTP authorisation denied message.
I am assuming some sort of firewall (F5) is filtering/blocking the request. This occurs even when I change the HTTP request to use a common user agent. What can we do to ensure a GO request looks like a standard browser?
func fetch_url(url string, d time.Duration) (int, error) {
client := &http.Client{
Timeout: d,
}
req, err := http.NewRequest("GET", url, nil)
if err != nil {
return 0, err
}
req.Header.Set("User-Agent", "Mozilla/5.0 (iPad; CPU OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53")
resp, err := client.Do(req)
if err != nil {
return 0, err
}
status := resp.StatusCode
resp.Body.Close()
return status, nil
}
Try matching the exact headers from a request from your web browser to eliminate other factors. A smart firewall could have heuristics on what looks like a web browser versus a robot.
Notice that the go http client sends only a minimal HTTP request:
GET /foo HTTP/1.1
Host: localhost:3030
User-Agent: Go 1.1 package http
Accept-Encoding: gzip
Whereas a web browser is more chatty:
GET /foo HTTP/1.1
Host: localhost:3030
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.89 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
I'm trying to send a POST request which needs to modify the Header.
Here is my code:
import (
"net/http"
"net/url"
"fmt"
)
const API_URL = "https://api.site.com/api/"
func SendOne(str string) {
v := url.Values{}
v.Add("source", "12345678")
v.Add("text", str)
client := &http.Client{nil, nil, nil}
req, err := http.NewRequest("POST", API_URL, strings.NewReader(v.Encode()))
if err != nil {
fmt.Println(err)
}
req.Header.Add("Authorization", "123456")
res, err := client.Do(req)
if err != nil {
fmt.Println(err)
}
defer res.Body.Close()
}
I have no idea why the code doesn't work. Any clue?
Thanks in advance.
Edit: I forgot to say I was using OAuth 2.0 for authorization.
Using tcpdump we can see that the request headers and body for the code you pasted looks like:
POST / HTTP/1.1
Host: example.com
User-Agent: Go 1.1 package http
Content-Length: 45
Authorization: 123456
Accept-Encoding: gzip
source=12345678&text=http%3A%2F%2Fexample.com
You mention in the comment above that if you add a Content-Type header it works. Doing the same process and dumping the communication between the two peers we get:
POST / HTTP/1.1
Host: example.com
User-Agent: Go 1.1 package http
Content-Length: 45
Authorization: 123456
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip
source=12345678&text=http%3A%2F%2Fexample.com
Which is exactly the same as the prior payload, except it now includes the provided Content-Type header. So, in terms of the behavior within the Go application itself, there's nothing special happening other than what you explicitly told it to do.
The reason why it works when you add the Content-Type header then must be that the actual server you're talking to wants to know how the content body you're providing is encoded.