How do you view what exactly is sent over the wire by a HTTP client and how the connection is configured?
go has packages called httputil & httptrace that can facilitate viewing the HTTP lifecycle, as well as what is actually sent over the wire: http-tracing blog post
httptrace go doc
httputil go doc
NOTE httputil.DumpRequestOut is meant for outgoing messages on the client side and httputil.DumpRequest are meant for incoming messages on the server side
NOTE httputil.DumpRequestOut appends the default transport's headers, so if you customize the transport, the changes would not be reflected. See: Why does the HTTP Client Force an Accept-Encoding header
Sample Implementation:
package main
import (
"crypto/tls"
"fmt"
"net/http"
"net/http/httptrace"
"net/http/httputil"
"net/textproto"
"time"
)
func main() {
url := "https://www.google.com"
client := &http.Client{}
req, err := http.NewRequest(http.MethodGet, url, nil)
if err != nil {
return
}
requestDump, err := httputil.DumpRequestOut(req, false)
if err != nil {
fmt.Printf("%s: REQUEST ERR: %s\n", time.Now(), err)
}
fmt.Printf("%s: REQUEST: \n%s\n", time.Now(), string(requestDump))
trace := &httptrace.ClientTrace{
// GetConn is called before a connection is created or
// retrieved from an idle pool. The hostPort is the
// "host:port" of the target or proxy. GetConn is called even
// if there's already an idle cached connection available.
GetConn: func(hostPort string) {
fmt.Printf("Get Conn: hostPort: %s\n", hostPort)
},
// GotConn is called after a successful connection is
// obtained. There is no hook for failure to obtain a
// connection; instead, use the error from
// Transport.RoundTrip.
GotConn: func(connInfo httptrace.GotConnInfo) {
fmt.Printf("Got Conn: connInfo: %+v\n", connInfo)
},
// PutIdleConn is called when the connection is returned to
// the idle pool. If err is nil, the connection was
// successfully returned to the idle pool. If err is non-nil,
// it describes why not. PutIdleConn is not called if
// connection reuse is disabled via Transport.DisableKeepAlives.
// PutIdleConn is called before the caller's Response.Body.Close
// call returns.
// For HTTP/2, this hook is not currently used.
PutIdleConn: func(err error) {
fmt.Printf("PutIdlConn: ERR: %s\n", err)
},
// GotFirstResponseByte is called when the first byte of the response
// headers is available.
GotFirstResponseByte: func() {
fmt.Println("GotFirstResponseByte")
},
// Got100Continue is called if the server replies with a "100
// Continue" response.
Got100Continue: func() {
fmt.Println("Got100Continue")
},
// Got1xxResponse is called for each 1xx informational response header
// returned before the final non-1xx response. Got1xxResponse is called
// for "100 Continue" responses, even if Got100Continue is also defined.
// If it returns an error, the client request is aborted with that error value.
Got1xxResponse: func(code int, header textproto.MIMEHeader) error {
fmt.Printf("Got1xxResponse: code: %d header: %+v\n", code, header)
return nil
},
// DNSStart is called when a DNS lookup begins.
DNSStart: func(dnsInfo httptrace.DNSStartInfo) {
fmt.Printf("DNS Start: dnsInfo: %+v\n", dnsInfo)
},
// DNSDone is called when a DNS lookup ends.
DNSDone: func(dnsInfo httptrace.DNSDoneInfo) {
fmt.Printf("DNS Done: dnsInfo: %+v\n", dnsInfo)
},
// ConnectStart is called when a new connection's Dial begins.
// If net.Dialer.DualStack (IPv6 "Happy Eyeballs") support is
// enabled, this may be called multiple times.
ConnectStart: func(network, addr string) {
fmt.Printf("Connect Start: Network Addr: %s %s\n", network, addr)
},
// ConnectDone is called when a new connection's Dial
// completes. The provided err indicates whether the
// connection completedly successfully.
// If net.Dialer.DualStack ("Happy Eyeballs") support is
// enabled, this may be called multiple times.
ConnectDone: func(network, addr string, err error) {
fmt.Printf("Connect Done: Network Addr: %s %s ERR: %s\n", network, addr, err)
},
// TLSHandshakeStart is called when the TLS handshake is started. When
// connecting to an HTTPS site via an HTTP proxy, the handshake happens
// after the CONNECT request is processed by the proxy.
TLSHandshakeStart: func() {
fmt.Println("TLSHandshakeStart")
},
// TLSHandshakeDone is called after the TLS handshake with either the
// successful handshake's connection state, or a non-nil error on handshake
// failure.
TLSHandshakeDone: func(connState tls.ConnectionState, err error) {
fmt.Printf("TLSHandshakeDone: connState: %+v ERR: %s\n", connState, err)
},
// WroteHeaderField is called after the Transport has written
// each request header. At the time of this call the values
// might be buffered and not yet written to the network.
WroteHeaderField: func(key string, value []string) {
fmt.Printf("WroteHeaderField: key: %s val: %s\n", key, value)
},
// WroteHeaders is called after the Transport has written
// all request headers.
WroteHeaders: func() {
fmt.Println("WroteHeaders")
},
// Wait100Continue is called if the Request specified
// "Expect: 100-continue" and the Transport has written the
// request headers but is waiting for "100 Continue" from the
// server before writing the request body.
Wait100Continue: func() {
fmt.Println("Wait100Continue")
},
// WroteRequest is called with the result of writing the
// request and any body. It may be called multiple times
// in the case of retried requests.
WroteRequest: func(info httptrace.WroteRequestInfo) {
fmt.Printf("WroteRequest: %+v\n", info)
},
}
req = req.WithContext(httptrace.WithClientTrace(req.Context(), trace))
resp, err := client.Do(req)
fmt.Printf("%s: RESPONSE OBJ: \n%v\n", time.Now(), resp)
}
Output:
2020-07-29 14:09:53.682167 -0700 PDT m=+0.000769969: REQUEST:
GET / HTTP/1.1
Host: www.google.com
User-Agent: Go-http-client/1.1
Accept-Encoding: gzip
Get Conn: hostPort: www.google.com:443
DNS Start: dnsInfo: {Host:www.google.com}
DNS Done: dnsInfo: {Addrs:[{IP:172.217.17.100 Zone:} {IP:2a00:1450:400e:806::2004 Zone:}] Err:<nil> Coalesced:false}
Connect Start: Network Addr: tcp 172.217.17.100:443
Connect Done: Network Addr: tcp 172.217.17.100:443 ERR: %!s(<nil>)
TLSHandshakeStart
TLSHandshakeDone: connState: {Version:772 HandshakeComplete:true DidResume:false CipherSuite:4865 NegotiatedProtocol:h2 NegotiatedProtocolIsMutual:true ServerName: PeerCertificates:[0xc0001d6000 0xc0001d6580] VerifiedChains:[[0xc0001d6000 0xc0001d6580 0xc000278b00]] SignedCertificateTimestamps:[] OCSPResponse:[] ekm:0x1226ae0 TLSUnique:[]} ERR: %!s(<nil>)
Got Conn: connInfo: {Conn:0xc0001a2000 Reused:false WasIdle:false IdleTime:0s}
WroteHeaderField: key: :authority val: [www.google.com]
WroteHeaderField: key: :method val: [GET]
WroteHeaderField: key: :path val: [/]
WroteHeaderField: key: :scheme val: [https]
WroteHeaderField: key: accept-encoding val: [gzip]
WroteHeaderField: key: user-agent val: [Go-http-client/2.0]
WroteHeaders
WroteRequest: {Err:<nil>}
GotFirstResponseByte
2020-07-29 14:09:54.620195 -0700 PDT m=+0.938796345: RESPONSE OBJ:
&{200 OK 200 HTTP/2.0 2 0 map[Alt-Svc:[h3-29=":443"; ma=2592000,h3-27=":443"; ma=2592000,h3-T050=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"] Cache-Control:[private, max-age=0] Content-Type:[text/html; charset=ISO-8859-1] Date:[Wed, 29 Jul 2020 21:09:54 GMT] Expires:[-1] P3p:[CP="This is not a P3P policy! See g.co/p3phelp for more info."] Server:[gws] Set-Cookie:[1P_JAR=2020-07-29-21; expires=Fri, 28-Aug-2020 21:09:54 GMT; path=/; domain=.google.com; Secure NID=204=qnJT-6IGam7-C1fTR8uIkbDPnfV7OwgOGn5-6tGCWLYmeaRMoSKgV1qSRfKGLghNgQVWY9N_o6hUWKm69I5KrdVqIEVVxRy6XSY6F4c1JyTJZZqEMxMlkpznu-PWOn9eAezKBONTxCZgsGZYboEeYZ5-qZBjUvd7BratNIPkTxU; expires=Thu, 28-Jan-2021 21:09:54 GMT; path=/; domain=.google.com; HttpOnly] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]] 0xc00018c1e0 -1 [] false true map[] 0xc000112100 0xc00007c000}
Related
I am basically making a health check crawler to a huge list of domains. I have a Golang script that creates ~256 routines that make requests to the list of domains. I am using the same client with the following transport configuration:
# init func
this.client = &http.Client{
Transport: &http.Transport{
ForceAttemptHTTP2: true,
TLSHandshakeTimeout: TLSHandShakeTimeout,
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
MaxConnsPerHost: -1,
DisableKeepAlives: true,
},
Timeout: RequestTimeout,
}
...
# crawler func
req, err := http.NewRequestWithContext(this.ctx, "GET", opts.Url, nil)
if err != nil {
return nil, errors.Wrap(err, "failed to create request")
}
res, err := this.client.Do(req)
if err != nil {
return nil, err
}
defer res.Body.Close()
...
I ran netstat -anp | wc -l and can see over 2000+ connections with TIME_WAIT.
The default number of goroutines per host for http.Client is 2. One is for the receiver and the other for the sender. So for thousands of domains, there could be thousands of goroutines here.
As the DisableKeepAlives is set to true, so the connection will be closed when the response of HTTP is done. The TIME_WAIT is the normal TCP state after closing a connection.
However, the default timeout of TIME_WAIT state on Linux is 60 seconds. The huge number of TIME_WAIT states could cause the server (such as probe/crawler) connection issue.
In order to solve the TIME_WAIT issue. The SO_LINGER option could help. It disables the default TCP delayed-close behavior, which sends the RST to the peer when the connection is closed. And it would remove the TIME_wAIT state of the TCP connection.
More discussion could be found here When is TCP option SO_LINGER (0) required?
Sample
dialer := &net.Dialer{
Control: func(network, address string, conn syscall.RawConn) error {
var opterr error
if err := conn.Control(func(fd uintptr) {
l := &syscall.Linger{}
opterr = syscall.SetsockoptLinger(int(fd), unix.SOL_SOCKET, unix.SO_LINGER, l)
}); err != nil {
return err
}
return opterr
},
}
client := &http.Client{
Transport: &http.Transport{
DialContext: dialer.DialContext,
},
}
Moreover, here is another SO_LINGER use case in EaseProbe. It is a simple, standalone, and lightweight tool that can do health/status checking.
What I try to achieve:
I want to recieve Http request from client then hijack the connection to monitor it on server side (Checking health of connection). Also I want to send Http response on that hijacked connection. In that order: recieve Http request, get request body, hijack connection, return response to client, monitor connection health on server side.
What I've already achieved:
Here is the code of http request handler:
func (c *HandlerCtx) HijackHandler(w http.ResponseWriter, req *http.Request){
// Fetching request body skipped
// Hijacking the connection
h, _ := w.(http.Hijacker)
conn, br, err := h.Hijack()
if err != nil {
fmt.Println(err.Error())
return
}
responseBody := "Http response from hijacked connection"
hr := http.Response{
Status: "200 OK",
StatusCode: 200,
Proto: "HTTP/1.1",
ProtoMajor: 1,
ProtoMinor: 1,
Header: make(http.Header, 0),
Body: ioutil.NopCloser(bytes.NewBufferString(responseBody)),
ContentLength: int64(len(responseBody)),
TransferEncoding: nil,
Close: false,
Uncompressed: false,
Trailer: nil,
Request: req,
TLS: nil,
}
// Writing response
err = hr.Write(br)
if err != nil{
fmt.Println(err.Error())
return
}
// Sending EOF to allow io.ReadAll(resp.Body) without blocking
if v, ok := conn.(interface{ CloseWrite() error }); ok {
err = v.CloseWrite()
if err != nil {
fmt.Println(err.Error())
}
}
// Monitor connection health
}
This is the client code:
func main(){
// Body skipped for testing purposes
resp, err := http.Post("http://127.0.0.1:8085/hello", "application/json", nil)
if err != nil{
fmt.Println(err.Error())
return
}
b, err := io.ReadAll(resp.Body)
if err != nil{
fmt.Println(err.Error())
return
}
}
Now I recieve unexpected EOF from client after calling err = v.CloseWrite() but when I don't CloseWrite the client code stuck on io.ReadAll(resp.Body)
Is there any way to force client to read that http response? Please help me find solution.
So my HTTP client initialisation and send request code looks like this.
package http_util
import (
"crypto/tls"
"net/http"
"time"
)
var httpClient *http.Client
func Init() {
tr := &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
MaxIdleConnsPerHost: 200,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
}
httpClient = &http.Client{Transport: tr, Timeout: 30 * time.Second}
}
func SendRequest(ctx context.Context, request *http.Request) (*SomeRespStruct, error) {
httpResponse, err := httpClient.Do(request)
if err != nil {
return nil, err
}
responseBody, err := ioutil.ReadAll(httpResponse.Body)
defer httpResponse.Body.Close()
if err != nil {
return nil, err
}
response := &SomeRespStruct{}
err = json.Unmarshal(responseBody, response)
if err != nil {
return nil, err
}
return response, nil
}
When I launch my server, I call http_util.Init().
The issue arises when I receive multiple requests (20+) at once to call this external server. In one of my functions I do
package external_api
import (
"context"
"log"
)
func SomeAPICall(ctx context.Context) (SomeRespStruct, error) {
// Build request
request := buildHTTPRequest(...)
log.Printf("Send request: %v", request)
response, err := http_util.SendRequest(ctx, request)
// Error checks
if err != nil {
log.Printf("HTTP request timed out: %v", err)
return nil, err
}
log.Printf("Received response: %v", response)
return response, nil
}
My issue is that I get a 15~20s lag in between the Send request and Received response logs based on the output timestamp when there is high request volume. Upon checking with the server that's handling my requests, I found out that on their end, processing time from end-to-end takes less than a second (the same exact request that had a long turnaround time according to my own logs), so I'm not too sure what is the root cause of this high turnaround time. I also did a traceroute and a ping to the server as well and there was no delay, so this should not be a network error.
I've looked around and it seems like the suggested solutions are:
to increase the MaxIdleConnsPerHost
to read the HTTP response body in full and close it
Both of which I have already done.
I'm not sure if there is more tuning to be done regarding the configuration of my HTTP client to resolve this issue, or if I should investigate other workarounds, for instance retry or perhaps scaling (but my CPU and memory utilisation are at the 2-3% range).
I am building a web server that must accept HTTP requests from a client, but must also accept requests over a raw TCP socket from peers. Since HTTP runs over TCP, I am trying to route the HTTP requests by the TCP server rather than running two separate services.
Is there an easy way to read in the data with net.Conn.Read(), determine if it is an HTTP GET/POST request and pass it off to the built in HTTP handler or Gorilla mux? Right now my code looks like this and I am building the http routing logic myself:
func ListenConn() {
listen, _ := net.Listen("tcp", ":8080")
defer listen.Close()
for {
conn, err := listen.Accept()
if err != nil {
logger.Println("listener.go", "ListenConn", err.Error())
}
go HandleConn(conn)
}
}
func HandleConn(conn net.Conn) {
defer conn.Close()
// determines if it is an http request
scanner := bufio.NewScanner(conn)
for scanner.Scan() {
ln := scanner.Bytes()
fmt.Println(ln)
if strings.Fields(string(ln))[0] == "GET" {
http.GetRegistrationCode(conn, strings.Fields(string(ln))[1])
return
}
... raw tcp handler code
}
}
It is not a good idea to mix HTTP and raw TCP traffic.
Think about all firewalls and routers between your application and clients. They all designed to enable safe HTTP(s) delivery. What they will do with your tcp traffic coming to the same port as valid HTTP?
As a solution you can split your traffic to two different ports in the same application.
With ports separation you can route your HTTP and TCP traffic independently and configure appropriate network security for every channel.
Sample code to listen for 2 different ports:
package main
import (
"fmt"
"net"
"net/http"
"os"
)
type httpHandler struct {
}
func (m *httpHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
fmt.Println("HTTP request")
}
func main() {
// http
go func() {
http.ListenAndServe(":8001", &httpHandler{})
}()
// tcp
l, err := net.Listen("tcp", "localhost:8002")
if err != nil {
fmt.Println("Error listening:", err.Error())
os.Exit(1)
}
defer l.Close()
for {
conn, err := l.Accept()
if err != nil {
fmt.Println("Error accepting: ", err.Error())
os.Exit(1)
}
go handleRequest(conn)
}
}
// Handles incoming requests.
func handleRequest(conn net.Conn) {
// read/write from connection
fmt.Println("TCP connection")
conn.Close()
}
open http://localhost:8001/ in browser and run command line echo -n "test" | nc localhost 8002 to test listeners
The http.Request struct includes the remote IP and port of the request's sender:
// RemoteAddr allows HTTP servers and other software to record
// the network address that sent the request, usually for
// logging. This field is not filled in by ReadRequest and
// has no defined format. The HTTP server in this package
// sets RemoteAddr to an "IP:port" address before invoking a
// handler.
// This field is ignored by the HTTP client.
**RemoteAddr string**
The http.Response object has no such field.
I would like to know the IP address that responded to the request I sent, even when I sent it to a DNS address.
I thought that net.LookupHost() might be helpful, but 1) it can return multiple IPs for a single host name, and 2) it ignores the hosts file unless cgo is available, which it is not in my case.
Is it possible to retrieve the remote IP address for an http.Response?
Use the net/http/httptrace package and use the GotConnInfo hook to capture the net.Conn and its corresponding Conn.RemoteAddr().
This will give you the address the Transport actually dialled, as opposed to what was resolved in DNSDoneInfo:
package main
import (
"log"
"net/http"
"net/http/httptrace"
)
func main() {
req, err := http.NewRequest("GET", "https://example.com/", nil)
if err != nil {
log.Fatal(err)
}
trace := &httptrace.ClientTrace{
GotConn: func(connInfo httptrace.GotConnInfo) {
log.Printf("resolved to: %s", connInfo.Conn.RemoteAddr())
},
}
req = req.WithContext(httptrace.WithClientTrace(req.Context(), trace))
client := &http.Client{}
_, err := client.Do(req)
if err != nil {
log.Fatal(err)
}
}
Outputs:
~ go run ip.go
2017/02/18 19:38:11 resolved to: 104.16.xx.xxx:443
Another solution I came up with was the hook the DialContext function in the http client transport. This is a specific solution that lets you modify the http.Client instead of the request which may be useful.
We first create a function that returns a hooked dial context
func remoteAddressDialHook(remoteAddressPtr *net.Addr) func(ctx context.Context, network string, address string) (net.Conn, error) {
hookedDialContext := func(ctx context.Context, network, address string) (net.Conn, error) {
originalDialer := &net.Dialer{
Timeout: 30 * time.Second,
KeepAlive: 30 * time.Second,
}
conn, err := originalDialer.DialContext(ctx, network, address)
if err != nil {
return nil, err
}
// conn was successfully created
*remoteAddressPtr = conn.RemoteAddr()
return conn, err
}
return hookedDialContext
}
We can then use this function to create a DialContext that writes to an outparameter
var remoteAddr net.Addr
customTransport := &http.Transport{
Proxy: http.ProxyFromEnvironment,
DialContext: remoteAddressDialHook(&remoteAddr),
ForceAttemptHTTP2: true,
MaxIdleConns: 100,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
ExpectContinueTimeout: 1 * time.Second,
}
customHttpClient := http.Client{
Transport: customTransport,
}
// do what you normally would with a http client, it will then set the remoteAddr to be the remote address
fmt.Println(remoteAddr.String())