Retry http request RoundTrip - http

I made a server with client hitting throught http. I set retry mechanism in the client inside its Transport's RoundTripper method. Here's the example of working code for each server and client:
server main.go
package main
import (
"fmt"
"net/http"
"time"
)
func test(w http.ResponseWriter, req *http.Request) {
time.Sleep(2 * time.Second)
fmt.Fprintf(w, "hello\n")
}
func main() {
http.HandleFunc("/test", test)
http.ListenAndServe(":8090", nil)
}
client main.go
package main
import (
"context"
"fmt"
"log"
"net/http"
"time"
)
type Retry struct {
nums int
transport http.RoundTripper
}
// to retry
func (r *Retry) RoundTrip(req *http.Request) (resp *http.Response, err error) {
for i := 0; i < r.nums; i++ {
log.Println("Attempt: ", i+1)
resp, err = r.transport.RoundTrip(req)
if resp != nil && err == nil {
return
}
log.Println("Retrying...")
}
return
}
func main() {
r := &Retry{
nums: 5,
transport: http.DefaultTransport,
}
c := &http.Client{Transport: r}
// each request will be timeout in 1 second
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second)
defer cancel()
req, err := http.NewRequestWithContext(ctx, http.MethodGet, "http://localhost:8090/test", nil)
if err != nil {
panic(err)
}
resp, err := c.Do(req)
if err != nil {
panic(err)
}
fmt.Println(resp.StatusCode)
}
What's happening is the retry seems only work for first iteration. For subsequent iteration it doesn't wait for one second each, instead the debugging message printed for as much as the retry nums.
I expect the retry attempt to be waiting 1 second each as I put the timeout for 1 second in the context. But it seems only wait for 1 second for whole retries. What do I miss?
Beside, how to stop server from processing timeout request?, I saw CloseNotifier already deprecated.

The problem is with the context. Once the context is done, you cannot reuse the same context anymore. You have to re-create the context at every attempt. You can get the timeout from parent context, and use it to create new context with it.
func (r *retry) RoundTrip(req *http.Request) (resp *http.Response, err error) {
var (
duration time.Duration
ctx context.Context
cancel func()
)
if deadline, ok := req.Context().Deadline(); ok {
duration = time.Until(deadline)
}
for i := 0; i < r.nums; i++ {
if duration > 0 {
ctx, cancel = context.WithTimeout(context.Background(), duration)
req = req.WithContext(ctx)
}
resp, err = r.rt.RoundTrip(req)
...
// the rest of code
...
}
return
}
This code will create new fresh context at every attempt by using the timeout from its parent.

For the server you can use the Request.Context() to check if a request is cancelled or not.
In the client the request times out when the context times out after 1 second. So the context does not trigger the roundtrip period. If you want the request to retry before the context is done you should change the behaviour of the transport you are using. You are now using the http.DefaultTransport which is defined as follows:
var DefaultTransport RoundTripper = &Transport{
Proxy: ProxyFromEnvironment,
DialContext: (&net.Dialer{
Timeout: 30 * time.Second,
KeepAlive: 30 * time.Second,
DualStack: true,
}).DialContext,
ForceAttemptHTTP2: true,
MaxIdleConns: 100,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
ExpectContinueTimeout: 1 * time.Second,
}
Transport has more time out variables these set here so depending on when you want to retry you should set the appropriate time out. For example in your case you could set Transport.ResponseHeaderTimeout to 1 second. When the server does not reply with a response header within 1 second the client will retry. You then make your context time out after 5 (or better 6) seconds you should see the client retrying the amount you specified (5 times).

Related

Diagnosing root cause long HTTP response turnaround in Golang

So my HTTP client initialisation and send request code looks like this.
package http_util
import (
"crypto/tls"
"net/http"
"time"
)
var httpClient *http.Client
func Init() {
tr := &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
MaxIdleConnsPerHost: 200,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
}
httpClient = &http.Client{Transport: tr, Timeout: 30 * time.Second}
}
func SendRequest(ctx context.Context, request *http.Request) (*SomeRespStruct, error) {
httpResponse, err := httpClient.Do(request)
if err != nil {
return nil, err
}
responseBody, err := ioutil.ReadAll(httpResponse.Body)
defer httpResponse.Body.Close()
if err != nil {
return nil, err
}
response := &SomeRespStruct{}
err = json.Unmarshal(responseBody, response)
if err != nil {
return nil, err
}
return response, nil
}
When I launch my server, I call http_util.Init().
The issue arises when I receive multiple requests (20+) at once to call this external server. In one of my functions I do
package external_api
import (
"context"
"log"
)
func SomeAPICall(ctx context.Context) (SomeRespStruct, error) {
// Build request
request := buildHTTPRequest(...)
log.Printf("Send request: %v", request)
response, err := http_util.SendRequest(ctx, request)
// Error checks
if err != nil {
log.Printf("HTTP request timed out: %v", err)
return nil, err
}
log.Printf("Received response: %v", response)
return response, nil
}
My issue is that I get a 15~20s lag in between the Send request and Received response logs based on the output timestamp when there is high request volume. Upon checking with the server that's handling my requests, I found out that on their end, processing time from end-to-end takes less than a second (the same exact request that had a long turnaround time according to my own logs), so I'm not too sure what is the root cause of this high turnaround time. I also did a traceroute and a ping to the server as well and there was no delay, so this should not be a network error.
I've looked around and it seems like the suggested solutions are:
to increase the MaxIdleConnsPerHost
to read the HTTP response body in full and close it
Both of which I have already done.
I'm not sure if there is more tuning to be done regarding the configuration of my HTTP client to resolve this issue, or if I should investigate other workarounds, for instance retry or perhaps scaling (but my CPU and memory utilisation are at the 2-3% range).

HTTP client returns random errors on timeout

I have an HTTP client with a custom RoundTripper which in turn uses the http.DefaultTransport to handle the request.
Now imagine I have a slow server which takes a long time to respond and it makes my http client timeout and cancel the client. Here is the code for the client:
package main
import (
"fmt"
"io/ioutil"
"net/http"
"time"
)
type rt struct {
roundTripper func(req *http.Request) (*http.Response, error)
}
func (r rt) RoundTrip(req *http.Request) (*http.Response, error) {
return r.roundTripper(req)
}
func main() {
c := http.Client{
Timeout: 3 * time.Second,
Transport: rt{RoundTripper(http.DefaultTransport)},
}
resp, err := c.Get("http://127.0.0.1:9000")
if err != nil {
fmt.Println("err:", err)
} else {
body, err := ioutil.ReadAll(resp.Body)
resp.Body.Close()
fmt.Println(string(body), err)
}
}
func RoundTripper(next http.RoundTripper) func(req *http.Request) (*http.Response, error) {
return func(req *http.Request) (*http.Response, error) {
resp, err := next.RoundTrip(req)
if err != nil {
return nil, fmt.Errorf("err: %w", err)
}
return resp, nil
}
}
The problem here is that the error I'm receiving on timeout is randomly one of net/http: request canceled or context deadline exceeded.
Now I know they should be semantically the same thing but I'm failing to understand why it's returning each and when?
Here is the server code if you want to try it for yourself.
The function net/http/client.setRequestCancel() is used to set the cancel of the request. There are three ways
The second will return: net/http: request canceled
The third will return: context deadline exceeded
Because both use the same deadline, time.now()+client.Timeout.
So according to the runtime schedule, the request will be cancelled randomly through these two methods.
https://github.com/golang/go/blob/master/src/net/http/transport.go#L2652
case <-cancelChan:
// return err: net/http: request
pc.t.CancelRequest(req.Request) canceled
cancelChan = nil
case <-ctxDoneChan:
// return err:
pc.t.cancelRequest(req.Request, req.Context().Err())
cancelChan = nil
ctxDoneChan = nil

HTTP server HandleFunc loop on timeout?

I am working on a Go app that has a web server. I was trying to add timeouts and encountered an issue. Here's a sample code I made to reproduce it because posting the actual code would be impossible:
package main
import (
"fmt"
"html/template"
"net/http"
"time"
)
var layout *template.Template
func main() {
router := http.NewServeMux()
server := &http.Server{
Addr: ":8888",
Handler: router,
ReadTimeout: 5 * time.Second,
WriteTimeout: 1 * time.Second,
IdleTimeout: 15 * time.Second,
}
router.HandleFunc("/", home)
var err error
layout, err = template.ParseFiles("./layout.html")
if err != nil {
fmt.Printf("Error1: %+v\n", err)
}
server.ListenAndServe()
}
func home(w http.ResponseWriter, r *http.Request) {
fmt.Println("responding")
err := layout.Execute(w, template.HTML(`World`))
if err != nil {
fmt.Printf("Error2: %+v\n", err)
}
time.Sleep(5 * time.Second)
}
layout.html: Hello {{.}}!
When I run it and visit 127.0.0.1:8888, the browser stays loading, and the home() which is triggering the timeout, starts over again and it does it 10 times before it stops and the browser shows a connection reset error.
I was expecting that after a timeout, the func would end immediately, the connection be closed and the browser stop loading and show an error.
How can I achieve this?
immediately response use goroutines and context timeout
package main
import (
"context"
"fmt"
"html/template"
"net/http"
"time"
)
var layout *template.Template
var WriteTimeout = 1 * time.Second
func main() {
router := http.NewServeMux()
server := &http.Server{
Addr: ":8889",
Handler: router,
ReadTimeout: 5 * time.Second,
WriteTimeout: WriteTimeout + 10*time.Millisecond, //10ms Redundant time
IdleTimeout: 15 * time.Second,
}
router.HandleFunc("/", home)
server.ListenAndServe()
}
func home(w http.ResponseWriter, r *http.Request) {
fmt.Printf("responding\n")
ctx, _ := context.WithTimeout(context.Background(), WriteTimeout)
worker, cancel := context.WithCancel(context.Background())
var buffer string
go func() {
// do something
time.Sleep(2 * time.Second)
buffer = "ready all response\n"
//do another
time.Sleep(2 * time.Second)
cancel()
fmt.Printf("worker finish\n")
}()
select {
case <-ctx.Done():
//add more friendly tips
w.WriteHeader(http.StatusInternalServerError)
return
case <-worker.Done():
w.Write([]byte(buffer))
fmt.Printf("writed\n")
return
}
}

Go client program generates a lot a sockets in TIME_WAIT state

I have a Go program that generates a lot of HTTP requests from multiple goroutines. after running for a while, the program spits out an error: connect: cannot assign requested address.
When checking with netstat, I get a high number (28229) of connections in TIME_WAIT.
The high number of TIME_WAIT sockets happens when I the number of goroutines is 3 and is severe enough to cause a crash when it is 5.
I run Ubuntu 14.4 under docker and go version 1.7
This is the Go program.
package main
import (
"io/ioutil"
"log"
"net/http"
"sync"
)
var wg sync.WaitGroup
var url="http://172.17.0.9:3000/";
const num_coroutines=5;
const num_request_per_coroutine=100000
func get_page(){
response, err := http.Get(url)
if err != nil {
log.Fatal(err)
} else {
defer response.Body.Close()
_, err =ioutil.ReadAll(response.Body)
if err != nil {
log.Fatal(err)
}
}
}
func get_pages(){
defer wg.Done()
for i := 0; i < num_request_per_coroutine; i++{
get_page();
}
}
func main() {
for i:=0;i<num_coroutines;i++{
wg.Add(1)
go get_pages()
}
wg.Wait()
}
This is the server program:
package main
import (
"fmt"
"net/http"
"log"
)
var count int;
func sayhelloName(w http.ResponseWriter, r *http.Request) {
count++;
fmt.Fprintf(w,"Hello World, count is %d",count) // send data to client side
}
func main() {
http.HandleFunc("/", sayhelloName) // set router
err := http.ListenAndServe(":3000", nil) // set listen port
if err != nil {
log.Fatal("ListenAndServe: ", err)
}
}
The default http.Transport is opening and closing connections too quickly. Since all connections are to the same host:port combination, you need to increase MaxIdleConnsPerHost to match your value for num_coroutines. Otherwise, the transport will frequently close the extra connections, only to have them reopened immediately.
You can set this globally on the default transport:
http.DefaultTransport.(*http.Transport).MaxIdleConnsPerHost = numCoroutines
Or when creating your own transport
t := &http.Transport{
Proxy: http.ProxyFromEnvironment,
DialContext: (&net.Dialer{
Timeout: 30 * time.Second,
KeepAlive: 30 * time.Second,
}).DialContext,
MaxIdleConnsPerHost: numCoroutines,
MaxIdleConns: 100,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
ExpectContinueTimeout: 1 * time.Second,
}
Similar question: Go http.Get, concurrency, and "Connection reset by peer"

golang http timeout and goroutines accumulation

I use goroutines achieve http.Get timeout, and then I found that the number has been rising steadily goroutines, and when it reaches 1000 or so, the program will exit
Code:
package main
import (
"errors"
"io/ioutil"
"log"
"net"
"net/http"
"runtime"
"time"
)
// timeout dialler
func timeoutDialler(timeout time.Duration) func(network, addr string) (net.Conn, error) {
return func(network, addr string) (net.Conn, error) {
return net.DialTimeout(network, addr, timeout)
}
}
func timeoutHttpGet(url string) ([]byte, error) {
// change dialler add timeout support && disable keep-alive
tr := &http.Transport{
Dial: timeoutDialler(3 * time.Second),
DisableKeepAlives: true,
}
client := &http.Client{Transport: tr}
type Response struct {
resp []byte
err error
}
ch := make(chan Response, 0)
defer func() {
close(ch)
ch = nil
}()
go func() {
resp, err := client.Get(url)
if err != nil {
ch <- Response{[]byte{}, err}
return
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
ch <- Response{[]byte{}, err}
return
}
tr.CloseIdleConnections()
ch <- Response{body, err}
}()
select {
case <-time.After(5 * time.Second):
return []byte{}, errors.New("timeout")
case response := <-ch:
return response.resp, response.err
}
}
func handler(w http.ResponseWriter, r *http.Request) {
_, err := timeoutHttpGet("http://google.com")
if err != nil {
log.Println(err)
return
}
}
func main() {
go func() {
for {
log.Println(runtime.NumGoroutine())
time.Sleep(500 * time.Millisecond)
}
}()
s := &http.Server{
Addr: ":8888",
ReadTimeout: 15 * time.Second,
WriteTimeout: 15 * time.Second,
}
http.HandleFunc("/", handler)
log.Fatal(s.ListenAndServe())
}
http://play.golang.org/p/SzGTMMmZkI
Init your chan with 1 instead of 0:
ch := make(chan Response, 1)
And remove the defer block that closes and nils ch.
See: http://blog.golang.org/go-concurrency-patterns-timing-out-and
Here is what I think is happening:
after the 5s timeout, timeoutHttpGet returns
the defer statement runs, closing ch and then setting it to nil
the go routine it started to do the actual fetch finishes and attempts to send its data to ch
but ch is nil, and so won't receive anything, preventing that statement from finishing, and thus preventing the go routine from finishing
I assume you are setting ch = nil because before you had that, you would get run-time panics because that's what happens when you attempt to write to a closed channel, as described by the spec.
Giving ch a buffer of 1 means that the fetch go routine can send to it without needing a receiver. If the handler has returned due to timeout, everything will just get garbage collected later on.

Resources