Is the Go HTTP handler goroutine expected to exit immediately in this case? - http

I have one Go HTTP handler like this:
mux.HandleFunc("/test", func(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
if cn, ok := w.(http.CloseNotifier); ok {
go func(done <-chan struct{}, closed <-chan bool) {
select {
case <-done:
case <-closed:
fmt.Println("client cancelled....................!!!!!!!!!")
cancel()
}
}(ctx.Done(), cn.CloseNotify())
}
time.Sleep(5 * time.Second)
fmt.Println("I am still running...........")
fmt.Fprint(w, "cancellation testing......")
})
The API works fine, then with curl before the request finish I terminate the curl command deliberately with Control-C, and on server side I do see the client cancelled....................!!!!!!!!! get logged out, but after a while the I am still running........... get logged out also, I thought this goroutine will be terminated immediately!
So, is this desired behaviour, or I did something wrong?
If it is expected, since whatever the goroutine will complete its work, then what is the point of the early cancellation?
If I did something wrong, please help to point me out the correct way.

You create a contex.Context that can be cancelled, which you do cancel when the client closes the connection, BUT you do not check the context and your handler does nothing differently if it is cancelled. The context only carries timeout and cancellation signals, it does not have the power nor the intent to kill / terminate goroutines. The goroutines themselves have to monitor such cancellation signals and act upon it.
So what you see is the expected output of your code.
What you want is to monitor the context, and if it is cancelled, return "immediately" from the handler.
Of course if you're "sleeping", you can't monitor the context meanwhile. So instead use time.After(), like in this example:
mux.HandleFunc("/test", func(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
if cn, ok := w.(http.CloseNotifier); ok {
go func(done <-chan struct{}, closed <-chan bool) {
select {
case <-done:
case <-closed:
fmt.Println("client cancelled....................!!!!!!!!!")
cancel()
}
}(ctx.Done(), cn.CloseNotify())
}
select {
case <-time.After(5 * time.Second):
fmt.Println("5 seconds elapsed, client didn't close")
case <-ctx.Done():
fmt.Println("Context closed, client closed connection?")
return
}
fmt.Fprint(w, "cancellation testing......")
})

Related

Respond to HTTP request while processing in the background

I have an API that receives a CSV file to process. I'd like to be able to send back an 202 Accepted (or any status really) while processing the file in the background. I have a handler that checks the request, writes the success header, and then continues processing via a producer/consumer pattern. The problem is that, due to the WaitGroup.Wait() calls, the accepted header isn't sending back. The errors on the handler validation are sending back correctly but that's because of the return statements.
Is it possible to send that 202 Accepted back with the wait groups as I'm hoping (and if so, what am I missing)?
func SomeHandler(w http.ResponseWriter, req *http.Request) {
endAccepted := time.Now()
err := verifyRequest(req)
if err != nil {
w.WriteHeader(http.StatusBadRequest)
data := JSONErrors{Errors: []string{err.Error()}}
json.NewEncoder(w).Encode(data)
return
}
// ...FILE RETRIEVAL CLIPPED (not relevant)...
// e.g. csvFile, openErr := os.Open(tmpFile.Name())
//////////////////////////////////////////////////////
// TODO this isn't sending due to the WaitGroup.Wait()s below
w.WriteHeader(http.StatusAccepted)
//////////////////////////////////////////////////////
// START PRODUCER/CONSUMER
jobs := make(chan *Job, 100) // buffered channel
results := make(chan *Job, 100) // buffered channel
// start consumers
for i := 0; i < 5; i++ { // 5 consumers
wg.Add(1)
go consume(i, jobs, results)
}
// start producing
go produce(jobs, csvFile)
// start processing
wg2.Add(1)
go process(results)
wg.Wait() // wait for all workers to finish processing jobs
close(results)
wg2.Wait() // wait for process to finish
log.Println("===> Done Processing.")
}
You're doing all the processing in the background, but you're still waiting for it to finish. The solution would be to just not wait. The best solution would move all of the handling elsewhere to a function you can just call with go to run it in the background, but the simplest solution leaving it inline would just be
w.WriteHeader(http.StatusAccepted)
go func() {
// START PRODUCER/CONSUMER
jobs := make(chan *Job, 100) // buffered channel
results := make(chan *Job, 100) // buffered channel
// start consumers
for i := 0; i < 5; i++ { // 5 consumers
wg.Add(1)
go consume(i, jobs, results)
}
// start producing
go produce(jobs, csvFile)
// start processing
wg2.Add(1)
go process(results)
wg.Wait() // wait for all workers to finish processing jobs
close(results)
wg2.Wait() // wait for process to finish
log.Println("===> Done Processing.")
}()
Note that you elided the CSV file handling, so you'll need to ensure that it's safe to use this way (i.e. that you haven't defered closing or deleting the file, which would cause that to occur as soon as the handler returns).

How to interrupt an HTTP handler?

Say I have a http handler like this:
func ReallyLongFunction(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello World!")
// run code that takes a long time here
// Executing dd command with cmd.Exec..., etc.
})
Is there a way I can interrupt this function if the user refreshes the page or kills the request some other way without running the subsequent code and how would I do it?
I tried doing this:
notify := r.Context().Done()
go func() {
<-notify
println("Client closed the connection")
s.downloadCleanup()
return
}()
but the code after whenever I interrupt it still runs anyway.
There's no way to forcibly tear a goroutine down from any code external to that goroutine.
Hence the only way to actually interrupt processing is to periodically check whether the client is gone (or whether there's another signal to stop processing).
Basically that would amount to structuring your handler something like this
func ReallyLongFunction(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello World!")
done := r.Context().Done()
// Check wheteher we're done
// do some small piece of stuff
// check whether we're done
// do another small piece of stuff
// …rinse, repeat
})
Now a way to check whether there was something written to a channel, but without blocking the operation is to use the "select with default" idiom:
select {
case <- done:
// We're done
default:
}
This statemept executes the code in the "// We're done" block if and only if done was written to or was closed (which is the case with contexts), and otherwis the empty block in the default branch is executed.
So we can refactor that to something like
func ReallyLongFunction(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello World!")
done := r.Context().Done()
closed := func () bool {
select {
case <- done:
return true
default:
return false
}
}
if closed() {
return
}
// do some small piece of stuff
if closed() {
return
}
// do another small piece of stuff
// …rinse, repeat
})
Stopping an external process started in an HTTP handler
To address the OP's comment…
The os/exec.Cmd type has the Process field, which is of type os.Process and that type supports the Kill method which forcibly brings the running process down.
The only problem is that exec.Cmd.Run blocks until the process exits,
so the goroutine which is executing it cannot execute other code, and if exec.Cmd.Run is called in an HTTP handler, there's no way to cancel it.
How to best handle running a program in such an asynchronous manner heavily depends on how the process itself is organized but I'd roll like this:
In the handler, prepare the process and then start it using exec.Cmd.Start (as opposed to Run).
Check the error value Start have returned: if it's nil
the process has managed to start OK. Otherwise somehow communicate the failure to the client and quit the handler.
Once the process is known to had started, the exec.Cmd value
has some of its fields populated with process-related information;
of particular interest is the Process field which is of type
os.Process: that type has the Kill method which may be used to forcibly bring the process down.
Start a goroutine and pass it that exec.Cmd value and a channel of some suitable type (see below).
That goroutine should call Wait on it and once it returns,
it should communicate that fact back to the originating goroutine over that channel.
Exactly what to communicate, is an open question as it depends
on whether you want to collect what the process wrote to its standard
output and error streams and/or may be some other data related to the process' activity.
After sending the data, that goroutine exits.
The main goroutine (executing the handler) should just call exec.Cmd.Process.Kill when it detect the handler should terminate.
Killing the process eventually unblocks the goroutine which is executing Wait on that same exec.Cmd value as the process exits.
After killing the process, the handler goroutine waits on the channel to hear back from the goroutine watching the process. The handler does something with that data (may be logs it or whatever) and exits.
You should cancel the goroutine from inside, so for a long calculation task, you may provide checkpoints, to stop and check for the cancelation:
Here is the tested code for the server which has e.g. long calculation task and checkpoints for the cancelation:
package main
import (
"fmt"
"io"
"log"
"net/http"
"time"
)
func main() {
http.HandleFunc(`/`, func(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
log.Println("wait a couple of seconds ...")
for i := 0; i < 10; i++ { // long calculation
select {
case <-ctx.Done():
log.Println("Client closed the connection:", ctx.Err())
return
default:
fmt.Print(".")
time.Sleep(200 * time.Millisecond) // long calculation
}
}
io.WriteString(w, `Hi`)
log.Println("Done.")
})
log.Println(http.ListenAndServe(":8081", nil))
}
Here is the client code, which times out:
package main
import (
"io/ioutil"
"log"
"net/http"
"time"
)
func main() {
log.Println("HTTP GET")
client := &http.Client{
Timeout: 1 * time.Second,
}
r, err := client.Get(`http://127.0.0.1:8081/`)
if err != nil {
log.Fatal(err)
}
defer r.Body.Close()
bs, err := ioutil.ReadAll(r.Body)
if err != nil {
log.Fatal(err)
}
log.Println("HTTP Done.")
log.Println(string(bs))
}
You may use normal browser to check for not canclation, or close it, refresh it , disconect it, or ..., for the cancelation.

Sending HTTP Put body from ReadCloser never ends

Target
I want to send to data to my server from a read closer. (In the example a NopCloser, later it will be the Stdout of an exec.Command)
Problem
The Request never ends. Even if I manually close the cmdOut the program nevers ends. Concrete: It never reaches the "Request Done" line and there by never calls wg.Done()
Gotchas
All the data is sent correctly to the server (even with the exec.Command Stdout). But the http.DefaultClient.Do seems to be still listening on the ReadCloser after it is empty (and closed in the main routine)
Code
cmdOut := ioutil.NopCloser(bytes.NewBuffer([]byte("Hallo DU")))
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
// localhost:1234 is a netcat server: "nc -l -p 1234"
req, _ := http.NewRequest("PUT", "http://localhost:1234", cmdOut)
resp, err := http.DefaultClient.Do(req)
if err != nil {
log.Println(err)
return
}
// Never reaches this line
log.Println("Request Done")
}()
cmdOut.Close()
log.Println("Wait for go routine")
wg.Wait()
log.Println("DONE")
The problem isn't the sending of your request--it's the receiving of the response.
You're sending your request to netcat, which simply discards the request, and does nothing else. This leaves the HTTP client library waiting for an HTTP response, which never comes.
The solution is to talk to an actual HTTP server.

Async work after response

I am trying to implement http server that:
Calculate farther redirect using some logic
Redirect user
Log user data
The goal is to achieve maximum throughput (at least 15k rps). In order to do this, I want to save log asynchronously. I'm using kafka as logging system and separate logging block of code into separate goroutine. Overall example of current implementation:
package main
import (
"github.com/confluentinc/confluent-kafka-go/kafka"
"net/http"
"time"
"encoding/json"
)
type log struct {
RuntimeParam string `json:"runtime_param"`
AsyncParam string `json:"async_param"`
RemoteAddress string `json:"remote_address"`
}
var (
producer, _ = kafka.NewProducer(&kafka.ConfigMap{
"bootstrap.servers": "localhost:9092,localhost:9093",
"queue.buffering.max.ms": 1 * 1000,
"go.delivery.reports": false,
"client.id": 1,
})
topicName = "log"
)
func main() {
siteMux := http.NewServeMux()
siteMux.HandleFunc("/", httpHandler)
srv := &http.Server{
Addr: ":8080",
Handler: siteMux,
ReadTimeout: 2 * time.Second,
WriteTimeout: 5 * time.Second,
IdleTimeout: 10 * time.Second,
}
if err := srv.ListenAndServe(); err != nil {
panic(err)
}
}
func httpHandler(w http.ResponseWriter, r *http.Request) {
handlerLog := new(log)
handlerLog.RuntimeParam = "runtimeDataString"
http.Redirect(w, r, "http://google.com", 301)
go func(goroutineLog *log, request *http.Request) {
goroutineLog.AsyncParam = "asyncDataString"
goroutineLog.RemoteAddress = r.RemoteAddr
jsonLog, err := json.Marshal(goroutineLog)
if err == nil {
producer.ProduceChannel() <- &kafka.Message{
TopicPartition: kafka.TopicPartition{Topic: &topicName, Partition: kafka.PartitionAny},
Value: jsonLog,
}
}
}(handlerLog, r)
}
The questions are:
Is it correct/efficient to use separate goroutine to implement async logging or should I use a different approach? (workers and channels for example)
Maybe there is a way to further improve performance of server, that I'm missing?
Yes, this is correct and efficient use of a goroutine (as Flimzy pointed in the comments). I couldn't agree more, this is a good approach.
The problem is that the handler may finish executing before the goroutine started processing everything and the request (which is a pointer) may be gone or you may have some races down the middleware stack. I read your comments, that it isn't your case, but in general, you shouldn't pass a request to a goroutine. As I can see from your code, you're really using only RemoteAddr from the request and why not to redirect straight away and put logging in the defer statement? So, I'd rewrite your handler a bit:
func httpHandler(w http.ResponseWriter, r *http.Request) {
http.Redirect(w, r, "http://google.com", 301)
defer func(runtimeDataString, RemoteAddr string) {
handlerLog := new(log)
handlerLog.RuntimeParam = runtimeDataString
handlerLog.AsyncParam = "asyncDataString"
handlerLog.RemoteAddress = RemoteAddr
jsonLog, err := json.Marshal(handlerLog)
if err == nil {
producer.ProduceChannel() <- &kafka.Message{
TopicPartition: kafka.TopicPartition{Topic: &topicName, Partition: kafka.PartitionAny},
Value: jsonLog,
}
}
}("runtimeDataString", r.RemoteAddr)
}
The goroutines unlikely improve performance of your server as you just send the response earlier and those kafka connections could pile up in the background and slow down the whole server. If you find this as the bottleneck, you may consider saving logs locally and sending them to kafka in another process (or pool of workers) outside of your server. This may spread the workload over time (like sending fewer logs when you have more requests and vice versa).

Go bufio.Scanner stops while reading TCP connection to Redis

Reading TCP connection between Redis-server by using bufio.Scanner
fmt.Fprintf(conn, "*3\r\n$3\r\nSET\r\n$5\r\nmykey\r\n$7\r\nHello!!\r\n")
scanner := bufio.NewScanner(conn)
for {
// fmt.Println("marker00")
if ok := scanner.Scan(); !ok {
// fmt.Println("marker01")
break
}
// fmt.Println("marker02")
fmt.Println(scanner.Text())
}
"+OK" comes as the result for first scanning, but the second scanning stops just in invoking Scan method. (marker00 -> marker02 -> marker00 and no output any more)
Why does Scan stop and how can I know the end of TCP response (without using bufio.Reader)?
Redis does not close the connection for you after sending a command. Scan() ends after io.EOF which is not sent.
Check out this:
package main
import (
"bufio"
"fmt"
"net"
)
// before go run, you must hit `redis-server` to wake redis up
func main() {
conn, _ := net.Dial("tcp", "localhost:6379")
message := "*3\r\n$3\r\nSET\r\n$1\r\na\r\n$1\r\nb\r\n"
go func(conn net.Conn) {
for i := 0; i < 10; i++ {
fmt.Fprintf(conn, message)
}
}(conn)
scanner := bufio.NewScanner(conn)
for {
if ok := scanner.Scan(); !ok {
break
}
fmt.Println(scanner.Text())
}
fmt.Println("Scanning ended")
}
Old question, but I had the same issue. Two solutions:
1) Add a "QUIT\r\n" command to your Redis message. This will cause Redis to close the connection which will terminate the scan. You'll have to deal with the extra "+OK" that the quit outputs.
2) Add
conn.SetReadDeadline(time.Now().Add(time.Second*5))
just before you start scanning. This will cause the scan to stop trying after 5 seconds. Unfortunately, it will always take 5 seconds to complete the scan so choose this time wisely.

Resources