How to interrupt an HTTP handler? - http

Say I have a http handler like this:
func ReallyLongFunction(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello World!")
// run code that takes a long time here
// Executing dd command with cmd.Exec..., etc.
})
Is there a way I can interrupt this function if the user refreshes the page or kills the request some other way without running the subsequent code and how would I do it?
I tried doing this:
notify := r.Context().Done()
go func() {
<-notify
println("Client closed the connection")
s.downloadCleanup()
return
}()
but the code after whenever I interrupt it still runs anyway.

There's no way to forcibly tear a goroutine down from any code external to that goroutine.
Hence the only way to actually interrupt processing is to periodically check whether the client is gone (or whether there's another signal to stop processing).
Basically that would amount to structuring your handler something like this
func ReallyLongFunction(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello World!")
done := r.Context().Done()
// Check wheteher we're done
// do some small piece of stuff
// check whether we're done
// do another small piece of stuff
// …rinse, repeat
})
Now a way to check whether there was something written to a channel, but without blocking the operation is to use the "select with default" idiom:
select {
case <- done:
// We're done
default:
}
This statemept executes the code in the "// We're done" block if and only if done was written to or was closed (which is the case with contexts), and otherwis the empty block in the default branch is executed.
So we can refactor that to something like
func ReallyLongFunction(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello World!")
done := r.Context().Done()
closed := func () bool {
select {
case <- done:
return true
default:
return false
}
}
if closed() {
return
}
// do some small piece of stuff
if closed() {
return
}
// do another small piece of stuff
// …rinse, repeat
})
Stopping an external process started in an HTTP handler
To address the OP's comment…
The os/exec.Cmd type has the Process field, which is of type os.Process and that type supports the Kill method which forcibly brings the running process down.
The only problem is that exec.Cmd.Run blocks until the process exits,
so the goroutine which is executing it cannot execute other code, and if exec.Cmd.Run is called in an HTTP handler, there's no way to cancel it.
How to best handle running a program in such an asynchronous manner heavily depends on how the process itself is organized but I'd roll like this:
In the handler, prepare the process and then start it using exec.Cmd.Start (as opposed to Run).
Check the error value Start have returned: if it's nil
the process has managed to start OK. Otherwise somehow communicate the failure to the client and quit the handler.
Once the process is known to had started, the exec.Cmd value
has some of its fields populated with process-related information;
of particular interest is the Process field which is of type
os.Process: that type has the Kill method which may be used to forcibly bring the process down.
Start a goroutine and pass it that exec.Cmd value and a channel of some suitable type (see below).
That goroutine should call Wait on it and once it returns,
it should communicate that fact back to the originating goroutine over that channel.
Exactly what to communicate, is an open question as it depends
on whether you want to collect what the process wrote to its standard
output and error streams and/or may be some other data related to the process' activity.
After sending the data, that goroutine exits.
The main goroutine (executing the handler) should just call exec.Cmd.Process.Kill when it detect the handler should terminate.
Killing the process eventually unblocks the goroutine which is executing Wait on that same exec.Cmd value as the process exits.
After killing the process, the handler goroutine waits on the channel to hear back from the goroutine watching the process. The handler does something with that data (may be logs it or whatever) and exits.

You should cancel the goroutine from inside, so for a long calculation task, you may provide checkpoints, to stop and check for the cancelation:
Here is the tested code for the server which has e.g. long calculation task and checkpoints for the cancelation:
package main
import (
"fmt"
"io"
"log"
"net/http"
"time"
)
func main() {
http.HandleFunc(`/`, func(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
log.Println("wait a couple of seconds ...")
for i := 0; i < 10; i++ { // long calculation
select {
case <-ctx.Done():
log.Println("Client closed the connection:", ctx.Err())
return
default:
fmt.Print(".")
time.Sleep(200 * time.Millisecond) // long calculation
}
}
io.WriteString(w, `Hi`)
log.Println("Done.")
})
log.Println(http.ListenAndServe(":8081", nil))
}
Here is the client code, which times out:
package main
import (
"io/ioutil"
"log"
"net/http"
"time"
)
func main() {
log.Println("HTTP GET")
client := &http.Client{
Timeout: 1 * time.Second,
}
r, err := client.Get(`http://127.0.0.1:8081/`)
if err != nil {
log.Fatal(err)
}
defer r.Body.Close()
bs, err := ioutil.ReadAll(r.Body)
if err != nil {
log.Fatal(err)
}
log.Println("HTTP Done.")
log.Println(string(bs))
}
You may use normal browser to check for not canclation, or close it, refresh it , disconect it, or ..., for the cancelation.

Related

Respond to HTTP request while processing in the background

I have an API that receives a CSV file to process. I'd like to be able to send back an 202 Accepted (or any status really) while processing the file in the background. I have a handler that checks the request, writes the success header, and then continues processing via a producer/consumer pattern. The problem is that, due to the WaitGroup.Wait() calls, the accepted header isn't sending back. The errors on the handler validation are sending back correctly but that's because of the return statements.
Is it possible to send that 202 Accepted back with the wait groups as I'm hoping (and if so, what am I missing)?
func SomeHandler(w http.ResponseWriter, req *http.Request) {
endAccepted := time.Now()
err := verifyRequest(req)
if err != nil {
w.WriteHeader(http.StatusBadRequest)
data := JSONErrors{Errors: []string{err.Error()}}
json.NewEncoder(w).Encode(data)
return
}
// ...FILE RETRIEVAL CLIPPED (not relevant)...
// e.g. csvFile, openErr := os.Open(tmpFile.Name())
//////////////////////////////////////////////////////
// TODO this isn't sending due to the WaitGroup.Wait()s below
w.WriteHeader(http.StatusAccepted)
//////////////////////////////////////////////////////
// START PRODUCER/CONSUMER
jobs := make(chan *Job, 100) // buffered channel
results := make(chan *Job, 100) // buffered channel
// start consumers
for i := 0; i < 5; i++ { // 5 consumers
wg.Add(1)
go consume(i, jobs, results)
}
// start producing
go produce(jobs, csvFile)
// start processing
wg2.Add(1)
go process(results)
wg.Wait() // wait for all workers to finish processing jobs
close(results)
wg2.Wait() // wait for process to finish
log.Println("===> Done Processing.")
}
You're doing all the processing in the background, but you're still waiting for it to finish. The solution would be to just not wait. The best solution would move all of the handling elsewhere to a function you can just call with go to run it in the background, but the simplest solution leaving it inline would just be
w.WriteHeader(http.StatusAccepted)
go func() {
// START PRODUCER/CONSUMER
jobs := make(chan *Job, 100) // buffered channel
results := make(chan *Job, 100) // buffered channel
// start consumers
for i := 0; i < 5; i++ { // 5 consumers
wg.Add(1)
go consume(i, jobs, results)
}
// start producing
go produce(jobs, csvFile)
// start processing
wg2.Add(1)
go process(results)
wg.Wait() // wait for all workers to finish processing jobs
close(results)
wg2.Wait() // wait for process to finish
log.Println("===> Done Processing.")
}()
Note that you elided the CSV file handling, so you'll need to ensure that it's safe to use this way (i.e. that you haven't defered closing or deleting the file, which would cause that to occur as soon as the handler returns).

Async work after response

I am trying to implement http server that:
Calculate farther redirect using some logic
Redirect user
Log user data
The goal is to achieve maximum throughput (at least 15k rps). In order to do this, I want to save log asynchronously. I'm using kafka as logging system and separate logging block of code into separate goroutine. Overall example of current implementation:
package main
import (
"github.com/confluentinc/confluent-kafka-go/kafka"
"net/http"
"time"
"encoding/json"
)
type log struct {
RuntimeParam string `json:"runtime_param"`
AsyncParam string `json:"async_param"`
RemoteAddress string `json:"remote_address"`
}
var (
producer, _ = kafka.NewProducer(&kafka.ConfigMap{
"bootstrap.servers": "localhost:9092,localhost:9093",
"queue.buffering.max.ms": 1 * 1000,
"go.delivery.reports": false,
"client.id": 1,
})
topicName = "log"
)
func main() {
siteMux := http.NewServeMux()
siteMux.HandleFunc("/", httpHandler)
srv := &http.Server{
Addr: ":8080",
Handler: siteMux,
ReadTimeout: 2 * time.Second,
WriteTimeout: 5 * time.Second,
IdleTimeout: 10 * time.Second,
}
if err := srv.ListenAndServe(); err != nil {
panic(err)
}
}
func httpHandler(w http.ResponseWriter, r *http.Request) {
handlerLog := new(log)
handlerLog.RuntimeParam = "runtimeDataString"
http.Redirect(w, r, "http://google.com", 301)
go func(goroutineLog *log, request *http.Request) {
goroutineLog.AsyncParam = "asyncDataString"
goroutineLog.RemoteAddress = r.RemoteAddr
jsonLog, err := json.Marshal(goroutineLog)
if err == nil {
producer.ProduceChannel() <- &kafka.Message{
TopicPartition: kafka.TopicPartition{Topic: &topicName, Partition: kafka.PartitionAny},
Value: jsonLog,
}
}
}(handlerLog, r)
}
The questions are:
Is it correct/efficient to use separate goroutine to implement async logging or should I use a different approach? (workers and channels for example)
Maybe there is a way to further improve performance of server, that I'm missing?
Yes, this is correct and efficient use of a goroutine (as Flimzy pointed in the comments). I couldn't agree more, this is a good approach.
The problem is that the handler may finish executing before the goroutine started processing everything and the request (which is a pointer) may be gone or you may have some races down the middleware stack. I read your comments, that it isn't your case, but in general, you shouldn't pass a request to a goroutine. As I can see from your code, you're really using only RemoteAddr from the request and why not to redirect straight away and put logging in the defer statement? So, I'd rewrite your handler a bit:
func httpHandler(w http.ResponseWriter, r *http.Request) {
http.Redirect(w, r, "http://google.com", 301)
defer func(runtimeDataString, RemoteAddr string) {
handlerLog := new(log)
handlerLog.RuntimeParam = runtimeDataString
handlerLog.AsyncParam = "asyncDataString"
handlerLog.RemoteAddress = RemoteAddr
jsonLog, err := json.Marshal(handlerLog)
if err == nil {
producer.ProduceChannel() <- &kafka.Message{
TopicPartition: kafka.TopicPartition{Topic: &topicName, Partition: kafka.PartitionAny},
Value: jsonLog,
}
}
}("runtimeDataString", r.RemoteAddr)
}
The goroutines unlikely improve performance of your server as you just send the response earlier and those kafka connections could pile up in the background and slow down the whole server. If you find this as the bottleneck, you may consider saving logs locally and sending them to kafka in another process (or pool of workers) outside of your server. This may spread the workload over time (like sending fewer logs when you have more requests and vice versa).

Is the Go HTTP handler goroutine expected to exit immediately in this case?

I have one Go HTTP handler like this:
mux.HandleFunc("/test", func(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
if cn, ok := w.(http.CloseNotifier); ok {
go func(done <-chan struct{}, closed <-chan bool) {
select {
case <-done:
case <-closed:
fmt.Println("client cancelled....................!!!!!!!!!")
cancel()
}
}(ctx.Done(), cn.CloseNotify())
}
time.Sleep(5 * time.Second)
fmt.Println("I am still running...........")
fmt.Fprint(w, "cancellation testing......")
})
The API works fine, then with curl before the request finish I terminate the curl command deliberately with Control-C, and on server side I do see the client cancelled....................!!!!!!!!! get logged out, but after a while the I am still running........... get logged out also, I thought this goroutine will be terminated immediately!
So, is this desired behaviour, or I did something wrong?
If it is expected, since whatever the goroutine will complete its work, then what is the point of the early cancellation?
If I did something wrong, please help to point me out the correct way.
You create a contex.Context that can be cancelled, which you do cancel when the client closes the connection, BUT you do not check the context and your handler does nothing differently if it is cancelled. The context only carries timeout and cancellation signals, it does not have the power nor the intent to kill / terminate goroutines. The goroutines themselves have to monitor such cancellation signals and act upon it.
So what you see is the expected output of your code.
What you want is to monitor the context, and if it is cancelled, return "immediately" from the handler.
Of course if you're "sleeping", you can't monitor the context meanwhile. So instead use time.After(), like in this example:
mux.HandleFunc("/test", func(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
if cn, ok := w.(http.CloseNotifier); ok {
go func(done <-chan struct{}, closed <-chan bool) {
select {
case <-done:
case <-closed:
fmt.Println("client cancelled....................!!!!!!!!!")
cancel()
}
}(ctx.Done(), cn.CloseNotify())
}
select {
case <-time.After(5 * time.Second):
fmt.Println("5 seconds elapsed, client didn't close")
case <-ctx.Done():
fmt.Println("Context closed, client closed connection?")
return
}
fmt.Fprint(w, "cancellation testing......")
})

Why this simple web server is called even number times?

I'm trying to learn Go web programming, and here is a simple web server: it prints out the times being called.
package main
import (
"fmt"
"net/http"
)
var calls int
// HelloWorld print the times being called.
func HelloWorld(w http.ResponseWriter, r *http.Request){
calls++
fmt.Fprintf(w, "You've called me %d times", calls)
}
func main() {
fmt.Printf("Started server at http://localhost%v.\n", 5000)
http.HandleFunc("/", HelloWorld)
http.ListenAndServe(":5000", nil)
}
When I refresh the page, I got:
You've called me 1 times
You've called me 3 times
You've called me 5 times
....
Question: Why it is 1, 3, 5 times, rather than 1,2,3...? What's the order of the function HelloWorld being called?
It is because every incoming request is routed to your HelloWorld() handler function, and the browser makes multiple calls under the hood, specifically to /favicon.ico.
And since your web server does not send back a valid favicon, it will request it again when you refresh the page in the browser.
Try it with Chrome: open the Developer tools (CTRL+SHIFT+I), and choose the "Network" tab. Hit refresh, and you will see 2 new entries:
Name Status Type
--------------------------------------------------------
localhost 200 document
favicon.ico 200 text/plain
Since your counter starts with 0 (default value for type int), you increment it once and you send back 1. Then the request for favicon.ico increments it again (2), but the result is not displayed. Then if you refresh, it gets incremented again to 3 and you send that back, etc.
Also note that multiple goroutines can serve requests concurrently, so your solution has a race. You should synchronize access to the calls variable, or use the sync/atomic package to increment the counter safely, for example:
var calls int64
func HelloWorld(w http.ResponseWriter, r *http.Request) {
count := atomic.AddInt64(&calls, 1)
fmt.Fprintf(w, "You've called me %d times", count)
}
A simple "fix" to achieve what you want would be to check the request path, and if it is not the root "/", don't increment, e.g.:
func HelloWorld(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/" {
return
}
count := atomic.AddInt64(&calls, 1)
fmt.Fprintf(w, "You've called me %d times", count)
}
You may also choose to only exclude requests for favicon.ico, e.g.:
if r.URL.Path == "/favicon.ico" {
return
}

Simulate a tcp connection in Go

In Go, a TCP connection (net.Conn) is a io.ReadWriteCloser. I'd like to test my network code by simulating a TCP connection. There are two requirements that I have:
the data to be read is stored in a string
whenever data is written, I'd like it to be stored in some kind of buffer which I can access later
Is there a data structure for this, or an easy way to make one?
No idea if this existed when the question was asked, but you probably want net.Pipe() which provides you with two full duplex net.Conn instances which are linked to each other
EDIT: I've rolled this answer into a package which makes things a bit simpler - see here: https://github.com/jordwest/mock-conn
While Ivan's solution will work for simple cases, keep in mind that a real TCP connection is actually two buffers, or rather pipes. For example:
Server | Client
---------+---------
reads <=== writes
writes ===> reads
If you use a single buffer that the server both reads from and writes to, you could end up with the server talking to itself.
Here's a solution that allows you to pass a MockConn type as a ReadWriteCloser to the server. The Read, Write and Close functions simply proxy through to the functions on the server's end of the pipes.
type MockConn struct {
ServerReader *io.PipeReader
ServerWriter *io.PipeWriter
ClientReader *io.PipeReader
ClientWriter *io.PipeWriter
}
func (c MockConn) Close() error {
if err := c.ServerWriter.Close(); err != nil {
return err
}
if err := c.ServerReader.Close(); err != nil {
return err
}
return nil
}
func (c MockConn) Read(data []byte) (n int, err error) { return c.ServerReader.Read(data) }
func (c MockConn) Write(data []byte) (n int, err error) { return c.ServerWriter.Write(data) }
func NewMockConn() MockConn {
serverRead, clientWrite := io.Pipe()
clientRead, serverWrite := io.Pipe()
return MockConn{
ServerReader: serverRead,
ServerWriter: serverWrite,
ClientReader: clientRead,
ClientWriter: clientWrite,
}
}
When mocking a 'server' connection, simply pass the MockConn in place of where you would use the net.Conn (this obviously implements the ReadWriteCloser interface only, you could easily add dummy methods for LocalAddr() etc if you need to support the full net.Conn interface)
In your tests you can act as the client by reading and writing to the ClientReader and ClientWriter fields as needed:
func TestTalkToServer(t *testing.T) {
/*
* Assumes that NewMockConn has already been called and
* the server is waiting for incoming data
*/
// Send a message to the server
fmt.Fprintf(mockConn.ClientWriter, "Hello from client!\n")
// Wait for the response from the server
rd := bufio.NewReader(mockConn.ClientReader)
line, err := rd.ReadString('\n')
if line != "Hello from server!" {
t.Errorf("Server response not as expected: %s\n", line)
}
}
Why not using bytes.Buffer? It's an io.ReadWriter and has a String method to get the stored data. If you need to make it an io.ReadWriteCloser, you could define you own type:
type CloseableBuffer struct {
bytes.Buffer
}
and define a Close method:
func (b *CloseableBuffer) Close() error {
return nil
}
In majority of the cases you do not need to mock net.Conn.
You only have to mock stuff that will add time to your tests, prevent tests from running in parallel (using shared resources like the hardcoded file name) or can lead to outages (you can potentially exhaust the connection limit or ports but in most of the cases it is not a concern, when you run your tests in isolation).
Not mocking has an advantage of more precise testing of what you want to test with a real thing.
https://www.accenture.com/us-en/blogs/software-engineering-blog/to-mock-or-not-to-mock-is-that-even-a-question
Instead of mocking net.Conn, you can write a mock server, run it in a goroutine in your test and connect to it using real net.Conn
A quick and dirty example:
port := someRandomPort()
srv := &http.Server{Addr: port}
go func(msg string) {
http.HandleFunc("/hello", myHandleFUnc)
srv.ListenAndServe()
}
myTestCodeUsingConn(port)
srv.Shutdown(context.TODO())

Resources