I'm trying to send data (files or whatever) through HTTP from the client to a server and read them as stream in the server.
But I noticed the chunk size or buffer size when the request's body is read it is fixed to 32kb. I tried doing it with TCP before using HTTP and the buffer size was the expected assigned size.
The data received from the request is being written to a file
Questions:
Is it possible to increase the chunk / buffer size?
if it is possible, by having a bigger buffer size will it increase performance due to less write calls to to the file being created?
If it is not possible, should I worry about performance loss by doing more write calls to the file being created?
Would it be better to use TCP? I really need the headers and http response
Here is some code for illustration:
client.go:
package main
import (
"fmt"
"io"
"log"
"net/http"
"os"
)
func main() {
addr := "http://localhost:8080"
path := "path/to/file"
sendHTTP(addr, path)
}
func sendHTTP(addr, path string) {
f, err := os.Open(path)
if err != nil {
log.Fatal("Error opening file:", err)
}
client := &http.Client{}
req, err := http.NewRequest("POST", addr, f)
if err != nil {
f.Close()
log.Fatal("Error creating request:", err)
}
_, err = client.Do(req)
if err != nil {
f.Close()
log.Fatal("Error doing request:", err)
}
}
server.go:
package main
import (
"fmt"
"io"
"log"
"net/http"
)
func main() {
addr := ":8080"
http.HandleFunc("/", handler)
http.ListenAndServe(addr, nil)
}
func handler(_ http.ResponseWriter, r *http.Request) {
buf := make([]byte, 512*1024) // 512kb
for {
br, err := r.Body.Read(buf)
if err == io.EOF {
break
} else if err != nil {
log.Println("Error reading request:", err)
break
}
fmt.Println(br) // is always 32kb
}
}
The call r.Body.Read(buf) waits for data from the network and returns up to len(buf) bytes of the available data. The amount of available data at the time of the call depends on timing and buffer sizes on the client, server and network. It's not easy to control.
The data received from the request is being written to a file
To write the data to the file in the most efficient way, copy from the request body to the file using io.Copy. Here's an example where f is the *os.File you want to write:
_, err := io.Copy(f, r.Body)
if err != nil {
// handle error
}
At the time I am writing this answer, the io.Copy function calls f.ReadFrom(r.Body) to copy the request body to a file.
Related
When throwing an HTTP Request with Go and receiving a Response, I want to receive a response while streaming, considering the case where the ResponseBody is huge (1 GB or more).
resp, err: = http.Client.Do(req)
In this case, if the body is huge, I can not read the Header and I do not know the state of Response.
Is there any solution?
(Edit: If you're unable to get the "Content-length" header from the response, it is possible that the web service you're hitting doesn't return that header. In such a case, there's no way to know the length of the response body without reading it completely. You can simulate that in the following example by removing the line that sets the Content-length header in the response.)
The standard Go net/http package handles large responses very well. Here's a self contained example to demonstrate:
// Start a mock HTTP server that returns 2GB of data in the response. Make a
// HTTP request to this server and print the amount of data read from the
// response.
package main
import (
"fmt"
"io"
"log"
"net/http"
"strings"
"time"
)
const oneMB = 1024 * 1024
const oneGB = 1024 * oneMB
const responseSize = 2 * oneGB
const serverAddr = "localhost:9999"
func startServer() {
// Mock HTTP server that always returns 2GB of data
go http.ListenAndServe(serverAddr, http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
w.Header().Set("Content-length", fmt.Sprintf("%d", responseSize))
// 1MB buffer that'll be copied multiple times to the response
buf := []byte(strings.Repeat("x", oneMB))
for i := 0; i < responseSize/len(buf); i++ {
if _, err := w.Write(buf); err != nil {
log.Fatal("Failed to write to response. Error: ", err.Error())
}
}
}))
// Some grace period for the server to start
time.Sleep(100 * time.Millisecond)
}
func main() {
startServer()
// HTTP client
req, err := http.NewRequest("GET", "http://"+serverAddr, nil)
if err != nil {
log.Fatal("Error creating HTTP request: ", err.Error())
}
client := http.Client{}
resp, err := client.Do(req)
if err != nil {
log.Fatal("Error making HTTP request: ", err.Error())
}
// Read the response header
fmt.Println("Response: Content-length:", resp.Header.Get("Content-length"))
bytesRead := 0
buf := make([]byte, oneMB)
// Read the response body
for {
n, err := resp.Body.Read(buf)
bytesRead += n
if err == io.EOF {
break
}
if err != nil {
log.Fatal("Error reading HTTP response: ", err.Error())
}
}
fmt.Println("Response: Read", bytesRead, "bytes")
}
You wouldn't want to read the entire response in memory if it's too large. Write it to a temporary file instead and then process that.
If instead you're looking for options to do this reliably when the network isn't very reliable, look for "HTTP range requests" using which you can resume partially downloaded data.
Hey there I would like to parse a http.resquest two times like below. When I parsed the Body the first time, the body will be closed. I need some help/hint what the best way is to handle this, do I have to create a copy of the request or is there a better way?
func myfunc(w http.ResponseWriter, req *http.Request) {
err := parseBody(req, &type1){
.....
}
err := parseBody(req, &type2){
.....
}
}
Thanks for help
It's true that you can read body only once and it's ok because to parse body more than once you don't have to read it more that one time. Let's consider simple example:
package main
import (
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
)
type RequestData1 struct {
Code string `json:"code"`
Status string `json:"status"`
}
type RequestData2 struct {
Status string `json:"status"`
Message string `json:"message"`
}
func main() {
http.HandleFunc("/post", post)
http.ListenAndServe(":8080", nil)
}
If we use this code:
func post(w http.ResponseWriter, r *http.Request) {
body1, err := ioutil.ReadAll(r.Body)
if err != nil {
panic(err)
}
rd1 := RequestData1{}
err = json.Unmarshal(body1, &rd1)
if err != nil {
panic(err)
}
body2, err := ioutil.ReadAll(r.Body)
if err != nil {
panic(err)
}
rd2 := RequestData2{}
err = json.Unmarshal(body2, &rd2)
if err != nil {
panic(err) // panic!!!
}
fmt.Printf("rd1: %+v \nrd2: %+v", rd1, rd2)
w.WriteHeader(http.StatusOK)
w.Write([]byte(`Look into console.`))
}
we will have panic: http: panic serving [::1]:54581: unexpected end of JSON input
but with next code:
func post(w http.ResponseWriter, r *http.Request) {
body, err := ioutil.ReadAll(r.Body)
if err != nil {
panic(err)
}
rd1 := RequestData1{}
err = json.Unmarshal(body, &rd1)
if err != nil {
panic(err)
}
rd2 := RequestData2{}
err = json.Unmarshal(body, &rd2)
if err != nil {
panic(err)
}
fmt.Printf("rd1: %+v \nrd2: %+v", rd1, rd2)
w.WriteHeader(http.StatusOK)
w.Write([]byte(`Look into console.`))
}
all works! You can test it by issuing request:
curl -X POST 'http://localhost:8080/post' \
-H 'Content-Type: application/json' -d '{"code":"200", "status": "OK", "message": "200 OK"}'
Result will be:
rd1: {Code:200 Status:OK}
rd2: {Status:OK Message:200 OK}
When you read request.Body, you're reading the stream from the client (e.g. web browser). The client only sends the request once. If you want to parse it multiple times, read the whole thing out into a buffer (e.g. a []byte) and then parse that as many times as you want. Just be mindful of the potential memory use of many concurrent requests with large payloads, as you'll be holding the full payload in memory at least until you're fully done parsing it.
Here is the schema :
Client sends a POST request to server A
server A process this and sends a GET to server B
server B sends a response through A to the client
I though the best idea was to make a pipe which would read the response of the GET, and write into the response of the POST, but I got many types problems.
func main() {
r := mux.NewRouter()
r.HandleFunc("/test/{hash}", testHandler)
log.Fatal(http.ListenAndServe(":9095", r))
}
func handleErr(err error) {
if err != nil {
log.Fatalf("%s\n", err)
}
}
func testHandler(w http.ResponseWriter, r *http.Request){
fmt.Println("FIRST REQUEST RECEIVED")
vars := mux.Vars(r)
hash := vars["hash"]
read, write := io.Pipe()
// writing without a reader will deadlock so write in a goroutine
go func() {
write, _ = http.Get("http://localhost:9090/test/" + hash)
defer write.Close()
}()
w.Write(read)
}
When I run this I get the following error:
./ReverseProxy.go:61: cannot use read (type *io.PipeReader) as type []byte in argument to w.Write
Is there a way, to properly insert a io.PipeReader format into an http response?
Or am I doing this in a totally wrong way?
You are not actually writing to it, you're replacing the pipe's write.
Something along the lines of:
func testHandler(w http.ResponseWriter, r *http.Request) {
fmt.Println("FIRST REQUEST RECEIVED")
vars := mux.Vars(r)
hash := vars["hash"]
read, write := io.Pipe()
// writing without a reader will deadlock so write in a goroutine
go func() {
defer write.Close()
resp, err := http.Get("http://localhost:9090/test/" + hash)
if err != nil {
return
}
defer resp.Body.Close()
io.Copy(write, resp.Body)
}()
io.Copy(w, read)
}
Although, I agree with #JimB, for this instance, the pipe isn't even needed, something like this should be more efficient:
func testHandler(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
hash := vars["hash"]
resp, err := http.Get("http://localhost:9090/test/" + hash)
if err != nil {
// handle error
return
}
defer resp.Body.Close()
io.Copy(w, resp.Body)
}
i use
resp, err := http.Get("http://example.com/")
get a http.Response, and i want to exactly write to a http handler, but only http.ResponseWriter, so i hijack it.
...
webConn, webBuf, err := hj.Hijack()
if err != nil {
// handle error
}
defer webConn.Close()
// Write resp
resp.Write(webBuf)
...
Write raw request
But When i hijack, http connection can't reuse (keep-alive), so it slow.
How to solve?
Thanks! Sorry for my pool English.
update 12/9
keep-alive, It keep two tcp connection, and can reuse.
but when i hijack, and conn.Close(), It can't reuse old connection, so it create a new tcp connection when i each refresh.
Do not use hijack, Because once hijack, the HTTP server library will not do anything else with the connection, So can't reuse.
I change way, copy Header and Body, look like reverse proxy (http://golang.org/src/pkg/net/http/httputil/reverseproxy.go), Is works.
Example:
func copyHeader(dst, src http.Header) {
for k, w := range src {
for _, v := range w {
dst.Add(k, v)
}
}
}
func copyResponse(r *http.Response, w http.ResponseWriter) {
copyHeader(w.Header(), r.Header)
w.WriteHeader(r.StatusCode)
io.Copy(w, r.Body)
}
func handler(w http.ResponseWriter, r *http.Response) {
resp, err := http.Get("http://www.example.com")
if err != nil {
// handle error
}
copyResponse(resp, w)
}
It seem that once the connection is closed the keep-alive connection closes as well.
One possible solution would be to prevent the connection from closing until desired, but I'm not sure if that good advise.
Maybe the correct solution involves creating a instance of net.TCPConn, copying the connection over it, then calling .SetKeepAlive(true).
Before running the below example, launch another terminal with netstat -antc | grep 9090.
Routes in example:
localhost:9090/ok is a basic (non-hijacked) connection
localhost:9090 is a hijacked connection, lasting for 10 seconds.
Example
package main
import (
"fmt"
"net/http"
"sync"
"time"
)
func checkError(e error) {
if e != nil {
panic(e)
}
}
var ka_seconds = 10
var conn_id = 0
func main() {
http.HandleFunc("/ok", func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintln(w, "ok")
})
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
conn_id++
fmt.Printf("Connection %v: Keep-alive is enabled %v seconds\n", conn_id, ka_seconds)
hj, ok := w.(http.Hijacker)
if !ok {
http.Error(w, "webserver doesn't support hijacking", http.StatusInternalServerError)
return
}
conn, bufrw, err := hj.Hijack()
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
// Don't forget to close the connection:
time.AfterFunc(time.Second* time.Duration(ka_seconds), func() {
conn.Close()
fmt.Printf("Connection %v: Keep-alive is disabled.\n", conn_id)
})
resp, err := http.Get("http://www.example.com")
checkError(err)
resp.Write(bufrw)
bufrw.Flush()
})
fmt.Println("Listing to localhost:9090")
http.ListenAndServe(":9090", nil)
}
Related issue: http://code.google.com/p/go/issues/detail?id=5645
I'm writing a simple web app in Go and I want my responses to be streamed to the client (i.e. not buffered and sent in blocks once the request is fully processed) :
func handle(res http.ResponseWriter, req *http.Request) {
fmt.Fprintf(res, "sending first line of data")
sleep(10) //not real code
fmt.Fprintf(res, "sending second line of data")
}
From the client point of view, the two lines will be sent at the same time. Any suggestions are appreciated :)
Edit after #dystroy answer
It's possible to flush after each write I personally make, but in my use case it's not enough:
cmd := exec.Command("a long command that outputs lots of lines")
cmd.Stdout = res //where res is a http.ResponseWritter
cmd.Stderr = res
err := cmd.Run()
I want the output of my cmd to be flushed as well. Anyway to "autoflush" the ResponseWritter ?
Solution
I found help on golang's mailing list. There is 2 way to achieve this: using hijacker that allow to take over the underlying TCP connection of HTTP, or piping the stdout and stderr of the command in a go routine that will write and flush :
pipeReader, pipeWriter := io.Pipe()
cmd.Stdout = pipeWriter
cmd.Stderr = pipeWriter
go writeCmdOutput(res, pipeReader)
err := cmd.Run()
pipeWriter.Close()
//---------------------
func writeCmdOutput(res http.ResponseWriter, pipeReader *io.PipeReader) {
buffer := make([]byte, BUF_LEN)
for {
n, err := pipeReader.Read(buffer)
if err != nil {
pipeReader.Close()
break
}
data := buffer[0:n]
res.Write(data)
if f, ok := res.(http.Flusher); ok {
f.Flush()
}
//reset buffer
for i := 0; i < n; i++ {
buffer[i] = 0
}
}
}
Last update
Even nicer: http://play.golang.org/p/PpbPyXbtEs
As implied in the documentation, some ResponseWriter may implement the Flusher interface.
This means you can do something like this :
func handle(res http.ResponseWriter, req *http.Request) {
fmt.Fprintf(res, "sending first line of data")
if f, ok := res.(http.Flusher); ok {
f.Flush()
} else {
log.Println("Damn, no flush");
}
sleep(10) //not real code
fmt.Fprintf(res, "sending second line of data")
}
Be careful that buffering can occur in many other places in the network or client side.
Sorry if I've misunderstood your question, but would something like the below do the trick?
package main
import (
"bytes"
"fmt"
"net/http"
)
func handler(w http.ResponseWriter, r *http.Request) {
body := make([]byte, int(r.ContentLength))
b := bytes.NewBuffer(body)
if _, err := b.ReadFrom(r.Body); err != nil {
fmt.Fprintf(w, "%s", err)
}
if _, err := b.WriteTo(w); err != nil {
fmt.Fprintf(w, "%s", err)
}
}
func main() {
http.HandleFunc("/", handler)
if err := http.ListenAndServe(":8080", nil); err != nil {
panic(err)
}
}
$ curl --data "param1=value1¶m2=value2" http://localhost:8080
returns:
param1=value1¶m2=value2
You could always append whatever data you wanted to body, or read more bytes into the buffer from elsewhere before writing it all out again.