Limit bytes to read from HTTP response - http

I need to read responses from user provided URLs
I don't want them to overload my server with links to huge files.
I want to read N bytes max and return an error if there are more bytes to read.
I can read N bytes, but how I detect, that file is incomplete (assuming corner cases when remote file is exactly N bytes long)?

Additionally to Peter's answer, there is a ready solution in the net/http package: http.MaxBytesReader():
func MaxBytesReader(w ResponseWriter, r io.ReadCloser, n int64) io.ReadCloser
MaxBytesReader is similar to io.LimitReader but is intended for limiting the size of incoming request bodies. In contrast to io.LimitReader, MaxBytesReader's result is a ReadCloser, returns a non-EOF error for a Read beyond the limit, and closes the underlying reader when its Close method is called.
Originally it was "designed" for limiting the size of incoming request bodies, but it can be used to limit incoming response bodies as well. For that, simply pass nil for the ResponseWriter parameter.
Example using it:
{
body := ioutil.NopCloser(bytes.NewBuffer([]byte{0, 1, 2, 3, 4}))
r := http.MaxBytesReader(nil, body, 4)
buf, err := ioutil.ReadAll(r)
fmt.Println("When body is large:", buf, err)
}
{
body := ioutil.NopCloser(bytes.NewBuffer([]byte{0, 1, 2, 3, 4}))
r := http.MaxBytesReader(nil, body, 5)
buf, err := ioutil.ReadAll(r)
fmt.Println("When body is exact (OK):", buf, err)
}
{
body := ioutil.NopCloser(bytes.NewBuffer([]byte{0, 1, 2, 3, 4}))
r := http.MaxBytesReader(nil, body, 6)
buf, err := ioutil.ReadAll(r)
fmt.Println("When body is small (OK):", buf, err)
}
Output (try it on the Go Playground):
When body is large: [0 1 2 3] http: request body too large
When body is exact (OK): [0 1 2 3 4] <nil>
When body is small (OK): [0 1 2 3 4] <nil>

Simply try to read your maximum acceptable size plus 1 byte. For an acceptable size of 1MB:
var res *http.Response
b := make([]byte, 1<<20+1)
n, err := io.ReadFull(res.Body, b)
switch err {
case nil:
log.Fatal("Response larger than 1MB")
case io.ErrUnexpectedEOF:
// That's okay; the response is exactly 1MB or smaller.
b = b[:n]
default:
log.Fatal(err)
}
You can also do the same thing with an io.LimitedReader:
var res *http.Response
r := &io.LimitedReader{
R: res.Body,
N: 1<<20 + 1,
}
// handle response body somehow
io.Copy(ioutil.Discard, r)
if r.N == 0 {
log.Fatal("Response larger than 1MB")
}
Note that both methods limit the uncompressed size. Significantly fewer bytes may traverse the network if the response is compressed. You need be clear about whether you want to limit network or memory usage and adjust the limit accordingly, possibly on a case-by-case basis.

You can check the content length field in request header to get the total file size.

Related

Alternative To ioutil.ReadAll in go?

For a program I'm making this function is ran as a goroutine in a for loop depending on how many urls are passed in (no set amount).
func makeRequest(url string, ch chan<- string, errors map[string]error){
res, err := http.Get(url)
if err != nil {
errors[url] = err
close(ch)
return
}
defer res.Body.Close()
body, _ := ioutil.ReadAll(res.Body)
ch <- string(body)
}
The entire body of the response has to be used so ioutil.ReadAll seemed like the perfect fit but with no restriction on the amount of urls that can be passed in and the nature of ReadAll being that it's all stored in memory it's starting to feel less like the golden ticket. I'm fairly new to Go so if you do decide to answer, if you could give some explanation behind your solution it would be greatly appreciated!
One insight that I got as I learned how to use Go is that ReadAll is often inefficient for large readers, and like in your case, is subject to arbitrary input being very big and possibly leaking out memory. When I started out, I used to do JSON parsing like this:
data, err := ioutil.ReadAll(r)
if err != nil {
return err
}
json.Unmarshal(data, &v)
Then, I learned of a much more efficient way of parsing JSON, which is to simply use the Decoder type.
err := json.NewDecoder(r).Decode(&v)
if err != nil {
return err
}
Not only is this more concise, it is much more efficient, both memory-wise and time-wise:
The decoder doesn't have to allocate a huge byte slice to accommodate for the data read - it can simply reuse a tiny buffer which will be used against the Read method to get all the data and parse it. This saves a lot of time in allocations and removes stress from the GC
The JSON Decoder can start parsing data as soon as the first chunk of data comes in - it doesn't have to wait for everything to finish downloading.
Now, of course your question has nothing to do with JSON, but this example is useful to illustrate that if you can use Read directly and parse data chunks at a time, do it. Especially with HTTP requests, parsing is faster than reading/downloading, so this can lead to parsed data being almost immediately ready the moment the request body finishes arriving.
In your case, you seem not to be actually doing any handling of the data for now, so there's not much to suggest to aid you specifically. But the io.Reader and the io.Writer interfaces are the Go equivalent of UNIX pipes, and so you can use them in many different places:
Writing data to a file:
f, err := os.Create("file")
if err != nil {
return err
}
defer f.Close()
// Copy will put all the data from Body into f, without creating a huge buffer in memory
// (moves chunks at a time)
io.Copy(f, resp.Body)
Printing everything to stdout:
io.Copy(os.Stdout, resp.Body)
Pipe a response's body to a request's body:
resp, err := http.NewRequest("POST", "https://example.com", resp.Body)
In order to bound the amount of memory that you're application is using, the common approach is to read into a buffer, which should directly address your ioutil.ReadAll problem.
go's bufio package offers utilities (Scanner) which supports reading until a delimiter, or reading a line from the input, which is highly related to #Howl's question
While that is pretty much simple in go
Here is the client program:
package main
import (
"fmt"
"net/http"
)
var data []byte
func main() {
data = make([]byte, 128)
ch := make(chan string)
go makeRequest("http://localhost:8080", ch)
for v := range ch {
fmt.Println(v)
}
}
func makeRequest(url string, ch chan<- string) {
res, err := http.Get(url)
if err != nil {
close(ch)
return
}
defer res.Body.Close()
defer close(ch) //don't forget to close the channel as well
for n, err := res.Body.Read(data); err == nil; n, err = res.Body.Read(data) {
ch <- string(data[:n])
}
}
Here is the serve program:
package main
import (
"net/http"
)
func main() {
http.HandleFunc("/", hello)
http.ListenAndServe("localhost:8080", nil)
}
func hello(w http.ResponseWriter, r *http.Request) {
http.ServeFile(w, r, "movie.mkv")
}

PUT upload a file's byte range with streams and progress

I just got started with Go and need some help. I would like to upload a certain range of bytes from a file.
I already accomplished this by reading the bytes into a buffer. But this increases memory usage.
Instead of reading bytes into memory, I want to stream them while uploading and have an upload progress. I did something like this in Node.js but struggle to get the puzzle pieces together for Go.
The code that I have now looks like this:
func uploadChunk(id, mimeType, uploadURL, filePath string, offset, size uint) {
// open file
file, err := os.Open(filePath)
panicCheck(err, ErrorFileRead) // custom error handler
defer file.Close()
// move to the proper byte
file.Seek(int64(offset), 0)
// read byte chunk into buffer
buffer := make([]byte, size)
file.Read(buffer)
fileReader := bytes.NewReader(buffer)
request, err := http.NewRequest(http.MethodPut, uploadURL, fileReader)
client := &http.Client{
Timeout: time.Second * 10,
}
response, err := client.Do(request)
panicCheck(err, ErrorFileRead)
defer response.Body.Close()
b, err := httputil.DumpResponse(response, true)
panicCheck(err, ErrorFileRead)
fmt.Println("response\n", string(b))
}
Could you guys help me to figure out how to stream and get progress for an upload?
Thanks
You can use an io.LimitedReader to wrap the file and only read the amount of data you want. The implementation returned by io.LimitReader is an *io.LimitedReader.
file.Seek(int64(offset), 0)
fileReader := io.LimitReader(file, size)
request, err := http.NewRequest(http.MethodPut, uploadURL, fileReader)
And for S3 you will want to ensure that you don't use chunked encoding by explicitly setting the ContentLength:
request.ContentLength = size
As for upload progress, see: Go: Tracking POST request progress

Proper way to add padding to byte slice in golang?

I'm trying to encrypt some data in go but it's hardly ever the correct cipher.BlockSize.
Is there a "built-in" way to add padding or should I be using a function to add it manually?
This is my solution now:
// encrypt() encrypts the message, but sometimes the
// message isn't the proper length, so we add padding.
func encrypt(msg []byte, key []byte) []byte {
cipher, err := aes.NewCipher(key)
if err != nil {
log.Fatal(err)
}
if len(msg) < cipher.BlockSize() {
var endLength = cipher.BlockSize() - len(msg)
ending := make([]byte, endLength, endLength)
msg = append(msg[:], ending[:]...)
cipher.Encrypt(msg, msg)
} else {
var endLength = len(msg) % cipher.BlockSize()
ending := make([]byte, endLength, endLength)
msg = append(msg[:], ending[:]...)
cipher.Encrypt(msg, msg)
}
return msg
}
Looking at Package cipher it appears like you may have to add the padding yourself, see PKCS#7 padding.
Essentially add the required padding bytes with the value of each byte the number of padding byte added.
Note that you need to add padding consistently and that means that if the data to be encrypted is an exact multiple of the block size an entire block of padding must be added since there is no way to know from the data if padding has been added or not, it is a common mistake to try to out-smart this. Consider if the last byte is 0x00, is that padding or data?
here's my solution
// padOrTrim returns (size) bytes from input (bb)
// Short bb gets zeros prefixed, Long bb gets left/MSB bits trimmed
func padOrTrim(bb []byte, size int) []byte {
l := len(bb)
if l == size {
return bb
}
if l > size {
return bb[l-size:]
}
tmp := make([]byte, size)
copy(tmp[size-l:], bb)
return tmp
}

In Go, how do I write a streaming http response body to a seek position in a file effectively?

I have a program that combines multiple http responses and writes to the respective seek positions on a file. I am currently doing this by
client := new(http.Client)
req, _ := http.NewRequest("GET", os.Args[1], nil)
resp, _ := client.Do(req)
defer resp.Close()
reader, _ := ioutil.ReadAll(resp.Body) //Reads the entire response to memory
//Some func that gets the seek value someval
fs.Seek(int64(someval), 0)
fs.Write(reader)
This sometimes results in a large memory usage because of the ioutil.ReadAll.
I tried bytes.Buffer as
buf := new(bytes.Buffer)
offset, _ := buf.ReadFrom(resp.Body) //Still reads the entire response to memory.
fs.Write(buf.Bytes())
but it was still the same.
My intention was to use a buffered write to the file, then seek to the offset again, and to continue write again till the end of stream is received (and hence capturing the offset value from buf.ReadFrom). But it was also keeping everything in the memory and writing at once.
What is the best way to write a similar stream directly to the disk, without keeping the entire content in buffer?
An example to understand would be much appreciated.
Thank you.
Use io.Copy to copy the response body to the file:
resp, _ := client.Do(req)
defer resp.Close()
//Some func that gets the seek value someval
fs.Seek(int64(someval), 0)
n, err := io.Copy(fs, resp.Body)
// n is number of bytes copied

Golang Read Bytes from net.TCPConn

Is there a version of ioutil.ReadAll that will read until it reaches an EOF OR if it reads n bytes (whichever comes first)?
I cannot just take the first n bytes from an ioutil.ReadAll dump for DOS reasons.
io.ReadFull
or
io.LimitedReader
or
http.MaxBytesReader.
If you need something different first look at how those are implemented, it'd be trivial to roll your own with tweaked behavior.
There are a couple of ways to achieve your requirement. You can use either one.
func ReadFull
func ReadFull(r Reader, buf []byte) (n int, err error)
ReadFull reads exactly len(buf) bytes from r into buf. It returns the
number of bytes copied and an error if fewer bytes were read. The
error is EOF only if no bytes were read.
func LimitReader
func LimitReader(r Reader, n int64) Reader
LimitReader returns a Reader that reads from r but stops with EOF after n bytes. The underlying implementation is a *LimitedReader.
func CopyN
func CopyN(dst Writer, src Reader, n int64) (written int64, err error)
CopyN copies n bytes (or until an error) from src to dst. It returns
the number of bytes copied and the earliest error encountered while
copying. On return, written == n if and only if err == nil.
func ReadAtLeast
func ReadAtLeast(r Reader, buf []byte, min int) (n int, err error)
ReadAtLeast reads from r into buf until it has read at least min
bytes. It returns the number of bytes copied and an error if fewer
bytes were read. The error is EOF only if no bytes were read.
There are two options. If n is the number of bytes you want to read and r is the connection.
Option 1:
p := make([]byte, n)
_, err := io.ReadFull(r, p)
Option 2:
p, err := io.ReadAll(&io.LimitedReader{R: r, N: n})
The first option is more efficient if the application typically fills the buffer.
If you are reading from an HTTP request body, then use http.MaxBytesReader.

Resources