I'm wondering what the best way is to upload a video to S3 via a presigned URL. I am primarily considering using a standard HTTP PUT request, placing video/mp4 as the Content-Type, and then attaching the video file as the body.
I'm wondering if there are more efficient approaches to doing this, such as using a third party library or possibly compressing the video before sending it via the PUT request?
In general, when your object size reaches 100 MB, you should consider
using multipart uploads instead of uploading the object in a single
operation.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html
I had most success using Uppy for this
https://uppy.io/docs/aws-s3-multipart/
You will need to provide some backend endpoints though:
https://uppy.io/docs/aws-s3-multipart/#createMultipartUpload-file
https://uppy.io/docs/aws-s3-multipart/#listParts-file-uploadId-key
https://uppy.io/docs/aws-s3-multipart/#prepareUploadParts-file-partData
https://uppy.io/docs/aws-s3-multipart/#abortMultipartUpload-file-uploadId-key
https://uppy.io/docs/aws-s3-multipart/#completeMultipartUpload-file-uploadId-key-parts
On compression part of you question, S3 does not have any compute. It will not modify your sent bytes, it will just store it. If want to use compression, you need to do that before upload, upload to cloud, unzip there with some compute (Ec2, Lambda etc.) and then put to S3.
Related
I am very new to grpc and I am refactoring some http handlers to grpc. In there I have found a handler which is relevant to upload a file. In the request, it is sending a file as http.FormFile using http multipart form data.
What I found is, there is a way using request chunk data stream to upload file. But what i need is avoid streaming and do it in stateless manner.
I have searched a way to solve this, but I couldn't find a way to do that. Highly appreciate if someone give me a propper solution to do this
Tl;dr
gRPC was not designed to handle large file uploads in the same way that you would using http multipart form data uploads. gRPC has a (slightly) arbitrary 4MB message limit (also see this). In my experience, the proper solution is to not use gRPC for large file uploads. That being said, there may be a few options you can try.
Changing gRPC Call Options
You can manually override the default 4MB message limit using connection options when you dial the gRPC server. For example, see this:
client, err := grpc.Dial("...",
grpc.WithDefaultCallOptions(grpc.MaxCallRecvMsgSize(4096)))
Using gRPC Streams
I have needed to use gRPC for file uploads without changing the default message limit options by implementing my own chunking package to handle it for me over a unary gRPC stream. You mentioned you wanted to avoid using a stream, however I'm providing this as a resource for those who want to avoid changing their gRPC message limit. Using the library, you'll need to wrap your gRPC client. Once a wrapper is created, you can upload anything that is compatible with the io.Reader interface.
err := chunk.UploadFrom(reader, WrapUploadFileClient(client))
We realize that if we want to produce a multipart query that contains a video file of 15GB, it is impossible to allocate in memory the size needed for such a large amount of data, most devices have only 2 or 3GB of RAM.
It is therefore absolutely necessary to switch to the uploadTask method which will push to the server the contents of a block file of the maximum size allowed by the IP packets sent to the server.
This is a POST method. However, it does not contain parameters such as the folder id or the file name. So you need a way to transmit these parameters. The best way is to code them in the URL.
I proposed an encoding format in the form of a path behind the endpoint of the API, but we can also very well encode these two parameters in a classic way in the URL, eg:
/api/upload?id=123&filename=video.mp4
From what I read on Stackoverflow, it's trivial with Symfony to retrieve id and filename. Then all the data received in the body of the POST request can be written in a raw way directly into a file, without also passing through a buffer in server-side memory.
The user data must imperatively be streamed, whether mobile side or server side, and whether upload or download. Loading user content in memory is also very dangerous in terms of security.
In symfony, how can I do that?
This goes way beyond Symfony and depends on the web server you are using.
By default with apache/nginx and php you will receive an already buffered request, so you cannot stream it to a file.
However, there are solutions, for example with Apache you can stream requests, see http://hc.apache.org/httpclient-3.x/performance.html#Request_Response_entity_streaming
Probably nginx also has options for it, but I don't know about those.
Another option might be websockets, see http://en.wikipedia.org/wiki/WebSocket
I'm a beginner developer trying to understand how some 'backend' processes work when uploading files from a web app.
I'm using edgee:slingshot in a meteor project to upload images to Amazon S3. My understanding is that a file upload from slingshot makes a POST request to S3 in order to upload a file to the bucket. I confirm this from the Chrome console where I can see the POST request (after a preflight OPTIONS request) to S3.
However the POST request count in S3 increased by 4 after uploading the image. I did not upload anything else, and there is only one other file in the bucket which is not being used anywhere.
Is this normal behaviour? I do not know the nuts and bolts of HTTP requests so this is all a bit mystifying. I am interested because S3 prices according to (amongst other things) the number of requests made.
Bonus question: The number of GET requests also increased (by 3, not 4) after uploading the image. Is this normal behaviour? The slingshot upload function returns the download URL of the image in the bucket. I did not think I was making any GET requests.
Is there some kind of behind-the-scenes validation/batch upload going on that is causing this?
Thanks for the help.
My requirement is to encode data and send it across network via HTTP, but I am stuck trying to choose the best encoding technique. Which of the three above is best? Is there anything better?
The criteria for "best" should be small size and fast encoding/decoding.
yEnc has less overhead but has other problems mentioned here http://en.wikipedia.org/wiki/YEnc#Criticisms.
What is "best" depends on the criteria you might have and how you plan to send data over the network. Are you using a web service, email or other means. You need to provide more details.
Edit:
Since you are uploading data via HTTP, you do not need to use any of Base64, yEnc or Uuencode. You just use the standard http file upload built in facility in both browser and web server. See this question as a reference:
How does HTTP file upload work?
Also this reference:
http://www.hanselman.com/blog/ABackToBasicsCaseStudyImplementingHTTPFileUploadWithASPNETMVCIncludingTestsAndMocks.aspx
I am using a Azure cloud storage solution, and as such, each file has it's own URL. I need to allow users to download files such as PDFs from our ASP .Net website. Currently, we are using Response.TransmitFile to send the file to the client's browser, but this requires that we fetch the data from the cloud storage and then send it to the client (seems like an inefficient way to do it).
I'm wondering if we could not just have a direct link to the file, and if so, how would this differ from the Response.TrnasmitFile method? That is, without the TransmitFile method, we cannot set the Content-type header, etc... How does that effect anything?
Thanks
Usually I stay away from using Response.TransmitFile as it does require that you fetch the file and then stream it down to the client.
The only times I have used it was to protect files and only serve them to users that had permission to access them instead of just linking directly to the file.
If the files you are serving are public, then I would recommend just linking to them. If all you need is to set the Content-Type header, then simply make sure the .pdf extension is mapped to the correct MIME type (application/pdf).