meteor-slingshot/S3: How HTTP requests work between slingshot and S3 - http

I'm a beginner developer trying to understand how some 'backend' processes work when uploading files from a web app.
I'm using edgee:slingshot in a meteor project to upload images to Amazon S3. My understanding is that a file upload from slingshot makes a POST request to S3 in order to upload a file to the bucket. I confirm this from the Chrome console where I can see the POST request (after a preflight OPTIONS request) to S3.
However the POST request count in S3 increased by 4 after uploading the image. I did not upload anything else, and there is only one other file in the bucket which is not being used anywhere.
Is this normal behaviour? I do not know the nuts and bolts of HTTP requests so this is all a bit mystifying. I am interested because S3 prices according to (amongst other things) the number of requests made.
Bonus question: The number of GET requests also increased (by 3, not 4) after uploading the image. Is this normal behaviour? The slingshot upload function returns the download URL of the image in the bucket. I did not think I was making any GET requests.
Is there some kind of behind-the-scenes validation/batch upload going on that is causing this?
Thanks for the help.

Related

Best way to upload video via a presigned URL to S3?

I'm wondering what the best way is to upload a video to S3 via a presigned URL. I am primarily considering using a standard HTTP PUT request, placing video/mp4 as the Content-Type, and then attaching the video file as the body.
I'm wondering if there are more efficient approaches to doing this, such as using a third party library or possibly compressing the video before sending it via the PUT request?
In general, when your object size reaches 100 MB, you should consider
using multipart uploads instead of uploading the object in a single
operation.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html
I had most success using Uppy for this
https://uppy.io/docs/aws-s3-multipart/
You will need to provide some backend endpoints though:
https://uppy.io/docs/aws-s3-multipart/#createMultipartUpload-file
https://uppy.io/docs/aws-s3-multipart/#listParts-file-uploadId-key
https://uppy.io/docs/aws-s3-multipart/#prepareUploadParts-file-partData
https://uppy.io/docs/aws-s3-multipart/#abortMultipartUpload-file-uploadId-key
https://uppy.io/docs/aws-s3-multipart/#completeMultipartUpload-file-uploadId-key-parts
On compression part of you question, S3 does not have any compute. It will not modify your sent bytes, it will just store it. If want to use compression, you need to do that before upload, upload to cloud, unzip there with some compute (Ec2, Lambda etc.) and then put to S3.

How to check that a file is being streamed and not downloaded?

Abstract: There is a page with a player that loads audio file and plays it. The player used on the web page is jwplayer. I need to find a way to determine if the audio file is being streamed to the player or not.
Background: In my research I found that if I use nginx header like X-Accel-Redirect - the file will be streamed. I have setup the web server with nginx + apache combination (nginx is reverse proxy for apache), after that I pointed jwplayer to the mp3 file - and it is working. I mean I am able to click anywhere on the audio timeline and it immediately starts playing sound. But, since I didn't set that header yet, and adding the fact that player already works - that's why I need to check my question and know for sure.
Some of my own thoughts: JwPlayer itself supports some kind of bufferring, so I have no idea whether it just downloads the mp3 file I am testing this functions on, or it receives the stream and plays it out.
Is there a way to check and know for sure? The only idea about all of this I have is to check access logs, but I don't know what to look for, or if I need a special format for the logs to see those requried data.
While I was researching the issue I got some weird download related topics and something about HTTP headers with "Ranges" in them, but I am not sure that it relates to the streaming or not.
Please advice.
From the point of view of the server, there is no difference between download and streaming. A server just send bits. What happens to those bits later is unknown. What you need is a player that sends reports to back to the server or a loging service such as mixpanel.

Response.TransmitFile vs. Direct Link

I am using a Azure cloud storage solution, and as such, each file has it's own URL. I need to allow users to download files such as PDFs from our ASP .Net website. Currently, we are using Response.TransmitFile to send the file to the client's browser, but this requires that we fetch the data from the cloud storage and then send it to the client (seems like an inefficient way to do it).
I'm wondering if we could not just have a direct link to the file, and if so, how would this differ from the Response.TrnasmitFile method? That is, without the TransmitFile method, we cannot set the Content-type header, etc... How does that effect anything?
Thanks
Usually I stay away from using Response.TransmitFile as it does require that you fetch the file and then stream it down to the client.
The only times I have used it was to protect files and only serve them to users that had permission to access them instead of just linking directly to the file.
If the files you are serving are public, then I would recommend just linking to them. If all you need is to set the Content-Type header, then simply make sure the .pdf extension is mapped to the correct MIME type (application/pdf).

HTTP PUT and POST alternatives for uploading content

Other than HTTP PUT and POST, what other methods can a web application designer use to allow users to upload content (either files or listbox text) from a page of his web app to a remote server?
On the same topic, I was wondering what technology/APIs does a service like Google Docs or Google Drive use? The reason I ask this is: Our Sys Admin has disabled file uploading (via Squid proxy), yet I was able to create and share a document using Google Docs / Google Drive.
Many thanks in advance,
/HS
EDIT Please see the strikeout above.
This depends on the server in question - as the standard set of HTTP commands can be expanded, and some may not be configured/allowed. One of the common commands is "OPTIONS" that ask "what can I do".
But to answer more helpfully: you generally have two main options:
POST (the one you probably want to user as it's nearly always avaiable
GET. You could use GET (but I'm NOT advocating it - just saying you could you it - you should not use a GET to make changes to the server). There are problems with this approach (including size of files, manually handling the encoding etc) but it's possible if you have to go this route.
PUT it often not enabled on servers for security reasons.
More reading: http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
Edit: if "file uploading" is prevented by proxy, have you tried encoding the POST? i.e. As opposed to sending a multipart POST, try encoding the files yourself into POST string and sending that instead? Or encode the file and split into multiple small posts and piecing them together at the other end?
Google Docs uses a mixture of POST and GET. POST for the updates. Google Drive I don't know.

Need to check uptime on a large file being hosted

I have a dynamically generated rss feed that is about 150M in size (don't ask)
The problem is that it keeps crapping out sporadically and there is no way to monitor it without downloading the entire feed to get a 200 status. Pingdom times out on it and returns a 'down' error.
So my question is, how do I check that this thing is up and running
What type of web server, and server side coding platform are you using (if any)? Is any of the content coming from a backend system/database to the web tier?
Are you sure the problem is not with the client code accessing the file? Most clients have timeouts and downloading large files over the internet can be a problem depending on how the server behaves. That is why file download utilities track progress and download in chunks.
It is also possible that other load on the web server or the number of users is impacting server. If you have little memory available and certain servers then it may not be able to server that size of file to many users. You should review how the server is sending the file and make sure it is chunking it up.
I would recommend that you do a HEAD request to check that the URL is accessible and that the server is responding at minimum. The next step might be to setup your download test inside or very close to the data center hosting the file to monitor further. This may reduce cost and is going to reduce interference.
Found an online tool that does what I needed
http://wasitup.com uses head requests so it doesn't time out waiting to download the whole 150MB file.
Thanks for the help BrianLy!
Looks like pingdom does not support the head request. I've put in a feature request, but who knows.
I hacked this capability into mon for now (mon is a nice compromise between paying someone else to monitor and doing everything yourself). I have switched entirely to https so I modified the https monitor to do it. The did it the dead-simple way: copied the https.monitor file, called it https.head.monitor. In the new monitor file I changed the line that says (you might also want to update the function name and the place where that's called):
get_https to head_https
Now in mon.cf you can call a head request:
monitor https.head.monitor -u /path/to/file

Resources