Need to check uptime on a large file being hosted - http

I have a dynamically generated rss feed that is about 150M in size (don't ask)
The problem is that it keeps crapping out sporadically and there is no way to monitor it without downloading the entire feed to get a 200 status. Pingdom times out on it and returns a 'down' error.
So my question is, how do I check that this thing is up and running

What type of web server, and server side coding platform are you using (if any)? Is any of the content coming from a backend system/database to the web tier?
Are you sure the problem is not with the client code accessing the file? Most clients have timeouts and downloading large files over the internet can be a problem depending on how the server behaves. That is why file download utilities track progress and download in chunks.
It is also possible that other load on the web server or the number of users is impacting server. If you have little memory available and certain servers then it may not be able to server that size of file to many users. You should review how the server is sending the file and make sure it is chunking it up.
I would recommend that you do a HEAD request to check that the URL is accessible and that the server is responding at minimum. The next step might be to setup your download test inside or very close to the data center hosting the file to monitor further. This may reduce cost and is going to reduce interference.

Found an online tool that does what I needed
http://wasitup.com uses head requests so it doesn't time out waiting to download the whole 150MB file.
Thanks for the help BrianLy!

Looks like pingdom does not support the head request. I've put in a feature request, but who knows.
I hacked this capability into mon for now (mon is a nice compromise between paying someone else to monitor and doing everything yourself). I have switched entirely to https so I modified the https monitor to do it. The did it the dead-simple way: copied the https.monitor file, called it https.head.monitor. In the new monitor file I changed the line that says (you might also want to update the function name and the place where that's called):
get_https to head_https
Now in mon.cf you can call a head request:
monitor https.head.monitor -u /path/to/file

Related

Wordpress logging requests into a database

I am trying to create a plugin which logs http requests from users into a database. So far I've logged the requests for php files by hooking my function to the init function. But now I want to know if I can also log requests for files such as images, documents, etc. Is there any php code executed when someone requests files? Thank you.
Not by default, no. The normal mod_rewrite rules (not to be confused with WP's own rewrite rules) Wordpress uses specifically exclude any existing files such as images, css or javascript files. Those will be handled directly by Apache.
You obviously could add a custom script that runs on each request, logs the access to the database, reads those files and prints their content to the client, but it would come at a considerable cost, I'm afraid.
Apache, albeit not the fastest webserver around, is much, much faster in delivering a file to a client than running a php script, setting up a database connection, logging etc pp would be.
You'd get much higher server load, and probably noticeably slower page loads.
Instead, I recommend that you parse the access logs. They'll most likely contain all of the data you're looking for, and if you have access to the configuration, you can add specific headers sent by the client. You can easily do this with a cronjob that runs once a day, and it doesn't even have to run on the same server.

Disable disk cache on media foundation network source

I'm currently using MediaFoundation to read MP4 video from a remote server using HTTP. I create the media source using the source resolver method CreateObjectFromURL.
This works perfectly fine and the video is great, except for one huge pain point:
All data received by the source seems to be cached to disk, specifically in ...AppData\Local\Microsoft\Windows\INetCache\IE. This is not ideal as the streams can be open for an unlimited amount of time because the video is live.
At the moment the only real solution is to periodically disconnect, clean up temporary internet files, then reconnect again. This would need to be done every few hours to stop the machine from filling up and exploding. It is not really an acceptable long term solution for the end user of course.
I have tried to disable caching by setting the MFNETSOURCE_CACHEENABLED attribute on the media source to false, but it doesn't seem to stop it caching at all.
Is there a trick I'm missing somewhere?

How can I track downloads of files from remote websites

I am sharing the link of a file (e.g. pdf), which is stored in my server. Is it possible to track whenever some user is downloading the file? I don't have access to the script of the other page but I thought I could track the incoming requests to my server. Would that be computationally expensive? Any hints towards which direction to look?
You can use the measurement protocol, a language agnostic description of a http tracking request to the Google Analytics tracking server.
The problem in your case is that you do not have a script between the click and the download to send the tracking request. One possible workaround would be to use the server access log, provided you have some control over the server.
For example the Apache web server can user piped logs, e.g. instead if being written directly to a file the log entry is passed to a script or program. I'm reasonably sure that other servers have something similar.
You could pipe the logs to a script that evaluates if the log entry points at the URL of your pdf file, and if so breaks down the info into individual data fields and sends them via a programming language of your choice to the GA tracking server.
If you cannot control the server to that level you'd need to place a script with the same name and location as the original file on the server, map the pdf extension to a script interpreter of your choice (in apache via addType, which with many hosts can be done via a htaccess file) and have the script sending the tracking request before delivering the original file.
Both solutions require a modicum of programming practice (the latter much less than the former). Piping logs might be expensive, depending on the number of requests to your server (you might create an extra log file for downloadable files, though). An intermediary script would not be an expensive operation.

WP-Engine 502 timeout- what options do I have to get around this limitation?

We have a plugin for Wordpress that we've been using successfully on many customers- the plugin syncs stock numbers with our warehouse and exports orders to our warehouse.
We have recently had a client move to WP-Engine who seem to impose a hard 30 second limit on the length of a running request. Because sometimes we have many orders to export, the script simply hits a 502 bad gateway error.
According to WP-Engine documentation, this cannot be turned off on a client by client basis.
https://wpengine.com/support/troubleshooting-502-error/
My question is, what options do I have to get around a host's 30 second timeout limit? Setting set_time_limit has no effect (as expected as it is the web server killing the request, not PHP). The only thing I can think of is make heavy modifications to the plugin whereby it acts as an API and we simply pull the data from the clients system, however this is a last resort.
The long-process timeout is 60 seconds.
This cannot be turned off on shared plans, only plans with dedicated servers. You will not be able to get around this by attempting to modify it as it runs directly on Apache outside of your particular install
Your optons are:
1. 'Chunk' the upload to be smaller
2. Upload the sql file to your sFTP _wpeprivate folder and have their support import it for you.
3. Optimize the import so the content is imported more efficiently.
I can see three options here.
Change the web host (easy option).
Modify a plugin to process the sync in batches. However, this also won't give you a 100% guarantee with a hard script execution time limit - something may get lost in one or more batches and you won't even know.
Contact WP Engine and ask to raise the limit for this particular client.

How to check that a file is being streamed and not downloaded?

Abstract: There is a page with a player that loads audio file and plays it. The player used on the web page is jwplayer. I need to find a way to determine if the audio file is being streamed to the player or not.
Background: In my research I found that if I use nginx header like X-Accel-Redirect - the file will be streamed. I have setup the web server with nginx + apache combination (nginx is reverse proxy for apache), after that I pointed jwplayer to the mp3 file - and it is working. I mean I am able to click anywhere on the audio timeline and it immediately starts playing sound. But, since I didn't set that header yet, and adding the fact that player already works - that's why I need to check my question and know for sure.
Some of my own thoughts: JwPlayer itself supports some kind of bufferring, so I have no idea whether it just downloads the mp3 file I am testing this functions on, or it receives the stream and plays it out.
Is there a way to check and know for sure? The only idea about all of this I have is to check access logs, but I don't know what to look for, or if I need a special format for the logs to see those requried data.
While I was researching the issue I got some weird download related topics and something about HTTP headers with "Ranges" in them, but I am not sure that it relates to the streaming or not.
Please advice.
From the point of view of the server, there is no difference between download and streaming. A server just send bits. What happens to those bits later is unknown. What you need is a player that sends reports to back to the server or a loging service such as mixpanel.

Resources