Wordpress logging requests into a database - wordpress

I am trying to create a plugin which logs http requests from users into a database. So far I've logged the requests for php files by hooking my function to the init function. But now I want to know if I can also log requests for files such as images, documents, etc. Is there any php code executed when someone requests files? Thank you.

Not by default, no. The normal mod_rewrite rules (not to be confused with WP's own rewrite rules) Wordpress uses specifically exclude any existing files such as images, css or javascript files. Those will be handled directly by Apache.
You obviously could add a custom script that runs on each request, logs the access to the database, reads those files and prints their content to the client, but it would come at a considerable cost, I'm afraid.
Apache, albeit not the fastest webserver around, is much, much faster in delivering a file to a client than running a php script, setting up a database connection, logging etc pp would be.
You'd get much higher server load, and probably noticeably slower page loads.
Instead, I recommend that you parse the access logs. They'll most likely contain all of the data you're looking for, and if you have access to the configuration, you can add specific headers sent by the client. You can easily do this with a cronjob that runs once a day, and it doesn't even have to run on the same server.

Related

accessing WordPress DB from remote server

Need some advice before starting develop some things.. I've 15 WordPress websites on different installs, and I've remote server which gets data 24/7 from those websites.
I've reached a point that I want the server to modify the websites based on his calculated data.
The things are this:
Should I allow the server the access the WP DB remotely and modify things without using WP on the circle?
Or, use WP REST API and supply some secured routes which provide data and accept data and make those changes?
My instinct is to use the WP API, but. After all its a PHP (nginx+apache) which have some limits (timeout for example) and I find it hard to run hard and long process on the WP itself.
I can divide the tasks to different levels, for example:
fetching data (simple get)
make some process on the remote server
loop and modify in small batches to another route
My concerns are that this circle require perfect match between remote server and WP API, and any change or fix on WP side brings plugins update on the websites which is not much fun.
Hope for any ideas and suggests to make it forward.
"use WP REST API and supply some secured routes which provide data and accept data and make those changes", indeed.
i don't know why timeout or another limits may cause a problem - but using API is the best way for such kind of cases. You can avoid timeout problems with some adjustments on web servers side.
Or you can increase memory, timeout limit exclusively for requested server.
f.e.
if ($_SERVER["remote_attr"]=='YOUR_MAIN_SERVER_IP') {
ini_set('max_execution_time',1000);
ini_set('memory_limit','1024M');
}

How can I track downloads of files from remote websites

I am sharing the link of a file (e.g. pdf), which is stored in my server. Is it possible to track whenever some user is downloading the file? I don't have access to the script of the other page but I thought I could track the incoming requests to my server. Would that be computationally expensive? Any hints towards which direction to look?
You can use the measurement protocol, a language agnostic description of a http tracking request to the Google Analytics tracking server.
The problem in your case is that you do not have a script between the click and the download to send the tracking request. One possible workaround would be to use the server access log, provided you have some control over the server.
For example the Apache web server can user piped logs, e.g. instead if being written directly to a file the log entry is passed to a script or program. I'm reasonably sure that other servers have something similar.
You could pipe the logs to a script that evaluates if the log entry points at the URL of your pdf file, and if so breaks down the info into individual data fields and sends them via a programming language of your choice to the GA tracking server.
If you cannot control the server to that level you'd need to place a script with the same name and location as the original file on the server, map the pdf extension to a script interpreter of your choice (in apache via addType, which with many hosts can be done via a htaccess file) and have the script sending the tracking request before delivering the original file.
Both solutions require a modicum of programming practice (the latter much less than the former). Piping logs might be expensive, depending on the number of requests to your server (you might create an extra log file for downloadable files, though). An intermediary script would not be an expensive operation.

Long waiting (TTFB) time for scripts / styles on Azure Website

I have this intriguing problem on Azure Website. My website uses 4 script files and 3 style files, each minified. They are not so big, bigest has near 200 KBs. Website had already started. Azure's Always On option is turned on. When I call to WebApi for data it returns in <50ms.
And when app is reloaded it needs 250 ms just to get first byte from tiniest script, and others needs much more. Initial Html is loaded in 60 ms. Scripts/styles are cached so they are not downloaded, but the TTFB time is killing the performance. This repeats every single reload. App is not containing any sophisticated configuration so it should run much faster than it.
What can cause such problems?
Although your static files are cached, the browser still issues requests with if-modifies-since header (which results in a 304).
While it doesn't need to download the actual content, it still needs to wait the RTT + server think time to continue.
I would suggest two things:
Adding Cache-Control and Expire headers - will help avoid 304 in some cases (pretty much unless you hit F5)
Using a proper CDN - such as Incapsula or others, that will minimize the RTT + think time. It can also be used to easily control cache settings for various resources.
More good stuff here.
Good Luck!
From here:
As you saw earlier, IIS 7 caches the compressed versions of static
files. So, if a request arrives for a static file whose compressed
version is already in the cache, it doesn’t need to be compressed
again.
But what if there is no compressed version in the cache? Will IIS 7
then compress the file right away and put it in the cache? The answer
is yes, but only if the file is being requested frequently. By not
compressing files that are only requested infrequently, IIS 7 saves
CPU usage and cache space.
By default, a file is considered to be requested frequently if it is
requested two or more times per 10 seconds.
So, the reason your users are being served an uncompressed version of the javascript file is because it didn't meet the default threshold for being compressed; in other words, the javascript file was not requested 2 times within 10 seconds.
To control this, there is one attribute we must change on the <serverRuntime> element, which controls compression: frequentHitThreshold. In order for your file to be compressed when it is requested once, change your <serverRuntime> element to look like this:
<serverRuntime enabled="true" frequentHitThreshold="1" />
This will slightly impact your CPU performance if you have many javascript files that are being served and you have users quite often, but likely if you have users often enough to impact CPU from compressing these files, then they are already compressed and cached!
My guess would be Azures always on.
If it works anything like the one CloudFlare provides, it essentially proxies the request and tries to cache it.
Depending on the exact implementation of this cache on the side of Azure, it might wait for the scripts output to complete to cache it/validate the cache and then pass it on to the browser.
You might have a chance checking the caching configuration and disable always on for your scripts if possible.
The scripts and styles are static files and by default are compressed (you can check this with HTTP header "content-encoding": gzip) before being sent to client. So, the TTFB consists of network latency, browser HTTP channel scheduling and the static file compression time from server.
On the other hand, your Web API data is dynamic data and by default is not compressed, so possible its TTFB is less than the TTFB for static files.
However, you don't need to switch off static compressing, otherwise TTFB is minimized but content transferring time will be extended. Actually, you don't need to worry about TTFB, see more info: https://blog.cloudflare.com/ttfb-time-to-first-byte-considered-meaningles/
I finished with storing files on Azure Storage and serving them by Azure CDN. It provides high speed of response and costs nothing. I add them to blob every publish, in Pre-build event by Gulp.
well... there are 2 main problems with your site:
you are using AZURE - a high priced service with a poor performance.... don't ask me why people think that this is a good service
you are storing client files side-by-side with the server files.. while server files should be stored in a specific server, client files can practically can be served from... everywhere
so - please use a CDN (or any other server) for your client side files (mainly css and js, you may consider moving fonts and images as well)

Securing large static files without hogging ASP.Net threads using IIS

While my ASP.NET code is streaming a large file, is it tying down a thread completely? In other words, if 8 people are downloading large files and I only have 8 threads available, will no further requests be processed?
In any case, I need to find an alternative way of securing large static files, preferably by letting IIS serve it directly after the user has been authorized, in order to free the application server from having to deal with something that IIS, Nginx, etc can do better without hitting any managed code.
I believe Nginx allows this if your app puts the "X-Accel-Redirect" header in its response: http://kovyrin.net/2006/11/01/nginx-x-accel-redirect-php-rails/.
Apache and Lighttpd have the same feature.
Any advice?
Returning the URL of the file is an appropriate solution.
You can prevent unauthorised users who have that URL from also downloading the file by using the standard authentication providers in asp.net. If you turn runAllManagedModulesForAllRequests on (see http://www.iis.net/configreference/system.webserver/modules) the users authentication will be verified when they hit the URL, if they are authorised, they will be allowed to access the file.
In either case, downloading doesn't lock threads, just execution. This is why the maxConnections settings has a default setting of 4294967295. (see http://www.iis.net/configreference/system.applicationhost/sites/site/limits)

Need to check uptime on a large file being hosted

I have a dynamically generated rss feed that is about 150M in size (don't ask)
The problem is that it keeps crapping out sporadically and there is no way to monitor it without downloading the entire feed to get a 200 status. Pingdom times out on it and returns a 'down' error.
So my question is, how do I check that this thing is up and running
What type of web server, and server side coding platform are you using (if any)? Is any of the content coming from a backend system/database to the web tier?
Are you sure the problem is not with the client code accessing the file? Most clients have timeouts and downloading large files over the internet can be a problem depending on how the server behaves. That is why file download utilities track progress and download in chunks.
It is also possible that other load on the web server or the number of users is impacting server. If you have little memory available and certain servers then it may not be able to server that size of file to many users. You should review how the server is sending the file and make sure it is chunking it up.
I would recommend that you do a HEAD request to check that the URL is accessible and that the server is responding at minimum. The next step might be to setup your download test inside or very close to the data center hosting the file to monitor further. This may reduce cost and is going to reduce interference.
Found an online tool that does what I needed
http://wasitup.com uses head requests so it doesn't time out waiting to download the whole 150MB file.
Thanks for the help BrianLy!
Looks like pingdom does not support the head request. I've put in a feature request, but who knows.
I hacked this capability into mon for now (mon is a nice compromise between paying someone else to monitor and doing everything yourself). I have switched entirely to https so I modified the https monitor to do it. The did it the dead-simple way: copied the https.monitor file, called it https.head.monitor. In the new monitor file I changed the line that says (you might also want to update the function name and the place where that's called):
get_https to head_https
Now in mon.cf you can call a head request:
monitor https.head.monitor -u /path/to/file

Resources