How to load Motomo faster with Drupal? - google-analytics

I have a site with Drupal 8
I also have another second server with another domain name for Matomo audience analysis.
My Drupal 8 site is slow to load. How does the following advice apply with Drupal 8 ? It is a LAMP server.
Preload DNS Often a Matomo (Piwik) is hosted on a different domain and when the browser loads the JavaScript tracker file, it needs to
first perform a DNS lookup to find the IP address for this domain. By
adding the below snipped for your Matomo domain, it can boost the
performance of loading the tracker file by 10ms to 50ms.
<link rel="dns-prefetch" href="//example.innocraft.cloud">
enter image description here

You can try adding a dns-prefetch and it should work and speed up the DNS query a bit (but keep in mind that this is only a tiny fraction of the 1.8s).
For this just add the HTML tag as high up in the HTML of the site as possible (so somewhere in the <head>)
<link rel="dns-prefetch" href="http://analytics.yourdomain.example">
Also keep in mind that while your Matomo loads quite slow, this shouldn't slow down your website as Matomo only inizialises after the page has finished loading and is usable by the visitor. (so the green and blue lines in you image)
You can look at https://matomo.org/docs/optimize-how-to/ for more tips on how to optimize Matomo to be faster. (e.g. a newer PHP version, a SSD for the database, more RAM for the database, enabling opcache, etc.)
You could also take a look at this plugin: https://plugins.matomo.org/QueuedTracking
It dumps the raw visitor data directly into a MySQL or Redis Table (which can bring the request down to 30ms) and then when the queue is X entries long, it asynchronously runs Matomo on everything in the queue.

Related

Wordpress logging requests into a database

I am trying to create a plugin which logs http requests from users into a database. So far I've logged the requests for php files by hooking my function to the init function. But now I want to know if I can also log requests for files such as images, documents, etc. Is there any php code executed when someone requests files? Thank you.
Not by default, no. The normal mod_rewrite rules (not to be confused with WP's own rewrite rules) Wordpress uses specifically exclude any existing files such as images, css or javascript files. Those will be handled directly by Apache.
You obviously could add a custom script that runs on each request, logs the access to the database, reads those files and prints their content to the client, but it would come at a considerable cost, I'm afraid.
Apache, albeit not the fastest webserver around, is much, much faster in delivering a file to a client than running a php script, setting up a database connection, logging etc pp would be.
You'd get much higher server load, and probably noticeably slower page loads.
Instead, I recommend that you parse the access logs. They'll most likely contain all of the data you're looking for, and if you have access to the configuration, you can add specific headers sent by the client. You can easily do this with a cronjob that runs once a day, and it doesn't even have to run on the same server.

Long waiting (TTFB) time for scripts / styles on Azure Website

I have this intriguing problem on Azure Website. My website uses 4 script files and 3 style files, each minified. They are not so big, bigest has near 200 KBs. Website had already started. Azure's Always On option is turned on. When I call to WebApi for data it returns in <50ms.
And when app is reloaded it needs 250 ms just to get first byte from tiniest script, and others needs much more. Initial Html is loaded in 60 ms. Scripts/styles are cached so they are not downloaded, but the TTFB time is killing the performance. This repeats every single reload. App is not containing any sophisticated configuration so it should run much faster than it.
What can cause such problems?
Although your static files are cached, the browser still issues requests with if-modifies-since header (which results in a 304).
While it doesn't need to download the actual content, it still needs to wait the RTT + server think time to continue.
I would suggest two things:
Adding Cache-Control and Expire headers - will help avoid 304 in some cases (pretty much unless you hit F5)
Using a proper CDN - such as Incapsula or others, that will minimize the RTT + think time. It can also be used to easily control cache settings for various resources.
More good stuff here.
Good Luck!
From here:
As you saw earlier, IIS 7 caches the compressed versions of static
files. So, if a request arrives for a static file whose compressed
version is already in the cache, it doesn’t need to be compressed
again.
But what if there is no compressed version in the cache? Will IIS 7
then compress the file right away and put it in the cache? The answer
is yes, but only if the file is being requested frequently. By not
compressing files that are only requested infrequently, IIS 7 saves
CPU usage and cache space.
By default, a file is considered to be requested frequently if it is
requested two or more times per 10 seconds.
So, the reason your users are being served an uncompressed version of the javascript file is because it didn't meet the default threshold for being compressed; in other words, the javascript file was not requested 2 times within 10 seconds.
To control this, there is one attribute we must change on the <serverRuntime> element, which controls compression: frequentHitThreshold. In order for your file to be compressed when it is requested once, change your <serverRuntime> element to look like this:
<serverRuntime enabled="true" frequentHitThreshold="1" />
This will slightly impact your CPU performance if you have many javascript files that are being served and you have users quite often, but likely if you have users often enough to impact CPU from compressing these files, then they are already compressed and cached!
My guess would be Azures always on.
If it works anything like the one CloudFlare provides, it essentially proxies the request and tries to cache it.
Depending on the exact implementation of this cache on the side of Azure, it might wait for the scripts output to complete to cache it/validate the cache and then pass it on to the browser.
You might have a chance checking the caching configuration and disable always on for your scripts if possible.
The scripts and styles are static files and by default are compressed (you can check this with HTTP header "content-encoding": gzip) before being sent to client. So, the TTFB consists of network latency, browser HTTP channel scheduling and the static file compression time from server.
On the other hand, your Web API data is dynamic data and by default is not compressed, so possible its TTFB is less than the TTFB for static files.
However, you don't need to switch off static compressing, otherwise TTFB is minimized but content transferring time will be extended. Actually, you don't need to worry about TTFB, see more info: https://blog.cloudflare.com/ttfb-time-to-first-byte-considered-meaningles/
I finished with storing files on Azure Storage and serving them by Azure CDN. It provides high speed of response and costs nothing. I add them to blob every publish, in Pre-build event by Gulp.
well... there are 2 main problems with your site:
you are using AZURE - a high priced service with a poor performance.... don't ask me why people think that this is a good service
you are storing client files side-by-side with the server files.. while server files should be stored in a specific server, client files can practically can be served from... everywhere
so - please use a CDN (or any other server) for your client side files (mainly css and js, you may consider moving fonts and images as well)

Replace url in html files on load balance basing on geolocation?

I have a HTTP web server providing static html pages.
Within the page, it loads images & css from a fixed domain like:
<img src="http://assets.mysite.com/1.jpg" />
Actually there are several different domains serving the same files. For example.
assets-us.mysite.com
assets-eu.mysite.com
assets-asia.mysite.com
I wanna the load balance to replace the domain "assets.mysite.com" to others according to the visitor's geolocation.
For example, when I access the same url from Europe, the html I get is:
<img src="http://assets-eu.mysite.com/1.jpg" />
When I access the same url from Japan, the html I get is
<img src="http://assets-asia.mysite.com/1.jpg" />
I prefer to NGINX(or G-WAN). Is it possible with only some configuration or script setup for the load balance to achieve this? how is the performance affected by this replacement?
If your goal is to perform as well as possible then you should do geo-ip load-balancing at the DNS request - users are redirected prior to querying the Web server. CDNs work this way.
But if you can't do that and want to manage the load-balancing from the Web server then the best way to scale is to use an AS (the networks used by ISPs) lookup table to find in which regions users are located.
Doing this as opposed to searching IP addresses, will immensely reduce the database size, and therefore speed-up operations. IP databases offer more details but are much larger.
For G-WAN, you would write a connection handler or a content-type handler if you want to implement a different logic for different MIME-types (the latter might also ease development as you won't have to parse the request to find the resource type).
If the database is stored locally (preferably in RAM), G-WAN C/C++/C# scripts, if properly implemented, won't increase the latency in a noticeable manner.

is wordpress suitable for a site which has 317k pageviews p/w

I had meeting with a local newspaper company's owner. they are planning to have a newly designed website. their current website is static and doesnt have any kinds of database. But their weekly pageview figure is around 317k. This figure surely will increase in the future
The question is if i create a Wordpress system for them will the website run smoothly with new functionalities (news,galleries may be). it is not neccessary to use lots of plugins. can their current server support wordpress package without any upgrade.
Or shall i think to use php to design website.
Yes - so long as the machinery for it is adequate, and you configure it properly.
If the company uses CDN (like akamai), ask them if this thing can piggyback on their account, then make them do it anyway when they throw up a political barrier. Then, then stop sweating it, turn keepalive on and ignore anything below this line. Otherwise:
If this is on a VPS, make sure it has guaranteed memory and I/O resources - otherwise host it on a hardware machine. If you're paranoid, something with a 10k RPM drive and 2-3 gigs of ram will do (memory for apache and mysql to have breathing room and hard drive for unexpected swap file compensation.)
Make sure the 317k/w figure is accurate:
If it comes from GA/Omniture/another vendor tracking suite, increase the figures by about 33-50% to account for robots that they can't track.
If the number comes from house stats/httpd logs, assume it's 10-20% less (since robots don't typically hit you up for stylesheets and images.)
If it comes from combined reports by an analyst whose job it is to report on their own traffic performance, scratch your head and flip a coin.
Apache: News sites in America have lunchtime and workday winddown traffic bursts around or about 11 am, and 4 pm, so you may want to turn Keepalive off (having it on will improve things during slow traffic periods, but during burst times the machine will spin into an unrecoverable state.)
PHP: Make sure some kind of opcode caching is enabled on the hosting machine (either APC or eAccelerator). With opcode caching, memory footprint drops off significantly and machine doesn't have to borrow as much from the swap file - hard drive.
WP: Make sure you use WP3.4, as ticket http://core.trac.wordpress.org/ticket/10964 was closed in favor of this ticket's fix: http://core.trac.wordpress.org/ticket/18536. Both longstanding issues address query performances on large volume sites, but the overall improvements/fixes help everywhere else too.
Secondly, make sure to use something like the WP Super Cache caching plugin and configure it appropriately. If volume of content on this site is going to be permanently small, you shouldn't have to take any special precautions - otherwise you may want to alter the plugin/rules so to permanently archive older content into a static file. There is no reason why 2 year old content should be constantly respidered at full resource cost.
Robots.txt: prepare and properly register a dynamic sitemap with google/bing/etc. If you expect posts to be unnecessarily peppered with a bunch of tags and categories by people who don't understand what they actually do, you may want to Disallow /page/*, /category/* and /tag/*. Otherwise, when spider robots swarm the site, for every post you'll be slammed by an amount increased by number of tags/cats it has. And then some.
For several years The Baltimore Sun hosted their reader reward, sports and editorial database projects directly off a single collocated machine. Combined traffic volume was factors larger than what you mention, but adequately met.
Here's a video of httpd status w/keepalive on during a slow hour, at about 30 req./sec: http://www.youtube.com/watch?v=NAHz4GRY0WM#t=09
I would not exclude WordPress for this project based only off of the weekly pageview of < a million. I have hosted WordPress sites that receive much, much more traffic and were still very functional. Whether or not WordPress is the best solution for this type of project though based off of the other criteria you have is completely up to you.
Best of luck and happy coding!
WP is capable of handling huge traffic. See this list of people who are using WP VIP services:
Time,DowJones,NBC Sprts,CNN and many more.
Visit WordPress VIP site: http://vip.wordpress.com/clients/

Need to check uptime on a large file being hosted

I have a dynamically generated rss feed that is about 150M in size (don't ask)
The problem is that it keeps crapping out sporadically and there is no way to monitor it without downloading the entire feed to get a 200 status. Pingdom times out on it and returns a 'down' error.
So my question is, how do I check that this thing is up and running
What type of web server, and server side coding platform are you using (if any)? Is any of the content coming from a backend system/database to the web tier?
Are you sure the problem is not with the client code accessing the file? Most clients have timeouts and downloading large files over the internet can be a problem depending on how the server behaves. That is why file download utilities track progress and download in chunks.
It is also possible that other load on the web server or the number of users is impacting server. If you have little memory available and certain servers then it may not be able to server that size of file to many users. You should review how the server is sending the file and make sure it is chunking it up.
I would recommend that you do a HEAD request to check that the URL is accessible and that the server is responding at minimum. The next step might be to setup your download test inside or very close to the data center hosting the file to monitor further. This may reduce cost and is going to reduce interference.
Found an online tool that does what I needed
http://wasitup.com uses head requests so it doesn't time out waiting to download the whole 150MB file.
Thanks for the help BrianLy!
Looks like pingdom does not support the head request. I've put in a feature request, but who knows.
I hacked this capability into mon for now (mon is a nice compromise between paying someone else to monitor and doing everything yourself). I have switched entirely to https so I modified the https monitor to do it. The did it the dead-simple way: copied the https.monitor file, called it https.head.monitor. In the new monitor file I changed the line that says (you might also want to update the function name and the place where that's called):
get_https to head_https
Now in mon.cf you can call a head request:
monitor https.head.monitor -u /path/to/file

Resources