Recently, after the lighthouse update, significant differences in values appeared in the site measurement, with successive measurements of the same project, the differences reached 30 points, for example, before the update, these differences had a maximum of 3 points.
We are positioning ourselves as a company that gives high pointspeed in pagespeed as an advantage, but for any differences, as a result, we cannot rely on numbers. We want to understand:
Why does such a difference in the resulting values appear?
How can this difference be reduced?
Could this be due to the incorrect operation of your algorithms?
The measurements were carried out on our project in production, which had constant values > 90 points before the update.
Check this.
I doubt this is a CLS (Cumulative Layout Shift) issue. Your CLS numbers are fine.
The variance in Page Speed Insights could be due to your CDN - Content Distribution Network. First-time tests usually are poor as your site gets loaded on CDN cache. Subsequent tests will be faster. Always take the median score from 4-5 runs.
You can also try and disable CDN temporarily. Platforms such as Cloudflare offer developer testing mode to disable CDN temporarily. Try that. You will get more consistent results.
Related
I am noticing different TTBF values in Chrome network tab vs logged by WebVitals. Ideally it should be exactly same value, but sometimes seeing large difference as much as 2-3 seconds for certain scenarios.
I am using Next.js and using reportWebVitals to log respective performance metrics.
Here is a sample repo, app url and screenshots for reference.
Using performance.timing.responseStart - performance.timing.requestStart is returning more appropriate value than relying on WebVitals TTFB value.
Any idea what could be going wrong? Is is a bug on WebVitals and I shouldn't be using it or mistake at my end in consuming/logging the values?
The number provided by reportWebVitals (and the underlying library web-vitals) is generally considered the correct TTFB in the web performance community (though to be fair, there are some differences in implementation across tools).
I believe DevTools labels that smaller number "Waiting (TTFB)" as an informal hint to the user what that "waiting" is to give it context and because it usually is the large majority of the TTFB time.
However, from a user-centric perspective, time-to-first-byte should really include all the time from when the user starts navigating to a page to when the server responds with the first byte of that page--which will include time for DNS resolution, connection negotiation, redirects (if any), etc. DevTools does include at least some information about that extra time in that screenshot, just separated into various periods above the ostensible TTFB number (see the "Queueing", "Stalled", and "Request Sent" entries).
Generally the Resource Timing spec can be used as the source of truth for talking about web performance. It places time 0 as the start of navigation:
Throughout this work, all time values are measured in milliseconds since the start of navigation of the document [HR-TIME-2]. For example, the start of navigation of the document occurs at time 0.
And then defines responseStart as
The time immediately after the user agent's HTTP parser receives the first byte of the response
So performance.timing.responseStart - performance.timing.navigationStart by itself is the browser's measure of TTFB (or performance.getEntriesByType('navigation')[0].responseStart in the newer Navigation Timing Level 2 API), and that's the number web-vitals uses for TTFB as well.
In Pagespeed insights, I get the following message in Origin Summary: "Over the previous 28-day collection period, the aggregate experience of all pages served from this origin does not pass the Core Web Vitals assessment."
screenshot of the message in PageSpeed Insights
Does anyone know what % of URLs have to pass the test in order to change this? Or what the criteria is?
Explanation
Lets use Largest Contentful Paint (LCP) as an example.
Firstly, the pass / fail is not based on the percentage of URLs, it is based on the average time / score.
This is an important distinction as you could have 50% of the data fail, but if it only fails by 0.1s (2.6s) and the other 50% of data is passing by 1 second (1.5s) the average will be a pass (average of 2.05s which is a pass).
Obviously this is an over-simplified example but you hopefully get the idea that you could have 50% of your site in the red and still pass in theory, which is why the percentages in each category are more for diagnostics.
If the average time for LCP across all pages in the CrUX dataset is less than 2.5 seconds ("Good") then you will get a green score and that is a pass.
If the time is less than 4 seconds the score will be orange ("Needs improvement") but this will still count as a fail.
Over 4 seconds and it fails and will be red ("Poor").
Passing criteria
So you need the following to be true to pass the web vitals (at time of writing):-
Largest Contentful Paint (LCP) average is less than 2.5 seconds
First Input Delay (FID) is less than 100ms
Cumulative Layout Shift is less than 0.1
If any one of those is over the threshold you will fail, even if the other two are within the green / passes.
FID - when running lighthouse (or Page Speed Insights) on a page you do not get the FID as part of the synthetic test (Lab Data).
Instead you get Total Blocking Time (TBT) - this is a close enough approximation for FID in most circumstances so use that (or run a performance trace).
Application Insights can collect Dependencies as a part of the log analytics and recently it is enabled by default. Of course, having information regarding dependencies is increadible when you try to improve performance, but how can one use it when the sampling is on and the rate of data is a lot more than sampling rate?
To give an example, MaxTelemetryItemsPerSecond in the documentation is 5 in their example. Having it enabled on production, the sum(itemCount) for my requests is around 400 and for dependencies is around 5000-6000. Regradless of the price, I wanted to has as much information as possible, so I've tried increasing the limit and I hit the performance problem on around 600. So I had to prioritize my events, exceptions and requests over the dependencies, so there will be a limit of max 100 rows for max sampling dependencies which means each row in my sampling data will represent 50 dependencies and I am at performance limit. If I go for 10 rows limit for dependencies, each row will represent 500 items.
My question is, what would be the use of the data that is sampled with a rate of 1:500? What is the gain? How can this be even helpful?
Sampling is done to reduce the cost of telemetry. (financial cost + performance cost)
Even with sampling, the built-in sampling takes care to retain or discard related events. i.e If a RequestTelemetry is retained by sampling, then all the DependencyTelemetry within the context of that request is retained. This will give you enough to perform closer investigation of Requests, and how Dependencies are contributing to overall performance of the request.
You may also want to take a closer look at all the dependencies collected, and filter some of them if you think they are not very useful. For example - some people may chose to drop all very fast dependencies.
Access to raw Request/Dependencies are most needed on failures - you may write a telemetryprocessor to retain all failed dependencies. This will mean you'll have more data to investigate failures, while still sampling the rest of the telemetry.
I'm testing my pagespeed everyday several times. My page often receives a grade between 94 to 98, with the main problems being:
Eliminate render-blocking resources - Save 0.33
Defer unused CSS - Save 0.15 s
And in Lab data, all values are green.
Since yesterday, suddenly page speed has dropped, to about 80-91 range,
with the problems being:
Eliminate render-blocking resources - Save ~0.33
Defer unused CSS - Save ~0.60 s
And it is also saying my First CPU idle is slow ~4,5ms
And so is time to interactive , ~4.7
And sometimes speed index is slow as well.
It also started to show Minimize main-thread work advice, which didn't show earlier.
The thing is, I did not change anything in the page. Still same HTML, CSS and JS. This also not a server issue, I don't have a CPU overuse problem.
On Gtmetrix I'm still getting the same 100% score and same 87% Yslow score, with the page being fully loaded somewhere between 1.1s to 1.7s, making 22 HTTP requests in total size of 259kb, just like before.
On Pingdom I also get the same 91 grade as before, with page load speed around 622ms to 750ms.
Therefore, I can't understand this sudden change in the way Google analyzes my page.
I'm worried of course it will affect my rankings.
Any idea what is causing this?
it seems that this is a problem of PageSpeed Insights web itself as it is reported now on some pagespeed insights discuss google groups:
https://groups.google.com/forum/#!topic/pagespeed-insights-discuss/luQUtDOnoik
The point is that if you try to test your performnce direclty from another lighthouse web test, for example:
https://www.webpagetest.org/lighthouse
You will see your previous rates
In our case, in this site we always had 90+ on mobile but now google page rate has been reduced to 65+
https://developers.google.com/speed/pagespeed/insights/?url=https%3A%2F%2Fwww.test-english.com%2F&tab=mobile
but it still remains 90+ in webpagetest.org: https://www.webpagetest.org/result/190204_2G_db325d7f8c9cddede3262d5b3624e069/
This bug was acknowledged by Google and now has been fixed. Refer to https://groups.google.com/forum/#!topic/pagespeed-insights-discuss/by9-TbqdlBM
From Feb 1 to Feb 4 2019, PSI had a bug that led to lower performance
scores. This bug is now resolved.
The headless Chrome infrastructure used by PageSpeed Insights had a
bug that reported uncompressed (post-gzip) sizes as if they were the
compressed transfer sizes. This led to incorrect calculations of the
performance metrics and ultimately a lower score. The mailing list
thread titled [BUG] [compression] Avoid enormous network payloads /
Defer unused CSS doesn't consider compression covered this issue in
greater detail. Thanks for Raul and David for their help.
As of Monday Feb 4, 5pm PST, the bug is completely addressed via a new
production rollout.
I am working on a client proposal and they will need to upgrade their network infrastructure to support hosting an ASP.NET application. Essentially, I need to estimate peak usage for a system with a known quantity of users (currently 250). A simple answer like "you'll need a dedicated T1 line" would probably suffice, but I'd like to have data to back it up.
Another question referenced NetLimiter, which looks pretty slick for getting a sense of what's being used.
My general thought is that I'll fire the web app up and use the system like I would anticipate it be used at the customer, really at a leisurely pace, over a certain time span, and then multiply the bandwidth usage by the number of users and divide by the time.
This doesn't seem very scientific. It may be good enough for a proposal, but I'd like to see if there's a better way.
I know there are load tools available for testing web application performance, but it seems like these would not accurately simulate peak user load for bandwidth testing purposes (too much at once).
The platform is Windows/ASP.NET and the application is hosted within SharePoint (MOSS 2007).
In lieu of a good reporting tool for bandwidth usage, you can always do a rough guesstimate.
N = Number of page views in busiest hour
P = Average Page size
(N * P) /3600) = Average traffic per second.
The server itself will have a lot more internal traffic for probably db server/NAS/etc. But outward facing that should give you a very rough idea on utilization. Obviously you will need to far surpass the above value as you never want to be 100% utilized, and to allow for other traffic.
I would also not suggest using an arbitrary number like 250 users. Use the heaviest production day/hour as a reference. Double and triple if you like, but that will give you the expected distribution of user behavior if you have good log files/user auditing. It will help make your guesstimate more accurate.
As another commenter pointed out, a data center is a good idea, when redundancy and bandwidth availability become are a concern. Your needs may vary, but do not dismiss the suggestion lightly.
There are several additional questions that need to be asked here.
Is it 250 total users, or 250 concurrent users? If concurrent, is that 250 peak, or 250 typically? If it's 250 total users, are they all expected to use it at the same time (eg, an intranet site, where people must use it as part of their job), or is it more of a community site where they may or may not use it? I assume the way you've worded this that it is 250 total users, but that still doesn't tell enough about the site to make an estimate.
If it's a community or "normal" internet site, it will also depend on the usage - eg, are people really going to be using this intensely, or is it something that some users will simply log into once, and then forget? This can be a tough question from your perspective, since you will want to assume the former, but if you spend a lot of money on network infrastructure and no one ends up using it, it can be a very bad thing.
What is the site doing? At the low end of the spectrum, there is a "typical" web application, where you have reasonable size (say, 1-2k) pages and a handful of images. A bit more intense is a site that has a lot of media - eg, flickr style image browsing. At the upper end is a site with a lot of downloads - streaming movies, or just large files or datasets being downloaded.
This is getting a bit outside the threshold of your question, but another thing to look at is the future of the site: is the usage going to possibly double in the next year, or month? Be wary of locking into a long term contract with something like a T1 or fiber connection, without having some way to upgrade.
Another question is reliability - do you need redundancy in connections? It can cost a lot up front, but there are ways to do multi-homed connections where you can balance access across a couple of links, and then just use one (albeit with reduced capacity) in the event of failure.
Another option to consider, which effectively lets you completely avoid this entire question, is to just host the application in a datacenter. You pay a relatively low monthly fee (low compared to the cost of a dedicated high-quality connection), and you get as much bandwidth as you need (eg, most hosting plans will give you something like 500GB transfer a month, to start with - and some will just give you unlimited). The datacenter is also going to be more reliable than anything you can build (short of your own 6+ figure datacenter) because they have redundant internet, power backup, redundant cooling, fire protection, physical security.. and they have people that manage all of this for you, so you never have to deal with it.