How to allow pagespeed insights to http basic authentication sites? - pagespeed

How to allow pagespeed insights to http basic authentication sites?
It is returning error as follows even though I mentioned the username and password as below format.
Lighthouse returned error: FAILED_DOCUMENT_REQUEST. Lighthouse was unable to reliably load the page you requested. Make sure you are testing the correct URL and that the server is properly responding to all requests. (Details: net::ERR_ACCESS_DENIED)
https://username:password#www.example.com
e.g curl https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://username:password#www.example.com

This method does not seem to be working anymore - https://username:password#website.com
But one can check pagespeed insights through lighthouse which provides same results.
GTMetrix is another way to check pagespeed insights.
You can pass the url like this - https://username:password#website.com

Our method is to send the URL as https://username:password#website.com.
Same results. Normally this is sufficient bypass for other dev tools and scanners. Pagespeed Insights cannot do it.

Related

Azure Content Moderator Portal - Unable to load Azure Media Services Video

We are creating video reviews in the review tool using the code here and everything used to work before (months back).
Now the only problem we are facing is loading the video on the review tool.
From the console, On chrome, it says CORB blocked the response,
Cross-Origin Read Blocking (CORB) blocked cross-origin response https://REDACTED.streaming.media.azure.net/REDACTED/ignite_c_c.ism/manifest with MIME type application/vnd.ms-sstr+xml. See https://www.chromestatus.com/feature/5629709824032768 for more details.
And I can see 0B responses,
And on Firefox,
But if you paste the same video manifest URL in the Azure Media Test Tool, it works fine there.
Any help to fix the video loading issue would be greatly appreciated.
If you say you were able to use tha same without any changes over months ago, maybe a browser update(unless you have updated endpoints or header to Cross site access policies). Refer Configure CDN profile
However, "CORB" referred above seems similar to CORS (Cross Origin Resource Sharing).
It is an HTTP feature that enables a web application running under one
domain to access resources in another domain. In order to reduce the
possibility of cross-site scripting attacks, all modern web browsers
implement a security restriction known as same-origin policy. This
prevents a web page from calling APIs in a different domain. CORS
provides a secure way to allow one origin (the origin domain) to call
APIs in another origin.
CORS on Azure CDN will work automatically with no additional configuration. When you create a new account, default Streaming Endpoint Azure CDN integration is enabled by default. If you later want to disable/enable the CDN, your streaming endpoint must be in the stopped state. It could take up to two hours for the Azure CDN integration to get enabled and for the changes to be active across all the CDN POPs.
you might want to start using a wildcard (*) to setup the HTTP header, which disables CORS and allows any URL to access the CDN Endpoint.
Refer: Using Azure CDN with CORS
Caution: The Content Moderator Review tool is now deprecated and will be retired on 12/31/2021.
Video moderation enables detection of potential adult content in videos. The review tool internally calls the automated moderation APIs and presents the items for review right within your web browser
There are multiple indications:
SameSite cookie flag error
No decoders for requested formats
CORB error
You can give this a try though:
Set the SameSite by default cookies flag value to Disabled in Chrome 80 and later versions.
In your Chrome browser session, address chrome://flags/ and Search for or find the flag, SameSite by default cookies.
Select Disabled
.

Firebase Hosting returning 500 internal error for Googlebot user-agent when using Google Chrome's "network conditions" tab?

I've got the following set up on my Firebase web app (it's a Single Page App built with React):
I'm doing SSR for robots user agents, so they get full rendered HTML and no Javascript
Users get the empty HTML and get the Javascript to run the app.
firebase.json
"rewrites": [{
"source": "/**",
"function": "ssrApp"
}]
Basically every request should go into my ssrApp function, that will detect robot crawlers user-agents and decide wheter it will respond with the SSR version for the robots, or the JS version for the regular users.
It is working as intended. Google is indexing my pages, and I always log some info about the user agents from my ssrApp function. For example, when I'm sharing an URL on Whatsapp, I can see Whatsapp crawler on my logs from Firebase Console (see below):
But the weird thing is that I'm not being able to mimick Googlebot using Chrome's Network Conditions tab:
When I try to access my site by using Googlebot's user agent I get a 500 - Internal error
And my ssrApp functions isn't even triggered, since NOTHING is logged out from it.
Is this a Firebase Hosting built-in protection to avoid fake Googlebots? What could be happening?
NOTE: I'm trying to mimick Googlebot's user agent because I want to inspect the SSR version of my app in production. I know that there are other ways to do that (including some Google Search Console tools), but I thought that this would work.
Could you check that your pages are still in the Google index? I have the exact same experience and 80% of my pages are now gone...
When I look up a page in Google Search Console https://search.google.com/search-console it indicates there was an issue during the last crawl. When I "Test it Live" it spins and reports the error 500 as well and asks to "try again later"...
I had a related issue, so I hope this could help anyone who encounters what I did:
I got the 500 - Internal error response for paths that were routed to a Cloud Run container when serving a GoogeBot user agent. As it happens, if the Firebase Hosting CDN happened to have a cached response for the path, it would successfully serve the cached response, but if it didn't have a cached response, then the request would not reach the Cloud Run container, it would fail at the Firebase Hosting CDN with 500 - Internal error.
It turns out that the Firebase Hosting CDN secretly probes and respects any robots.txt that is served by the Cloud Run container itself (not the Hosting site). My Cloud Run container served a robots.txt which disallowed all access to bots.
Seemingly, when the Hosting CDN attempts to serve a request from a bot from a path that is routed to a Cloud Run container, it will first probe any /robots.txt that is accessible at the root of that container itself, and then refuse to send the request to the container if disallowed by the rules therein.
Removing/adjusting the robots.txt file on the Cloud Run container itself immediately resolved the issue for me.

Chrome browser find call to APIs ? He

I am using Chrome for some web scraping. How can I find manually the call to API endpoints url ?
The way I do it now it going to Network and look link by link. Is there a better way to quickly identify the call to database endpoints ?
Thanks
You can use fiddler or similar network sniffers.
You start fiddler. Then as you navigate to your website of interest, fiddler is going to show you all the url's that are sent to the website, as well as the website's responses.

Linkedin API not working on any Google App Engine sites

When I try to use Linkedin to login to my site on Google App Engine I get a 999 error. I think it must be blocked there because on my local machine the login does work fine.
Some other sites on app engine seem to have the same problem. My only conclusion is that the ip range of app engine must have been banned by linked on purpose or by accident. I think it must be by accident because of how many sites this must affect.
I do not think that it is only related to the Google App Engine, but rather kinda strict blocking policy at LinkedIn. Check out this post here: 999 Error Code on HEAD request to LinkedIn
It seems that LinkedIn also blocks request based on user-agent.
and HTTP Error 999: Request denied
and How to avoid "HTTP/1.1 999 Request denied" response from LinkedIn?

google analytics - googlebot cant access site

Sorry if this is the incorrect area.
I develop websites using Yootheme and Rockettheme. they have a coded area in the backend where you simply enter the code from Analytics.
however lately im getting emails stating the following;
Message summary
Webmaster Tools sent you the following important messages about sites in your account. To keep your site healthy, we recommend regularly reviewing these messages and addressing any critical issues.
http://www.anigmabeauty.co.nz/: Googlebot can't access your site
Over the last 24 hours, Googlebot encountered 1 errors while attempting to connect to your site. Your site's overall connection failure rate is 50.0%.
You can see more details about these errors in Webmaster Tools.
I've deleted them and re-added the websites, works for a while then does the same thing. Any ideas on how to fix this.
When I open http://www.anigmabeauty.co.nz/ I get 303 See other redirect. It may cause the issue since 303 redirect is not so good for SEO.
Try to replace 303 redirect with 302 redirect.

Resources