Always Got Lower Performance Score with Encoded URL Parameter - pagespeed

Anyone can point me why google-page-speed api provides lower performance score with encoded url parameter?
I called api for 100 times with encoded url parameter versus unencoded url parameter, and the result as follow:
Encode URL parameter:
TP90 performance score: 74
pagespeed request: https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https%3A%2F%2Fm.ctrip.com%2Fwebapp%2Fflight%2Fschedule%2Fdetail.html%3FhideAddTrip%3Dtrue%26isHideNavBar%3DYES%26navBarStyle%3Dgray%26flightNo%3DNH7018%26queryDate%3D2019-11-10%26dcode%3DNRT%26acode%3DLAX&strategy=mobile&key=AIzaSyA-AeDYHQr1ufyzqpq2sbb2tWqPoS-tjTo
Normal URL parameter:
TP90 performance score: 90
pagespeed request: https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://m.ctrip.com/webapp/flight/schedule/detail.html?hideAddTrip=true&isHideNavBar=YES&navBarStyle=gray&flightNo=NH7018&queryDate=2019-11-10&dcode=NRT&acode=LAX&strategy=mobile&key=AIzaSyA-AeDYHQr1ufyzqpq2sbb2tWqPoS-tjTo
The other parameters are exactly the same.
Appreciate for you answering

The reason you get different results is that you actually end up testing two different URLs
https://m.ctrip.com/webapp/flight/schedule/detail.html?hideAddTrip=true&isHideNavBar=YES&navBarStyle=gray&flightNo=NH7018&queryDate=2019-11-10&dcode=NRT&acode=LAX
and
https://m.ctrip.com/webapp/flight/schedule/detail.html?hideAddTrip=true
By not URL encoding your request to PSI you are losing everything after the &.
I am unsure why this is the case as I would have expected the opposite behaviour.

Related

Added random parameter for XSHM fix. Is there a limit to the length of a URL on IIS/ASP.NET?

At my organization, we have implemented a suggestion for fixing Cross-Site History Manipulation by appending a random GUID to the end of the URL on a redirect.
For example:
Response.Redirect($"{path}&paramX={Guid.NewGuid():N}");
So if the user has visited the page https://www.example.com/default.aspx then the redirect behavior would be the following:
Response.Redirect("https://www.example.com/default.aspx?&paramX=d11712a771294de8a6fc0c66e92954fc");
The issue or question comes into play if the Redirect happens when the user has already been redirected once or multiple times. In that case, duplicate params will be appended each time such as the following:
Response.Redirect("https://www.example.com/default.aspx?&paramX=d11712a771294de8a6fc0c66e92954fc&paramX=ff4bc6a838684b198060c70091b300e2");
Is there a limit on the URL length this could run into if excessive redirects happen?
If so, my solution to this would be to use a RegEx to detect if the param exists each time and use RegEx Replace( ) to replace it rather than appending each time.
Yes, it exists a limit for the length of url. It depends on the browser which user use. Here I list some limit of browsers for your reference:
IE: no more than 2048 byte
Chrome: no more than 8182 byte
FireFox: no more than 65536 byte
Safari: no more than 80000 byte
So I think it is not easy to exceed the limit length of url, but I suggest you to do some improvement to not append more and more same param to the url.

Google Optimize Experiment targeting wrong URL

I've setup an experiment on a specific URL in which I send no traffic
(same domain name that I use for other landing pages but with different parameter in the URL)
I've started the experiment few days ago without sending any traffic
And now I see that the experiment got triggered around 5000 times.
I double checked on my analytics reports and I see no access to the main page that is supposed to trigger the test. To explain with example:
This is what I have running:
http://domain1/landingpages?id=1
http://domain1/landingpages?id=2
this is the test that I created:
http://domain1/landingpages?id=3
with a 50% redirect on:
http://domain1/landingpages?id=4
The Experiment should only be triggered on id=3 page, but it did got triggered with id=1 and id=2 pages. Any idea how I can make the trigger only happen when "id=3" is in the url ?
Currently my configuration is as follow:
"WHEN Url Matches "http://domain1/landingpages?id=3" "
The URL targeting documentation explains your situation. (Emphasize by me.)
Use matches when there are query string parameters in URLs that you
don’t want to include in the matching. Matches can be more flexible
than equals because it adheres to the following rules:
Ignores query string parameters and fragments.
Case insensitive.
Normalized to remove a www. prefix.
Normalized to a remove a trailing slash.
HTTP and HTTPS are optional (HTTP will match HTTPS).
Verifying this in Optimize:
So you should either simply select Equals operator and use http://domain1/landingpages?id=3 as a value.
If other parameters might occur, then you could build a regex for this, to containt id=3 among various parameters. E.g.:
http:\/\/domain1\/landingpages\?(.*&)?id=3(&|$)
Optionally, you can use Query parameter targating, and build a rule for the base URL, and for the id parameter separately.

Meaning of =3D in malicious URLs

My server logs show a many attempts to access non existing sides. These are the "usual" bots scanning for known vulnerabilities. Many of the URLs contain =3D, e.g.
/?q=3Duser%2Fpassword&name%5B%23p=
/user/register/?element_parents=3Daccou=
/wp-admin/admin-post.php?swp_debug=3Dlo=
%3D is the url encoded value of = so I would expect to find %3D within the URL but not =3D. However, =3D can be found all over the logs. What is the meaning of this?
=3D is an example of a Quoted-Printable encoding for ASCII 0x3D, or the equals sign character (=).
You don't usually see this in URLs. It's not the normal encoding to use. It's a standard MIME type, an alternative to using base64. It looks like the request is expecting the app to decode the query string using Quoted-Printable, and then use the resulting path in some further redirect.

Should HTTP redirects be used to correct misleading URL's

I have some logic in an MVC controller which can result in a URL parameter being ignored leading to a URL potentially being misleading
As an illustration:
If the GET handler logic in the controller for the following URL:
http://example.com/results?sortByField=10&search=full&locationId=1
ignores the value of sortByField and calculates a value using the other parameter types
e.g.
if (search="full" && locationId = 1)
then
//sort results by field 1
else
//sort results by sortByField paramater
This means that the URL implies that the results are sorted by field 10 while the actual sort field is 1.
One solution to this would be to do a 302 redirect to the original URL modified to have sortByField=1 from within the if statement above. This would lead to a clean URL which reflects the behaviour of the page but results in an additional round trip and also doesn't seem to fully fit the definition of a 302 redirect.
Any thoughts on whether this even matters and the pros and cons of using a redirect appreciated.
If you're canonicalising URLs, 301 is the status code to use. A client that has requested:
http://example.com/results?sortByField=10&search=full&locationId=1
and been redirected to:
http://example.com/results?sortByField=1&search=full&locationId=1
should always use the second URL in future.
Any thoughts on whether this even matters
If you're expecting your clients to examine the URLs they're using then it's a nice way to self-document. There can also be benefits for caching.
It's certainly REST-fully valid; whether the extra round-trip is worth it really depends on the performance concerns of the site you're running.

RCurl::url.exists() : how to get non-error for redirects (in the 300 range of HTTP status codes)

I have a bunch of URLs extracted by text-mining some PDF documents. Now I want to test the URLS for validity. Some urls have junk characters inside or appended, or the URLS are truncated. One approach is to filter them by calling each of them.
To do that, I use the url.exists() function from the RCurl package. The function makes HTTP HEAD requests to urls using curl and checks the status code.
From the documentation of ?url.exists
If ‘.header’ is ‘FALSE’, this returns ‘TRUE’ or ‘FALSE’ indicating
whether the request was successful (had a status with a value in
the 200 range).
How can I make it return TRUE for urls that issue a redirect? Redirect status codes are in the 300 range. They are not really errors.
Or is there a better way? grabbing the actual status codes and process them manually? Should I use a system command here?

Resources