How to bypass google's cookie consent redirect when making a request with Guzzle - symfony

I am using Guzzle to pull data from content that the end location of a google rss feed link. e.g.
https://news.google.com/rss/articles/CBMiVWh0dHBzOi8vd3d3LmxvbmRvbi1maXJlLmdvdi51ay9pbmNpZGVudHMvMjAyMy9qYW51YXJ5L21haXNvbmV0dGUtZmlyZS1zdHJlYXRoYW0taGlsbC_SAQA?oc=5
When using curl with -L (location) flag it appears to bypass the consent redirect and pulls through the end location content.
I am using Drupal 10 with httpclient available which I understand uses Guzzle 7. How do I do the same there?
When enabling 'track redirects' guzzle feature I can see it appears to be getting stuck redirecting to google consent page and not redirecting to the end location?
e.g.
An AJAX HTTP error occurred. HTTP Result Code: 200 Debugging information follows. Path: /batch?id=328&op=do_nojs&op=do StatusText: parsererror ResponseText: Redirecting https://news.google.com/rss/articles/CBMiTmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1ONIBUmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1OC5hbXA?oc=5 to https://consent.google.com/m?continue=https://news.google.com/rss/articles/CBMiTmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1ONIBUmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1OC5hbXA?oc%3D5&gl=GB&m=0&pc=n&hl=en-US&src=1 Redirecting https://consent.google.com/m?continue=https://news.google.com/rss/articles/CBMiTmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1ONIBUmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1OC5hbXA?oc%3D5&gl=GB&m=0&pc=n&hl=en-US&src=1 to https://news.google.com/rss/articles/CBMiTmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1ONIBUmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1OC5hbXA?oc=5&ucbcb=1 Redirecting https://news.google.com/rss/articles/CBMiTmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1ONIBUmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1OC5hbXA?oc=5&ucbcb=1 to https://news.google.com/rss/articles/CBMiTmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1ONIBUmh0dHBzOi8vd3d3Lm15bG9uZG9uLm5ld3MvbmV3cy9wcm9wZXJ0eS9pbS1lc3RhdGUtYWdlbnQtcmVudGluZy1zb3V0aC0yNjA2MDI1OC5hbXA?oc=5&ucbcb=1&hl=en-GB&gl=GB&ceid=GB:en
This appeared to be working fine prior to updating to d10 that also includes symfony 4-6 update behind the scenes, so not sure if that is related?

After looking into this a little more I believe the issue I was having is more to do with Google using Javascript to handle the redirect. I have tested this by turning javascript off in the browser and doing so the redirect does not work. This is in combination with all News links in rss feeds now linking to Google first rather than the final source.
So to overcome this I have had to add an additional step that extracts the url from this middle page which I can then use to do the final lookup.

Related

WordPress: First login of a new browser session always fails

I'm working on a WordPress website: https://samarazakaz.ru/
The client discovered a strange bug. After newly opening a browser the first login always fails, second one succeeds.
I tracked down the issue to a strange cookie with the name RCPC that is being set when the login form is submitted. If the cookie is missing then the login fails regardless of proper credentials.
I searched high and wide for any information about this cookie but could not find anything useful. The only thing remotely resembling my case was on some discussion on a site called https://codeforces.com/ . But nothing on that mentioned anything related to WordPress.
The site has a bare-bones setup with Elementor and my own plugin. And nothing in my code messes with cookies or the login process. I downloaded all website files and search in all files for "RCPC" but found nothing.
The site is behind an Nginx proxy, but I could not find any connection with this cookie and Nginx either.
I noticed that the value of this cookie is constant. So, as a workaround I jerry-rigged my plugin to set this cookie any time when it's not set. But, of course, I'm not very happy with that solution because I don't know if this will just stop working one day.
Update:
I verified that this is coming from the hosting. I renamed the /wp-login.php file and made a request to it, and it didn't return a 404 error but a 200 page with the same redirect code and the header to set the cookie. The hosting is reg.ru .
As far as I can figure this is a counter measure against automated password guessing. Any request (POST, GET, etc) to the /wp-login.php will get the redirect script with the cookie setting header. Only requests containing the correct RCPC cookie will get forwarded.
Upon further testing found that the value of the RCPC cookie is some kind of hash generated from the request's IP address. Because all of our computers got the same one but from other locations its different.
This does not cause any problem if the standard WordPress login form is used because that lives at the /wp-login.php address, so the first GET request will generate the cookie. However, we had a custom login page which didn't access /wp-login.php until the form was submitted.
Based on these discoveries I made a workaround, which is simply adding a one line JS script to the login page which makes a (fetch) request to the /wp-login.php page and simply discards the result. This is enough to set the cookie in the browser so that the form will work at the first try.
Need on hosting disable test-cookie-module

The problem with the transfer of the website to symfony

After the transfer of the site to a new hosting, there is a problem: the site produces an event redirect to the old hosting. Currently set up nginx on something that would have sent data to pure servers but it produces still a redirect. The redirect itself is made starting from the app file.php to HttpKernel.php in handler (......) there is a call to events, $this->dispatcher - >dispatch(.................) which forms a redirect and does not let on, if you remove this element, then the page is formed only without data from the database and there is an error 404 page not found. When the page loads, a kernel event is generated.request and security.authentication.success and with such parameters it produces a redirect.
Check for kernel request events. You may have a hardcode somewhere. I dont know much about your symfony version, but you can debug events with php bin/console debug:event-dispatcher kernel.exception. After you do this post code here we might be able to help. AND YES question is formed very poorly.

linkedin : Invalid redirect_uri. This value must match a URL registered with the API Key

I am using 'omniauth-linkedin-oauth2'.
When I am login with linkedin then I am getting this error
Invalid redirect_uri. This value must match a URL registered with the API Key.
This is my settings:
Went back to LinkedIn developer site (https://www.linkedin.com/secure/developer ) to check my setting again. Everything matches API Key, Secret Key and OAuth 2.0 Redirect URLs.
Searched web looking for some clues. Couldn’t find a one. Crazy issue:
Then I saw that in the URL Owin was appending some extra string to the redirect_uri “signin-linkedin”. When I decoded the URL I saw this http://localhost:54307/signin-linkedin . I took this URL and placed it in the OAuth 2.0 Redirect URLs field in the LinkedIn developer site.
This link is helpful for me
https://naveengopisetty.wordpress.com/2014/09/15/linkedin-oauth-2-0-issue-invalid-redirect_uri-this-value-must-match-a-url-registered-with-the-api-key/
You can just look in url that you are getting that error message on.
eg. if you are using python's social auth the url would look like this:
https://www.linkedin.com/uas/oauth2/authorization?scope=r_basicprofile+r_emailaddress&state=XXXXXX&redirect_uri=http://example.com.au/sa/complete/linkedin-oauth2/&response_type=code&client_id=YYYYYYY
so you would use this part of the above url for the redirect url
http://example.com/sa/complete/linkedin-oauth2/
please check your redirect_url. for my case I see like this.
https://www.linkedin.com/uas/oauth2/authorization?response_type=code&client_id=77k93y0w31zaey&redirect_uri=http%3A%2F%2Flocalhost%3A1729%2Fsignin-linkedin&scope=r_basicprofile%2Cr_emailaddress&state=nhAC-nR-CgEwO3XS2ezANhuPBMz-IUmLPJYgGHlZvZ8B1pCfsGBU0PR0dZ5XxE4zbyeI0RLcKByqPLKkgQdqMm4s6DjFYqMCEehYA2iWT9MfioEHjPXGCt2USxUTF0wKBpflCUjG5URVlJa3qI7U3ydFOErZ4Hhnr9SVmKdf1bithYfbOqBx345o8LQLexbddQ687vP6y0szrIyCM6FHip1tCpOY3Hgg5FJQEFH1mCJ_yLunD5vDUN4VVfkQbcjk
for this I add the url for OAuth 2.0 Authorized Redirect URLs:
http://localhost:1729/signin-linkedin
where http://localhost:1729 =base url and
signin-linkedin = the string which add after base url
One more solution is to just verify the client_id you've been using the whole time..because with every update in the list of redirect_uri, the client_id gets updated.
Worth mentioning when one uses libraries to handle oauth: some libraries fail to care about the protocol that is used (or at least require further parametrization). Eg, I gave Linkedin https://example/callback as oauth2 url, but the library sent the request with http://example/callback as parameter.
I had this when trying to authorise from a zurb Reveal modal popup. In my case, the issue was the URL for the page that was being displayed in the popup was not in my OAuth2 Redirect URLs list on the LinkedIn developer site.
That was easy to miss because the page URL from the page in the modal is not the URL that was currently showing in the browser's address bar. Once I added the URL for the page being shown in the pop up it worked.
After spending hours i finally get to the solution. You got an error no issues just check the url and find redirect_uri. Copy and Paste it's value it in your linkedin dev account oauth2 redirect field.
Make sure to add both with and without trailing '/' as redirect url.
http://localhost:8000/oauth/complete/linkedin-oauth2
http://localhost:8000/oauth/complete/linkedin-oauth2/

Unknown 302 redirect at root. Crawlers cannot follow through

The whole thing started when I noticed Facebook Debugger and other crawler tools are unable to parse my page. Facebook throws a critical error saying that it cannot follow the redirect. I believe Search Engine bots are hitting the same end. The website is functioning normally via all major web browsers.
It's probably worth to mention I am experimenting with ASP.NET Routing, using Web Forms under IIS8.
Given a website (http://example.com), here's what happens.
Case 1: Trying to access the root, this is what I get with a Web Sniffer simulator
Case 1 observations:
First thing I notice is '302' redirect instead of '200 OK'. It gives a 302 redirect with or without leading 'www'.
I noticed is that the Location Header is simply "/", confirmed by the page from IIS that I cannot see with a regular browser, which says the page is moved to "/". I believe something messes up at this point and the crawler is unable to follow through for some reason.
Case 2: Trying to access a given category page with a Web Sniffer simulator
Case 2 observations:
As you might figured out already, identical to case 1. And once again Facebook debugger cannot go through it, resulting to a redirect it cannot follow.
Questions:
1: How can I force an absolute path in the location headers instead of relative and will this be sufficient for the crawlers to follow through?
2: What could cause a 302 redirect happening in the first place in both www and non-www versions of the website?
Your web application is most likely depending on a cookie. The application sends a Set-Cookie header and redirects to the same page in order to receive a new request with the cookie data available. Search engines / bots, the Facebook bot and your Web Sniffer simulator will not send that cookie data and hence the web application keeps sending the 302 redirect responses.
The solution is to change your application to not require cookies for just simply viewing your web pages.

Circular redirect path detected and wrong Open Graph data displayed

When sharing the following URL to Facebook
www.magicsoftware.com
You will get outdated information. Facebook refers to the site (magicsoftware.com/en) and takes all the information from the cache.
I tried to clear the cache by going to the dubugger-
https://developers.facebook.com/tools/debug/og/object?q=www.magicsoftware.com
But that didn't help much.
Someone has an idea what I can do?
P.S - if you checked the debugger link, you would see that there are two critical errors mentioned:
Could Not Follow Redirect: URL requested a HTTP redirect, but it could
not be followed. Errors That Must Be Fixed
Circular Redirect Path: Circular redirect path detected (see 'Redirect
Path' section for details).
What does that mean?
Your server is issuing redirect to the same URL as visited based on some condition, actually according to my tests on any requests that came without Accept-Language header get redirected.
See with Accept-Language header, and without any headers
Facebook linter doesn't seems to pass this header while crawling your OpenGraph meta and hung due to redirection loop.
You should avoid that redirection (or at least have some fallback) for Facebook linter to be able to collect updated data and update the cached version.
Same thing is happening to me now. I have no redirect in place. but I am getting this message " there was an error following the redirect path." when using the debugger on this URL http://www.mmaid.co/cleaning-services/offers/coupons/social-discount.php I will give it time and see if it fixes itself.
I found the solution myself - and it's only patience :)
Facebook just needs time to remove their cache files. So the solution is simply to use the Facebook Debugger to enter your URL and then to wait. Facebook will automatically refresh this URL cache.

Resources