How would I receive a client ID from a response with requests library? - Python - python-requests

I'm using the requests library in python and here is my code, my goal is trying to get a soundcloud client ID
r = requests.get("https://soundcloud.com/search?q=a")
This URL is that I'm trying to get:
https://api-v2.soundcloud.com/me?client_id=[REDACTED]
Using print(r.url) I tried to get that URL but it shows the same one which is in the request. How could I make this not happen and make it actually give the last response url similar to loading a webpage and getting the last request with just the requests library? I've also tried r.json() which is just printing out the webpage HTML and is still on the same "url". Please help - I'm stuck at this.

Related

How to get items from headers by learning from initiators and using request python?

I am trying to get the fingerprint as can be seen from this snapshot.
I tried searching for the fingerprint but it's not in the response or cookies. I am wondering how this fingerprintjs works so that I can imitate and return the fingerprint item.
The website is https://alfagift.id/
When you take a look into network, especially categories, there's a preflight and an xhr where it is initiated by https://alfagift.id/_nuxt/ca268e7.js
I've tried doing a requests
resp=requests.get(" https://alfagift.id/")
resp.cookies
nothing seems to be returning the fingerprint that's needed.
Can anyone show me how you can get the fingerprint?
This file's rendering and executing the fingerprinting script on the client side: https://alfagift.id/_nuxt/f9d159c.js
Proof:
__fpjs_d_m||Math.random()>=.001))try{var t=new XMLHttpRequest;t.open("get","https://m1.openfpcdn.io/fingerprintjs/v3.3.3/npm-monitoring",!0),t.send()}catch(t){console.error(t)}}(),[4,vt(r)];case 1:return t.sent(),[2,gt(L(ft,{debug:n},
Used library: https://github.com/fingerprintjs/fingerprintjs

How to make a request to a site with reCAPTCHA with Python Requests

Goal
I want to make a request to a website with Python requests to scrape some information for containers location and time.
This is the website I'm trying to get data from : https://www.cma-cgm.com/ebusiness/tracking by inserting the container number.
I'm trying something simple, like :
import requests
url = "some_url_i_cant_find"
tracking_number = ABCD1234567
requests.post(url, payload=tracking_number)
Problem
I cannot find in the Network tab how the request to get the container's data is being processed.
I assume this has something to do with reCAPTCHA, but I don't know much about this or how to handle it.
Solution
Some other answer or topic regarding this issue
How to make a request to this website and read the response.

Scraping data from stats.nba.com, Getting Error in curl::curl_fetch_memory(url, handle = handle)

I'd like to scrape team advanced stats from stats.nba.com.
My current code to get the XHR file where the data is stored is :
library(httr)
library(jsonlite)
nba <- GET('https://stats.nba.com/stats/leaguedashteamstats?Conference=&DateFrom=11%2F12%2F2019&DateTo=&Division=&GameScope=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Advanced&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2019-20&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&TwoWay=0&VsConference=&VsDivision=')
I get the URL via these steps in Chrome:
Inspect -> Network -> XHR
The code throws this error:
Error in curl::curl_fetch_memory(url, handle = handle) :
LibreSSL SSL_read: SSL_ERROR_SYSCALL, errno 60
I also tried it with custom advanced filters on the website which either result in the same error or the code running forever. I'm not that great in web scraping so I would appreciate if anyone can point out what the issue is here.
I have had a good look at this. It looks like this site goes to some lengths to prevent scraping, and won't give you the json from that url unless you provide it with cookies that are generated by a back-and-forth between your browser's javascript and their own servers. They also monitor request timings with New Relic technology and are therefore likely to block your IP if you scrape multiple pages. It wouldn't be impossible, but very, very hard.
If you are desperate for the data you could look into using the NBA API which requires a sign-up but us free to use for 1000 requests per day.
The other option is to automate a browser using RSelenium to get the html of the fully rendered pages.
Of course, if you only want this one page, you can just copy the html from your Chrome's inspector, then use rvest::read_html(readClipboard())

How to make POST request using OAuth via Youtube API?

I am trying to get this thing to work for a couple days since it's my first time working with the OAuth system without any luck.
I have been experimenting here: https://developers.google.com/youtube/v3/docs/subscriptions/insert#try-it
With the following settings:
http://i.gyazo.com/5cd28f1194d5dfebee25d07bc0db965e.png
When I execute the code it successfully subscribes to the specified channelIdaccount with the authorized account.
I have tried to copy paste the shown POST URL into my browser without any luck. The plan was just to test it as I would like to implement this in PHP.
Now to my questions:
The {YOUR_API_KEY}, is this where I am supposed to type in the access token? If so, do I need the &mine=true tag at all?
I just realized that there are no ID's in the URL but there is an JSON-object in the request box example. Am I supposed to convert a string to JSON-object and pass it to the $fields= tag?

Iframe and http error when loading invalid or protected URL

IM writing some html which provides a preview function for a list of URLs. I want to use an iframe for this functionality,
The issue arises when some of the URLs are broken (returning a 500 error) or when a page contains some authenication process which the requesting user cannot satisfy. In these situations the iframe is trying to display the URL but the content returned in the frame is useless (500 error or authenication error ) to the user.
DOes iframe have any built in error handling for these senarios or is there some other way i can display a generic error page if something happens when loading the iframe?
Thanks
AFAIK, there is no way to directly access the header of a response to a request initiated by an iframe (or indeed, any request) in client side script.
This is slightly convoluted, but I think it would work:
The iframe is initially loaded with a URL that refers to a script on your server, and your pass the actual URL as a GET parameter.
The server side script takes that URL, and sends a HEAD request to it (following 3xx redirects).
If the response code for the HEAD request is >= 200 and < 300, send some script back to the client which changes the iframe's src to the actual URL (you might be able to do something as simple as window.location.href = but I'm not sure without testing).
If the response code is >= 400, send a page that says "This page is not loading at the moment".
If you know PHP/have it available on your server, I can try and provide a code example.

Resources