How to extract the original response header from a site before redirection using the requests module in python - python-requests

def login_with_requests():
url = "https://url/login/"
login_data = {'csrfmiddlewaretoken':'', 'username':'username', 'password':'password'}
response = requests.get(url)
# print(response.headers)
response_cookies = response.cookies
print(csrf_token)
csrfmiddlewarepattern = re.compile(r'csrfmiddlewaretoken\W\svalue\W{2}([a-zA-Z0-9]+)\W')
matches = csrfmiddlewarepattern.finditer(response.text)
for match in matches:
csrfmiddlewaretoken = match.group(1)
# print(csrfmiddlewaretoken)
login_data['csrfmiddlewaretoken'] = csrfmiddlewaretoken
login_response = requests.post(url, cookies=response_cookies, data=login_data)
print(login_response.headers)
print(login_response.history)
I'm able to successfully login to a site using this code. The problem I have is that when I make a post request to the login site with the necessary parameters, although it is successful the site makes a redirection to the home page. Therefore I receive 2 response headers; the first one is the actual post response (status:302) to the login made containing a redirection header to the home page and the second one is the response containing data meant for the home page.
My problem is that the first response from the site contains a session-id token that I need before I can keep on interacting with the website. But the login_response.headers returns the final response headers which are meant for the request made to the redirected home page.
How can I extract the original response headers received from the site before the redirection as it contains the session-id token that I need for further interaction with the website?
I checked the login_response.history data, it seems to only return the status code for the previous request.

I found a solution, i thought i should share.
def login_with_requests_PaymentSite():
url = "<site.com/login/>"
login_data = {'csrfmiddlewaretoken':'', 'username':<username>, 'password':<password>}
response = requests.get(url)
csrf_token = response.cookies # Cookies returned from site for non-authenticated user.
# Extract csrfmiddlewaretoken that would be used to make a login post request.
csrfmiddlewarepattern = re.compile(r'csrfmiddlewaretoken\W\svalue\W{2}([a-zA-Z0-9]+)\W')
matches = csrfmiddlewarepattern.finditer(response.text)
for match in matches:
csrfmiddlewaretoken = match.group(1)
login_data['csrfmiddlewaretoken'] = csrfmiddlewaretoken # Save csrfmiddlewaretoken in the post login data.
# Start a session
session = requests.Session()
login_session = session.post(url, cookies=csrf_token, data=login_data) # Login to the site
sessionid_cookies = login_session.cookies # Set sessionid cookies that would be used for consecutive requests.
login_response_file = open('Login_Response_Paymentsite.html', 'w')
login_response_file.write(login_session.text)
login_response_file.close()
transaction_history_url = "<site.com/transactions/>"
transaction_history = requests.get(transaction_history_url, cookies=sessionid_cookies)
print("\n Result returned for the transactions page: ")
print(transaction_history.text)
userinfomation_url = "<site.com/userinformation/>"
userinformation = requests.get(userinfomation_url, cookies=sessionid_cookies)
print('\n Result returned for userinformation page: ')
print(userinformation.text)
To be able to make consecutive requests to a site after successful login using the requests module you have to make use of the requests.Session() method. This method helps you to store the session_id returned by the web application after successful login. If you make use of requests.post method instead you won't be able to retrieve the session_id. But using the requests.Session method stores the session_id automatically.
After making the post request;
login_session = session.post(url, cookies=csrf_token, data=login_data) # Login to the site You extract the session_id that would be used for consecutive requests with sessionid_cookies = login_session.cookies

Related

Python requests post: data and json

I used the following Python code to retrieve a web page behind a login page successfully for some years:
username = 'user'
password = 'pass'
login_url = 'https://company.com/login?url='
redirect_url = 'https://epaper.company.com/'
data = { 'email' : username, 'pass' : password }
initial_url = login_url + quote(redirect_url)
response = requests.post(initial_url, data=data)
Then something changed at company.com about 2 months ago, and the request returned status code 400. I tried changing the data parameter to json (response = requests.post(initial_url, json=data)) which gave me a 200 response telling me a wrong password was provided.
Any ideas what I could try to debug?
Thanks,
Jan
Update: I just tried using a requests session to retrieve the csrf_token from the login page (as suggested here), so now my code reads:
with requests.Session() as sess:
response = sess.get(login_url)
signin = BeautifulSoup(response._content, 'html.parser')
data['csrf_token'] = signin.find('input', {'name':'csrf_token'})['value']
response = sess.post(initial_url, data=data)
Unfortunately, the response is still 400 (and 200/wrong password with the json parameter).
First: When you send data=data, used {"Content-Type":"application/x-www-form-urlencoded"}; if you send json=data, in headers response should be used {"Content-Type":"application/json"}
Second: Perhaps redirects have been added. Try to add:
response = sess.post(url, data=data)
print("URL you expect", url)
print("Last request URL:", response.url)
Be sure to check:
print(sess.cookies.get_dict())
print(response.headers)
If you get an unexpected result when checking, change the code like this:
response = sess.post(url, data=data, allow_redirects=False)

Blazor with ODataClient - Location Header is missing

I'm creating client side Blazor app with Microsoft.OData.Client. When I create new entity like this:
var dataServiceContext = this.ClientFactory.CreateClient<Container>(new Uri("http://localhost:5000/odata"));
var newAsset = new CreateAssetDto()
{
TechnicalName = "from_client_4",
DisplayNameFormat = "format from client",
Icon = "client/icon",
InheritedFrom = Guid.NewGuid(),
IsActive = true,
Translation = new AssetTranslationDto
{
Title = "Client Asset",
Language = "en",
Description = "This is asset from client"
}
};
dataServiceContext.AddToAssets(newAsset);
await dataServiceContext.SaveChangesAsync();
I get an exception stating that response to this POST request is missing Location header. When I run fiddle to see what's going on I can see that it actually made 2 requests.
The first request is POST but doesn't include the body and recieves 204 response.
The second request is the one that actually contains the data creating new Asset and response contains Location header as it should.
I guess OData Client is complaining about Location header missing in the response for the first request (since response for second request does contain the header). But why is it even making the first request?
Any idea how to deal with this problem?
It's possible that the first request is a preflight request sent by the browser. But normally CORS preflight requests are sent using OPTIONS method, not POST. So this case is curious.
I am a contributor to the project but do not have enough reputation to add comments here to get clarifications. Could you create an issue on https://github.com/OData/odata.net ?

Why do I get a 200 status code but then redirected back to the login page?

I am trying to login to website using python requests library; I get the 200 status code however I just get redirected back to the login page.
I have tried many variations of the various post function variables including:
data = formData, form data includes the correct username, password and _csrf token
data = json.dumps(formData)
headers = headers (which included the "user-agent")
auth = HTTPBasicAuth(username,password) where username and password are the correct string variables for the site login
with requests.Session() as c:
loginPage = c.get(LoginURL)
loggedInPage = c.post(LoginURL,
data=formData,
headers=headers,
auth=HTTPBasicAuth(username,password)
)

API authentication in R - unable to pass auth token as header

I am looking to do a simple GET request (from the Aplos API) in R using the httr package. I'm able to obtain a temporary token by authenticating with an API key, but then I get a 401 "Token could not be located" once trying to use the token to make an actual GET request. Would appreciate any help! Thank you in advance.
AplosURL <- "https://www.aplos.com/hermes/api/v1/auth/"
AplosAPIkey <- "XYZ"
AplosAuth <- GET(paste0(AplosURL,AplosAPIkey))
AplosAuthContent <- content(AplosAuth, "parsed")
AplosAuthToken <- AplosAuthContent$data$token
#This is where the error occurs
GET("https://www.aplos.com/hermes/api/v1/accounts",
add_headers(Authorization = paste("Bearer:", AplosAuthToken)))
This is a Python snippet provided by the API documentation:
def api_accounts_get(api_base_url, api_id, api_access_token):
# This should print a contact from Aplos.
# Lets show what we're doing.
headers = {'Authorization': 'Bearer: {}'.format(api_access_token)}
print 'geting URL: {}accounts'.format(api_base_url)
print 'With headers: {}'.format(headers)
# Actual request goes here.
r = requests.get('{}accounts'.format(api_base_url), headers=headers)
api_error_handling(r.status_code)
response = r.json()
print 'JSON response: {}'.format(response)
return (response)
In the python example, the return of the auth code block is the api_bearer_token which is base64 decoded and rsa decrypted (using your key) before it can be used.
...
api_token_encrypted = data['data']['token']
api_bearer_token = rsa.decrypt(base64.decodestring(api_token_encrypted), api_user_key)
return(api_bearer_token)
That decoded token is then used in the api call to get the accounts.
The second issue I see is that your Authorization header does not match the example's header. Specifically, you are missing the space after "Bearer:"
headers = {'Authorization': 'Bearer: {}'.format(api_access_token)}
vs
add_headers(Authorization = paste("Bearer:", AplosAuthToken)))
Likely after addressing both of these you should be able to proceed.

HTTP Post Login failing

I'm writing a simple log in to website, navigate to page and parse app for iOS. I've managed to generate a post request, but it doesn't seem to want to log in. I don't know why it could be failing, because I get status 200. Admittedly I get this even if I deliberately enter wrong credentials. Any ideas? (code is in Swift)
var url = NSURL(string:"https://www.example.com/ps/signin.html")
var request = NSMutableURLRequest(URL: url)
request.HTTPMethod = "POST"
var dataString = "timezoneOffset=-600&userid1=xxx&userid=xxx&pwd=xxx&x=31&y=12"
let data = (dataString as NSString).dataUsingEncoding(NSUTF8StringEncoding)
request.HTTPBody = data
var connection = NSURLConnection(request: request, delegate: self, startImmediately: false)
println("sending request : \(request)")
connection.start()
So I get a response back, with status 200, and seemingly the html code for the login page again.
The problem is that you are POSTing to the login form: https://www.example.com/ps/signin.html. You need to POST to the same place the HTML form posts to, namely: https://www.example.com/psp/ps/?cmd=login&languageCd=ENG
Try changing your URL:
var url = NSURL(string:"https://www.example.com/psp/ps/?cmd=login&languageCd=ENG")

Resources