Getting some problems with this post request for creating a delivery:
{'dropoff_name': 'stephen',
'pickup_address': '1234 Bancroft Way, Emeryville, CA',
'pickup_phone_number': '1231231234',
'dropoff_phone_number': '1231231234',
'dropoff_address': '200 Powell Street, Emeryville, CA',
'pickup_name': 'ShareTea',
'manifest': 'boba'
}
Here's my code:
def post_data(self):
post_data = {}
post_data["manifest"] = self.manifest
# post_data['manifest_items'] = self.manifest_items
post_data.update(self.pickup.post_data("pickup"))
post_data.update(self.dropoff.post_data("dropoff"))
if self.quote:
post_data["quote_id"] = self.quote.quote_id
return post_data
def _make_request(self, url, data=None, type='get'):
if type == 'post':
print(data)
headers = {'Content-type': 'application/x-www-form-urlencoded'}
response = requests.post(url, data=data, auth=(self.api_key, ''), headers = headers)
params = delivery.post_data()
return self._make_request(url, data=params, type='post')
I'm getting a 400 Exception that says The parameters of your request were invalid.
Does it identify which parameters are invalid?
If it's just the phone numbers, I had success by formatting the phone number in my request to the format of "123-123-1234'
I believe the manifest field should be an array.
Related
I used the following Python code to retrieve a web page behind a login page successfully for some years:
username = 'user'
password = 'pass'
login_url = 'https://company.com/login?url='
redirect_url = 'https://epaper.company.com/'
data = { 'email' : username, 'pass' : password }
initial_url = login_url + quote(redirect_url)
response = requests.post(initial_url, data=data)
Then something changed at company.com about 2 months ago, and the request returned status code 400. I tried changing the data parameter to json (response = requests.post(initial_url, json=data)) which gave me a 200 response telling me a wrong password was provided.
Any ideas what I could try to debug?
Thanks,
Jan
Update: I just tried using a requests session to retrieve the csrf_token from the login page (as suggested here), so now my code reads:
with requests.Session() as sess:
response = sess.get(login_url)
signin = BeautifulSoup(response._content, 'html.parser')
data['csrf_token'] = signin.find('input', {'name':'csrf_token'})['value']
response = sess.post(initial_url, data=data)
Unfortunately, the response is still 400 (and 200/wrong password with the json parameter).
First: When you send data=data, used {"Content-Type":"application/x-www-form-urlencoded"}; if you send json=data, in headers response should be used {"Content-Type":"application/json"}
Second: Perhaps redirects have been added. Try to add:
response = sess.post(url, data=data)
print("URL you expect", url)
print("Last request URL:", response.url)
Be sure to check:
print(sess.cookies.get_dict())
print(response.headers)
If you get an unexpected result when checking, change the code like this:
response = sess.post(url, data=data, allow_redirects=False)
GitHub allows you to send no more than 2500 requests per hour if I have several accounts/tokens, how to set up an automatic token change in Scrapy when a certain level of requests is reached (for example 2500) or for the token to change when responding 403.?
class GithubSpider(scrapy.Spider):
name = 'github.com'
start_urls = ['https://github.com']
tokens = ['token1', 'token2', 'token3', 'token4']
headers = {
'Accept': 'application/vnd.github.v3+json',
'Authorization': 'token ' + tokens[1],
}
def start_requests(self, **cb_kwargs):
for lang in languages:
cb_kwargs['lang'] = lang
url = f'https://api.github.com/search/users?q=language:{lang}%20location:{country}&per_page=100'
yield Request(url=url, headers=self.headers, callback=self.parse, cb_kwargs=cb_kwargs)
You could use the cycle function from the module itertools to create a generator using your list of tokens that you can then cycle through for each request you send to ensure you are using all the tokens equally thereby reducing chance of reaching the limit for any of the tokens.
If you start receiving 403 responses then you will know that all the tokens have reached their limit. See sample code below
from itertools import cycle
class GithubSpider(scrapy.Spider):
name = 'github.com'
start_urls = ['https://github.com']
tokens = cycle(['token1', 'token2', 'token3', 'token4'])
def start_requests(self, **cb_kwargs):
for lang in languages:
headers = {
'Accept': 'application/vnd.github.v3+json',
'Authorization': 'token ' + next(self.tokens)
}
cb_kwargs['lang'] = lang
url = f'https://api.github.com/search/users?q=language:{lang}%20location:{country}&per_page=100'
yield Request(url=url, headers=headers, callback=self.parse, cb_kwargs=cb_kwargs)
I have this python code that does not work as expected.
import requests
import json
API_ENDPOINT = "https://lkokpdvhc4.execute-api.us-east-1.amazonaws.com/mycall"
data = {'mnumber':'9819838466'}
r = requests.post(url = API_ENDPOINT, data = json.dumps(data))
print (r.text)
This will return an error:
{"stackTrace": [["/var/task/index.py", 5, "handler", "return
mydic[code]"]], "errorType": "KeyError", "errorMessage": "''"}
When I test the API using Amazon console's gateway, I get the expected output (i.e. string like "mumbai"). It means this is client side issue. I have confirmed this by using "postman" as well that returns the same error as mentioned above. How do I send correct headers to post request?
You can create a dictionary with the headers such as
headers = {
"Authorization": "Bearer 12345",
"Content-Type": "application/json",
"key" : "value"
}
Then at the point of making the request pass it as a keyword argument to the request method i.e .post() or .get() or .put
This will be
response = requests.post(API_ENDPOINT, data=json.dumps(data), headers=headers)
I'm not having success in scraping this website because it's does not contain any forms.
My crawler always returns nothing when I dump response data to a file:
import scrapy
class LoginSpider(scrapy.Spider):
name = 'mamega.org'
start_urls = ['https://www.mamega.org/search/']
def parse(self, response):
return scrapy.Request('https://www.mamega.org/_searchm.php',
method="POST",
meta = {'section': 'ebooks', 'datafill': 'musso'},
headers={'Content-Type': 'application/json; charset=UTF-8'},
callback = self.after_login
)
def after_login(self, response):
print ("__________________________________________after_login______________________________________________________")
page = response.url.split("/")[-2]
filename = 'quotes-%s.html' % page
with open(filename, 'wb') as f:
f.write(response.body)
self.log('Saved file %s' % filename)
for title in response.xpath('//table[#style="width:93%;"]//tbody//tr//td/following-sibling::a[2]/#href'):
yield {'roman': title.css('a ::text').extract_first(),'url': title.css('a::attr(href)').extract_first()}
Your first POST request doesn't contain any body.
If you take a look at the website you can see it includes 3 things that you need to replicate to get a proper response from their server:
The content-type and x-requested-with headers and some form data type body.
You can replicate this in your crawler:
headers = {
'content-type': 'application/x-www-form-urlencoded; charset=UTF-8',
'x-requested-with': 'XMLHttpRequest'
}
Request(
'https://www.mamega.org/_searchm.php',
method='POST',
body='section=ebooks&datafill=musso',
headers=headers
}
return scrapy.Request('https://www.mamega.org/_searchm.php',
method="POST",
meta = {'section': 'ebooks', 'datafill': 'musso'},
headers={'Content-Type': 'application/json; charset=UTF-8'},
callback = self.after_login
)
data you are passing as meta is actually formdata of POST Request.
Make Your Request as:
return scrapy.Request('https://www.mamega.org/_searchm.php',
method="POST",
#formdata formdata = {'section': 'ebooks', 'datafill': 'musso'},
headers={'Content-Type': 'application/json; charset=UTF-8'},
callback = self.after_login
)
Is it possible to submit a Freebase mqlread request via POST in Python? I have tried to search for documentation but everything refers to GET. Thanks.
It is possible.
You will need issue a POST and add a specific header: X-HTTP-Method-Override: GET (basically tells the server to emulate a GET with the POST's content). Specifically for me I used the Content-Encoding: application/x-www-form-urlencode.
Here's the relevant part of my code (coffeescript) if it helps:
mqlread = (query, queryEnvelope, cb) ->
## build URL
url = urlparser.format
protocol: 'https'
host: 'www.googleapis.com'
pathname: 'freebase/v1/mqlread'
## build POST body
queryEnvelope ?= {}
queryEnvelope.key = config.GOOGLE_API_SERVER_KEY
queryEnvelope.query = JSON.stringify query
options =
url: url
method: 'POST'
headers:
'X-HTTP-Method-Override': 'GET'
'User-Agent': config.wikipediaScraperUserAgent
timeout: 3000
form: queryEnvelope
## invoke API
request options, (err, response, body) ->
if err then return cb err
if response.statusCode != 200
try
json = JSON.parse(body)
errmsg = json?.error?.message or "(unknown JSON)"
catch e
errmsg = body?[..50]
return cb "#{response.statusCode} #{errmsg}"
r = JSON.parse response.body
decodeStringsInResponse r
cb null, r
I don't think POST is supported for MQLread, but you could use the HTTP Batch facility.
Here's an example in Python:
https://github.com/tfmorris/freebase-python-samples/blob/master/client-library/mqlread-batch.py