Post request using cookies with cURL, RCurl and httr - r

In Windows cURL I can post a web request similar to this:
curl --dump-header cook.txt ^
--data "RURL=http=//www.example.com/r&user=bob&password=hello" ^
--user-agent "Mozilla/5.0" ^
http://www.example.com/login
With type cook.txt I get a response similar to this:
HTTP/1.1 302 Found
Date: Thu, ******
Server: Microsoft-IIS/6.0
SERVER: ******
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
Location: ******
Set-Cookie: Cookie1=; domain=******; expires=****** ******
******
******
Cache-Control: private
Content-Type: text/html; charset=iso-8859-1
Content-Length: 189
I can manually read cookie lines like: Set-Cookie: AuthCode=ABC... (I could script this of course). So I can use AuthCode for subsequent requests.
I am trying do the same in R with RCurl and/or httr (still don't know which one is better for my task).
When I try:
library(httr)
POST("http://www.example.com/login",
body= list(RURL="http=//www.example.com/r",
user="bob", password="hello"),
user_agent("Mozilla/5.0"))
I get a response similar to this:
Response [http://www.example.com/error]
Status: 411
Content-type: text/html
<h1>Length Required</h1>
By and large I know about 411-error and I could try to fix the request; but I do not get it in cURL, so I am doing something wrong with the POST command.
Can you help me in translating my cURL command to RCurl and/or httr?

httr automatically preserves cookies across calls to the same site, as illustrated by these two calls to http://httpbin.org
GET("http://httpbin.org/cookies/set?a=1")
# Response [http://httpbin.org/cookies]
# Status: 200
# Content-type: application/json
# {
# "cookies": {
# "a": "1"
# }
# }
GET("http://httpbin.org/cookies")
# Response [http://httpbin.org/cookies]
# Status: 200
# Content-type: application/json
# {
# "cookies": {
# "a": "1"
# }
# }
Perhaps the problem is that you're sending your data as application/x-www-form-urlencoded, but the default in httr is multipart/form-data, so use multipart = FALSE in your POST call.

Based on Juba suggestion, here is a working RCurl template.
The code emulates a browser behaviour, as it:
retrieves cookies on a login screen and
reuses them on the following page requests containing the actual data.
### RCurl login and browse private pages ###
library("RCurl")
loginurl ="http=//www.*****"
mainurl ="http=//www.*****"
agent ="Mozilla/5.0"
#User account data and other login pars
pars=list(
RURL="http=//www.*****",
Username="*****",
Password="*****"
)
#RCurl pars
curl = getCurlHandle()
curlSetOpt(cookiejar="cookiesk.txt", useragent = agent, followlocation = TRUE, curl=curl)
#or simply
#curlSetOpt(cookiejar="", useragent = agent, followlocation = TRUE, curl=curl)
#post login form
web=postForm(loginurl, .params = pars, curl=curl)
#go to main url with real data
web=getURL(mainurl, curl=curl)
#parse/print content of web
#..... etc. etc.
#This has the side effect of saving cookie data to the cookiejar file
rm(curl)
gc()

Here is a way to create a post request, keep and reuse the resulting cookies with RCurl, for example to get web pages when authentication is required :
library(RCurl)
curl <- getCurlHandle()
curlSetOpt(cookiejar="/tmp/cookies.txt", curl=curl)
postForm("http://example.com/login", login="mylogin", passwd="mypasswd", curl=curl)
getURL("http://example.com/anotherpage", curl=curl)

Related

How to get session token when authenticating to JSON REST API (in R)

I'm trying to access JSON data (in R) from a REST API.
To authenticate myself, I need to use a POST method in https://dashboard.server.eu/login. The data that needs to be sent are email and password:
library(httr)
login <- list(
email = "my#email.com",
password = "mypass"
)
res <- POST("https://dashboard.server.eu/login", body = login, encode = "form", verbose())
When executing the above, I get this output:
-> POST /login HTTP/1.1
-> Host: dashboard.server.eu
-> User-Agent: libcurl/7.59.0 r-curl/3.3 httr/1.4.1
-> Accept-Encoding: gzip, deflate
-> Cookie: session=10kq9qv1udf0107F4C70RY14fsum41sq50
-> Accept: application/json, text/xml, application/xml, */*
-> Content-Type: application/x-www-form-urlencoded
-> Content-Length: 53
->
>> email=my%40email.com&password=mypass
<- HTTP/1.1 200 OK
<- access-control-allow-headers: Accept, Authorization, Content-Type, If-None-Match
<- access-control-allow-methods: HEAD, GET, POST, PUT, DELETE
<- cache-control: no-cache
<- content-encoding: gzip
<- content-type: application/json; charset=utf-8
<- date: Mon, 09 Mar 2020 14:58:31 GMT
<- set-cookie: session=10kq9qv1udf0107F4C70RY14fsum41sq50; HttpOnly; SameSite=Strict; Path=/
<- vary: origin,accept-encoding
<- x-microserv: NS4yNi4xODQuMjE3
<- x-poweredby: Poetry
<- Content-Length: 2346
<- Connection: keep-alive
The doc of the site says that, in case of success, a JSON res is returned and contains a string token in res.data._id.
I don't find it... even looking at every list (and sub-lists) of res.
How am I supposed to find the token?
Following the doc, and an example in AngularJS, I'm then supposed to do:
// Create JSON Object with your token
let authorizeObject = {
'Authorization': 'Session ' + token,
'content-type': 'application/json;charset=UTF-8',
'accept': 'application/json,text/plain',
};
// Create header from the previous JSON Object
let header = {'headers':authorizeObject};
// Use the header in your http request...
$http.get('https://dashboard.server.eu/', header)
Any hint on making this dream become true?
UPDATE -- With cURL, I could check that there is a _id key/value returned…
With the command:
curl -k -X POST "https://dashboard.server.eu/login" \
-d '{ "email" : "my#email.com", "password" : "mypass" }' \
-H "Content-Type: application/json"
I get the output:
{
"_id": "697v2on4ll0107F4C70RYhosfgtmhfug",
"isAuthenticated": true,
"user": {
"_id": "5dd57868d83cfc000ebbb273",
"firstName": "me",
"lastName": "Me",
...
So, the session token is indeed somewhere...
Does this help to help me?
Looking at the image of res in your question, the message is there, under content - it's just that the content is stored as a vector of raw bytes, which is why you didn't recognise it as json.
Since any file type can be sent by http, the contents in an httr response object are stored in raw format rather than a character string for various reasons - perhaps most importantly because many binary files will contain a 0x00 byte, which isn't allowed in a character string in R.
In your case, we can not only tell that res$content is text, but that it is your "missing" json. The first six bytes of res$content are shown in your image, and are 7b, 22, 5f, 69, 64, 22. We can convert these to a character string in R by doing:
rawToChar(as.raw(c(0x7b, 0x22, 0x5f, 0x69, 0x64, 0x22)))
[1] "{\"_id\""
This matches the first six characters of your expected json string.
Therefore if you do:
httr::content(res, "text")
or
rawToChar(res$content)
You will get your json as a character string.

Multiple Authentication in CURL

Context
I want to access an API from R and have the correct credentials to do so.
The issue is that Authentication requires three arguments to be passed (Username, Password and Organisation).
I am not familiar with API calls from R and I do not know how to pass this "organisation" argument.
Example
An example of the API call to request a token is as follows (taken from the documentation):
curl -X POST "https://URL.com" \
-H "Content-Type: application/json" \
--data "{\"user_name\": \"<username>\" \
\"password\": \"<mypassword>\",\
\"organisation\": \"<organisation>\"}"
Using the crul package, I have tried:
require("crul")
z <- crul::HttpClient$new(url = "https://THEURL.com",
auth(user = "myuser",
pwd = "mypwd"))
z$get()
Which returns:
<crul response>
url: https://THEURL.com
request_headers:
User-Agent: libcurl/7.59.0 r-curl/3.3 crul/0.8.0
Accept-Encoding: gzip, deflate
Accept: application/json, text/xml, application/xml, */*
response_headers:
status: HTTP/1.1 403 Forbidden
content-type: application/json
content-length: 211
connection: keep-alive
x-amzn-errortype: IncompleteSignatureException
Im assuming that x-amzn-errortype is referring to the missing organisation authentication variable.
Questions
How can I pass the variable "Organisation" to the function?
Or is there a better way to do this?
I think it will be way easier for you to use the package httr when you deal with HTTP requests. I think your example request could be:
library(httr)
POST("https://THEURL.com",
body = list(user_name = "<username>", password = "<mypassword>", organisation = "<organisation>"),
encode = "json")

python requests POST Multipart/form-data with additional parameters in Content-Disposition

The task is post request to some TTS(text to speech) endpoint, and get audio from response.
The endpoint is in private network so I cannot share with you to test against directly, but my question is not domain specific and I think it is a general http question.
There's existing working curl and python2 scripts, as following:
curl -v -H "Content-Type:multipart/form-data;boundary=message_boundary_0001" -H "Accept:audio/ogg;codecs=opus;" --data-binary #request.txt ip:port/someother/ -m 10 -o response.txt
requests.txt:
--message_boundary_0001--
Content-Disposition: form-data; name="RequestData"
Content-Type: application/json; charset=utf-8
{
jsondata1
}
--message_boundary_0001--
Content-Disposition: form-data; name="TtsParameter"; paramName="TEXT_TO_READ"
Content-Type: application/json; charset=utf-8
{
jsondata2
}
--message_boundary_0001--
The python2 scripts mainly constructs the request content, then call httplib.HTTPConnection..request('POST', uri, some BytesIO(), headers). if needed I can paste the code here.
Now I want to rewrite using python 3 requests library.
I've searched requests doc, and one exsisting SO question, and wrote following code, but got 400 error:
import requests
from requests_toolbelt import MultipartEncoder
headers = {'Accept': 'audio/ogg;codecs=opus;',
'Connection': 'keep-alive',
'Content-Type': 'multipart/form-data;boundary=message_boundary_0001',
}
RequestData = '''{
jsondata1
}'''
TtsParameter_TEXT_TO_READ = '''{
jsondata2
}'''
# url_origin = 'https://httpbin.org/post' # for debugging
url = 'http://ip:port/someother/'
resp = requests.post(url, headers=headers,
files={'RequestData': (None, RequestData), 'TtsParameter': (None, TtsParameter_TEXT_TO_READ)},
timeout=10)
print(resp.status_code)
print(resp.content.decode('utf-8'))
Which is not surprising, because in my curl request.txt there's a special Content-Disposition: Content-Disposition: form-data; name="TtsParameter"; paramName="TEXT_TO_READ", which is rarely seen in any tutorials.
So my question is how to pass the paramName="TEXT_TO_READ" to requests?
Update
the latest python code is pushed to github now.
https://github.com/LeiYangGH/py3requeststts
There is no way to do this with vanilla requests.
There's a less than ideal way to do it with the toolbelt thought.
from requests_toolbelt.multipart import encoder
mpe = encoder.MultipartEncoder(fields={'RequestData': (None, RequestData), 'TtsParameter': (None, TtsParameter_TEXT_TO_READ)})
for part in mpe.parts:
if 'name="TtsParameter"' in part.headers:
part.headers = part.headers.replace('name="TtsParameter"',
'name="TtsParameter"; paramName="TEXT_TO_READ"')
headers.update({'Content-Type': mpe.content_type})
requests.post(url, headers=headers, data=mpe)

Authorization Error 401 using GET in httr (R).

I'm trying to make a GET call in R using httr and I keep getting an authorization 401 error.
R code:
testfunction2 <- function()
{
set_config(verbose())
locus_url <- "https://api.locusenergy.com/v3/clients/5599"
r <- GET(url = "https://api.locusenergy.com/v3/clients/5599",
query=list(authorization="Bearer c935845d8fc1124757e66ce04d2c75d0"),
Accept="application/json")
}
The results:
> print(testfunction2())
-> GET /v3/clients/5599
authorization=Bearer%20c935845d8fc1124757e66ce04d2c75d0 HTTP/1.1
-> User-Agent: libcurl/7.39.0 r-curl/0.9.1 httr/1.0.0
-> Host: api.locusenergy.com
-> Accept-Encoding: gzip, deflate
-> Cookie: AWSELB=D91FBFE1087EF6EBC125A126777051237474A8A060B6095B8E3C16151308453F8556B2A2E90CB2178F365FAA8AA8C29B124D15CA3EB859CFE615428E8D55C393ABB5B436BF
-> Accept: application/json, text/xml, application/xml, */*
->
<- HTTP/1.1 401 Unauthorized
<- Content-Type: application/json
<- Date: Sun, 16 Aug 2015 05:02:27 GMT
<- Server: Apache-Coyote/1.1
<- transfer-encoding: chunked
<- Connection: keep-alive
<-
I expect it to return a 200 code (rather than 401, that implies authorization error.)
I know the token is correct because it works if I use the Postman (google add-in) and Python. The token won't work for you because I changed it since I can't share it.
Python Code:
import http.client
conn = http.client.HTTPSConnection("api.locusenergy.com")
headers = {
'authorization': "Bearer 935845d8fc1124757e66ce04d2c75d0"
}
conn.request("GET", "/v3/clients/5599", headers=headers)
res = conn.getresponse()
data = res.read()
print(data)
results from Python
b'{"statusCode":200,"partnerId":4202,"tz":"US/Arizona","firstName":"xxx","lastName":"xxxx","email":"xxxx#aol.com","id":5599}'
So, again the question is what am I doing wrong in R or can you give me any hints? This won't be reproducible for you because the token expired and I can't share it.
Could it be the space in the authorization? authorization="Bearer 935845d8fc1124757e66ce04d2c75d0"? Are there any hints in the verbose output of the get call in R?
For reference, this is the site's API page:
https://developer.locusenergy.com/
The site requires OAUTH2 authentication to return the token. I didn't include that code but I verified the token works with Python and Postman.
Right now you are passing your authorization values in the query string of the httr code and not in the http header as you are doing in the python code. Instead use
GET(url = "https://api.locusenergy.com/v3/clients/5599",
accept_json(),
add_headers(Authorization="Bearer c935845d8fc1124757e66ce04d2c75d0")
)

How to get JSON back from HTTP POST Request (to another domain)

I'm trying to use the API on a website, here's the part of the manual:
Authenticated Sessions (taken from here)
To create an authenticated session, you need to request an authToken from the '/auth' API resource.
URL: http://stage.amee.com/auth (this is not my domain)
Method: POST
Request format: application/x-www-form-urlencoded
Response format: application/xml, application/json
Response code: 200 OK
Response body: Details of the authenticated user, including API
version.
Extra data: "authToken" cookie and header, containing the
authentication token that should be
used for subsequent calls.
Parameters: username / password
Example
Request
POST /auth HTTP/1.1
Accept: application/xml
Content-Type: application/x-www-form-urlencoded
username=my_username&password=my_password
Response
HTTP/1.1 200 OK
Set-Cookie: authToken=1KVARbypAjxLGViZ0Cg+UskZEHmqVkhx/Pm...;
authToken: 1KVARbypAjxLGViZ0Cg+UskZEHmqVkhx/PmEvzkPGp...==
Content-Type: application/xml; charset=UTF-8
QUESTION:
How do I get that to work?
I tried jQuery, but it seems to have problem with XSS. Actual code snippet would be greatly appreciated.
p.s.
All I was looking for was WebClient class in C#
You need to put application/json in your Accept header, this tells the server you want it to respond in that format - not xml.
I am using rails to extract the same authentication token cookie from stage.amee.com/auth as mentioned above. it took a bit of experimentation before I created and customised the correct request object that returned a 200 OK, with the authtoken as a cookie. i haven't found an effective method of reading the request object or I would post exactly what it looks like. here is my ruby code from the app's controller
#define parameters
uri=URI.parse('http://stage.amee.com')
#path = '/auth'
#login_details = 'username=your_username&password=your_password'
#headers = {'Content-Type' => 'application/x-www-form-urlencoded', 'Accept' => 'application/json'}
#create request object
req = Net::HTTP.new(uri.host, uri.port)
#send the request using post, defining the path, body and headers
resp, data = req.post(#path, #login_details, #headers)
#print response details to console
puts "response code = " << resp.code
puts "response inspect = " << resp.inspect
resp.each do |key, val|
puts "response header key : " + key + " = " + val
end
puts "data: " + data

Resources