How to authenticate myself to download data in R?

How to authenticate myself to download data in R? - r

I want to download secured data from LendingClub (a P2P lending company, please Google it if you're interested in what they do).
The secured data can only be downloaded if you have an account. So now I have a username and password, and I check the download page to copy the file download link. Then how can I authenticate myself to download the data? I tried the following:
file <- 'lc1'
url <- "https://www.lendingclub.com/fileDownload.action?type=gen&file=LoanStats3a_securev1.csv.zip"
download.file(url, file)
But it throws warning:
trying URL 'https://www.lendingclub.com/fileDownload.action?type=gen&file=LoanStats3a_securev1.csv.zip'
Content type 'text/html;charset=UTF-8' length 200 bytes
opened URL
downloaded 14 Kb
Warning message:
In download.file(url, file) :
downloaded length 14531 != reported length 200
And the text file downloaded is not the zip file I want, I guess it's because no authentication step is involved, because if you don't have an account you can also download the partial data and the link is different:
url <- "https://resources.lendingclub.com/LoanStats3a.csv.zip"
and previous commands would work fine. So where can I add the authentication step?

You'll have to use their REST API with an API key that they give you here.
Then you can build a URL to the resource that you're looking to download in the format you'd like it in (or a format that you can manipulate to use in your code).
You can use curl to double-check your URL:
$curl -v -H "Authorization: <api key>" -XGET https://api.lendingclub.com/api/investor/v1/accounts/<investor_id>/summary

Related

Process ends with 100% progress and error while using pre-signed url for images

Using Mac, language .NET
Photoscene ID : krb1VCLBHop1AdbAdPtm96PWtCfsbncmfSXdSG5eUYM
I'm trying to upload images from mobile device to create 3d model of an object.
There were some issues uploading local images directly, so I uploaded images on AWS S3 and generated pre-signed url. These urls were used within file uploading curl command.
curl -v $BASE'photo-to-3d/v1/file'
-H 'Authorization: Bearer '$AUTH
-F "photosceneid="$PID
-F "type=image"
-F "file[0]="$URL
the result shows no error msg.
{"Usage":"2.1651990413666",
"Resource":"\/file",
"photosceneid":"krb1VCLBHop1AdbAdPtm96PWtCfsbncmfSXdSG5eUYM",
"Files":{"file":[{"filename":"YXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyMTEwMDdUMDY0NTU0WiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmWC1BbXotU2lnbmF0dXJlPWUyMDgyYWE2YjgyNzFmNzk4NTFmNzc3MzkwYmFhN2ExN2FjM2Y5MjJjNjM3NGIyMzg5YWFmN2IwNzE3OTY3MjU=.jpg",
"fileid":"AALxKyQX1tAhcEgbM6w6lITCOHAleS1Dumu6ocy3qaY=-krb1VCLBHop1AdbAdPtm96PWtCfsbncmfSXdSG5eUYM",
"filesize":"187126",
"msg":"No error"}
Processing ends with error
{"Usage":"0.51469397544861",
"Resource":"\/photoscene\/krb1VCLBHop1AdbAdPtm96PWtCfsbncmfSXdSG5eUYM\/progress",
"Photoscene":{"photosceneid":"krb1VCLBHop1AdbAdPtm96PWtCfsbncmfSXdSG5eUYM",
"progressmsg":"ERROR","progress":"100"}}
File links were checked. Images were displayed right and can be downloaded.
Format is jpg.
checked photoscene properties and found error_msg_id
"error_msg_id":"262"
the meaning of error code was not found anywhere including error handling documentation
I don't know what to do from here since error message does not specify what exactly went wrong.

R - Cannot Download gz file from FTP Server

I have been trying for three days now to download a file from an FTP server with R without a result. I have really tried everything and read all questions but still cannot manage.
The url is:
u <- "ftp://user:password#109.2.160.55/AGLO/2020/10/AGLO_00001_03-0_GDBX_1000077_202010032206_860101.CSV.gz"
When I copy paste this link in Firefox I can download the file, but with R I cannot. I tried download.file, GET, writeBin, getURL. All failed: getURL gives the following error:
Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) :
embedded nul in string: '\037‹\b\bøÙx_\0\003AGLO_00001_03-0_GDBX_1000077_202010032206_860101.CSV\0¬\\M³£¸’Ý¿_Áîº\002I XÉ¶®*\fn>nÔMÇ›\231·èÍ¼\210î×\021óóç¤\004\006#¹.Œ§+¢ë’×¦Ž’TæÉ\017¡ÎUS*üï·\030ÿ±ßbñKüÛùtøþ\033#A–ýÆc\036ãgÁy,\177Ë%~f_ŽÝ{é~,é\v%}¡\\~ÐIN¿ÿùï?~ÿ\217¿þýÏ¿þ(\017±\034oY¾ýë¯?þû÷?ÿ$qù·Å/p\v÷\177{£òð]U½.umÊ¿\235»¦/{s~ƒ”\220–7ÓŸ¢Ã¿þø¯\177þã¯ÿ)ó¸ÈŠ\022_eEœã·îúz-K\031—í £¯ZÕÑW5´º+…H9/{Uéú¨K^BfNôsT©è]·n\215ŽÊ2\021œ©²¦¯”e/æ»7e\fIy‹\231Ä_"ayF?ÈñÏýsåQUGüâ3ô¼H’d\t\177\024Hàgœ•ê=ºªópR„\235ç¬¼\002á¹VÇ\aðºRFy°y\b6Ç_xz¹d¯Àf<—I2Â.\030›\004\f°3\002ZÓõ#\027\035Z£ê\023ÀÇ²(+\\7CwT}ÉJ\215¿»nè£þ¢\201\230Ô^>"§\033×ø3#çLäž¾éc›õ\035§‰`wàr\022\020pg-êøë »èÖjØCï\003çeý÷\037j\210êoM}n"ÝG×«ÆCÂ:£úï]S«è¢a_*°\034¹Z\016\036C\034XŽÜ¼œBð8YX\217»&ã‘\t=†›êz=´X…`yyÓ]g-G\177é¾Ü¾(üé¢¶9`\235Ñ\031ÎXË\016\033*RÚwÏmèûgàešf|\001Þ]\023xžYËï/F·\235}\004p\tM{Òj\200·);Ý¾\033\230ýE\003úY_u\215‡`j,´Û¾\0^°8}iëf<\023Kå\217\002,àP2aÉß\006‚%åM\r¦ªì“¸Ý`jã%°“{ú±AùiÊ_s;ô_ºÄî\004Öí0\vý4ÜTÿá+ÿÚœL5þv‡¹0ž’ÃxÁå¤’\025l\
There is no proxy problem whatsoever since to get the u url I am searching in the FTP directory.
How can I download this fing file ?
Also another way I could eventially work around this is using:
browseURL(u,
browser = "C:/Program Files (x86)/Mozilla Firefox/firefox.exe")
The issue with this is that it will open a Firefox browser that will:
Ask me if I am sure I want to go to this site and then
Ask me what I want to do with this file (not so much of a problem since I can choose a default to always download, but still I do not want to)
The issue with this is that simply I do not want to open a browser and I do not want to be asked if I want to go to the site and if I want to download. There are many files on this server so I want to do all of this automatically and I will need to be working in parllel, so having a browser pop up is not great, but if all else fails I can accept.
I am so desperate that I can give you the user name and password in private.

Apparently downloading the file to disk using httr solved the problem. It is possible to combine write_disk and httr::GET to download files to disk in the following way:
library(httr)
to_download <- "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf"
# Download pdf to disk
GET(to_download, write_disk("dummy.pdf"))

Jmeter: Response code: Non HTTP response code: javax.net.ssl.SSLHandshakeException

I have an application URL. I need to run login test using Jmeter. I recorded the login steps using blazemeter extension of chrome. But when I run it I get below error. I know there have been questions like this, I have tried few and it seems my case is different.
I have tried:
Added these two lines in jmeter.bat
set JAVA_HOME=C:\Program Files\Java\jdk1.8.0_65
set PATH=%JAVA_HOME%\bin;%PATH%
Run Jmeter using "Run as Administrator"
Download the certificate from here https://gist.github.com/borisguery/9ef114c53b83e553b635 and install it this way
https://www.youtube.com/watch?v=2k581jcWk9M
Restart the Jmeter but and try again but no luck.
When I expand the error in Jmeter View tree listener I get error on this particular css file: https://abcurl.xyzsample.com/assets/loginpage/css/okta-sign-in.min.7c7cfd15fa939095d61912dd8000a2a8.css
Error:
Thread Name: Thread Group 1-1
Load time: 268
Connect Time: 0
Latency: 0
Size in bytes: 2256
Headers size in bytes: 0
Body size in bytes: 2256
Sample Count: 1
Error Count: 1
Response code: Non HTTP response code: javax.net.ssl.SSLHandshakeException
Response message: Non HTTP response message: Received fatal alert: handshake_failure
Response headers:
HTTPSampleResult fields:
ContentType:
DataEncoding: null

If you are getting error for only one .css file and it does not belong to the application under test (i.e. it is an external stylesheet) the best thing you could do is just to exclude it from the load test via URLs must match section which lives under "Advanced" tab of the HTTP Request Defaults configuration element.
If you need to load this .css by any means you could also try the following approaches:
Play with https.default.protocol and https.socket.protocols properties (look for the above lines in jmeter.properties) file
Install Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files into /jre/lib/security folder of your JRE or JDK home (replace existing files with the downloaded ones)

If your url needs a client certificate, then copy your cert to /bin folder and from the jmeter console if you go to options -> SSL Manager and select your cert , it would prompt you for the certificate password . And if you run your tests again , that should work .
Additionally you can also do keystore configuraion (http://jmeter.apache.org/usermanual/component_reference.html#Keystore_Configuration) , if you haven't done already .
Please Note that my jmeter version is 4.0 . Hope this helps .

download.file() fails when appending a random suffix to the filename

I'm trying to download a file in R on a remote server which sits behind a number of proxies. Something - I can't figure out what - is causing the file to be returned cached whenever I try and access it on that server, whether I do so through R or just through a Web Browser.
I've tried using cacheOK=FALSE in my download.file call and this has had no effect.
Per Is there a way to force browsers to refresh/download images? I have tried adding a random suffix to the end of the URL:
download.file(url = paste("http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_daily.zip?",
format(Sys.time(), "%d%m%Y"),sep=""),
destfile = "F-F_Research_Data_Factors_daily.zip", cacheOK=FALSE)
This produces, e.g., the following URL:
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_daily.zip?17092012
Which when accessed from a Web Browser on the remote server, indeed returns the latest version of the file. However, when accessed using download.file in R, this returns a corrupted zip archive. Both WinRAR and R's unzip function complain that the zip file is corrupt.
unzip("F-F_Research_Data_Factors_daily.zip")
1: In unzip("F-F_Research_Data_Factors_daily.zip") :
internal error in unz code
I can't see why downloading this file via R would cause a corrupted file to be returned, whereas downloading it via a Web Browser gives no problem.
Can anyone suggest either a way to beat the cache from R (about which I'm not hopeful), or a reason why download.file doesn't like my URL with ?someRandomString tacked onto the end of it?

It will work if you use mode="wb"
download.file(url = paste("http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_daily.zip?",format(Sys.time(),"%d%m%Y"),sep=""),
destfile = "F-F_Research_Data_Factors_daily.zip", mode='wb', cacheOK=FALSE)

How to open Excel 2007 File from Password Protected Sharepoint 2007 site in R using RODBC or RCurl?

I am interested in opening an Excel 2007 file in R 2.11.1 using RODBC. The Excel file resides in the shared documents page of a MOSS2007 website. I currently download the .xlsx file to my hard drive and then import to R using the following code:
library(RODBC)
con<-odbcConnectExcel2007("C:/file location/file.xlsx")
data<-sqlFetch(con, "worksheet name")
close(con)
When I type in the web url for the document into the odbcConnectExcel2007 connection, an error message pops up with:
ODBC Excel Driver Login Failed: Invalid internet Address.
followed by the following message in my R console:
ERROR: Could not SQLDriverConnect
Any insights you can provide would be greatly appreciated.
Thanks!
**UPDATE**
The site I am attempting to download from is password protected. I tried another method using the method 'getUrl' in the package RCurl:
x = getURL("http://website.com/file.xlsx", userpwd = "uname:pw")
The error that I receive is:
Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) :
embedded nul in string: 'PK\003\004\024\0\006\0\b\0\0\0!\0dA»ï\001\0\0O\n\0\0\023\0Ò\001[Content_Types].xml ¢Î\001( \0\002\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
I have no idea what this means. Any help would be appreciated. Thanks!

Two solutions worked for me.
If you do not need to automate the script that pulls the data, you can map a network drive pointing to the sharepoint folder from which you want to extract the Excel document.
If you need to automate a script to pull the Excel file every couple of minutes, I recommend sending your authentication credentials in a request that automatically saves the file to a local drive. From there you can read it into R for further data wrangling.
library("httr")
library("openxlsx")
user <- <USERNAME>
password <- <PASSWORD>
url <- "https://sharepoint.company/file_to_obtain.xlsx"
httr::GET(url,
authenticate(user, password, type="ntlm"),
write_disk("C:/tempfile.xlsx", overwrite = TRUE))
df <- openxlsx::read.xlsx("C:/tempfile.xlsx")
You can obtain the correct URL to the file by clicking on the sharepoint location and removing "?Web=1" after the file ending (xlsx, xlsb, xls,...). USERNAME and PASSWORD are usually windows credentials. It helps storing them in a key manager (such as:
library("keyring")
keyring::key_set_with_value(service = "Windows", username = "Key", password = <PASSWORD>)
and then authenticating via
authenticate(user, kreyring::key_get("Windows", "Key"), type="ntlm")
in some instances it may be sufficient to pass
authenticate(":", ":", type="ntlm")
if only your Windows credentials are required and the code is running from your machine.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to authenticate myself to download data in R? - r

Related

Process ends with 100% progress and error while using pre-signed url for images

R - Cannot Download gz file from FTP Server

Jmeter: Response code: Non HTTP response code: javax.net.ssl.SSLHandshakeException

download.file() fails when appending a random suffix to the filename

How to open Excel 2007 File from Password Protected Sharepoint 2007 site in R using RODBC or RCurl?

Categories

Resources