XLSX data upload with RestRserve - r

I would like to work with RestRServe to have a .xlsx file uploaded for processing. I have tried the below using a .csv with success, but some slight modifications for .xlsx with get_file was not fruitful.
ps <- r_bg(function(){
library(RestRserve)
library(readr)
library(xlsx)
app = Application$new(content_type = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")
app$add_post(
path = "/echo",
FUN = function(request, response) {
cnt <- request$get_file("xls")
dt <- xlsx::read.xlsx(cnt, sheetIndex = 1, header = TRUE)
response$set_body("some function")
}
)
backend = BackendRserve$new()
backend$start(app, http_port = 65080)
})

What have you tried? According to the documentation request$get_file() method returns a raw vector - a binary representation of the file. I'm not aware of R packages/functions which allow to read xls/xlsx file directly from the raw vector (probably such functions exist, I just don't know).
Here you can write body to a file and then read it normal way then:
library(RestRserve)
library(readxl)
app = Application$new()
app$add_post(
path = "/xls",
FUN = function(request, response) {
fl = tempfile(fileext = '.xlsx')
xls = request$get_file("xls")
# need to drop attributes as writeBin()
# can't write object with attributes
attributes(xls) = NULL
writeBin(xls, fl)
xls = readxl::read_excel(fl, sheet = 1)
response$set_body("done")
}
)
backend = BackendRserve$new()
backend$start(app, http_port = 65080)
Also mind that content_type argument is for response encoding, not for request decoding.

Related

r curl curl_download doesn't append

I'm trying to take a set of dois and have the doc.org website return the information in .bib format. The code below is supposed to do that and crucially append each new result to a .bib file. mode = a is what I understand will do the appending but it doesn't. The last line of code prints out the contents of oufFile and it contains only the last .bib results.
What needs to be changed to make this work.
library(curl)
outFile <- tempfile(fileext = ".bib")
url1 <- "https://doi.org/10.1016/j.tvjl.2017.12.021"
url2 <- "https://doi.org/10.1016/j.yqres.2013.10.005"
h <- new_handle()
handle_setheaders(h, "accept" = "application/x-bibtex")
curl_download(url1, destfile = outFile, handle = h, mode = "a")
curl_download(url2, destfile = outFile, handle = h, mode = "a")
read_delim(outFile, delim = "\n")
It's not working for me as well with curl_download(). Alternatively you could download with curl() and and use write() with append = TRUE.
Here is a solution for that, which easily can be used for as many urls as you are looking to download the bibtex from. You can execute this after your line 7.
library(dplyr)
library(purrr)
urls <- list(url1, url2)
walk(urls, ~ {
curl(., handle = h) %>%
readLines(warn = FALSE) %>%
write(file = outFile, append = TRUE)
})
library(readr)
read_delim(outFile, delim = "\n")

R loop to extract CSV files from FTP

I'm trying to loop through all the CSV files on an FTP site and upload the contents of CSVs with a certain filename to a database.
So far I've been able to
access the FTP using...
getURL((url, userpwd = userpwd, ftp.use.epsv = FALSE, dirlistonly = TRUE),
get a list of the filenames using...
unlist(strsplit(filenames, "\r\n"),
and create a dataframe with a list of the full urls (e.g ftp://sample#ftpserver.name.com/samplename.csv) using...
for (i in seq_along(myfiles)) {
url_list[i,] <- paste(url, myfiles[i], sep = '')
}
How do I loop through this dataframe, filtering for certain filenames, in order to create a new dataframe with all of data from the relevant CSVs? (half the files are named Type1SampleName and half are Type2SampleName)
I would then uploading this data to the database.
Thanks!
Since RCurl::getURL returns direct HTTP response here being content of CSVs, consider extending your lapply function call to pass result into read.csv using text argument:
# VECTOR OF URLs
urls <- paste0(url, myfiles[grep("Type1", myfiles])
# LIST OF DATA FRAMES FROM EACH CSV
mydata <- lapply(urls, function(url) {
resp <- getURL(url, userpwd = userpwd, connecttimeout = 60)
read.csv(text = resp)
})
Alternatively, getURL supports a callback function with write argument:
Alternatively, if a value is supplied for the write parameter, this is returned. This allows the caller to create a handler within the call and get it back. This avoids having to explicitly create and assign it and then call getURL and then access the result. Instead, the 3 steps can be inlined in a single call.
# USER DEFINED METHOD
import_csv <- function(resp) read.csv(text = resp)
# LONG FORM NOTATION
mydata <- lapply(urls, function(url)
getURL(url, userpwd = userpwd, connecttimeout = 60, write = import_csv)
)
# SHORT FORM NOTATION
mydata <- lapply(urls, getURL, userpwd = userpwd, connecttimeout = 60, write = import_csv)
Just an update on how I finished this off and what worked for me in the end...
mydata <- lapply(urls, getURL, userpwd = userpwd, connecttimeout = 60)
Following on from above..
while (i <= length(mydata)) {
mydata1 <- paste0(mydata[[i]])
bin <- read.csv(text = mydata1, header = FALSE, skip = 1)
#Column renaming and formatting here
#Uploading to database using RODBC here
}
Thanks for the pointers #Parfait - really appreciated.
Like most problems it looks straightforward after you've done it!

Unable to read csv from S3 using R

I am trying to read a csv from AWS S3 bucket. Its the same file which I was able to write to the bucket.When I read it I get an error. Below is the code for reading the csv:
s3BucketName <- "pathtobucket"
Sys.setenv("AWS_ACCESS_KEY_ID" = "aaaa",
"AWS_SECRET_ACCESS_KEY" = "vvvvv",
"AWS_DEFAULT_REGION" = "us-east-1")
bucketlist()
games <- aws.s3::get_object(object = "s3://path/data.csv", bucket = s3BucketName)%>%
rawToChar() %>%
readr::read_csv()
Below is the error I get
<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>_data.csv</Key><RequestId>222</RequestId><HostId>333=</HostId></Error>
For reference below is how I used to write the data to the bucket
s3write_using(data, FUN = write.csv, object = "data.csv", bucket = s3BucketName
You don't need to include the protocol (s3://) or the bucket name in the object parameter of the get_object function, just the object key (filename with any prefixes.)
Should be able to do something like
games <- aws.s3::get_object(object = "data.csv", bucket = s3BucketName)

Import file from environment instead of read.table

I am using a package of someone else. As you see, there is a ImportHistData term in the function. I want to import the file from environment as rainfall name instead of rainfall.txt. When I replace rainfall.txt with rainfall, I got this error:
Error in read.table(x, header = FALSE, fill = TRUE, na.strings = y) :
'file' must be a character string or connection
So, to import file not as a text, which way should I follow?
Original shape of the function
DisagSimul(TimeScale=1/4,BLpar=list(lambda=l,phi=f,kappa=k,
alpha=a,v=v,mx=mx,sx=NA),CellIntensityProp=list(Weibull=FALSE,
iota=NA),RepetOpt=list(DistAllowed=0.1,FacLevel1Rep=20,MinLevel1Rep=50,
TotalRepAllowed=5000),NumOfSequences=10,Statistics=list(print=TRUE,plot=FALSE),
ExportSynthData=list(exp=TRUE,FileContent=c("AllDays"),file="15min.txt"),
ImportHistData=list("rainfall.txt",na.values="NA",FileContent=c("AllDays"),
DaysPerSeason=length(rainfall$Day)),PlotHyetographs=FALSE,RandSeed=5)
Source of ImportHistData part in the function
ImportHistDataFun(mode = 1, x = ImportHistData$file,
y = ImportHistData$na.values, z = ImportHistData$FileContent[1],
w = TRUE, s = ImportHistData$DaysPerSeason, timescale = 1)
First, check documentation of the package and see if the method (?DisagSimul) allows a data frame in memory to be used for ImportHistData argument instead of reading from an external .txt file.
If the function is set up to only read a file from disk and you do not want to save your rainfall data frame permanently as a file, consider using a tempfile that exists only in the R session or until you use unlink():
# INITIALIZE TEMP FILE
tf <- tempfile(pattern = "", fileext = ".txt")
# EXPORT rainfall to FILE
write.table(rainfall, tf, row.names=FALSE)
...
# USE TEMPFILE IN METHOD
DisagSimul(...
ImportHistData = list(tf, na.values="NA", FileContent=c("AllDays"),

RCurl - How to retrieve data from sftp?

I tried to retrieve data from sftp with the below code:
library(RCurl)
protocol <- "sftp"
server <- "xxxx#sftp.xxxx.com"
userpwd <- "xxx:yyy"
tsfrFilename <- "cccccc.tsv"
ouptFilename <- "out.csv"
opts = list(
#ssh.public.keyfile = "true", # file name
ssh.private.keyfile = "xxxxx.ppk",
keypasswd = "userpwd"
)
# Run #
## Download Data
url <- paste0(protocol, "://", server, tsfrFilename)
data <- getURL(url = url, .opts = opts, userpwd=userpwd)
and i received an error message:
Error in function (type, msg, asError = TRUE) : Authentication failure
What am I doing wrong?
Thanks
With a private key you do not need a password with you username. So your getURL statement will be:
data <- getURL(url = url, .opts = opts, username="username")
I had exactly the same problem and have just spent an hour trying different things out. What worked for me was changing the format of the private key to OpenSSH.
To do this, I used the key generator package puttygen. Go to the menu item "Conversions" to import the original private key and export to the OpenSSH format. I exported the converted key to the same folder that my original key was in with a new filename. I kept the *.ppk extension
Then I used the following commands:
opts <- list(
ssh.private.keyfile = "<path to my new OpenSSH Key>.ppk"
)
data <- getURL(url = URL, .opts = opts, username = username, verbose = TRUE)
This seemed to work fine.

Resources