Amazon AWS: Passing append = TRUE to a s3write_using FUN = write_csv - r

I am trying to pass an argument to the write_csv function in R but I can't seem to pass it correctly. It does not append the data to the current csv file in the S3 Bucket.
Currently what I have is:
data %>%
s3write_using(.,
FUN = write_csv,
bucket = "myBUCKET",
object = "myLocationToSaveCSV.csv",
append = TRUE) # Append = TRUE is what I would like to pass to write_csv
I am able to write the data to the S3 bucket but when I want to append data to the current data it just re-writes the old data and no new data gets appended.
How can I pass extra parameters ... to the write_csv(...) funcition inside the s3write_using?

R uses positional arguments in functions (also nonpositional named args- its weird..)
so try to put the additional args directly after FUN :
data %>%
s3write_using(.,
FUN = write_csv,
append = TRUE,
bucket = "myBUCKET",
object = "myLocationToSaveCSV.csv"
)
if this doesn't work you can try to use a lambda:
data %>%
s3write_using(.,
FUN = \(x) write_csv(x,append=TRUE),
bucket = "myBUCKET",
object = "myLocationToSaveCSV.csv")

Related

Convert name of dataframe to string within a function

I have many dataframes which I want to run through a function which creates a directory with the name of that dataframe as the folder name.
I have tried:
Create_dir = function(data){
filename = deparse(substitute(data))
dir.create(filename, showWarnings = FALSE)
}
And
Create_dir = function(data){
list = lst(data, "x")
filename = names(list)[1]
dir.create(filename, showWarnings = FALSE)
}
And a few other methods which all work well outside functions, but inside the function they either name the folder "data" or don't work because filename equates to a list of strings of all the data in the dataframe.
Any help with this would be really appreciated.
The first function actually works, thanks, apologies.

How to dynamically change names inside a for loop in usethis::use_data() R

This is my first time creating an R package. I am trying to include 39 different datasets into the pre-loaded data for my package. However the usethis::use_data() function which creates the .rda files takes an unquoted name only and not a variable. Therefore
data = dynamic_name_from_for_loop
it keeps creating a file data.rda instead of dynamic_name_from_for_loop.rda
library(usethis)
library(readtext)
library(tidyverse)
site_list = c('0034L','0081L','0089L','0166L','0220R','0236L','0307R',
'0333L','0414R','0434L','0445L','0450L','0476R','0501R','0515L',
'0566R','0629R','0651R','0688R','0701R','0817L','0846R','0876L',
'0917R','0938L','1044R','1194R','1227R','1233L','1377L','1396R',
'1459L','1726L','1833R','1946L','2023R','2133L','2201R','2255R')
for (i in 1:length(site_list)){
sitename = site_list[i]
filename = paste0('M:/Tools/GCsandbar/data-raw/',sitename,"sd.csv")
data = read.csv(filename, header = T)
df_name = paste0('RC',sitename,'sd')
assign(df_name,data)
usethis::use_data(data,name = df_name, overwrite = TRUE)
#file.rename(from = 'data/data.rda',to = paste('data/',df_name,'.rda')) ## this did not work
}
This just creates 39 instances of df_name.rda overwriting the previous one each time, instead of creating RC0034L.rda, RC0081L.rda, ....
use_data asks for unquoted names of the objects to be saved.
There is no argument called name, in the function, and as far as I can see, name = df-name is doing nothing.
Try do.call instead.
library(usethis)
library(readtext)
library(tidyverse)
site_list = c('0034L','0081L','0089L','0166L','0220R','0236L','0307R',
'0333L','0414R','0434L','0445L','0450L','0476R','0501R','0515L',
'0566R','0629R','0651R','0688R','0701R','0817L','0846R','0876L',
'0917R','0938L','1044R','1194R','1227R','1233L','1377L','1396R',
'1459L','1726L','1833R','1946L','2023R','2133L','2201R','2255R')
for (i in 1:length(site_list)){
sitename = site_list[i]
filename = paste0('M:/Tools/GCsandbar/data-raw/',sitename,"sd.csv")
data = read.csv(filename, header = T)
df_name = paste0('RC',sitename,'sd')
assign(df_name, data)
do.call("use_data", list(as.name(df_name), overwrite = TRUE))
}
In the loop, the use_data can be changed to use_data_raw
usethis::use_data_raw(df_name)

Unable to read csv from S3 using R

I am trying to read a csv from AWS S3 bucket. Its the same file which I was able to write to the bucket.When I read it I get an error. Below is the code for reading the csv:
s3BucketName <- "pathtobucket"
Sys.setenv("AWS_ACCESS_KEY_ID" = "aaaa",
"AWS_SECRET_ACCESS_KEY" = "vvvvv",
"AWS_DEFAULT_REGION" = "us-east-1")
bucketlist()
games <- aws.s3::get_object(object = "s3://path/data.csv", bucket = s3BucketName)%>%
rawToChar() %>%
readr::read_csv()
Below is the error I get
<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>_data.csv</Key><RequestId>222</RequestId><HostId>333=</HostId></Error>
For reference below is how I used to write the data to the bucket
s3write_using(data, FUN = write.csv, object = "data.csv", bucket = s3BucketName
You don't need to include the protocol (s3://) or the bucket name in the object parameter of the get_object function, just the object key (filename with any prefixes.)
Should be able to do something like
games <- aws.s3::get_object(object = "data.csv", bucket = s3BucketName)

How to provide options when writing to S3 from R using s3write_using?

Writing to an S3 bucket from R using the aws.s3 library works like so
s3write_using(iris, FUN = write.csv, object = "iris.csv", bucket = "some-bucket")
But I cannot get an option to work. E.g. row.names=FALSE
What I've tried so far
I have tried the following
s3write_using(iris, FUN = write.csv, object = "iris.csv", bucket = "some-bucket", opts = list(row.names=FALSE))
s3write_using(iris, FUN = write.csv, object = "iris.csv", bucket = "some-bucket", opts = list("row.names"=FALSE))
# Seen here: https://github.com/cloudyr/aws.s3/issues/200
s3write_using(iris, FUN = write.csv, object = "iris.csv", bucket = "some-bucket", opts=list(headers = c('row.names' = 'FALSE')))
None of the above errors (or even warns), but the resulting .csv has row names, which it shouldn't
Question
How do I write to an S3 bucket from R using s3write_using and specify row.names = FALSE (or any other optional parameter)?
Manual for s3write_using says:
Optional additional arguments passed to put_object or save_object, respectively.
So the optional parameters are passed to other aws.s3 functions and not write.csv.
But you can always define your own function:
mywrite <- function(x, file) {
write.csv(x, file, row.names=FALSE)
}
And then use this function in lieu of write.csv:
s3write_using(iris, FUN = mywrite, object = "iris.csv", bucket = "some-bucket")
You can even make a function generator, that is a function which returns a function which wraps around write.csv to provide the necessary arguments:
customize.write.csv <- function(row.names=TRUE, ...) {
function(x, file) {
write.csv(x, file, row.names=row.names, ...)
}
}
Which you can use like this:
s3write_using(iris,
FUN = customize.write.csv(row.names=FALSE, dec=","),
object = "iris.csv",
bucket = "some-bucket")
Look like you can just add the options you want at the end, give it a try:
s3write_using(iris,
FUN = write.csv,
object = "iris.csv",
bucket = "some-bucket",
row.names=FALSE, dec=","
)

Import file from environment instead of read.table

I am using a package of someone else. As you see, there is a ImportHistData term in the function. I want to import the file from environment as rainfall name instead of rainfall.txt. When I replace rainfall.txt with rainfall, I got this error:
Error in read.table(x, header = FALSE, fill = TRUE, na.strings = y) :
'file' must be a character string or connection
So, to import file not as a text, which way should I follow?
Original shape of the function
DisagSimul(TimeScale=1/4,BLpar=list(lambda=l,phi=f,kappa=k,
alpha=a,v=v,mx=mx,sx=NA),CellIntensityProp=list(Weibull=FALSE,
iota=NA),RepetOpt=list(DistAllowed=0.1,FacLevel1Rep=20,MinLevel1Rep=50,
TotalRepAllowed=5000),NumOfSequences=10,Statistics=list(print=TRUE,plot=FALSE),
ExportSynthData=list(exp=TRUE,FileContent=c("AllDays"),file="15min.txt"),
ImportHistData=list("rainfall.txt",na.values="NA",FileContent=c("AllDays"),
DaysPerSeason=length(rainfall$Day)),PlotHyetographs=FALSE,RandSeed=5)
Source of ImportHistData part in the function
ImportHistDataFun(mode = 1, x = ImportHistData$file,
y = ImportHistData$na.values, z = ImportHistData$FileContent[1],
w = TRUE, s = ImportHistData$DaysPerSeason, timescale = 1)
First, check documentation of the package and see if the method (?DisagSimul) allows a data frame in memory to be used for ImportHistData argument instead of reading from an external .txt file.
If the function is set up to only read a file from disk and you do not want to save your rainfall data frame permanently as a file, consider using a tempfile that exists only in the R session or until you use unlink():
# INITIALIZE TEMP FILE
tf <- tempfile(pattern = "", fileext = ".txt")
# EXPORT rainfall to FILE
write.table(rainfall, tf, row.names=FALSE)
...
# USE TEMPFILE IN METHOD
DisagSimul(...
ImportHistData = list(tf, na.values="NA", FileContent=c("AllDays"),

Resources