I have used googledrive functions successfully to access xlsx spreadsheets on my own google drive - so
drive_download(file = "DIRECTOR_TM/Faculty/Faculty Productivity/Faculty productivity.xlsx",
overwrite=TRUE)
works and saves a local copy of the file for me to run analyses on.
Mid year we switched to using team drives and the equivalent
drive_download(file = "Director/Faculty/Faculty Productivity/Faculty productivity.xlsx",
overwrite=TRUE)
doesn't work - I get an error that says "Error: 'file' does not identify at least one Drive file."
So I have tried using the team_drive_get function - and am confused
Director <- team_drive_get("Director")
does work - I get a tribble with one observation. But the file I want is in a subdirectory in the "Director" team drive. So I tried
TeamDrive <- team_drive_get("Director/Faculty/Faculty Productivity/")
but the result is a 0 obs tribble.
How do I get access to a file in a subdirectory on a team drive?
googledrive uses IDs to identify objects in a flattened file structure for your team, i.e., you don't need to know the subdirectory. If you know the name of your file, you just need to search the team drive and find the ID (see your specific question---and why I found this---addressed below).
# environment variables
FILENAME <- "your_file_name"
TEAM_DRIVE_NAME <- "your_team_name_here"
# get file(s)
gdrive_files_df <- drive_find(team_drive = TEAM_DRIVE_NAME)
drive_download(
as_id(gdrive_files_df[gdrive_files_df$name == FILENAME,]$id),
overwrite = TRUE
)
Alternatively, this is what you can do if you do need to find the specific ID of a subdirectory (perhaps for an upload where there is no existing ID for the file).
# environment variables
FILEPATH <- "your_file_path"
TEAM_SUBDIRECTORY <- "your_subdirectory"
# grab the ID of your subdirectory and upload to that directory
drive_upload(
FILEPATH,
path = as_id(gdrive_files_df[gdrive_files_df$name == TEAM_SUBDIRECTORY,]$id),
name = FILENAME,
)
Related
Let's say my colleagues and I have a shared directory, such as a SharePoint drive. Our file path to any given directory, say OurProject1 will be the same with the only difference being our username.
So for example my path will be: "C:/Users/JohnLennon/SharedDrive/SharedData/baseline_data"
While theirs will be: "C:/Users/RingoStarr/SharedDrive/SharedData/baseline_data"
I am trying to write a function that will allow any of my colleagues who has mapped the shared drive to run a script that accesses data in the shared data without them having to manually input their username. Keep in mind that the project directory is not the shared drive - that if I share this script with a colleague it will be kept outside of the shared directory and so relative file paths with regards to the project won't work.
I have been trying to approach this using an absolute file path set temporarily within the function that infers the first half of the directory path from getwd(). So the function looks a bit like this:
wd <- getwd() # get the users working dir
usr <- substr(wd, 1, 18) # extract the root down to the username
paste(usr, "SharedDrive/SharedData/baseline_data", sep = "") # prefix this onto the shared directory path
This works fine for RingoStarr, who has the same number of characters in his username as JohnLennon, but what about GeorgeHarrison, or all the other users? Counting characters on line two is clearly a limited approach.
I am looking for a modification to line two that will navigate "blindly" from the working directory, which we assume to be a subdirectory of "C:/Users/Username/" to two levels below the root directory (i.e. in the Username directory). ".." won't work here as we don't know where abouts within the the Username directory getwd() is.
I am also open to a different approach to the problem if one exists
Instead of substr, you can try strsplit and then paste with the collapse argument:
wd_split <- strsplit(wd, "\\/")
wd_split
# [[1]]
# [1] "C:" "Users" "JohnLennon" "SharedDrive" "SharedData"
usr <- paste(wd_split[[1]][1:3], collapse = "/")
usr
# "C:/Users/JohnLennon"
I am trying to upload hundreds of googlesheets using the new R googlesheets4 package using the function gs4_create. I can successfully upload files in the root of the google drive but fail to see how I can send it inside a pre existing folder on google drive.
See the following reprex:
df <- data.frame(a=1:10,b=letters[1:10])
googlesheets4:: gs4_create(name="TEST_FOLDER/testsheet",sheets=df)
It creates a file named : "TEST_FOLDER/testsheet in the root folder.
While I want to create the file inside the TEST_FOLDER.
I know I can use write_sheet() on files pre existing inside a folder but I want to create new files, not write in pre existing files. I also know the googledrive::drive_upload() will allow me to upload csv files but I do not like the format of the csv files when they are uploaded, as they go as plain text sheets with no frozen first row. This is possible only through the googlesheets4 package. So back to my question:
How do I create a googlesheet files (in bulk) inside the TEST_FOLDER?
First, you have to create a folder with drive_mkdir(name = "TEST_FOLDER") from the googledrive package. Once you created it, I would recommend you to work with the ids of the folder and the files. So, the next step to find the id would be:
folder_id <- drive_find(n_max = 10, pattern = "TEST_FOLDER")$id
*This works if you have only one folder called TEST_FOLDER in your Google Drive. If you have more than one, i would recommend you to copy/paste the id directly, or identifying the id you want before assigning to the "folder_id" object.
*If you don't want to do this step, you can also copy/paste the id directly from the Google Drive url
Once you have it, you can program a for loop in order to upload all files. For example, supposing your sheets are called sheet1, sheet2... sheet10:
a <- rep("sheet",10)
b <- 1:10
names <- paste0(a,b)
for(x in names){
gs4_create(name = x, sheets = list(sheet1 = get(x)))
sheet_id <- drive_find(type = "spreadsheet", n_max = 10,
pattern = x)$id
sheet_id <- drive_find(type = "spreadsheet", n_max = 10,
pattern = x)$id
drive_mv(file = as_id(sheet_id), path = as_id(folder_id))
}
NOTE: If you have too many files in your root folder of Google Drive, the mkdir function will take too much time. That's why I recommend working with ids. If you have this problem, you could create this folder manually, copy the id and assign it to the "folder_id" object.
I am using googledrive package from CRAN. But, function - drive_upload lets you upload a local file and not a data frame. Can anybody help with this?
Just save a data_frame in question to a local file. Most basic options would be saving to CSV or saving an RData.
Example:
test <- data.frame(a = 1)
tempFileCon <- file()
write.csv(test, file = tempFileCon)
rm(test)
load("test.Rds")
exists("test")
Since clarified it is not possible to use temporary file we could use a file connection.
test <- data.frame(a = 1)
tempFileCon <- file()
write.csv(test, file = tempFileCon)
And now we have the file conneciton in memory that we can use to provide for other functions. Caveat - use literal object name to address it and not quotations like you would with actual files.
Unfortunately I can find no way to push the dataframe up directly, but just to document for others trying to get the basics accomplished that this question touches upon is with the following code that writes a local .csv and then bounces it up through tidyverse::googledrive to express itself as a googlesheet.
write_csv(iris, 'df_iris.csv')
drive_upload('df_iris.csv', type='spreadsheet')
You can achieve this using gs_add_row from googlesheets package. This API accepts dataframes directly as input parameter and uploads data to the specified google sheet. Local files are not required.
From the help section of ?gs_add_row:
"If input is two-dimensional, internally we call gs_add_row once per input row."
This can be done in two ways. Like mentioned by others, a local file can be created and this can be uploaded. It is also possible to create a new spreadsheet in your drive. This spreadsheet will be created in the main folder of your drive. If you want it stored somewhere else, you can move it after creation.
# install the packages
install.packages("googledrive", "googlesheets4")
# load the libraries
library(googledrive)
library(googlesheets4)
## With local storage
# Locally store the file
write.csv(x = iris, file = "iris.csv")
# Upload the file
drive_upload(media = "iris.csv", type='spreadsheet')
## Direct storage
# Create an empty spreadsheet. It is stored as an object with a sheet_id and drive_id
ss <- gs4_create(name = "my_spreadsheet", sheets = "Sheet 1")
# Put the data.frame in the spreadsheet and provide the sheet_id so it can be found
sheet_write(data=iris, ss = ss, sheet ="Sheet 1")
# Move your spreadsheet to the desired location
drive_mv(file = ss, path = "my_creations/awesome location/")
I have a file in my google drive that is an xlsx. It is too big so it is not automatically converted to a googlesheet (that's why using googlesheets package did not work). The file is big and I can't even preview it through clicking on it on my googledrive. The only way to see it is to download is as an .xlsx . While I could load it as an xlsx file, I am trying instead to use the googledrive package.
So far what I have is:
library(googledrive)
drive_find(n_max = 50)
drive_download("filename_without_extension.xlsx",type = "xlsx")
but I got the following error:
'file' does not identify at least one Drive file.
Maybe it is me not specifying the path where the file lives in the Drive. For example : Work\Data\Project1\filename.xlsx
Could you give me an idea on how to load in R the file called filename.xlsx that is nested in the drive like that?
I read the documentation but couldn't figure out how to do that.Thanks in advance.
You should be able to do this by:
library(googledrive)
drive_download("~/Work/Data/Project1/filename.xlsx")
The type parameter is only for Google native spreadsheets, and does not apply to raw files.
I want to share my way.
I do this way because I keep on updating the xlsx file. It is a query result that comes from an ERP.
So, when I tried to do it by googleDrive Id, it gave me errors because each time the ERP update the file its Id change.
This is my context. Yours can be absolutely different. This file changes just 2 or three times at month. Even tough it is a "big" xlsx file (78-80K records with 19 factors), I use it for just seconds to calculate some values and then I can trash it. It does not have any sense to store it. (to store is more expensive than upload)
library(googledrive)
library(googlesheets4) # watch out: it is not the CRAN version yet 0.1.1.9000
drive_folder_owner<-"carlos.sxxx#xxxxxx.com" # this is my account in this gDrive folder.
drive_auth(email =drive_folder_owner) # previously authorized account
googlesheets4::sheets_auth(email =drive_folder_owner) # Yes, I know, should be the same, but they are different.
d1<-drive_find(pattern = "my_file.xlsx",type = drive_mime_type("xlsx")) # This is me finding the file created by the ERP, and I do shorten the search using the type
meta<-drive_get(id=d1$id)[["drive_resource"]] # Get the id from the file in googledrive
n_id<-glue("https://drive.google.com/open?id=",d1$id[[1]]) # here I am creating a path for reading
meta_name<- paste(getwd(),"/Files/",meta[[1]]$originalFilename,sep = "") # and a path to temporary save it.
drive_download(file=as_id(n_id),overwrite = TRUE, path = meta_name) # Now read and save locally.
V_CMV<-data.frame(read_xlsx(meta_name)) # store to data frame
file.remove(meta_name) # delete from R Server
rm(d1,n_id) # Delete temporary variables
I've been working on a R project (projectA) that I want to hand over to a colleague, what would be the best way to handle workspace references in the scripts? To illustrate, let's say projectA consists of several R scripts that each read input and write output to certain directories (dirs). All dirs are contained within my local dropbox. The I/O part of the scripts look as follows:
# Script 1.
# Give input and output names and dirs:
dat1Dir <- "D:/Dropbox/ProjectA/source1/"
dat1In <- "foo1.asc"
dat2Dir <- "D:/Dropbox/ProjectA/source2/"
dat2In <- "foo2.asc"
outDir <- "D:/Dropbox/ProjectA/output1/"
outName <- "fooOut1.asc"
# Read data
setwd(dat1Dir)
dat1 <- read.table(dat1In)
setwd(dat2Dir)
dat2 <- read.table(dat2In)
# do stuff with dat1 and dat2 that result in new data foo
# Write new data foo to file
setwd(outDir)
write.table(foo, outName)
# Script 2.
# Give input and output names and dirs
dat1Dir <- "D:/Dropbox/ProjectA/output1/"
dat1In <- "fooOut1.asc"
outDir <- "D:/Dropbox/ProjectA/output2/"
outName <- "fooOut2.asc"
Etc. Each script reads and write data from/to file and subsequent scripts read the output of previous scripts. The question is: how can I ensure that the directory-strings remain valid after transfer to another user?
Let's say we copy the ProjectA folder, including subfolders, to another PC, where it is stored at, e.g., C:/Users/foo/my documents/. Ideally, I would have a function FindDir() that finds the location of the lowest common folder in the project, here "ProjectA", so that I can replace every directory string with:
dat1Dir <- paste(FindDir(), "ProjectA/source1", sep= "")
So that:
# At my own PC
dat1Dir <- paste(FindDir(), "ProjectA/source1", sep= "")
> "D:/Dropbox/ProjectA/source1/"
# At my colleagues PC
dat1Dir <- paste(FindDir(), "ProjectA/source1", sep= "")
> "C:Users/foo/my documents/ProjectA/source1/"
Or perhaps there is a different way? Our work IT infrastructure currently does not allow using a shared disc. I'll put helper-functions in an 'official' R project (ie, hosted on R forge), but I'd like to use scripts when many I/O parameters are required and because the code can easily be viewed and commented.
Many thanks in advance!
You should be able to do this by using relative directory paths. This is what I do for my R projects that I have in Dropbox and that I edit/run on both my Windows and OS X machines where the Dropbox folder is D:/Dropbox and /Users/robin/Dropbox respectively.
To do this, you'll need to
Set the current working directory in R (either in the first line of your script, or interactively at the console before running), using setwd('/Users/robin/Dropbox;) (see the full docs for that command).
Change your paths to relative paths, which mean they just have the bit of the path from the current directory, in this case the 'ProjectA/source1' bit if you've set your current directory to your Dropbox folder, or just 'source1' if you've set your current directory to the ProjectA folder (which is a better idea).
Then everything should just work!
You may also be interested in an R library that I love called ProjectTemplate - it gives you really nice functionality for making self-contained projects for this sort of work in R, and they're entirely reproducible, moveable between computers and so on. I've written an introductory blog post which may be useful.