Subdirectory in R package - r

I am creating a R package, and would like to organize my R subdirectory, with subdirectories. Since only the function defined in R files at the root directory are exported, I added this code to one file at the root:
sourceDir <- function(path, trace = TRUE, ...) {
for (nm in list.files(path, pattern = "\\.[RrSsQq]$")) {
print(nm)
if(trace) cat(nm,":")
source(file.path(path, nm), ...)
if(trace) cat("\n")
}
}
sourceDir("R/DataGenerator")
When I use "CRTL+SHIFT+B" on RStudio, I see that the nm files are sourced. But once the package is loaded, none of the functions defined in the subdirectory R/DataGenerator are accessible, neither using :: nor using ::: .
How can I export functions defined in subdirectories of R ? Is it even possible ?

As indicated in the discussion in the comments to the accepted answer between Martin Morgan and me, this does not seem to work in current R versions. My workaround to get a bit better file organisation is to prefix the filenames using what would have been the subdirectories names.

Use the Collate: field in the DESCRIPTION file to specify paths to files to be included
Collate: foo.R bar/baz.R
A helper to generate the collate line might be something like
fls = paste(dir(pattern="R", recursive=TRUE), collapse=" ")
cat(strwrap(sprintf("Collate: %s", fls), exdent=4), sep="\n")

Related

Parsing issue, unexpected character when loading a folder

I am using this answer to load in a folder of Excel Files:
# Get the list of files
#----------------------------#
folder <- "path/to/files"
fileList <- dir(folder, recursive=TRUE) # grep through these, if you are not loading them all
# use platform appropriate separator
files <- paste(folder, fileList, sep=.Platform$file.sep)
So far, so good.
# Load them in
#----------------------------#
# Method 1:
invisible(sapply(files, source, local=TRUE))
#-- OR --#
# Method 2:
sapply(files, function(f) eval(parse(text=f)))
But the source function (Method 1) gives me the error:
Error in source("C:/Users/Username/filename.xlsx") :
C:/Users/filename :1:3: unexpected input
1: PK
^
For method 2 get the error:
Error in parse(text = f) : <text>:1:3: unexpected '/'
1: C:/
^
EDIT: I tried circumventing the issue by setting the working directory to the directory of the folder, but that did not help.
Any ideas why this happens?
EDIT 2: It works when doing the following:
How can I read multiple (excel) files into R?
setwd("...")
library(readxl)
file.list <- list.files(pattern='*.xlsx')
df.list <- lapply(file.list, read_excel)
just to provide a proper answer outside of the comment section...
If your target is to read many Excel files, you shouldn't use source.
source is dedicated to run external R code.
If you need to read many Excel files you can use the following code and the support of one of these libraries: readxl, openxlsx, tidyxl (with unpivotr).
filelist <- dir(folder, recursive = TRUE, full.names = TRUE, pattern = ".xlsx$|.xls$", ignore.case = TRUE)
l_df <- lapply(filelist, readxl::read_excel)
Note that we are using dir to list the full paths (full.names = TRUE) of all the files that ends with .xlsx, .xls (pattern = ".xlsx$|.xls$"), .XLSX, .XLS (ignore.case = TRUE) in the folder folder and all its subfolders (recursive = TRUE).
readxl is integrated with tidyverse. It is pretty easy to use. It is most likely what you're looking for.
Personally, I advice to use openxlsx if you need to write (rather than read) customized Excel files with many specific features.
tidyxl is the best package I've seen to read Excel files, but it may be rather complicated to use. However, it's really careful in the types preservation.
With the support of unpivotr it allows you to handle complicated Excel structures.
For example, when you find multiple headers and multiple left index columns.

Error in zip_internal(zipfile, files, recurse, compression_level, append = FALSE, : Some files do not exist

Received this error message from R when using openxlsx package, not sure where to look.
Error in zip_internal(zipfile, files, recurse, compression_level, append = FALSE, :
Some files do not exist
Does anyone have any suggestions? thanks.
The code is simple:
library(openxlsx)
df1 <- cars
write.xlsx(df1, file = 'cars.xlsx')
The error resolves itself if you add a TMP environment variable. As I suspected this is because the saveWorkbook function uses the function tempdir which creates a temporary directory and uses the TMP variable (among a list of other options) to do this (see documentation for more info).
The relevant lines of code are on line 223 here. Note that tempdir is not called directly, but through the function tempfile which uses tempdir() as default value for its tmpdir argument.
Related Github issue

How do I get the absolute path of an input file in R

I am using Rscript to plot some figures from a given CSV file in some directory, which is not necessarily my current working directory. I can call it as follows:
./script.r ../some_directory/inputfile.csv
Now I want to output my figures in the same directory (../some_directory), but I have no idea how to do that. I tried to get the absolute path for the input file because from this I could construct the output path, but I couldn't find out how to do that.
normalizePath() #Converts file paths to canonical user-understandable form
or
library(tools)
file_path_as_absolute()
The question is very old but it still misses a working solution. So here is my answer:
Use normalizePath(dirname(f)).
The example below list all the files and directories in the current directory.
dir <- "."
allFiles <- list.files(dir)
for(f in allFiles){
print(paste(normalizePath(dirname(f)), fsep = .Platform$file.sep, f, sep = ""))
}
Where:
normalizePath(dirname(f)) gives the absolute path of the parent directory. So the individual file names should be added to the path.
.Platform is used to have an OS-portable code. (here)
file.sep gives "the file separator used on your platform: "/" on both Unix-alikes and on Windows (but not on the former port to Classic Mac OS)." (here)
Warning: This may cause some problems if not used with caution. For instance, say this is the path: A/B/a_file and the working directory is now set to B. Then the code below:
dir <- "B"
allFiles <- list.files(dir)
for(f in allFiles){
print(paste(normalizePath(dirname(f)), fsep = .Platform$file.sep, f, sep = ""))
}
would give:
> A/a_file
however, it should be:
> A/B/a_file
Here the solution:
args = commandArgs(TRUE)
results_file = args[1]
output_path = dirname(normalizePath(results_file))
To get the absolute path(s) from file(s)
Why not combine the base R function file.path() with the answer that #Marius gave. This appears marginally simpler, will work with a vector of files (files), and take care of system specific separators:
file.path(normalizePath(dirname(files)), files)
And wrapped inside a function (abspath):
abspath <- function(files)file.path(normalizePath(dirname(files)), files)
For instance:
> setwd("~/test")
> list.files()
[1] "file1.txt" "file2.txt"
And then:
> abspath(files)
[1] "/home/myself/test/file1.txt" "/home/myself/test/file2.txt"
I see that people gave pieces of the solution, but not all of it.
I have used this:
outputFile = paste(normalizePath(dirname(inputFile)),"\\", "my_file.ext", sep = "")
Hope it helps.
fs::path_abs() is my preferred way. It avoids the backslashes of normalizePath().

R: sourcing files using a relative path

Sourcing files using a relative path is useful when dealing with large codebases. Other programming languages have well-defined mechanisms for sourcing files using a path relative to the directory of the file being sourced into. An example is Ruby's require_relative. What is a good way to implement relative path sourcing in R?
Below is what I pieced together a while back using various recipes and R forum posts. It's worked well for me for straight development but is not robust. For example, it breaks when the files are loaded via the testthat library, specifically auto_test(). rscript_stack() returns character(0).
# Returns the stack of RScript files
rscript_stack <- function() {
Filter(Negate(is.null), lapply(sys.frames(), function(x) x$ofile))
}
# Returns the current RScript file path
rscript_current <- function() {
stack <- rscript_stack()
r <- as.character(stack[length(stack)])
first_char <- substring(r, 1, 1)
if (first_char != '~' && first_char != .Platform$file.sep) {
r <- file.path(getwd(), r)
}
r
}
# Sources relative to the current script
source_relative <- function(relative_path, ...) {
source(file.path(dirname(rscript_current()), relative_path), ...)
}
Do you know of a better source_relative implementation?
After a discussion with #hadley on GitHub, I realized that my question goes against the common development patterns in R.
It seems that in R files that are sourced often assume that the working directory (getwd()) is set to the directory they are in. To make this work, source has a chdir argument whose default value is FALSE. When set to TRUE, it will change the working directory to the directory of the file being sourced.
In summary:
Assume that source is always relative because the working directory of the file being sourced is set to the directory where the file is.
To make this work, always set chdir=T when you source files from another directory, e.g., source('lib/stats/big_stats.R', chdir=T).
For convenient sourcing of entire directories in a predictable way I wrote sourceDir, which sources files in a directory in alphabetical order.
sourceDir <- function (path, pattern = "\\.[rR]$", env = NULL, chdir = TRUE)
{
files <- sort(dir(path, pattern, full.names = TRUE))
lapply(files, source, chdir = chdir)
}

How to obtain a list of directories within a directory, like list.files(), but instead "list.dirs()"

I am able to use list.files() to obtain a list of files in a given directory, but if I want to get a list of directories, how would I do this? Is it somehow right in front of me as an option within list.files()?
Also, I'm using Windows, so if the answer is to shell out to some Linux/unix command, that won't work for me.
.NET for example has a Directory.GetFiles() method, and a separate Directory.GetDirectories()
method, so I figured R would have an analogous pair.
Update: A list.dirs function was added to the base package in revision 54353, which was included in the R-2.13.0 release in April, 2011.
list.dirs(path = ".", full.names = TRUE, recursive = TRUE)
So my function below was only useful for a few months. :)
I couldn't find a base R function to do this, but it would be pretty easy to write your own using:
dir()[file.info(dir())$isdir]
Update: here's a function (now corrected for Timothy Jones' comment):
list.dirs <- function(path=".", pattern=NULL, all.dirs=FALSE,
full.names=FALSE, ignore.case=FALSE) {
# use full.names=TRUE to pass to file.info
all <- list.files(path, pattern, all.dirs,
full.names=TRUE, recursive=FALSE, ignore.case)
dirs <- all[file.info(all)$isdir]
# determine whether to return full names or just dir names
if(isTRUE(full.names))
return(dirs)
else
return(basename(dirs))
}
base R now includes a list.dirs function, so home-brewed variants are no longer necessary.
For example:
list.dirs('.', recursive=FALSE)
Just to update this thread:
I see that in the newer version of R (currently I'm using 2.5.1), there is now a list.dirs function included in the base install:
list.dirs implicitly has all.files = TRUE, and if recursive = TRUE,
the answer includes path itself (provided it is a readable directory).
list.dirs <- function(...) {
x <- dir(...)
x[file_test("-d", x)]
}
might be of use?
How might we do this recursively? (the recursive argument of dir breaks these functions because it never returns directory names, just the files within each directory, etc...).
What about something like this, give it a try:
dir('.')[file.info(dir('.',full.names=T))$isdir]
You mention that you don't want to shell out to a Linux/UNIX command but I assume its ok to shell out to a Windows command. In that case this would do it:
shell("dir/ad/b", intern = TRUE)
and this would do it recursively:
shell("dir/ad/b/s", intern = TRUE)
Normally I would prefer the platform independent solutions of others here but particularly for interactive use where you are just concerned with getting the answer as simply and directly as possible this may be less work.
I had this problem a while back and used this recursive code to find all directories. Perhaps this can be of use?
list.dirs <- function(parent=".") # recursively find directories
{
if (length(parent)>1) # work on first and then rest
return(c(list.dirs(parent[1]), list.dirs(parent[-1])))
else { # length(parent) == 1
if (!is.dir(parent))
return(NULL) # not a directory, don't return anything
child <- list.files(parent, full=TRUE)
if (!any(is.dir(child)))
return(parent) # no directories below, return parent
else
return(list.dirs(child)) # recurse
}
}
is.dir <- function(x) # helper function
{
ret <- file.info(x)$isdir
ret[is.na(ret)] <- FALSE
ret
}

Resources