Open file generated by loading .rda files - r

I followed your advice about creating a loop that loads files in R and did:
dataFiles<-lapply(Sys.glob("kwo*.rda*"), load)
Now I have my dataFiles which contains the files I wanted to load
head(dataFiles)
[[1]]
[1] "kw"
[[2]]
[1] "kw"
[[3]]
[1] "kw"
Now I need to work with the information contained in the files I loaded, what should I do to open the files and to 'identify' them?

Standard behavior of load in this kind of loop is to create a temporary environment, load the data into it, and discarding this temporary enviroment again. If you want them in the global environment, you need to explicitly load them into the global environment, see this SO post for more info. This will load all the objects contained in all the .Rda files into your global environment, aka workspace.
Could you provide some more information as to what you are doing exactly? What generated the Rda files, and what do you want to do with that data you read in? More information can help us, help you. And you refer to an earlier SO question your (I followed your advice about creating a loop that loads files in R), I cannot find this question in your profile.

Related

How to delete temporary files of deleted objects in R package raster

Since I work with large RasterBrick objects, the objects can't be stored in memory but are stored as temporary files in the current Temporary files directory tempdir() or to be exact in the subfolder "raster".
Due to the large file sizes it would be very nice to delete the temporary files of the unused objects.
If I delete objects I no longer need by
rm(list=ls(pattern="xxx")
the temporary files still exist.
Garbage collection gc() will have no effect on that to my understanding since it has no effect on the hard drive.
The automatically given names of the temporary files don't show any relation to the object names.
Therefore it is not possible to delete them by a code like
raster_temp_dir <- paste(tempdir(), "/raster", sep="")
files_to_be_removed <- list.files(rastertemp, pattern="xxx", full.names=T)
Unfortunately the files of objects still in use aren't read-only.
Therefore I also would delete objects I still need by running:
files_to_be_removed <- list.files(rastertemp, full.names=T)
Did somebody already solve this problem or has any ideas how to solve it?
It would be perfect if somehow, there a code could distinguish between unused and used objects.
Since this is unlikely to implement a detour could be naming the temporary files of the Raster objects manually, but I haven't encountered an option for this neither since the filename argument can only be used when writing files to the hard disk but not when temporary files are created (to my knowledge).
Thanks!
I think the function you're looking for is file.remove(). Just pass it a vector with the file names you want to delete.

Can you save R code files with .RData objects for version control?

There is no version control at my work (we have an outdated, centralized system with sensitive patient information, so we can't save things outside of it). When I save a .RData file from a script, I would like to be able to save the exact version of that .R file with it at that time. Is there a way to do this?
E.g. if I have an R script "run_analysis.R" that has the line
save(data,file='foo.RData')
Is there a way I can do something like
save(data,run_analysis.R,file='foo.RData')
so that if I pull up the data file a year later I'll know exactly what code was used to create it?
you could zip the foo.RData file together with the run_analysis.R file and store the zipped file.
the CRAN package [zip] (https://cran.r-project.org/web/packages/zip/zip.pdf) can be used to create the zip file from within r.

How to open .rdb file using R

My question is quite simple, but I couldn't find the answer anywhere.
How do I open a .rdb file using R?
It is placed inside an R package.
I have been able to solve the problem, so I am posting the answer here in case someone needs it in the future.
#### Importing data from .rdb file ####
setwd("path...\\Rsafd\\Rsafd\\data") # Set working directory up to the file that contains
# your .rds and .rdb files.
readRDS("Rdata.rds") # see metadata contained in .rds file
# lazyLoad is the function we use to open a .rdb file:
lazyLoad(filebase = "path...\\Rsafd\\Rsafd\\data\\Rdata", envir = parent.frame())
# for filebase, Rdata is the name of the .rdb file.
# envir is the environment on which the objects are loaded.
The result of using the lazyLoad function is that every database contained in the .rdb file shows up in your variable environment as a "promise". This means that the database will not be opened unless you want it to be.
The way to open it is the following:
find(HOWAREYOU) # open the file named HOWAREYOU
head(HOWAREYOU) # look at the first entries, just to make sure
Edit: readRDS is not part of the process to open the .rdb file, it is just to look at the metadata. The lazyLoad function indeed opens .rdb files.
Posting a slightly more direct answer since I keep Googling to this Q&A when trying to examine .rdb objects inside an R package (in particular the help/package.rdb file) and not seeing the answer clearly enough.
R keeps the help Rd objects for the installed package pkg at help/$pkg.{rdb,rdx}.
We can load these Rd objects into environment e like so:
lazyLoad(
file.path(system.file("help", package=pkg), pkg),
envir = e
)
Note that we can't use system.file("help", pkg, package=pkg) because system.file() requires the file to exist or it returns "", and here we've truncated the .rdb/.rdx extension as required by lazyLoad().
We can skip supplying envir=e, but the objects will be loaded into the global environment (assuming you're running this interactively) and I wanted my default answer to avoid polluting it.
See ?lazyLoad for more.

Rstudio is deleting key files when I knit (both PDF and HTML)

So I am having an R nightmare. I've returned to a project I built under the previous iteration (or perhaps one more) of RStudio. I produced a workable report that I was asked to update, and my current bugbear wasn't around then. Here is what happens:
My report file is "ISS Time Series.Rmd". It calls three other files:
"mystyles.sty", which updates the LaTeX preamble to use some additional packages.
"functions.R" and "load.R". The former contains frequently used functions I've written, and the latter loads the data I'm using.
I source the two R functions in the .Rmd file. When I try to Knit the report, whether I get an error or am successful, my two .R files and my one .sty file are deleted. And not just deleted -- gone for good.
I do not know what is up. I have ruined my previous work simply by returning to examine the original file.
Please, somebody has to help me here. My workflow is shot to hell if I have to write every last bit of code over and over again in each report.
UPDATE: Even copying the files to another directory doesn't help.
Here is the code block that calls the "load.R" file:
```{r loaddata}
#
# ------- Load Data
#
# This section loads the ISS survey files one at a time and saves them as
# read.SPSS objects within a list. It names these eleven objects as "ISS 2002",
# "ISS 2003", etc... until "ISS 2012". This file may be prohibitively large.
#
source("load.R") # Loads the ISS Survey files
```
Rename your file to ISS_Time_Series.Rmd and try again.
It is the spaces in the document name that makes rmarkdown::render() delete the files that have been loaded or sourced.
A an issue has already been filed. See https://github.com/rstudio/rmarkdown/issues/580

How to point to a directory in an R package?

I am making my first attempts to write a R package. I am loading one csv file from hard drive and I am hoping to bundle up my R codes and my csv files into one package later.
My question is how can I load my csv file when my pakage is generated, I mean right now my file address is something like c:\R\mydirectory....\myfile.csv but after I sent it to someone else how can I have a relative address to that file?
Feel free to correct this question if it is not clear to others!
You can put your csv files in the data directory or in inst/extdata.
See the Writing R Extensions manual - Section 1.1.5 Data in packages.
To import the data you can use, e.g.,
R> data("achieve", package="flexclust")
or
R> read.table(system.file("data/achieve.txt", package = "flexclust"))
Look at the R help for package.skeleton: this function
automates some of the setup for a new source package. It creates directories, saves functions, data, and R code files to appropriate places, and creates skeleton help files and a ‘Read-and-delete-me’ file describing further steps in packaging.
The directory structure created by package.skeleton includes a data directory. If you put your data here it will be distributed with the package.

Resources