getting the name of a dataframe from loading a .rda file in R - r

I am trying to load an .rda file in r which was a saved dataframe. I do not remember the name of it though.
I have tried
a<-load("al.rda")
which then does not let me do anything with a. I get the error
Error:object 'a' not found
I have also tried to use the = sign.
How do I load this .rda file so I can use it?
I restared R with load("al.rda) and I know get the following error
Error: C stack usage is too close to the limit

Use 'attach' and then 'ls' with a name argument. Something like:
attach("al.rda")
ls("file:al.rda")
The data file is now on your search path in position 2, most likely. Do:
search()
ls(pos=2)
for enlightenment. Typing the name of any object saved in al.rda will now get it, unless you have something in search path position 1, but R will probably warn you with some message about a thing masking another thing if there is.
However I now suspect you've saved nothing in your RData file. Two reasons:
You say you don't get an error message
load says there's nothing loaded
I can duplicate this situation. If you do save(file="foo.RData") then you'll get an empty RData file - what you probably meant to do was save.image(file="foo.RData") which saves all your objects.
How big is this .rda file of yours? If its under 100 bytes (my empty RData files are 42 bytes long) then I suspect that's what's happened.

I had to reinstall R...somehow it was corrupt. The simple command which I expected of
load("al.rda")
finally worked.

I had a similar issue, and it was solved without reinstall R. for example doing
load("al.rda) works fine, however if you do
a <- load("al.rda") will not work.

The load function does return the list of variables that it loaded. I suspect you actually get an error when you load "al.rda". What exactly does R output when you load?
Example of how it should work:
d <- data.frame(a=11:13, b=letters[1:3])
save(d, file='foo.rda')
a <- load('foo.rda')
a # prints "d"
Just to be sure, check that the load function you actually call is the original one:
find("load") # should print "package:base"
EDIT Since you now get an error when you load the file, it is probably corrupt in some way. Try this and say what it prints:
file.info("a1.rda") # Prints the file size etc...
readBin("a1.rda", "raw", 50) # reads first 50 bytes from the file
Without having access to the file, it's hard to investigate more... Maybe you could share the file somehow (http://www.filedropper.com or similar)?

I usually use save to save only a single object, and I then use the following utility method to retrieve that object into a given variable name using load, but into a temporary namespace to avoid overwriting existing objects. Maybe it will be helpful for others as well:
load_first_object <- function(fname){
e <- new.env(parent = parent.frame())
load(fname, e)
return(e[[ls(e)[1]]])
}
The method can of course be extended to also return named objects and lists of objects, but this simple version is for me the most useful.

Related

Unable to load external R object with load()

I saved R objects in .Rdata files with save(obj, filename) but now I'm unable to load these objects in my global environment with load(filename).
Files are created in my hard disk with a size around 4.2ko so I guess they contain the data but Rstudio can't load them and load() returns no error.
What can I do to load these data, or at least verify the data are present ?
EDIT: When redirecting load() I can see the data is only a character string and I'm very disappointed of not being able to restore these data. It is true that I should have tested the file restoration earlier, but is there still a hope ? a file of 4.2ko must contain data.
> MYR <- load("/home/R/data/MYR.Rdata")
> MYR
[1] "data"
> class(MYR)
[1] "character"
This is not how load works (see).
It loads the data in the environment and return only a list of names of objects created.
So in your case an object named data has been loaded in the environment. You can confirm with ls() it is there. And of course you can inspect the object data.

Importing a file only if it has been modified since last import and then save to a new object

I am trying to create a script that I run about once a week. The goal is that it will go out and check an MS Excel file that a co-worker manages. It then tests to see if the date the file was modified is newer then the last time it was imported. If it is newer, it will import the file (I am using readxl package - WONDERFUL!) into a new object that is a named with the date the original Excel file was last modified included in the object name. I have everything working except for the assignment of the imported data.frame to a new object that includes the date.
An example of the code I am using is:
First I create an object with a path pointing to the file of interest.
pfdFilePath <- file.path("H:", "3700", "3780", "002-00", "3.
Project Information", "Program", "RAH program.xls")
after testing to verify the file has been modified, I have tried simple assignment ("test" is just an example for simplification):
paste("df-", as.Date(file.info(pfdFilePath)$mtime), sep = "") <- "test"
But that code produces an error:
Error in paste("df-", as.Date(file.info(pfdFilePath)$mtime), sep = "") <- "test" :
target of assignment expands to non-language object
I then try the assign function:
assign(paste("df-", as.Date(file.info(pfdFilePath)$mtime), sep = ""), "test")
Running this code creates an object that looks to be okay, but when I evaluate it, or try using str() or class() I get the following error:
Error in df - df-2016-08-09 :
non-numeric argument to binary operator
I am pretty sure this is an error that has to do with the environment I am using assign, but being relatively new to R, I cannot figure it out. I understand that the assign function seems to be frowned upon, but those warnings seem to centered on for-loops vs. lapply functions. I am not really iterating within a function though. Just a dynamically named object whenever I run a script. I can't come up with a better way to do it. If there is another way to do this that doesn't require the assign function, or a better way to use assign function , I would love to know it.
Thank you in advance, and sorry if this is a duplicate. I have spent the entire evening digging and can't derive what I need.
Abdou provided the key.
assign(paste0("df.", "pfd.", strftime(file.info(pfdFilePath)$mtime, "%Y%m%d")), "test01")
I also converted to the cleaner paste0 function and got rid of the dashes to avoid confusion. Lesson learned.
Works perfectly.

How to preserve changes to function with fix() between R sessions?

If I edit a function with R v2.14.0 using fix(), those fixes are applied during the session.
For example, I might make the following edit to get a white background in a hive plot:
> library(HiveR)
> fix(plotHive)
... :%s/black/white/g
... :w
... :q
> plotHive(myHiveData)
I then get a white background in the hive plot, as expected.
But if I quit and reopen R, I have lost those changes, and the plot has a black background again.
How do I preserve the edits I make with fix() between R sessions?
EDIT
If I source() the modified plotHive() function, I get the following error:
> modifiedPlotHive <- source("modifiedPlotHive.R")
Error in source("modifiedPlotHive.R") :
modifiedPlotHive.R:1160:1: unexpected '<'
1159: }
1160: <
^
In addition: Warning message:
In readLines(file) : incomplete final line found on 'modifiedPlotHive.R'
The final line in the modified plotHive() function is:
<environment: namespace:HiveR>
If I remove this line before source()-ing, then the function no longer works.
Sorry I missed this when it came out, but the latest version of HiveR has the option to control the background color (available on CRAN 0.2-1) Bryan
Here's the safer way of doing what you want, referenced by #joran.
The sink/source pair is fine for dealing with R code files. But saving to text files and then reading back in other types of objects can strip them of important attributes, especially those relating to environments. That's what you just experienced.
The save/load pair stores objects in R's own binary format, so is much less liable to lose important information/environments attached to functions.
In this example, I define a personal version of ls, which differs from the base function in that it by default lists objects that start with a dot/period:
my_ls <- ls
fix(my_ls)
# 1) On the first line, change 'all.names=FALSE' to 'all.names=TRUE'
# 2) Say "Yes", I want to save the changes
save("my_ls", file="my_ls.Rdata")
# Then, in a later session, test that it works
load("my_ls.Rdata")
.TrysToHide <- 99
my_ls()
# [1] ".TrysToHide" "my_ls"
One more note: it's much cleaner to give your modified function a name of its own. To really edit a packaged function, and have the changes persist, you'd need to edit the sources and recompile the package. But if you do that, beware, as you may well break the function for other packaged functions that depend on it.
There are a couple of options:
Save your workspace before quiting and load it again when you reopen R.
Save the modified function to script file and source it:
sink("modified_plotHive.r")
plotHive
sink()
In the next session:
plotHive <- source("modified_plotHive.r")
HTH

Examining contents of .rdata file by attaching into a new environment - possible?

I am interested in listing objects in an RDATA file and loading only selected objects, rather than the whole set (in case some may be big or may already exist in the environment). I'm not quite clear on how to do this when there are conflicts in names, as attach() doesn't work as nicely.
1: For examining the contents of an R data file without loading it: This question is similar, but different from, the one asked at listing contents of an R data file without loading
In that case, the solution offered was:
attach(filename)
ls(pos = 2)
detach()
If there are naming conflicts between objects in the file and those in the global environment, this warning appears:
The following object(s) are masked _by_ '.GlobalEnv':
I tried creating a new environment, but I cannot seem to attach into that.
For instance, this produces the same error:
lsfile <- function(filename){
tmpEnv <- new.env()
evalq(attach(filename), envir = tmpEnv)
tmpls <- ls(pos = 2)
detach()
return(tmpls)
}
lsfile(filename)
Maybe I've made a mess of things with evalq (or eval). Is there some other way to avoid the naming conflict?
2: If I want to access an object - if there are no naming conflicts, I can just work with the one from the .rdat file, or copy it to a new one. If there are conflicts, how does one access the object in the file's namespace?
For instance, if my file is "sample.rdat", and the object is surveyData, and a surveyData object already exists in the global environment, then how can I access the one from the file:sample.rdat namespace?
I currently solve this problem by loading everything into a temporary environment, and then copy out what's needed, but this is inefficient.
Since this question has just been referenced let's clarify two things:
attach() simply calls load() so there is really no point in using it instead of load
if you want selective access to prevent masking it's much easier to simply load the file into a new environment:
e = local({load("foo.RData"); environment()})
You can then use ls(e) and access contents like e$x. You can still use attach on the environment if you really want it on the search path.
FWIW .RData files have no index (the objects are stored in one big pairlist), so you can't list the contained objects without loading. If you want convenient access, convert it to the lazy-load format instead which simply adds an index so each object can be loaded separately (see Get specific object from Rdata file)
I just use an env= argument to load():
> x <- 1; y <- 2; z <- "foo"
> save(x, y, z, file="/tmp/foo.RData")
> ne <- new.env()
> load(file="/tmp/foo.RData", env=ne)
> ls(env=ne)
[1] "x" "y" "z"
> ne$z
[1] "foo"
>
The cost of this approach is that you do read the whole RData file---but on the other hand that seems to be unavoidable anyway as no other method seems to offer a list of the 'content' of such a file.
You can suppress the warning by setting warn.conflicts=FALSE on the call to attach. If an object is masked by one in the global environment, you can use get to retreive it from your attached data.
x <- 1:10
save(x, file="x.rData")
#attach("x.rData", pos=2, warn.conflicts=FALSE)
attach("x.rData", pos=2)
(x <- 1)
# [1] 1
(x <- get("x", pos=2))
# [1] 1 2 3 4 5 6 7 8 9 10
Thanks to #Dirk and #Joshua.
I had an epiphany. The command/package foreach with SMP or MC seems to produce environments that only inherit, but do not seem to conflict with, the global environment.
lsfile <- function(list_files){
aggregate_ls = foreach(ix = 1:length(list_files)) %dopar% {
attach(list_files[ix])
tmpls <- ls(pos = 2)
return(tmpls)
}
return(aggregate_ls)
}
lsfile("f1.rdat")
lsfile(dir(pattern = "*rdat"))
This is useful to me because I can now parallelize this. This is a bare-bones version, and I will modify it to give more detailed information, but so far it seems to be the only way to avoid conflicts, even without ignore.
So, question #1 can be resolved by either ignoring the warnings (as #Joshua suggested) or by using whatever magic foreach summons.
For part 2, loading an object, I think #Joshua has the right idea - "get" will do.
The foreach magic can also work, by using the .noexport option. However, this has risks: whatever isn't specifically excluded will be inherited/exported from the global environment (I could do ls(), but there's always the possibility of attached datasets). For safety, this means that get() must still be used to avoid the risk of a naming conflict. Loading into a subenvironment avoids the naming conflict, but doesn't avoid the loading of unnecessary objects.
#Joshua's answer is far simpler than my foreach detour.

Loading someone else's .rdata file, can't access the data

My professor has sent me an .rdata file and wants me to do some analysis on the contents. Although I'm decent with R, I've never saved my work in .rdata files, and consequently haven't ever worked with them.
When I try to load the file, it looks like it's working:
> load('/home/swansone/Desktop/anes.rdata')
> ls()
[1] "25383-0001-Data"
But I can't seem to get at the data:
> names("25383-0001-Data")
NULL
I know that there is data in the .rdata file (it's 13 MB, there's definitely a lot in there) Am I doing something wrong? I'm at a loss.
Edit:
I should note, I've also tried not using quotes:
> names(25383-0001-Data)
Error: object "Data" not found
And renaming:
> ls()[1] <- 'nes'
Error in ls()[1] <- "nes" : invalid (NULL) left side of assignment
You're going to run into a lot of issues with an object that doesn't begin with a letter or . and a letter (as mentioned in An Introduction to R).
Use backticks to access this object (the "Names and Identifiers" section of help("`") explains why this works) and assign the object to a new, syntactically validly named object.
Data <- `25383-0001-Data`
Maybe it has to do with the unusual use of dashes in the name and backquotes work:
names(`25383-0001-Data`)
Edit:
More for reference (since Joshua already answered the main question perfectly), you can also reassign an object from ls() (what Wilduck tried in the question) using get(). This might be useful if the object of the name contains very weird characters:
foo <- 1:5
bar <- get(ls()[1])
bar
[1] 1 2 3 4 5
This of course requires the index of foo in ls() to be [1], but looking up the index of the required object is not too hard.

Resources