Unable to load external R object with load() - r

I saved R objects in .Rdata files with save(obj, filename) but now I'm unable to load these objects in my global environment with load(filename).
Files are created in my hard disk with a size around 4.2ko so I guess they contain the data but Rstudio can't load them and load() returns no error.
What can I do to load these data, or at least verify the data are present ?
EDIT: When redirecting load() I can see the data is only a character string and I'm very disappointed of not being able to restore these data. It is true that I should have tested the file restoration earlier, but is there still a hope ? a file of 4.2ko must contain data.
> MYR <- load("/home/R/data/MYR.Rdata")
> MYR
[1] "data"
> class(MYR)
[1] "character"

This is not how load works (see).
It loads the data in the environment and return only a list of names of objects created.
So in your case an object named data has been loaded in the environment. You can confirm with ls() it is there. And of course you can inspect the object data.

Related

Is there a way to get a list of the R objects in a file you're using load() on? If so, is it possible to load them individually?

Trying to load() a file that ends up giving me an error:
load('.RDataTmp', verbose = T)
Loading objects:
object1
object2
object3
object4
object5
Error in load(".RDataTmp", verbose = T) : error reading from connection
when I try to load it. Obviously, I messed up the name when I was saving this originally, but just when messing around I was able to save and load other objects with similarly messed up names. But it's appearing to load some of the objects before failing at loading some later object (secondary question, does this mean it loaded object5 successfully, or that's where the error occurred?). Is there a way to get a list of all the objects in the file? Then, is there a way to try and load only some of the objects?

'data' is not an exported object from 'namespace:my_package'

I'm writing a function that uses an external data as follow:
First, it checks if the data is in the data/ folder, if it is not, it creates the data/ folder and then downloads the file from github;
If the data is already in the data/ folder, it reads it, and perform the calculations.
The question is, when I run:
devtools::check()
it returns:
Error: 'data' is not an exported object from 'namespace:my_package'
Should I manually put something on NAMESPACE?
An example:
my_function <- function(x){
if(file.exists("data/data.csv")){
my_function_calculation(x = x)
} else {
print("Downloading source data...")
require(RCurl)
url_base <-
getURL("https://raw.githubusercontent.com/my_repository/data.csv")
dir.create(paste0(getwd(),"/data"))
write.table(url_base,"data/data.csv", sep = ",", quote = FALSE)
my_function_calculation(x = x)
}
}
my_function_calculation <- function(x = x){
data <- NULL
data <- suppressMessages(fread("data/data.csv"))
#Here, I use data...
return(data)
}
It could not be the same in every case, but I've solved the problem by removing the data.R file on R/ folder.
data.R is a file describing all data presented in the package. I had it since the previous version of my code, that had the data built in, not remote (to be downloaded).
Removing the file solved my problem.
Example of data.R:
#' Name_of_the_data
#'
#' Description_of_the_Data
#'
#' #format A data frame with 10000 rows and 2 variables:
#' \describe{
#' \item{Col1}{description of Col1}
#' \item{Col2}{description of Col2}
#' }
"data_name"
No need to remove data.R in /R folder, you just need to decorate the documentation around the NULL keyword as follow:
#' Name_of_the_data
#'
#' Description_of_the_Data
#'
#' #format A data frame with 10000 rows and 2 variables:
#' \describe{
#' \item{Col1}{description of Col1}
#' \item{Col2}{description of Col2}
#' }
NULL
Generally, this happens when you have a mismatch between the names of one of the rda files in data folder and what is described in R/data.R.
In this case, the data reference in the error message is for data.csv, not the data folder. You need to have rda files in the data folder of a R package. If you want to download csv, you need to put them in inst/extdata.
This being said, you might want to consider using tempdir() to save those files in the temp folder of your session instead.
There's 3 things to check:
The documentation is appropriately named:
#' Name_of_the_data
#'
#' Description_of_the_Data
#'
#' #format A data frame with 10000 rows and 2 variables:
#' \describe{
#' \item{Col1}{description of Col1}
#' \item{Col2}{description of Col2}
#' }
data
That the RData file is appropriately named for export in the data/ folder.
That the RData file is loaded with the name data.
If documentation (1) is A, the Rdata file is A.RData (2), but the object (when loaded with load() ) is named B- you're going to get this error exactly.
The problem probably is because how your object was named when you save it.
Suppose I load a file a called it "d", then I save it (as is suggested) with save in the data/ directory as "data":
save(d, file = "data/data.rda")
Then you will run the clean and install package and you will get the following error:
Error: 'data' is not an exported object from 'namespace:YourPakage'
Looks like it does not matter how you declare your object in the roxygen documentation. I guess you must name your OBJECT with the same name you are going to save it and loaded it.
For example, load your dataset as "pib" object, then save as "pib.rda" and declare in roxygen "loadData.R" (for example) your "pib".
#' Datos del PIB
#'
#' #docType data
#'
#' #usage data(pib)
#'
#' #format An object of class ...
#'
#' #keywords datasets
#'
#' #references ----
#'
#' #source ----
#'
#' #examples
#' data(pib)
"pib"
I had this issue because I copied the .rda file into the R\data folder.
Issue was resolved by using usethis::use_data(DataObject) which automatically takes the raw-data (DataObject) file and adds it to the R\data folder within the R package directory.
When I was stumped by the error
Error: 'data' is not an exported object from 'namespace:my_package'
MrFlick's comment above saved me. I had simply changed the name of an .rda file in my data folder. I was unable to get devtools::document() to recreate the NAMESPACE file. The solution was to re-save the data into the .rda file. (Of course I should have remembered that when one loads from an .rda file the name of the R object(s) has nothing to do with the name of the .rda file so renaming the .rda file doesn't do much.)
I spent a few hours trying to fix this. Finally got it to work.
Notes:
Data files have to be of type "rda". "rds" won't work.
File names had to be lower case.
NULL in documentation name didn't work for me. Had to be a lower case string.
In general, it seems the same error message is caused by several things. Anything the checker doesn't like related to data files, it will issue the same error. Hard to debug under those circumstances.
I will add another trap. Working in RStudio
I have assigned a string to MyString and saved in the data folder of my package project:
save(MyString, file="./data/MyString.RData")
My ./R/data.R file contains documentation for this:
#' A character string
#'
"MyString"
This works. But you must use one file per object and not do save(X, Y, Z, file="BitsAndPieces.RData") and then document BitsAndPieces. If you do then you will get the error of this question. Which I did, needless to say.
I had the same error and I would be able to overcome the error as follows.
The data file located at: data/df.RData
The R documentation file located at: R/df.R
I have created the df.RData file by importing the df.txt file into R and using the save() function to create the .RData file. I used the following code block to create .RData file.
x=read.table("df.txt")
save(x,file="df.RData")
Then after running the RCMD check I get the same error as df is not an exported object from namespace "package name".
I have overcome the error by change the variable name of the df.RData file as
df=read.table("df.txt")
save(df,file="df.RData")
Restarting the session solved the problem for me. Somehow the environment was empty and after restart all objects were back, hence solving the diff.
I had the same issue with one of my packages, and I needed to add
LazyData: true
to my DESCRIPTION file.
I had this problem, even renaming the variables and uninstalling the probematic packages didn't work.
I did:
I was trying to carry out the process in a session (tab) of R that was already in use previously, where the terra package had already been requested. This session is not saved, but was being automatically saved to an image in ~/.RData every time Rstudio was closed. So every time I opened Rstudio it retrieved that section (image) and reloaded the previous state causing the conflict between packages.
I solved it by creating a new blank rmarkdown and closing all previously opened sessions, as well as clearing all saved data in the Rstudio "Global environment".
I encountered this "Error: 'weekly' is not an exported object from 'namespace:ISLR'' when I was trying the following:
library(ISLR)
w <- ISLR::weekly
The problem is somehow fixed by changing it to:
w = ISLR::weekly
The = sign made all the difference here.

An error while trying to use glm model for prediction on another computer

I would like to save a glm object in one R machine and use it for prediction on another data set located on another machine that has a newer data.I try to use save and load but with no success.What am I doing wrong?
Here is a toy example:
# on machine 1:
glm<-glm(y~x1+x2,data=dat1, family=binomial(link="logit")
save(glm,file="glm.Rdata") # the file is stored in a folder.
# on machine 2:
load(glm.RData) # got an error:"Error in load(glm.RData) : object 'glm.RData' not found"
#I tried :
load(file='glm.RData') # no error was displayed
print(glm) # got an error:"Error in load(glm.RData) : object 'glm.RData' not found"
Any help will be great.
As per #user3710546's advice, I would avoid saving your model using the name glm, as it'll mask (ie. block) the glm() function, making it difficult for you to use it in your session.
Using save() and load()
save() is generally used to save a list of objects to a file, rather than a single object. The first argument to save() is list, 'A character vector containing the names of objects to be saved.' (Emphasis mine.) So you'd want to use it like this:
# On machine 1:
save(list = 'glm', file = '/path/to/glm.RData')
# On machine 2:
load(file = '/path/to/glm.RData')
Note that the file extensions are often case-sensitive: you saved to a file with the extension .RData but loaded from one with the extension .Rdata, which is different. This may explain why the file isn't found.
Using saveRDS() and readRDS()
An alternative to using save() and load is to use saveRDS() and readRDS(), which are designed to be used with one object. They're used slightly differently:
# On machine 1
saveRDS(glm, file = '/path/to/glm.rds')
# On machine 2
glm = readRDS(file = '/path/to/glm.rds')
Note the .rds file extension and the fact that readRDS() isn't automatically put in the environment (it needs to be assigned to something).
Saving parts of a GLM
If you just want the formula saved—that is, the actual text string—you can find it in glm$formula, where glm is the name of your object. It comes back as a formula object, but you can convert it to a string with as.character(glm$formula), to then be written to a text file or whatever.
If, however, you want the model itself without the dataset it was created from (to cut down on disk space), have a look at this article, which discusses which parts of a glm object can be safely deleted.

Employ environments to handle package-data in package-functions

I recently wrote a R extension. The functions use data contained in the package and must therefore load them. Subroutines also need to access the data.
This is the approach taken:
main<- function(...){
data(data)
sub <- function(...,data=data){...}
...
}
I'm unhappy with the fact that the data resides in .GlobalEnv so it still hangs around when the function had terminated (also undermining the downpassing via argument concept).
Please put me on the right track! How do you employ environments, when you have to handle package-data in package-functions?
It looks that you are looking for the LazyData directive in your namepace:
LazyData: yes
Othewise, data has the envir argument you can use to control in which environment you want to load your data, so for example if you wanted the data to be loaded inside main, you could use :
main<- function(...){
data(data, envir = environment() )
sub <- function(...,data=data){...}
...
}
If the data is needed for your functions, not for the user of the package, it should be saved in a file called sysdata.rda located in the R directory.
From R extensions:
Two exceptions are allowed: if the R subdirectory contains a file
sysdata.rda (a saved image of R objects: please use suitable
compression as suggested by tools::resaveRdaFiles) this will be
lazy-loaded into the namespace/package environment – this is intended
for system datasets that are not intended to be user-accessible via
data.

getting the name of a dataframe from loading a .rda file in R

I am trying to load an .rda file in r which was a saved dataframe. I do not remember the name of it though.
I have tried
a<-load("al.rda")
which then does not let me do anything with a. I get the error
Error:object 'a' not found
I have also tried to use the = sign.
How do I load this .rda file so I can use it?
I restared R with load("al.rda) and I know get the following error
Error: C stack usage is too close to the limit
Use 'attach' and then 'ls' with a name argument. Something like:
attach("al.rda")
ls("file:al.rda")
The data file is now on your search path in position 2, most likely. Do:
search()
ls(pos=2)
for enlightenment. Typing the name of any object saved in al.rda will now get it, unless you have something in search path position 1, but R will probably warn you with some message about a thing masking another thing if there is.
However I now suspect you've saved nothing in your RData file. Two reasons:
You say you don't get an error message
load says there's nothing loaded
I can duplicate this situation. If you do save(file="foo.RData") then you'll get an empty RData file - what you probably meant to do was save.image(file="foo.RData") which saves all your objects.
How big is this .rda file of yours? If its under 100 bytes (my empty RData files are 42 bytes long) then I suspect that's what's happened.
I had to reinstall R...somehow it was corrupt. The simple command which I expected of
load("al.rda")
finally worked.
I had a similar issue, and it was solved without reinstall R. for example doing
load("al.rda) works fine, however if you do
a <- load("al.rda") will not work.
The load function does return the list of variables that it loaded. I suspect you actually get an error when you load "al.rda". What exactly does R output when you load?
Example of how it should work:
d <- data.frame(a=11:13, b=letters[1:3])
save(d, file='foo.rda')
a <- load('foo.rda')
a # prints "d"
Just to be sure, check that the load function you actually call is the original one:
find("load") # should print "package:base"
EDIT Since you now get an error when you load the file, it is probably corrupt in some way. Try this and say what it prints:
file.info("a1.rda") # Prints the file size etc...
readBin("a1.rda", "raw", 50) # reads first 50 bytes from the file
Without having access to the file, it's hard to investigate more... Maybe you could share the file somehow (http://www.filedropper.com or similar)?
I usually use save to save only a single object, and I then use the following utility method to retrieve that object into a given variable name using load, but into a temporary namespace to avoid overwriting existing objects. Maybe it will be helpful for others as well:
load_first_object <- function(fname){
e <- new.env(parent = parent.frame())
load(fname, e)
return(e[[ls(e)[1]]])
}
The method can of course be extended to also return named objects and lists of objects, but this simple version is for me the most useful.

Resources