"Error opening SHP file" when loading from package data - r

I want to shift some of my most commonly used shapefiles to github. I have a new package in development. It worked when built in RStudio, but when I pull the package from github and run it I get the error "Error in getinfo.shape(filen) : Error opening SHP file". When I look inside the package the shapefiles are in place in the data folder. They're being called by individual functions, e.g.
load_lon = function(){
require(maptools)
lon <<- readShapePoly('data/london_outline_simple.shp', proj4string=CRS('+init=epsg:27700'))
}
Presumably this method means R is wrongly looking for subfolder 'data' in the working directory. But I can't think how else to call them, as data() doesn't support shp. Grateful for advice how to load them in.

In line with #josh-obrien's comment, the answer is to put data files in a subfolder of inst, whereupon they'll be compiled with the code when the package is built. I've put the shapefiles in:
$SOURCEDIR/inst/external/
These are installed via functions such as:
load_lon = function(){
require(maptools)
path = system.file("external/london_outline_simple.shp", package="londonShapefiles")
lon <<- readShapePoly(path, proj4string=CRS('+init=epsg:27700'))
}
Check out the working github package for full working example.

Related

Run a library on R while having the data in a different folder

I have a very simple question I'm sure, but I can't find the solution, please someone help me!
I installed MixSIAR for R, which is a package for statistical analysis and I see that the library is in C:/Users/me/Documents/R/win-library/4.0/MixSIAR and the script works perfectly as long as my data is in the same folder MixSIAR, but if I want to have it in my Dropbox or any other folder in my PC it doesn't work, because I can't figure out how to write the path to "call" the package MixSIAR.
#library(MixSIAR) # Works but needs data to be in the same MixSIAR folder.
library("C:/Users/pingry/Documents/R/win-library/4.0/MixSIAR") # doesn't work
# Load consumer
mix.filename <- system.file("data", "contemp_turtles_Oahu_consumer.csv", package = "MixSIAR")
mix <- load_mix_data(filename=mix.filename,
iso_names=c("d13C","d15N","d34S"),
factors=NULL,
fac_random=NULL,
fac_nested=NULL,
cont_effects=NULL)
The error I get:
Error is in library("C:/Users/pingry/Documents/R/win-library/4.0/MixSIAR") :
there is no package called ‘C:/Users/pingry/Documents/R/win-library/4.0/MixSIAR’
enter image description here
Mr. Flick gave me the answer so here is the code fixed to have my data in my Dropbox and use the MixSIAR package:
library(MixSIAR)
# Load mix data
mix <- load_mix_data(filename=("data/contemp_turtles_2islands_consumer.csv"),
iso_names=c("d13C","d15N","d34S"),
factors="island",
fac_random=TRUE,
fac_nested=FALSE,
cont_effects=NULL)

I am trying to make an interactive map using leafletR package according to a ZevRoss blog. But there is an error in the code

The ZevRoss blog is as follow:
http://zevross.com/blog/2014/04/11/using-r-to-quickly-create-an-interactive-online-map-using-the-leafletr-package/
The code with error is:
# ----- Write data to GeoJSON
leafdat<-paste(downloaddir, "/", filename, ".geojson", sep="")
writeOGR(subdat, leafdat, layer="", driver="GeoJSON")
And the error is:
Error in writeOGR(subdat, leafdat, layer = "", driver = "GeoJSON") :
GDAL Error 3: Cannot open file 'd:/Leaflet/County_2010Census_DP1.geojson'
Because I am a freshman in R, I searched for this problem a lot and didn't get any good answer.
I am using Rstudio R version 3.1.1(2014-07-10) on windows 7 32bit.
My rgdal version is 0.9-1.
The other code in the blog runs successfully, this sentence seems to be the only difficult point.
You could create GeoJSON using leafletR package:
library('leafletR')
Your_GeoJSON <- toGeoJSON(data=YourData, dest=getwd())
I've tried to find a solution for this mysterious error for some time.
Eventually I found this post on the Gdal package errors' tickets site that clarified the problem and gave a solution.
Basically the problem is in the interface between rgdal and Gdal (Gdal changed their way to work and the latest version of rgdal hasn't watched up yet):
writeOGR() calls ogrCheckExists("foo.geojson") to check first if the file exists before creating a new dataset.
In the 1.11 version the OGR GeoJSON driver will emit an error message that this file doesn't exists, whereas previous version didn't emit an error message.
rgdal catches this error as a fatal one and doesn't go to the writing step. This should be fixed in rgdal.
Meanwhile you have an easy workaround : add check_exists = FALSE as a parameter to writeOGR()
Therefore the following code will work:
writeOGR(spDf,'foo.geojson','spDf', driver='GeoJSON',check_exists = FALSE)
Of course if there is already a geojson file with the chosen name at the location writeOGR still fails.
Even though you already have a 'd:' drive on your computer and you have permission to write to that drive, try the following:
--------------------------------
leafdat<-paste(downloaddir, "/", ".geojson", sep="")
> leafdat
> "d:/Leaflet/.geojson"
writeOGR(subdat, leafdat, layer="", driver="GeoJSON")
--------------------------------
Then you may get ".geojson" file on "d:/Leaflet". Change the file name ".geojson" to "County_2010Census_DP1.geojson".

issue with get_rollit_source

I tried to use get_rollit_source from the RcppRoll package as follows:
library(RcppRoll)
get_rollit_source(roll_max,edit=TRUE,RStudio=TRUE)
I get an error:
Error in get("outFile", envir = environment(fun)) :
object 'outFile' not found
I tried
outFile="C:/myDir/Test.cpp"
get_rollit_source(roll_max,edit=TRUE,RStudio=FALSE,outFile=outFile)
I get an error:
Error in get_rollit_source(roll_max, edit = TRUE, RStudio = FALSE, outFile = outFile) :
File does not exist!
How can fix this issue?
I noticed that the RcppRoll folder in the R library doesn't contain any src directory. Should I download it?
get_rollit_source only works for 'custom' functions. For things baked into the package, you could just download + read the source code (you can download the source tarball here, or go to the GitHub repo).
Anyway, something like the following should work:
rolling_sqsum <- rollit(final_trans = "x * x")
get_rollit_source(rolling_sqsum)
(I wrote this package quite a while back when I was still learning R / Rcpp so there are definitely some rough edges...)

Package that downloads data from the internet during installation

Is anyone aware of a package that downloads a dataset from the internet during the installation process and then prepares and saves it so that it is available when loading the package using library(packageName)? Are there any drawbacks in this approach (besides the obvious one that package installation will fail if the data source is unavailable or the data format has changed)?
EDIT: Some background. The data is three tab-separated files in a ZIP archive, owned by federal statistics and generally freely accessible. I have R code which downloads, extracts and prepares the data, in the end three data frames are created which could be saved in .RData format.
I am thinking about creating two packages: A "data" package that provides the data, and a "code" package that operates on it.
I did this mockup before, while you were posting your edit. I presume it would work, but not tested. I've commented it so you can see what you would need to change. The idea here is to check to see if an expected object is available in the current working environment. If it is not, check to see that the file that the data can be found in is in the current working directory. If that is not found, prompt the user to download the file, then proceed from there.
myFunction <- function(this, that, dataset) {
# We're giving the user a chance to specify the dataset.
# Maybe they have already downloaded it and saved it.
if (is.null(dataset)) {
# Check to see if the object is already in the workspace.
# If it is not, check to see whether the .RData file that
# contains the object is in the current working directory.
if (!exists("OBJECTNAME", where = 1)) {
if (isTRUE(list.files(
pattern = "^DATAFILE.RData$") == "DATAFILE.RData")) {
load("DATAFILE.RData")
# If neither of those are successful, prompt the user
# to download the dataset.
} else {
ans = readline(
"DATAFILE.RData dataset not found in working directory.
OBJECTNAME object not found in workspace. \n
Download and load the dataset now? (y/n) ")
if (ans != "y")
return(invisible())
# I usually use RCurl in case the URL is https
require(RCurl)
baseURL = c("http://some/base/url/")
# Here, we actually download the data
temp = getBinaryURL(paste0(baseURL, "DATAFILE.RData"))
# Here we load the data
load(rawConnection(temp), envir=.GlobalEnv)
message("OBJECTNAME data downloaded from \n",
paste0(baseURL, "DATAFILE.RData \n"),
"and added to your workspace\n\n")
rm(temp, baseURL)
}
}
dataset <- OBJECTNAME
}
TEMP <- dataset
## Other fun stuff with TEMP, this, and that.
}
Two packages, hosted at Github
Here's another approach, building on the comments between #juba and I. The basic concept is to have, as you describe, one package for the codes and one for the data. This function would be part of the package that contains your code. It will:
Check to see if the data package is installed
Check to see if the version of the data package you have installed matches the version at Github, which we are going to assume is the most up to date version.
When it fails any of the checks, it asks the user if they want to update their installation of the package. In this case, for demonstration, I've linked to one of my packages in progress at Github. This should give you an idea of what you need to substitute to get it to work with your own package once you've hosted it there.
CheckVersionFirst <- function() {
# Check to see if installed
if (!"StataDCTutils" %in% installed.packages()[, 1]) {
Checks <- "Failed"
} else {
# Compare version numbers
require(RCurl)
temp <- getURL("https://raw.github.com/mrdwab/StataDCTutils/master/DESCRIPTION")
CurrentVersion <- gsub("^\\s|\\s$", "",
gsub(".*Version:(.*)\\nDate.*", "\\1", temp))
if (packageVersion("StataDCTutils") == CurrentVersion) {
Checks <- "Passed"
}
if (packageVersion("StataDCTutils") < CurrentVersion) {
Checks <- "Failed"
}
}
switch(
Checks,
Passed = { message("Everything looks OK! Proceeding!") },
Failed = {
ans = readline(
"'StataDCTutils is either outdated or not installed. Update now? (y/n) ")
if (ans != "y")
return(invisible())
require(devtools)
install_github("StataDCTutils", "mrdwab")
})
# Some cool things you want to do after you are sure the data is there
}
Try it out with CheckVersionFirst().
Note: This would succeed only if you religiously remember to update your version number in your description file every time you push a new version of the data to Github!
So, to clarify/recap/expand, the basic idea would be to:
Periodically push the updated version of your data package to Github, being sure to change the version number of the data package in its DESCRIPTION file when you do so.
Integrate this CheckVersionFirst() function as an .onLoad event in your code package. (Obviously modify the function to match your account and package name).
Change the commented line that reads # Some cool things you want to do after you are sure the data is there to reflect the cool things you actually want to do, which would probably start with library(YOURDATAPACKAGE) to load the data....
This method may not be efficient, but a good workaround. If you are making a package that needs regularly updated data, first make a package which has that data. It does not need any functions, but I like the concept of a setter (which you might not need in this case) & getter.
Then when you make your package, have the 'data'-package as a dependency. This way, whenever someone installs your package, he/she will always have the latest data.
On your part, you'll just have to swap out the data in your 'data' package, and upload it to the repo you want.
If you don't know how to build a package, check ?packages.skeleton and R CMD CHECK, R CMD BUILD

extracting source code from r package

I am trying to install the r package sowas and unfortunately it is too old to implement in the new versions of r.
According to the author you can use the package using the source() function to gain access to the code but I have not been able to figure out how to do that.
Any help is appreciated.
Here is a link to the package I described as it is not a CRAN package: http://tocsy.pik-potsdam.de/wavelets/
The .zip file is a windows binary and as such it won't be too interesting. What you'll want to look at is the contents of the .tar.gz archive. You can extract those contents and then look at the code in the R subdirectory.
You could also update the package to work with new versions of R so that you can actually build and install the package. To do so you could unpack the .tar.gz as before but now you'll need to add a NAMESPACE file. This is just a plaintext file at the top of the package directory that has a form like:
export(createar)
export(createwgn)
export(criticalvaluesWCO)
export(criticalvaluesWSP)
export(cwt.ts)
export(plot.wt)
export(plotwt)
export(readmatrix)
export(readts)
export(rk)
export(wco)
export(wcs)
export(writematrix)
export(wsp)
Where you have an export statement for any function in the package that you actually want to be able to use. If a function isn't exported then the functions in the package still have access to that function but the user can't use it (as easily). Once you do that you should be able to build and install the package.
I took the liberty of doing some of this already. I haven't actually taken the time to figure out which functions are useful and should be exported and just assumed that if a help page was written for the function that it should be exported and if there wasn't a help page then I didn't export it. I used Rd2roxygen to convert the help pages to roxygen code (because that's how I roll) and had to do a little bit of cleanup after that but it seems to install just fine.
So if you have the devtools package installed you should actually be able to install the version I modified directly by using the following commands
library(devtools)
install_github("SOWAS", "Dasonk")
Personally I would recommend that you go the route of adding the NAMESPACE file and what not directly as then you'll have more control over the code and be more able to fix any problems that might occur when using the package. Or if you use git you could fork my repo and continue fixing things from there. Good luck.
If you want to see the source code of a particular function, then just type the name of the function without the braces and press enter. You will see the code.
For example type var in command prompt to see it's code.
> var
function (x, y = NULL, na.rm = FALSE, use)
{
if (missing(use))
use <- if (na.rm)
"na.or.complete"
else "everything"
na.method <- pmatch(use, c("all.obs", "complete.obs", "pairwise.complete.obs",
"everything", "na.or.complete"))
if (is.na(na.method))
stop("invalid 'use' argument")
if (is.data.frame(x))
x <- as.matrix(x)
else stopifnot(is.atomic(x))
if (is.data.frame(y))
y <- as.matrix(y)
else stopifnot(is.atomic(y))
.Call(C_cov, x, y, na.method, FALSE)
}
<bytecode: 0x0000000008c97980>
<environment: namespace:stats>

Resources