I'm writing my own R package and would like to plot a spatialPolygonsDataFrame object. If I were writing it as a script I would simply load the necessary packages (maptools, rgdal, and rgeos) with library() and plot with plot(x).
When writing a package to build using library() is not advised, instead it is usual to load the package by adding it to Imports: in the NAMESPACE. If I do this I receive the following error:
Error in as.double(y) :
cannot coerce type 'S4' to vector of type 'double'
This is is corrected by loading the maptools package with library() if writing a script.
I know you can load individual methods with ImportMethodsFrom in the NAMESPACE so have tried to import a plot method from maptools using this approach but have had no luck. When I looked in the NAMESPACE of the maptools package I couldn't find a plot method exported. I've seen there is a plot.Spatial function which I have tried to import to my NAMESPACE without success:
No methods found in "maptools" for requests: plot.Spatial
Finally, I have tried adding maptools to Depends: instead of Imports: in my NAMESPACE and this does work. Is this the canonical way to do this? It seems overkill to attach a whole package for one method (plus I don't know what functions have been masked, etc.). What is the best way to load the necessary tools to plot maps within a self-authored function?
Edit 1: In response to #Hack-R's question, I don't know if plot.Spatial is the only method I need, or even if it's the correct one. It's my educated guess that this will enable me to plot spatial objects.
plot.Spatial is internal and is in sp and not maptools, which I think is the answer here. You are looking at the wrong package.
As discussed in the comments, you can simply use sp::plot.
For developing a package, there's a bit more to it.
If you import the methods for plot so that your functions can use it internally, but it won't be available to users unless they library(sp). You could re-export it, so your users don't have to attach sp - but you'll need to document it and perhaps explain why, and also check there's no issues if sp is attached.
This is a bit of a challenging topic that is well explained here: http://r-pkgs.had.co.nz/namespace.html I was pretty comfortable with namespaces but only recently realized you could re-export a function that you import from another - so you could provide sp's plot.Spatial without Depends: sp.
I override the print methods for Spatial in a package I use, and that in in turn overrides the overrides that raster provides - there's no stopping you doing this, it's a matter of managing the user expectations and hopefully not making things hard/er. You probably don't want to override a generic like plot for normal use, it's clearer if you have a myPlot that does that specifically, or add your own classes.
It's another level complicated though, since plot.Spatial is internal, and it's source is used to define an S4 method for plot. You can see the methods with showMethods("plot") and then get the internal functions that provide those with findMethods("plot")[["Spatial#missing"]] or findMethods("plot")[["SpatialPolygons#missing"]].
#mdsumner's answer pointed me in the right direction and was a useful discussion in its own right.
The answer to my specific query to plot spatialPolygonsDataFrame objects was to add sp to Imports: and call sp::plot()
Related
When making your own package for R, one often wants to make use of functions from a different package.
Maybe it's a plotting library like ggplot2, dplyr, or some niche function.
However, when making a function that depends on functions in other packages, what is the appropriate way to call them? In particular, I am looking for examples of when to use
myFunction <- function(x) {
example_package::function(x)
}
or
require(example_package)
myFunction <- function(x) {
function(x)
}
When should I use one over the other?
If you're actually creating an R package (as opposed to a script to source, R Project, or other method), you should NEVER use library() or require(). This is not an alternative to using package::function(). You are essentially choosing between package::function() and function(), which as highlighted by #Bernhard, explicitly calling the package ensures consistency if there are conflicting names in two or more packages.
Rather than require(package), you need to worry about properly defining your DESCRIPTION and NAMESPACE files. There's many posts about that on SO and elsewhere, so won't go into details, see here for example.
Using package::function() can help with above if you are using roxygen2 to generate your package documentation (it will automatically generate a proper NAMESPACE file.
The douple-colon variant :: has a clear advantage in the rare situations, when the same function name is used by two packages. There is a function psych::alpha to calculate Cronbach's alpha as a measure of internal consistency and a function scales::alpha to modify color transparency. There are not that many examples but then again, there are examples. dplyr even masks functions from the stats and base package! (And the tidyverse is continuing to produce more and more entries in our namespaces. Should you use dyplr you do not know, if the base function you use today will be masked by a future version of dplyr thus leading to an unexpected runtime problem of your package in the future.)
All of that is no problem if you use the :: variant. All of that is not a problem if in your package the last package opened is the one you mean.
The require (or library) variant leads to overall shorter code and it is obvious, at what time and place in the code the problem of a not-available package will lead to an error and thus become visible.
In general, both work well and you are free to choose, which of these admittedly small differences appears more important to you.
I'm developing a package that uses ggmap as a dependency.
ggmap: https://github.com/dkahle/ggmap
Within my package, I'm calling ggmap function using the recommended approach of including ggmap in the Imports section of the Description file, and calling functions using the :: operator (e.g. ggmap::get_map()). My issue is that ggmap assumes that some options are set upon initialization in .onLoad().
https://github.com/dkahle/ggmap/blob/master/R/attach.R
I believe that, since I'm not calling library() or require(), .onAttach() never gets called, and thus these options never get set. I can't call .onAttach() within my package, because it is not exported.
What is the best practice for initializing a dependent package?
This seems like a general problem in R package development, but I can't find the answer anywhere. And my apologies, this doesn't seem like the kind of question that can have a reproducible example.
I've written a package in which the following is in the imports:
Rcpp (>= 0.11.0),ggplot2,grid,gridExtra,png,methods,ape,Biostrings
I have read this artcle about how R searches: how R searches, and I figured basically unless there's a very good reason not to - it's is safer to import packages that mine depends on, and not to put them in the dependencies.
However I'm seeing the error whe I use my package:
could not find function "rasterGrob"
My suspicion is that ggplot Depends on Grid, so I have to make my package Depend on grid too, so as the grid is attached and so 'package:grid' will be seen when executing search().
A). Is my understanding correct? and B). So do I simply have to also Depend on Grid, or is it a better idea to Depend on ggplot2 also?
Thanks,
Ben.
I realize there are generic functions like plot, predict for a list of packages. I am wondering how could I get the R script of these generic functions for a specific package, like the lme4::predict. I tried lme4::predict, but it comes with error:
> lme4::predict
Error: 'predict' is not an exported object from 'namespace:lme4'
Since you state that my suggestion above was helpful I will tell you my process. I used my own co-authored package called pacman. This package was developed because we had a hard time remembering all the obscurely named functions to get information on and work with add on packages.
I used this to figure out what you wanted:
library(pacman)
p_funs(lme4, all=TRUE)
I set all = TRUE as predict is a method for a specific class (like print, summary and plot). Generally, these methods are not exported so p_funs won't return them unless you set all = TRUE. Then I scrolled down to the p section and found only a single predict method: predict.merMod
Next I realized it wasn't exported so :: won't show me the stuff and extra colon power is needed, hence: lme4:::predict.merMod
As pointed out by David and rawr above, some functions can be suborn little snippets (methods etc.), thus methods and getAnywhere are helpful.
Here is an example of that:
library(tm)
dissimilarity #The good stuff is hid
methods(dissimilarity) #I want the good stuff
getAnywhere("dissimilarity.DocumentTermMatrix")
Small end note
And of course you don't need pacman to look at functions for packages, it's what I use and helpful but it just wraps base R things. Use THE SOURCE to figure out exactly what.
[Revised based on suggestion of exporting names.]
I have been working on an R package that is nearing about 100 functions, maybe more.
I want to have, say, 10 visible functions and each may have 10 "invisible" sub-functions.
Is there an easy way to select which functions are visible, and which are not?
Also, in the interest of avoiding 'diff', is there a command like "all.equal" that can be applied to two different packages to see where they differ?
You can make a file called NAMESPACE in the base directory of your package. In this you can define which functions you want to export to the user, and you can also import functions from other packages. Exporting will make a function usable, and import will transfer a function from another package to you without making it available to the user (useful if you just need one function and don't want to require your users to load another package when they load yours).
A trunctuated part of my packages NAMESPACE :
useDynLib(qgraph)
export(qgraph)
(...)
importFrom(psych,"principal")
(...)
import(plyr)
which respectively loads the compiled functions, makes the function qgraph() available, imports from psych the principal function and imports from plyr all functions that are exported in plyr's NAMESPACE.
For more details read:
http://cran.r-project.org/doc/manuals/R-exts.pdf
I think you should organise your package and code the way you feel most comfortable with; it is your package after all. NAMESPACE can be used to control what gets exposed or not to the user up-front, as other's have mentioned, and you don't need to document all the functions, just the main user-called functions, by adding \alias{} tags to the Rd files for all the support functions you don't want people to know too much about, or hide them on an package.internals.Rd man page.
That being said, if you want people to help develop your package, or run with it and do amazing things, the better organised it is the easier that job will be. So lay out your functions logically, perhaps one file per function, named after the function name, or group all the related functions into a single R file for example. But be consistent in which approach you do.
If you have generic functions that have more general use, consider splitting those functions out into a separate package that others can use, without having to depend on your mega package with the extra cruft that is more specific. Your package can then depend on this generic package, as can packages of other authors. But don't split packages up just for the sake of making them smaller.
The answer is almost certainly to create a package. Some rules of thumb may help in your design choice:
A package should solve one problem
If you have functions that solve a different problem, put them in a separate package
For example, have a look at the ggplot2 package:
ggplot2 is a package that creates wonderful graphics
It imports plyr, a package that gives a consistent syntax and approach to solve the Split, Apply, Combine problem
It depends on reshape2, a package with only few functions that turns wide data into long, and vice-versa.
The point is that all of these packages were written by a single author, i.e. Hadley Wickham.
If you do decide to make a package, you can control the visibility of your functions:
Only functions that are exported are directly visible in the namespace
You can additionally mark some functions with the keyword internal, which will prevent them appearing in automatically generated lists of functions.
If you decide to develop your own package, I strongly recommend the devtools package, and reading the devtools wiki
If your reformulated question is about 'how to organise large packages', then this may apply:
NAMESPACE allows for very fine-grained exporting of functions: your user would see 10 visisble functions
even the invisible function are accessible if you or the users 'known', that is done via the ::: triple colon operator
packages do come in all sizes and shapes; one common rule about 'when to split' may be that as soon as you have functionality of use in different contexts
As for diff on packages: Huh? Packages are not usually all that close so that one would need a comparison function. The diff command is indeed quite useful on source code. You could use a hash function on binary code if you really wanted to but I am still puzzled as to why one would want to.