How do people learn about giving an R package a namespace? I find the documention in "R Extensions" fine, but I don't really get what is happening when a variable is imported or exported - I need a dummy's guide to these directives.
How do you decide what is exported? Is it just everything that really shouldn't required the pkg:::var syntax? What about imports?
Do imports make it easier to ensure that your use of other package functions doesn't get confused when function names overlap?
Are there special considerations for S4 classes?
Packages that I'm familiar with that use namespaces such as sp and rgdal are quite complicated - are there simple examples that could make things clearer?
I have a start on an answer on the devtools wiki: https://r-pkgs.org/Metadata.html
Few years later here....
I consolidated findings from Chambers, other StackOverflow posts, and lots of tinkering in R:
https://blog.thatbuthow.com/how-r-searches-and-finds-stuff/
This is less about implementing NAMESPACE/IMPORTS/DEPENDS and more about the purpose of these structures. Answers some of your questions.
The clearest explanation I've read is in John Chambers' Software for Data Analysis: Programming with R, page 103. I don't know of any free online explanations that are better than what you've already found in the R Extensions manual.
You could also pick an easy, small package and follow it.
I semi-randomly looked at digest which is one of my smaller packages. I loads a (small) dynamic library and exports one symbol, the digest() function. Here is the content of the NAMESPACE file:
## package has dynamic library
useDynLib(digest)
## and one and only one core function
export(digest)
Have a look at the rest of the source files and maybe try to read Writing R Extensions alongside looking at the example, and do some experiments.
http://www.stat.uiowa.edu/~luke/R/namespaces/morenames.pdf
Related
i imported exel to R now i do not know how to solve the question, as it is my 1st time with R
As this looks like an assignment/homework question, and you mention this is your 1st time with R, I think you would benefit more from looking at an in-depth introduction to R than a quick answer here. This site seems to be a good introduction: https://intro2r.com/index.html . The site recommend RStudio which is far more intuitive and easy to use than base R.
There is also often good documentation on basic functions within R itself. Type ? into the console before any command and it will direct you to some helpful information. For example, you may find these useful to get started.
?hist
?plot
?min
?max
I'm in the process of creating a small R package containing a set of functions that should be useful in a specialized area of Biology. I currently have the package on GitHub, but want to submit it to CRAN soon. One thing I have noticed when digging around in other packages, is that the code often includes no comments at all (e.g. short comments describing what different parts of the code does), which makes it more difficult to understand. I'm not a programmer or expert in R, so I don't understand why comments are often not included, and Hadley Wickham's "R packages" book makes not mention of this.
Edit: I'm not referring to the object documentation, that one accesses with ?function(), but to comments that are interspersed within the function code, which a normal user wouldn't see, but that could be helpful for people trying to figure out exactly how a function works.
Is there a specific reason to not include comments within the functions of an R package? If so, should I remove all the comments from my code before submitting to CRAN?
I'm writing an R package and I'd like to use one function from another package (plotKML). This external package has so many dependencies that I don't want my users to be required to download etc. If I use importFrom(plotKML, readGPX) in the NAMESPACE file it will load all of plotKML into the namespace and download all dependencies which I don't want.
So the question is: is it appropriate to copy the code for the one function I need (ensuring that all the dependencies in that one function are included)? If so what is appropriate for the attribution/documentation -- do I copy the documentation from the original?
There is a great discussion of this issue in this post and the answer by Brian Diggs is very helpful. But he ends with "For your example, you may be better off copying the code for memisc::describe into your package, although that approach has its own problems and caveats" so I'm left with some uncertainty about what the problems are and whether it's appropriate from a attribution perspective.
Questions about the appropriate attribution would probably be best resolved by contacting the package author directly. As noted in the comments above, that package appears to use GPL-3, which should mean that you can include the function in your package but your package must then also be GPL-3 licensed. (As always, probably no one here is a lawyer so that's on you to check...)
The primary downside to copying just the function you need is that then you are responsible for maintaining it. This probably also means maintaining it in a way that keeps it in sync with the original version from plotKML. Depending on the package, surrounding code and how often it is updated that could be fairly simple or it could be horrible.
Does anybody know where I can download the R package "cart" that can help create Gastner's
"Mapping with Diffusion-based Cartograms" ? I tried a install.package on R and says it's not available
for R 2.15. There is a page on R-forge about it but it doesn't explain how to download the package.
Thanks.
Way late to the game, but from what I can tell there's not much happening for the cart package; my recent efforts with cartogramming in R have pushed me towards two alternatives: Rcartogram within R (available from the GitHub repository) and ScapeToad, a program written in JS.
Advantage of the former is that you don't have to leave R (better for long-term project management), however it's a bit arcane to use (requires converting your shapefile to a density grid & then figuring out how to use an interpolation method, etc.).
Advantage of the latter is that it's got a very simple point-and-click GUI--add shapefile, create cartogram wizard, export shapefile, voila.
Both are based on the Gastner-Newman diffusion-based algorithm.
If you check the build page you'll see that at the moment the package fails to build. I thought it might be something minor but I've put in a little bit of work so far and it's still failing to build on my machine.
You might want to email the authors and ask them. You could also try their forum but it looks like it hasn't seen much activity lately.
I'm new to R and having a hard time piecing together information from various sources online related to what is considered a "good" practice with writing R code. I've read basic guides but I've been having a hard time finding information that is definitely up to date.
What are some examples of well written/documented S3 classes?
How about corresponding S4 classes?
What conventions do you use when commenting .R classes/functions? Do you put all of your comments in both .Rd files and .R files? Is synchronization of these files tiresome?
Whether to use S3, S4, or a package at all is mostly a style issue (as Dirk says), but I would suggest using one of those if you want to have a very well structured object (just as you would in any OOP language). For instance, all the time series classes have time series objects (I believe that they're all S3 with the exception of its) because it allows them to enforce certain behavior around the construction and usage of those objects. Similarly with the question about creating a package: it's a good idea to do this if you will be re-using your code frequently or if the code will be useful to someone else. It requires a little more effort, but the added organizational structure can easily make up for the cost.
Regarding S3 vs. S4 (discussed on R-Help here and here), the basic guideline is that S3 classes are more "quick and dirty" while S4 classes place more rigid control over objects and types. If you're working on Bioconductor, you typically will use S4 (see, for instance, "S4 classes and methods").
I would recommend reading some of the following:
"A (Not So) Short Introduction to S4" by Christophe Genolini
"Programmers' niche: A simple class, in S3 and S4" by Thomas Lumley
"Brobdingnag: a ''hello world'' package using S4 methods" by Robin K. S. Hankin
"Converting packages to S4" by Douglas Bates
"How S4 Methods Work" by John Chambers
For documentation, Hadley's suggestion is spot on: Roxygen will make life easier and puts the documentation right next to the code. That aside, you may still want to provide other comments in your code beyond what Roxygen or the man files require, in which case it's a good practice to comment your code for other developers. Those comments will not end up in your package; they will only be visible in the source code.
For 3. Use roxygen - it works like javadoc to take comments in your source files and build Rd files.
That's half a dozen or more questions bundled into one, which makes it difficult to answer.
So let's try from the inside out: First try to solve your RODBC wrapper problem. A code representation will suggest itself. I would start with simple functions, and then maybe build a package around it. That already gives you some encapsulation.
Much of the rest is style. Some prominent R codes swear by S4, while other swear about it. You can always read the packages of others as well as code in R itself. And you can always re-implement your RODBC wrapper in different ways and the compare your own approaches.
Edit: Reflecting you updated and much shortened question: Pick some packages from CRAN, in particular among those you use. I think you will quickly find some more or less interesting according to your style.
somewhat more style related than substance, but the Google R style guide is worth reading: