Shall I request author premision to modify r function - r

I am working on project and would like to use some function from R package. However, for my project requirements, I must modify these functions. Then used them for my project purpose only. Of course I would like to publish my work. The modification is on these functions only and I will use the new function in my project. So I will not change the package. So my question, shall I request the author permission for these modifications? I tried to modify these function because they are so closed to what I am doing. So, I just need to modify them. I am not plan to write a package.

As far as I know, if the pkg is on CRAN and if the Licence is GPL (>=2), your are allowed to copy and modify the content as long as the modified content is still in GPL and that you state that you modified the content. So you don't need to ask for the permission of the pkg creator.
A good practice would be to create your own package, calling it 'pkgextra' (where pkg is the name of the package) and stating in the DESCRIPTION that the package is built on top of another package e.g tidystringdist which is built on top of stringdist or ggExtra which is built on top of ggpot. Also, as R packages have a Dependencies component, you're clearly stating in the DESCRIPTION that you built your package depending on other packages.
To wrap up, no, you don't need the permission from the package author for as long as you distribute the created work with the same licence and that you state that you depend on this package.

Related

[R][package] Load all functions in a package's NAMESPACE

Tl;dr
I have a complete NAMESPACE file for the package I am building. I want to execute all the importFrom(x,y) clauses in that file to get the set of necessary functions I use in the package. Is there an automated way to do that?
Full question
I'm currently working on building a package that in turns depends on a bunch of other packages. Think: both Hmisc and dplyr.
The other contributors to the package don't really like using devtools::build() every 5 minutes when they debug, which I can understand. They want to be able to toggle a dev_mode boolean which, when set to true, loads all dependencies and sources all scripts in the R/ and data-raw/ folders, so that all functions and data objects are loaded in memory. I don't necessarily think that's a perfectly clean solution, but they're technically my clients and it's a big help to them so please don't offer a frame challenge unless it's the most user-friendly solution imaginable.
The problem with this solution is that when I load two libraries whose namespace clash, functions in the package that would perfectly work otherwise start throwing errors. I thus need to improve my import system.
Thanks to thorough documentation (and the help of devtools::check()), my NAMESPACE is complete. I guess I could split it in pieces and eval() some well-chosen parts of it, but it sort of feels like I'm probably not the first person to encounter this difficulty. Is there an automated way to parse NAMESPACE and import the proper functions from the proper packages ?
Thanks !

Failure to create a network object from edge list with RSiena Test

I am trying to use RSiena Test's sienaDataCreateFromSession() function to create a network object from an edge list. However, the available documentation for the function does not provide any instruction how to do so: https://www.rdocumentation.org/packages/RSiena/versions/1.1-232/topics/sienaDataCreateFromSession
I would be grateful for any advice and example how to do so.
While the function mentioned, sienaDataCreateFromSession(), is still available in the {RSienaTest} package from R-Forge, it is effectively obsolete.
Users are recommended to instead import data into their R session in whatever form they have it in, and then wrangle it into the correct shape for sienaDataCreate() and sienaDependent() as outlined in the manual.
If there is continued need for additional helper functions in the main package, please raise this as an issue on the RSiena Github page.

How to manage legacy dependencies in a R package?

I'm working on a fork of a package that depends upon the ReporteRs library.
However, this library has been deprecated by its owner for a few years, in favour of the officer and flextable libraries.
On of the main reasons for this depreciation is not to depend on rJava, which may cause installation problems and bugs.
In my package, how should I manage this case?
So far, my package was processing data to return a ReporteRs object. If I change my functions to return an officer object I would break backward compatibilty.
But if I don't, and keep old, ReporteRs returning function as legacy backward compatibilty functions, I have to keep ReporteRs in my dependencies and my package would be rJava-dependant.
Is there a win-win solution?
Here is what I would do:
Make your best attempt to re-implement your functions with the officer library, but keeping your old API. Make sure that you warn the users that these functions are deprecated. At the same time make new functions fully compliant with officer/flextable syntax. Note that you might change the behavior of the functions slightly (as in, not ensuring all parameters are properly evaluated), as long as they take the same parameters and return the same type of objects.
If that is really not possible, just add a compatibility warning to your old functions.
Create a transitional package version that you would keep around for a few weeks or months with both versions of these functions. If the package still needs to depend on rJava, tough luck.
Keep track of the packages that depend on your package. If there are not too many, you can contact their developers directly. Maybe the issue is not as serious as you think it is?
EDIT: As discussed above, you can make your dependency on ReportR conditional on the availability of ReportR. Then, you can put ReportR into the Suggests field of the DESCRIPTION file rather than Depends, and in the package you can use code like this:
if(requireNamespace("ReportR")) {
warning("This function is deprecated, better use MyNewFunction instead")
ReportR::whatever() ...
} else {
warning("To run this (deprecated) function, please install the ReportR package")
}

How to install an R package to R-3.3.0 from GitHub, which is built on R-3.4.0?

We have R-3.3.0 in our university's computing system. For some reason, the IT staffs do not want to update R version to R-3.4.0 soon. However, I need to install an R package from the GitHub, which is built on R-3.4.0. Is there any way to install that R package from the GitHub's to R-3.3.0?
#patrick's answer may work just fine. The benefit (if it works) is that you have all recent changes and functionality of the package. However, you may never know if one of the changes requiring 3.4 is about accuracy or correctness, meaning you may still get a return value but you don't necessarily know that it is correct.
For this answer, I'm going to assume that there is a valid reason to not use the current version and trick R into installing it anyway.
Grab from a specific commit
Go to the repo, https://github.com/mshasan/OPWeight in this case.
Open the DESCRIPTION file and click on the "Blame" button on the right. This brings up the commit message-header and timeframe for each group of lines with their most recent commit. In this case, it shows "Update DESCRIPTION":
Click on the description, and you're taken to the specific commit. Seeing that this is a single-line change, it is likely that an earlier commit may have been what actually changed code to use R (>= 3.4.0) a hard requirement. Take note of the commit hash (5c0a43c in this case).
Go back to the repo main page and click on "Commits". If you now search for that 7-char hash-substring, you'll see it happened on June 20, 2017. Unfortunately, the commit descriptions and timeline do not give a great expectation of where the version-depending change happened.
If you can find "the commit" that did it, then take that hash-substring and use that as your ref="..." argument to install_github. If not, however, you are either stuck (1) trying them iteratively or randomly, or (2) asking the author at which commit they started using 3.4-specific code.
Once you know a ref to use (or want to try), then run
devtools::install_github("mshasen/OPWeight", ref="5c0a43c")
(This is obviously the wrong ref to use, since that's the first commit at which we are certain the dependency exists.)
Using tests to know which to use
Since the repo contains a tests/ subdir, one can hope/assume that the tests will accurately catch if things are not working correctly in your R-3.3. This alternative method involves you testing each commit on your specific version of R (before the DESCRIPTION file was changed) until the tests no longer fail.
Using git (command-line or GUI), clone the repo to your local computer.
$ git glone https://github.com/mshasan/OPWeight
Now, iterate through the references (found using the above method or perhaps with git log) with something like:
$ git checkout --detach <hash_substring>
... and in R, run
devtools::test("path/to/your/copy/of/OPWeight")
If the tests have been set up correctly and you chose a worthy version, then stick with it.
You can find the description file for OPWeight here.
Change this part
Depends:
R (>= 3.4.0),
to whatever R you are using and see if things break. The logic of the description file is explained here. Obviously a last resort.

Where in R do I permanently store my custom functions?

I have several custom functions that I use frequently in R. Rather than souce this file (or parts thereof) in each script, is there some way to add this to a base R file such that they are always available when I use R?
Yes, create a package. There are numerous tutorials as well as the Writing R Extensions manual that came with your copy of R.
It may seem like too much work at first, but you will probably be glad that you did this in the longer run.
PS And you can then load that package from ~/.Rprofile. For really short code, you can also define it there.
A package may be overkill for a for a few useful functions. I'd argue there's nothing wrong with explicitly source()ing them as you need them - at least it is explicit so that if you email someone your code, you won't forget to include those other scripts.
Another option is to use the .Rprofile file. You can read about the details in ?Startup. Basically, the idea is that:
...a file called ‘.Rprofile’ is searched for in the current directory or
in the user's home directory (in that order). The user profile file is
sourced into the workspace.
You can read here about how many people use this functionality.
The accepted answer is best long-term: Make a package.
Luckily, the learning curve for doing this has been dramatically reduced by the devtools package: It automates package creation (a nice assist in getting off on the right foot), encourages good practices (like documenting with roxygen2, and helps with using online version control (bitbucket, github or other), sharing your package with others. It's also very helpful for smoothing your way to CRAN submission.
Good docs at http://adv-r.had.co.nz and http://r-pkgs.had.co.nz .
to create your package, for instance you can:
install.packages("devtools")
devtools::create("path/to/package/pkgname")
You could also look at the 'mvbutils' package: it lets you set up a hierarchical set of "tasks" (folders with workspace ".RData" files in them) such that you can always see what's in the ancestral tasks (ie the ancestors are in the search() path). So you can put your custom functions in the "starting task" where you always start R; and then you change to vwhatever project-specific task you require, so you can avoid cluttered workspaces, but you'll still be able to use (and edit) your custom functions because the starting task is always ancestral. Objects (including functions) get stored in ".RData" files and are thus loaded/saved automatically, but there are separate text-backup facilities for functions.
There are lots of different ways of working in R, and no "one-size-fits-all" best solution. It's also not easy to find an overview! Speaking just for myself:
I'm not a fan of having to 'source' everything in every time; for one thing, it simply doesn't work with big data sets and/or results of model runs.
I think packages are hard to create and maintain; there is a really significant overhead. After the first 5 packages you write, it does get a bit easier provided you do it on at least a weekly basis so you don't forget how, but really...
In fact, 'mvbutils' also has a bunch of tools for facilitating the creation and (especially) maintenance of packages, designed to interface smoothly with the task-hierarchy system. I use & edit my own packages all the time (including editing mvbutils itself); but if it wasn't for the tools in 'mvbutils', I'd be grinding my teeth in frustration most days of the week.

Resources