I'm developing R packages for my company of not so tech savy employees.
To facilitate the life of the users I'm thinking of having a set of scripts at a global location that can get sourced at startup in order to update and load relevant packages.
My idea is to have a script for each custom package that only gets sourced if that particalar package is installed. For this to work I need to modify the .First() function when the user installs the different packages, such that after packages A is installed:
.First() <- function(){
source('script_to_package_A')
}
and if package B is then installed:
.First() <- function(){
source('script_to_package_A')
source('script_to_package_B')
}
Thus i am interested to add a line inside the .Rprofile file if .First() is already defined and the line is not in there already or, if no .Rprofile exists, create it.
What would be the best way to achieve this?
best wishes
Thomas
EDIT: Just to clarify - what I am looking for is a safe way to modify the .First() function that gets called during startup. That means that if a user already has defined a .First() function the additions will just be appended instead of replacing the old .First(). Preferably this addition to first will happen when a specific package is installed but this is not necessary. I'm fairly confident in what to add to the .First() function so this is not in question.
I suspect that the easiest way to get what you want is to have a single package installed on all machines. If you can alter the .First function on everyones machine, then you can install a package.
Your .First function would be:
.First() <- function(){
library(mypkg)
run_mypkg_fun()
##Could also check for package updates
##update.packages("mypkg"...)
}
The function run_mypkg_fun can do all the checks you want, including:
installing packages
running different functions conditional on what's installed.
Related
I’ve written some R functions and dropped them into a script file using RStudio. These are bits of code that I use over and over, so I’m wondering how I might most easily create an R package out of them (for my own private use).
I’ve read various “how to” guides online but they’re quite complicated. Can anyone suggest an “idiot’s guide” to doing this please?
I've been involved in creating R packages recently, so I can help you with that. Before proceeding to the steps to be followed, there are some pre-requisites, which include:
RStudio
devtools package (for most of the functions involved in creation of a package)
roxygen2 package (for roxygen documentation)
In case you don't have the aforementioned packages, you can install them with these commands respectively:
install.packages("devtools")
install.packages("roxygen2")
Steps:
(1) Import devtools in RStudio by using library(devtools).
(devtools is a core package that makes creating R packages easier with its tools)
(2) Create your package by using:
create_package("~/directory/package_name") for a custom directory.
or
create_package("package_name") if you want your package to be created in current workspace directory.
(3) Soon after you execute this function, it will open a new RStudio session. You will observe that in the old session some lines will be auto-generated which basically tells R to create a new package with required components in the specified directory.
After this, we are done with this old instance of RStudio. We will continue our work on the new RStudio session window.
By far the package creation part is already over (yes, that simple) however, a package isn't directly functionable just by its creation plus the fact that you need to include a function in it requires some additional aspects of a package such as its documentation (where the function's title, parameters, return types, examples etc as mentioned using #param, #return etc - you would be familiar if you see roxygen documentation like in some github repositories) and R CMD checks to get it working.
I'll get to that in the subsequent steps, but just in case you want to verify that your package is created, you can look at:
The top right corner of the new RStudio session, where you can see the package name that you created.
The console, where you will see that R created a new directory/folder in the path that we specified in create_package() function.
The files panel of RStudio session, where you'll notice a bunch of new files and directories within your directory.
(4) As you mentioned in your words, you drop your functions in a script file - hence you will need to create the script first, which can be done using:
use_r("function_name")
A new R script will pop up in your working session, ready to be used.
Now go ahead and write your function(s) in it.
(5) After your done, you need to load the function(s) you have written for your package. This is accomplished by using the devtools::load_all() function.
When you execute load_all() in the console, you'll get to know that the functions have been loaded into your package when you'll see Loading package_name displayed in console.
You can try calling your functions after that in the console to verify that they work as a part of the package.
(6) Now that your function has been written and loaded into your package, it is time to move onto checks. It is a good practice to check the whole package as we make changes to our package. The function devtools::check() offers an easy way to do this.
Try executing check() in the console, it will go through a number of procedures checking your package for warnings/errors and give details for the same as messages on the screen (pertaining to what are the errors/warnings/notes). The R CMD check results at the end will contain the vital logs for you to see what are the errors and warnings you got along with their frequency.
If the functions in your package are written well, (with additional package dependencies taken care of) it will give you two warnings upon execution of check:
The first warning will be regarding the license that your package uses, which is not specified for a new pacakge.
The second should be the one for documentation, warning us that our code is not documented.
To resolve the first issue which is the license, use the use_mit_license("license_holder_name") command (or any other license which suits your package - but then for private use as you mentioned, it doesn't really matter what you specify if only your going to use it or not its to be distributed) with your name as in place of license_holder_name or anything which suits a license name.
This will add the license field in the .DESCRIPTION file (in your files panel) plus create additional files adding the license information.
Also you'll need to edit the .DESCRIPTION file, which have self-explanatory fields to fill-in or edit. Here is an example of how you can have it:
Package: Your_package_name
Title: Give a brief title
Version: 1.0.0.0
Authors#R:
person(given = "Your_first_name",
family = "Your_surname/family_name",
role = c("package_creator", "author"),
email = "youremailaddress#gmail.com",
comment = c(ORCID = "YOUR-ORCID-ID"))
Description: Give a brief description considering your package functionality.
License: will be updated with whatever license you provide, the above step will take care of this line.
Encoding: UTF-8
LazyData: true
To resolve the documentation warning, you'll need to document your function using roxygen documentation. An example:
#' #param a parameter one
#' #param b parameter two
#' #return sum of a and b
#' #export
#'
#' #examples
#' yourfunction(1,2)
yourfunction <- function(a,b)
{
sum <- a+b
return(sum)
}
Follow the roxygen syntax and add attributes as you desire, some may be optional such as #title for specifying title, while others such as #import are required (must) if your importing from other packages other than base R.
After your done documenting your function(s) using the Roxygen skeleton, we can tell our package that we have documented our functions by running devtools::document(). After you execute the document() command, perform check() again to see if you get any warnings. If you don't, then that means you're good to go. (you won't if you follow the steps)
Lastly, you'll need to install the package, for it to be accessible by R. Simply use the install() command (yes the same one you used at the beginning, except you don't need to specify the package here like install("package") since you are currently working in an instance where the package is loaded and is ready to be deployed/installed) and you'll see after a few lines of installation a statement like "Done (package_name)", which indicates the installation of our package is complete.
Now you can try your function by first importing your package using library("package_name") and then calling your desired function from the package. Thats it, congrats you did it!
I've tried to include the procedure in a lucid way (the way I create my R packages), but if you have any doubts feel free to ask.
my R package has a function (my_func) that uses function(bar) from another package (foo) which only available on Unix. So I write my code this way, following the suggested packages in Writing R extension:
my_func = function(){
....
if (requireNamespace("foo", quietly = TRUE)) {
foo::bar(..) # also tried bar(...)
} else {
# do something else without `bar`
}
...
}
However, I keep getting such warning message when I run R CMD check:
'loadNamespace' or 'requireNamespace' call not declared from: foo
Is there any way to use platform specific packages without gives such warnings? I tried to include foo in the Description file and the warning disappeared, but then the package cannot be installed on windows because foo is not available for the win.
I think you misread that comment, though it is easy to do because the logic is not clear and incontrovertible.
In that first paragraph that talked about calling packages using require(package) they said it was OK for a test environment or vignette but not from inside a finished function to use library(package) or require(package).
This is considered bad form, because the packages you call inside a function will usurp the order of access your user has set up in her work environment when loading packages.
The method you use above:
package::function()
is the approved of way to use a function within a function without altering the current environment.
But in order to use that function you must still have that package installed on your current machine or in your current working environment (such as a virtualenv or ipython or jupytr..and yes R will work in all those environments).
You cannot run a function that R cannot see from the working environment...so you must have it installed, but not necessarily loaded, to call it.
I'm building a package that uses two main functions. One of the functions model.R requires a special type of simulation sim.R and a way to set up the results in a table table.R
In a sharable package, how do I call both the sim.R and table.R files from within model.R? I've tried source("sim.R") and source("R/sim.R") but that call doesn't work from within the package. Any ideas?
Should I just copy and paste the codes from sim.R and table.R into the model.R script instead?
Edit:
I have all the scripts in the R directory, the DESCRIPTION and NAMESPACE files are all set. I just have multiple scripts in the R directory. ~R/ has premodel.R model.R sim.R and table.R. I need the model.R script to use both sim.R and table.R functions... located in the same directory in the package (e.g. ~R/).
To elaborate on joran's point, when you build a package you don't need to source functions.
For example, imagine I want to make a package named TEST. I will begin by generating a directory (i.e. folder) named TEST. Within TEST I will create another folder name R, in that folder I will include all R script(s) containing the different functions in the package.
At a minimum you need to also include a DESCRIPTION and NAMESPACE file. A man (for help files) and tests (for unit tests) are also nice to include.
Making a package is pretty easy. Here is a blog with a straightforward introduction: http://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/
As others have pointed out you don't have to source R files in a package. The package loading mechanism will take care of losing the namespace and making all exported functions available. So usually you don't have to worry about any of this.
There are exceptions however. If you have multiple files with R code situations can arise where the order in which these files are processed matters. Often it doesn't matter or the default order used by R happens to be fine. If you find that there are some dependencies within your package that aren't resolved properly you may be faced with a situation where a custom processing order for the R files is required. The DESCRIPTION file offers the optional Collate field for this purpose. Simply list all your R files in the order they should be processed to satisfy the dependencies.
If all your files are in R directory, any function will be in memory after you do a package build or Load_All.
You may have issues if you have code in files that is not in a function tho.
R loads files in alphabetical order.
Usually, this is not a problem, because functions are evaluated when they are called for execution, not at loading time (id. a function can refer another function not yet defined, even in the same file).
But, if you have code outside a function in model.R, this code will be executed immediately at time of file loading, and your package build will fail usually with a
ERROR: lazy loading failed for package 'yourPackageName'
If this is the case, wrap the sparse code of model.R into a function so you can call it later, when the package has fully loaded, external library too.
If this piece of code is there for initialize some value, consider to use_data() to have R take care of load data into the environment for you.
If this piece of code is just interactive code written to test and implement the package itself, you should consider to put it elsewhere or wrap it to a function anyway.
if you really need that code to be executed at loading time or really have dependency to solve, then you must add the collate line into DESCRIPTION file, as already stated by Peter Humburg, to force R to load files order.
Roxygen2 can help you, put before your code
#' #include sim.R table.R
call roxygenize(), and collate line will be generate for you into the DESCRIPTION file.
But even doing that, external library you may depend are not yet loaded by the package, leading to failure again at build time.
In conclusion, you'd better don't leave code outside functions in a .R file if it's located inside a package.
Since you're building a package, the reason why you're having trouble accessing the other functions in your /R directory is because you need to first:
library(devtools)
document()
from within the working directory of your package. Now each function in your package should be accessible to any other function. Then, to finish up, do:
build()
install()
although it should be noted that a simple document() call will already be sufficient to solve your problem.
Make your functions global by defining them with <<- instead of <- and they will become available to any other script running in that environment.
Suppose I'm currently developing a package called mypackage. As time goes by, many different functions have landed in there, and I want to reorganize it. So I'd like to create a new package called newpackage in which I would move some of the functions of mypackage (and include new ones later).
The problem is that I don't want original users of mypackage to get object not found errors when they want to use one of the moved functions.
So, I thought about doing the following :
create newpackage and move the functions
add into mypackage DESCRIPTION file : Depends: newpackage
As such, when people would install, upgrade or load mypackage, newpackage would be installed or loaded too, and all the functions would be available.
Do you think it would work, or would there be some problems I don't think about ?
Thanks !
Isn't it so that it is not recommended to remove functions from a package without labeling them first to be depreciated?! So, maybe you proceed as you planned but before removing them from the mypackage, you could first mark them there as depreciated and then remove them from it finally in the next version of the package. And during the migrating phase you could use the namespace of the packages to refer already to the function in newpackage as you planned.
I have built an R package but I do not want my users to have to install it before using it.
Is there a way to load a package without having to install it?
For example, if I have a package mypackage.tar.gz, is there something like
library("mypackage.tar.gz")
?
I'll join in "the chorus" of suggesting you should really install the package.
That having been said, you can take a look at Hadley's devtools package, which will let you load packages into the workspace without dumping in your global workspace.
The package will have to be untar'd/unzipped and follow the standard R package structure.
In order for this to work, though, your users would have to have the devtools package installed, so ... I'm not sure that this is any type of win for you.
If you only need the code to be loaded without it being installed, take the raw R script and source it:
source(myScript.R)
If you have different functions, you can create an R script that just loads all the necessary source files. What I sometimes do when developing, is name all my functions F_some_function.R and my classes Class_some_function.R. This allows me to source a main file containing following code :
funcdir <- "C:/Some/Path"
files <- dir(funcdir)
srcfiles <- c(grep("^Class_",dir(funcdir),value=T),
grep("^F_",dir(funcdir),value=T)
)
for( i in paste(funcdir,srcfiles,sep="/")) source(i)
If you present them with the tarred file, they can untar themselves using untar() before sourcing the main file.
But honestly, please use a package. You load everything in the global environment (or in a specified environment if you use local=T), but you lose all functionality of a package. Installing a package is no hassle, and removing neither.
If it's a matter of writing rights on the C drive (which is the only possible reason not to use a package that I met in my carreer), one can easily set another library location. R 2.12 actually does this by itself on Windows. See ?.libPaths()