How to import sf to package to run a function that depends on lwgeom? - r

I'm building a package that imports {sf}, and more specifically I use st_length() in one of my functions.
I initially added only {sf} to my package "Imports", but when I checked it I got a few {lwgeom} related errors:
Running examples in 'gtfstools-Ex.R' failed
The error most likely occurred in:
> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: get_trip_speed
> ### Title: Get trip speed
> ### Aliases: get_trip_speed
>
> ### ** Examples
>
> data_path <- system.file("extdata/spo_gtfs.zip", package = "gtfstools")
>
> gtfs <- read_gtfs(data_path)
>
> trip_speed <- get_trip_speed(gtfs)
Error in sf::st_length(trips_geometries) :
package lwgeom required, please install it first
This error happens when the examples are running, but some similar errors happen with the tests.
Then I added {lwgeom} to Imports. The check runs fine, but in the end I get a note: NOTE: Namespaces in Imports field not imported from: 'lwgeom'
What's the best practice when dealing with cases like this? Should I just keep track of this note and send it as a comment to CRAN during the package submission process?

You can consider adding the {lwgeom} package in Suggests field of your package DESCRIPTION file. It should do the trick.

The Suggests != Depends article by Dirk Eddelbuettel refers to a relevant bit of Writing R Extensions (WRE) that might be useful to this case.
Section 1.1.3.1 (suggested packages) reads (as of 2021-03-12):
Note that someone wanting to run the examples/tests/vignettes may not have a suggested package available (and it may not even be possible to install it for that platform). The recommendation used to be to make their use conditional via if(require("pkgname")): this is OK if that conditioning is done in examples/tests/vignettes, although using if(requireNamespace("pkgname")) is preferred, if possible.
However, using require for conditioning in package code is not good practice as it alters the search path for the rest of the session and relies on functions in that package not being masked by other require or library calls. It is better practice to use code like
if (requireNamespace("rgl", quietly = TRUE)) {
rgl::plot3d(...)
} else {
## do something else not involving rgl.
}
So while just adding {lwgeom} to Suggests works, we may stumble upon the issue where someone that runs a "lean installation" (i.e. without suggested packages) of my package won't be able to use the functions that rely on {lwgeom}.
More importantly, if an author of a package that I am importing decides to run a reverse dependency check on my package while not installing suggested packages, the check would fail because I'd have a few examples, tests and vignettes bits failing due to not having {lwgeom} available.
Thus, in addition to listing it in Suggests, I added some checks on examples and vignettes like suggested by WRE:
*examples/vignette context*
# the examples below require the 'lwgeom' package to be installed
if (requireNamespace("lwgeom", quietly = TRUE)) {
... do something ...
}
In the functions that require {lwgeom} I added:
if (!requireNamespace("lwgeom", quietly = TRUE))
stop(
"The 'lwgeom' package is required to run this function. ",
"Please install it first."
)
And added this bit to the tests of such functions (using {testthat}):
if (!requireNamespace("lwgeom", quietly = TRUE)) {
expect_error(
set_trip_speed(gtfs, "CPTM L07-0", 50),
regexp = paste0(
"The \\'lwgeom\\' package is required to run this function\\. ",
"Please install it first\\."
)
)
skip("'lwgeom' package required to run set_trip_speed() tests.")
}

Related

Why require() command is preferred within functions in R? [duplicate]

What is the difference between require() and library()?
There's not much of one in everyday work.
However, according to the documentation for both functions (accessed by putting a ? before the function name and hitting enter), require is used inside functions, as it outputs a warning and continues if the package is not found, whereas library will throw an error.
Another benefit of require() is that it returns a logical value by default. TRUE if the packages is loaded, FALSE if it isn't.
> test <- library("abc")
Error in library("abc") : there is no package called 'abc'
> test
Error: object 'test' not found
> test <- require("abc")
Loading required package: abc
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
there is no package called 'abc'
> test
[1] FALSE
So you can use require() in constructions like the one below. Which mainly handy if you want to distribute your code to our R installation were packages might not be installed.
if(require("lme4")){
print("lme4 is loaded correctly")
} else {
print("trying to install lme4")
install.packages("lme4")
if(require(lme4)){
print("lme4 installed and loaded")
} else {
stop("could not install lme4")
}
}
In addition to the good advice already given, I would add this:
It is probably best to avoid using require() unless you actually will be using the value it returns e.g in some error checking loop such as given by thierry.
In most other cases it is better to use library(), because this will give an error message at package loading time if the package is not available. require() will just fail without an error if the package is not there. This is the best time to find out if the package needs to be installed (or perhaps doesn't even exist because it it spelled wrong). Getting error feedback early and at the relevant time will avoid possible headaches with tracking down why later code fails when it attempts to use library routines
You can use require() if you want to install packages if and only if necessary, such as:
if (!require(package, character.only=T, quietly=T)) {
install.packages(package)
library(package, character.only=T)
}
For multiple packages you can use
for (package in c('<package1>', '<package2>')) {
if (!require(package, character.only=T, quietly=T)) {
install.packages(package)
library(package, character.only=T)
}
}
Pro tips:
When used inside the script, you can avoid a dialog screen by specifying the repos parameter of install.packages(), such as
install.packages(package, repos="http://cran.us.r-project.org")
You can wrap require() and library() in suppressPackageStartupMessages() to, well, suppress package startup messages, and also use the parameters require(..., quietly=T, warn.conflicts=F) if needed to keep the installs quiet.
Always use library. Never use require.
tl;dr: require breaks one of the fundamental rules of robust software systems: fail early.
In a nutshell, this is because, when using require, your code might yield different, erroneous results, without signalling an error. This is rare but not hypothetical! Consider this code, which yields different results depending on whether {dplyr} can be loaded:
require(dplyr)
x = data.frame(y = seq(100))
y = 1
filter(x, y == 1)
This can lead to subtly wrong results. Using library instead of require throws an error here, signalling clearly that something is wrong. This is good.
It also makes debugging all other failures more difficult: If you require a package at the start of your script and use its exports in line 500, you’ll get an error message “object ‘foo’ not found” in line 500, rather than an error “there is no package called ‘bla’”.
The only acceptable use case of require is when its return value is immediately checked, as some of the other answers show. This is a fairly common pattern but even in these cases it is better (and recommended, see below) to instead separate the existence check and the loading of the package. That is: use requireNamespace instead of require in these cases.
More technically, require actually calls library internally (if the package wasn’t already attached — require thus performs a redundant check, because library also checks whether the package was already loaded). Here’s a simplified implementation of require to illustrate what it does:
require = function (package) {
already_attached = paste('package:', package) %in% search()
if (already_attached) return(TRUE)
maybe_error = try(library(package, character.only = TRUE))
success = ! inherits(maybe_error, 'try-error')
if (! success) cat("Failed")
success
}
Experienced R developers agree:
Yihui Xie, author of {knitr}, {bookdown} and many other packages says:
Ladies and gentlemen, I've said this before: require() is the wrong way to load an R package; use library() instead
Hadley Wickham, author of more popular R packages than anybody else, says
Use library(x) in data analysis scripts. […]
You never need to use require() (requireNamespace() is almost always better)
?library
and you will see:
library(package) and require(package) both load the package with name
package and put it on the search list. require is designed for use
inside other functions; it returns FALSE and gives a warning (rather
than an error as library() does by default) if the package does not
exist. Both functions check and update the list of currently loaded
packages and do not reload a package which is already loaded. (If you
want to reload such a package, call detach(unload = TRUE) or
unloadNamespace first.) If you want to load a package without putting
it on the search list, use requireNamespace.
My initial theory about the difference was that library loads the packages whether it is already loaded or not, i.e. it might reload an already loaded package, while require just checks that it is loaded, or loads it if it isn't (thus the use in functions that rely on a certain package). The documentation refutes this, however, and explicitly states that neither function will reload an already loaded package.
Here seems to be the difference on an already loaded package.
While it is true that both require and library do not load the package. Library does a lot of other things before it checks and exits.
I would recommend removing "require" from the beginning of a function running 2mil times anyway, but if, for some reason I needed to keep it. require is technically a faster check.
microbenchmark(req = require(microbenchmark), lib = library(microbenchmark),times = 100000)
Unit: microseconds
expr min lq mean median uq max neval
req 3.676 5.181 6.596968 5.655 6.177 9456.006 1e+05
lib 17.192 19.887 27.302907 20.852 22.490 255665.881 1e+05

How to write a function that can load R package, if this package is not installed, install it automatically [duplicate]

I seem to be sharing a lot of code with coauthors these days. Many of them are novice/intermediate R users and don't realize that they have to install packages they don't already have.
Is there an elegant way to call installed.packages(), compare that to the ones I am loading and install if missing?
Yes. If you have your list of packages, compare it to the output from installed.packages()[,"Package"] and install the missing packages. Something like this:
list.of.packages <- c("ggplot2", "Rcpp")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
Otherwise:
If you put your code in a package and make them dependencies, then they will automatically be installed when you install your package.
Dason K. and I have the pacman package that can do this nicely. The function p_load in the package does this. The first line is just to ensure that pacman is installed.
if (!require("pacman")) install.packages("pacman")
pacman::p_load(package1, package2, package_n)
You can just use the return value of require:
if(!require(somepackage)){
install.packages("somepackage")
library(somepackage)
}
I use library after the install because it will throw an exception if the install wasn't successful or the package can't be loaded for some other reason. You make this more robust and reuseable:
dynamic_require <- function(package){
if(eval(parse(text=paste("require(",package,")")))) return(TRUE)
install.packages(package)
return(eval(parse(text=paste("require(",package,")"))))
}
The downside to this method is that you have to pass the package name in quotes, which you don't do for the real require.
A lot of the answers above (and on duplicates of this question) rely on installed.packages which is bad form. From the documentation:
This can be slow when thousands of packages are installed, so do not use this to find out if a named package is installed (use system.file or find.package) nor to find out if a package is usable (call require and check the return value) nor to find details of a small number of packages (use packageDescription). It needs to read several files per installed package, which will be slow on Windows and on some network-mounted file systems.
So, a better approach is to attempt to load the package using require and and install if loading fails (require will return FALSE if it isn't found). I prefer this implementation:
using<-function(...) {
libs<-unlist(list(...))
req<-unlist(lapply(libs,require,character.only=TRUE))
need<-libs[req==FALSE]
if(length(need)>0){
install.packages(need)
lapply(need,require,character.only=TRUE)
}
}
which can be used like this:
using("RCurl","ggplot2","jsonlite","magrittr")
This way it loads all the packages, then goes back and installs all the missing packages (which if you want, is a handy place to insert a prompt to ask if the user wants to install packages). Instead of calling install.packages separately for each package it passes the whole vector of uninstalled packages just once.
Here's the same function but with a windows dialog that asks if the user wants to install the missing packages
using<-function(...) {
libs<-unlist(list(...))
req<-unlist(lapply(libs,require,character.only=TRUE))
need<-libs[req==FALSE]
n<-length(need)
if(n>0){
libsmsg<-if(n>2) paste(paste(need[1:(n-1)],collapse=", "),",",sep="") else need[1]
print(libsmsg)
if(n>1){
libsmsg<-paste(libsmsg," and ", need[n],sep="")
}
libsmsg<-paste("The following packages could not be found: ",libsmsg,"\n\r\n\rInstall missing packages?",collapse="")
if(winDialog(type = c("yesno"), libsmsg)=="YES"){
install.packages(need)
lapply(need,require,character.only=TRUE)
}
}
}
if (!require('ggplot2')) install.packages('ggplot2'); library('ggplot2')
"ggplot2" is the package. It checks to see if the package is installed, if it is not it installs it. It then loads the package regardless of which branch it took.
TL;DR you can use find.package() for this.
Almost all the answers here rely on either (1) require() or (2) installed.packages() to check if a given package is already installed or not.
I'm adding an answer because these are unsatisfactory for a lightweight approach to answering this question.
require has the side effect of loading the package's namespace, which may not always be desirable
installed.packages is a bazooka to light a candle -- it will check the universe of installed packages first, then we check if our one (or few) package(s) are "in stock" at this library. No need to build a haystack just to find a needle.
This answer was also inspired by #ArtemKlevtsov's great answer in a similar spirit on a duplicated version of this question. He noted that system.file(package=x) can have the desired affect of returning '' if the package isn't installed, and something with nchar > 1 otherwise.
If we look under the hood of how system.file accomplishes this, we can see it uses a different base function, find.package, which we could use directly:
# a package that exists
find.package('data.table', quiet=TRUE)
# [1] "/Library/Frameworks/R.framework/Versions/4.0/Resources/library/data.table"
# a package that does not
find.package('InstantaneousWorldPeace', quiet=TRUE)
# character(0)
We can also look under the hood at find.package to see how it works, but this is mainly an instructive exercise -- the only ways to slim down the function that I see would be to skip some robustness checks. But the basic idea is: look in .libPaths() -- any installed package pkg will have a DESCRIPTION file at file.path(.libPaths(), pkg), so a quick-and-dirty check is file.exists(file.path(.libPaths(), pkg, 'DESCRIPTION').
This solution will take a character vector of package names and attempt to load them, or install them if loading fails. It relies on the return behaviour of require to do this because...
require returns (invisibly) a logical indicating whether the required package is available
Therefore we can simply see if we were able to load the required package and if not, install it with dependencies. So given a character vector of packages you wish to load...
foo <- function(x){
for( i in x ){
# require returns TRUE invisibly if it was able to load package
if( ! require( i , character.only = TRUE ) ){
# If package was not able to be loaded then re-install
install.packages( i , dependencies = TRUE )
# Load package after installing
require( i , character.only = TRUE )
}
}
}
# Then try/install packages...
foo( c("ggplot2" , "reshape2" , "data.table" ) )
Although the answer of Shane is really good, for one of my project I needed to remove the ouput messages, warnings and install packages automagically. I have finally managed to get this script:
InstalledPackage <- function(package)
{
available <- suppressMessages(suppressWarnings(sapply(package, require, quietly = TRUE, character.only = TRUE, warn.conflicts = FALSE)))
missing <- package[!available]
if (length(missing) > 0) return(FALSE)
return(TRUE)
}
CRANChoosen <- function()
{
return(getOption("repos")["CRAN"] != "#CRAN#")
}
UsePackage <- function(package, defaultCRANmirror = "http://cran.at.r-project.org")
{
if(!InstalledPackage(package))
{
if(!CRANChoosen())
{
chooseCRANmirror()
if(!CRANChoosen())
{
options(repos = c(CRAN = defaultCRANmirror))
}
}
suppressMessages(suppressWarnings(install.packages(package)))
if(!InstalledPackage(package)) return(FALSE)
}
return(TRUE)
}
Use:
libraries <- c("ReadImages", "ggplot2")
for(library in libraries)
{
if(!UsePackage(library))
{
stop("Error!", library)
}
}
# List of packages for session
.packages = c("ggplot2", "plyr", "rms")
# Install CRAN packages (if not already installed)
.inst <- .packages %in% installed.packages()
if(length(.packages[!.inst]) > 0) install.packages(.packages[!.inst])
# Load packages into session
lapply(.packages, require, character.only=TRUE)
Use packrat so that the shared libraries are exactly the same and not changing other's environment.
In terms of elegance and best practice I think you're fundamentally going about it the wrong way. The package packrat was designed for these issues. It is developed by RStudio by Hadley Wickham. Instead of them having to install dependencies and possibly mess up someone's environment system, packrat uses its own directory and installs all the dependencies for your programs in there and doesn't touch someone's environment.
Packrat is a dependency management system for R.
R package dependencies can be frustrating. Have you ever had to use trial-and-error to figure out what R packages you need to install to make someone else’s code work–and then been left with those packages globally installed forever, because now you’re not sure whether you need them? Have you ever updated a package to get code in one of your projects to work, only to find that the updated package makes code in another project stop working?
We built packrat to solve these problems. Use packrat to make your R projects more:
Isolated: Installing a new or updated package for one project won’t break your other projects, and vice versa. That’s because packrat gives each project its own private package library.
Portable: Easily transport your projects from one computer to another, even across different platforms. Packrat makes it easy to install the packages your project depends on.
Reproducible: Packrat records the exact package versions you depend on, and ensures those exact versions are the ones that get installed wherever you go.
https://rstudio.github.io/packrat/
This is the purpose of the rbundler package: to provide a way to control the packages that are installed for a specific project. Right now the package works with the devtools functionality to install packages to your project's directory. The functionality is similar to Ruby's bundler.
If your project is a package (recommended) then all you have to do is load rbundler and bundle the packages. The bundle function will look at your package's DESCRIPTION file to determine which packages to bundle.
library(rbundler)
bundle('.', repos="http://cran.us.r-project.org")
Now the packages will be installed in the .Rbundle directory.
If your project isn't a package, then you can fake it by creating a DESCRIPTION file in your project's root directory with a Depends field that lists the packages that you want installed (with optional version information):
Depends: ggplot2 (>= 0.9.2), arm, glmnet
Here's the github repo for the project if you're interested in contributing: rbundler.
You can simply use the setdiff function to get the packages that aren't installed and then install them. In the sample below, we check if the ggplot2 and Rcpp packages are installed before installing them.
unavailable <- setdiff(c("ggplot2", "Rcpp"), rownames(installed.packages()))
install.packages(unavailable)
In one line, the above can be written as:
install.packages(setdiff(c("ggplot2", "Rcpp"), rownames(installed.packages())))
The current version of RStudio (>=1.2) includes a feature to detect missing packages in library() and require() calls, and prompts the user to install them:
Detect missing R packages
Many R scripts open with calls to library() and require() to load the packages they need in order to execute. If you open an R script that references packages that you don’t have installed, RStudio will now offer to install all the needed packages in a single click. No more typing install.packages() repeatedly until the errors go away!
https://blog.rstudio.com/2018/11/19/rstudio-1-2-preview-the-little-things/
This seems to address the original concern of OP particularly well:
Many of them are novice/intermediate R users and don't realize that they have to install packages they don't already have.
Sure.
You need to compare 'installed packages' with 'desired packages'. That's very close to what I do with CRANberries as I need to compare 'stored known packages' with 'currently known packages' to determine new and/or updated packages.
So do something like
AP <- available.packages(contrib.url(repos[i,"url"])) # available t repos[i]
to get all known packages, simular call for currently installed packages and compare that to a given set of target packages.
The following simple function works like a charm:
usePackage<-function(p){
# load a package if installed, else load after installation.
# Args:
# p: package name in quotes
if (!is.element(p, installed.packages()[,1])){
print(paste('Package:',p,'Not found, Installing Now...'))
install.packages(p, dep = TRUE)}
print(paste('Loading Package :',p))
require(p, character.only = TRUE)
}
(not mine, found this on the web some time back and had been using it since then. not sure of the original source)
I use following function to install package if require("<package>") exits with package not found error. It will query both - CRAN and Bioconductor repositories for missing package.
Adapted from the original work by Joshua Wiley,
http://r.789695.n4.nabble.com/Install-package-automatically-if-not-there-td2267532.html
install.packages.auto <- function(x) {
x <- as.character(substitute(x))
if(isTRUE(x %in% .packages(all.available=TRUE))) {
eval(parse(text = sprintf("require(\"%s\")", x)))
} else {
#update.packages(ask= FALSE) #update installed packages.
eval(parse(text = sprintf("install.packages(\"%s\", dependencies = TRUE)", x)))
}
if(isTRUE(x %in% .packages(all.available=TRUE))) {
eval(parse(text = sprintf("require(\"%s\")", x)))
} else {
source("http://bioconductor.org/biocLite.R")
#biocLite(character(), ask=FALSE) #update installed packages.
eval(parse(text = sprintf("biocLite(\"%s\")", x)))
eval(parse(text = sprintf("require(\"%s\")", x)))
}
}
Example:
install.packages.auto(qvalue) # from bioconductor
install.packages.auto(rNMF) # from CRAN
PS: update.packages(ask = FALSE) & biocLite(character(), ask=FALSE) will update all installed packages on the system. This can take a long time and consider it as a full R upgrade which may not be warranted all the time!
Today, I stumbled on two handy function provided by the rlang package, namely, is_installed() and check_installed().
From the help page (emphasis added):
These functions check that packages are installed with minimal side effects. If installed, the packages will be loaded but not attached.
is_installed() doesn't interact with the user. It simply returns TRUE or FALSE depending on whether the packages are installed.
In interactive sessions, check_installed() asks the user whether to install missing packages. If the user accepts, the packages are installed [...]. If the session is non interactive or if the user chooses not to install the packages, the current evaluation is aborted.
interactive()
#> [1] FALSE
rlang::is_installed(c("dplyr"))
#> [1] TRUE
rlang::is_installed(c("foobarbaz"))
#> [1] FALSE
rlang::check_installed(c("dplyr"))
rlang::check_installed(c("foobarbaz"))
#> Error:
#> ! The package `foobarbaz` is required.
Created on 2022-03-25 by the reprex package (v2.0.1)
I have implemented the function to install and load required R packages silently. Hope might help. Here is the code:
# Function to Install and Load R Packages
Install_And_Load <- function(Required_Packages)
{
Remaining_Packages <- Required_Packages[!(Required_Packages %in% installed.packages()[,"Package"])];
if(length(Remaining_Packages))
{
install.packages(Remaining_Packages);
}
for(package_name in Required_Packages)
{
library(package_name,character.only=TRUE,quietly=TRUE);
}
}
# Specify the list of required packages to be installed and load
Required_Packages=c("ggplot2", "Rcpp");
# Call the Function
Install_And_Load(Required_Packages);
Quite basic one.
pkgs = c("pacman","data.table")
if(length(new.pkgs <- setdiff(pkgs, rownames(installed.packages())))) install.packages(new.pkgs)
Thought I'd contribute the one I use:
testin <- function(package){if (!package %in% installed.packages())
install.packages(package)}
testin("packagename")
Regarding your main objective " to install libraries they don't already have. " and regardless of using " instllaed.packages() ". The following function mask the original function of require. It tries to load and check the named package "x" , if it's not installed, install it directly including dependencies; and lastly load it normaly. you rename the function name from 'require' to 'library' to maintain integrity . The only limitation is packages names should be quoted.
require <- function(x) {
if (!base::require(x, character.only = TRUE)) {
install.packages(x, dep = TRUE) ;
base::require(x, character.only = TRUE)
}
}
So you can load and installed package the old fashion way of R.
require ("ggplot2")
require ("Rcpp")
48 lapply_install_and_load <- function (package1, ...)
49 {
50 #
51 # convert arguments to vector
52 #
53 packages <- c(package1, ...)
54 #
55 # check if loaded and installed
56 #
57 loaded <- packages %in% (.packages())
58 names(loaded) <- packages
59 #
60 installed <- packages %in% rownames(installed.packages())
61 names(installed) <- packages
62 #
63 # start loop to determine if each package is installed
64 #
65 load_it <- function (p, loaded, installed)
66 {
67 if (loaded[p])
68 {
69 print(paste(p, "loaded"))
70 }
71 else
72 {
73 print(paste(p, "not loaded"))
74 if (installed[p])
75 {
76 print(paste(p, "installed"))
77 do.call("library", list(p))
78 }
79 else
80 {
81 print(paste(p, "not installed"))
82 install.packages(p)
83 do.call("library", list(p))
84 }
85 }
86 }
87 #
88 lapply(packages, load_it, loaded, installed)
89 }
source("https://bioconductor.org/biocLite.R")
if (!require("ggsci")) biocLite("ggsci")
Using lapply family and anonymous function approach you may:
Try to attach all listed packages.
Install missing only (using || lazy evaluation).
Attempt to attach again those were missing in step 1 and installed in step 2.
Print each package final load status (TRUE / FALSE).
req <- substitute(require(x, character.only = TRUE))
lbs <- c("plyr", "psych", "tm")
sapply(lbs, function(x) eval(req) || {install.packages(x); eval(req)})
plyr psych tm
TRUE TRUE TRUE
I use the following which will check if package is installed and if dependencies are updated, then loads the package.
p<-c('ggplot2','Rcpp')
install_package<-function(pack)
{if(!(pack %in% row.names(installed.packages())))
{
update.packages(ask=F)
install.packages(pack,dependencies=T)
}
require(pack,character.only=TRUE)
}
for(pack in p) {install_package(pack)}
completeFun <- function(data, desiredCols) {
completeVec <- complete.cases(data[, desiredCols])
return(data[completeVec, ])
}
Here's my code for it:
packages <- c("dplyr", "gridBase", "gridExtra")
package_loader <- function(x){
for (i in 1:length(x)){
if (!identical((x[i], installed.packages()[x[i],1])){
install.packages(x[i], dep = TRUE)
} else {
require(x[i], character.only = TRUE)
}
}
}
package_loader(packages)
library <- function(x){
x = toString(substitute(x))
if(!require(x,character.only=TRUE)){
install.packages(x)
base::library(x,character.only=TRUE)
}}
This works with unquoted package names and is fairly elegant (cf. GeoObserver's answer)
In my case, I wanted a one liner that I could run from the commandline (actually via a Makefile). Here is an example installing "VGAM" and "feather" if they are not already installed:
R -e 'for (p in c("VGAM", "feather")) if (!require(p, character.only=TRUE)) install.packages(p, repos="http://cran.us.r-project.org")'
From within R it would just be:
for (p in c("VGAM", "feather")) if (!require(p, character.only=TRUE)) install.packages(p, repos="http://cran.us.r-project.org")
There is nothing here beyond the previous solutions except that:
I keep it to a single line
I hard code the repos parameter (to avoid any popups asking about the mirror to use)
I don't bother to define a function to be used elsewhere
Also note the important character.only=TRUE (without it, the require would try to load the package p).
Let me share a bit of madness:
c("ggplot2","ggsci", "hrbrthemes", "gghighlight", "dplyr") %>% # What will you need to load for this script?
(function (x) ifelse(t =!(x %in% installed.packages()),
install.packages(x[t]),
lapply(x, require)))
There is a new-ish package (I am a codeveloper), Require, that is intended to be part of a reproducible workflow, meaning the function produces the same output the first time it is run or subsequent times, i.e., the end-state is the same regardless of starting state. The following installs any missing packages (I include require = FALSE to strictly address the original question... normally I leave this on the default because I will generally want them loaded to the search path).
These two lines are at the top of every script I write (adjusting the package selection as necessary), allowing the script to be used by anybody in any condition (including any or all dependencies missing).
if (!require("Require")) install.packages("Require")
Require::Require(c("ggplot2", "Rcpp"), require = FALSE)
You can thus use this in your script or pass it anyone.

Automatically install list of packages in R if necessary

I would like to check, at the beginning of my R script, whether the required packages are installed and, if not, install them.
I would like to use something like the following:
RequiredPackages <- c("stockPortfolio","quadprog")
for (i in RequiredPackages) { #Installs packages if not yet installed
if (!require(i)) install.packages(i)
}
However, this gives me error messages because R tries to install a package named 'i'. If instead I use...
if (!require(i)) install.packages(get(i))
...in the relevant line, I still get error messages.
Anybody know how to solve this?
Although the problem has been solved by #Thomas's answer, I would like to point out that pacman might be a better yet simple choice:
First install pacman:
install.packages("pacman")
Then load packages. Pacman will check whether each package has been installed, and if not, will install it automatically.
pacman::p_load("stockPortfolio","quadprog")
That's it.
Relevant links:
pacman GitHub page
Introduction to pacman
I think this is close to what you want:
requiredPackages <- c("stockPortfolio","quadprog")
for (package in requiredPackages) { #Installs packages if not yet installed
if (!requireNamespace(package, quietly = TRUE))
install.packages(package)
}
HERE is the source code and an explanation of the requireNamespace function.
Both library and require use non-standard evaluation on their first argument by default. This makes them hard to use in programming. However, they both take a character.only argument (Default is FALSE), which you can use to achieve your result:
RequiredPackages <- c("stockPortfolio","quadprog")
for (i in RequiredPackages) { #Installs packages if not yet installed
if (!require(i, character.only = TRUE)) install.packages(i)
}
I have by now written the following function (and put it into a package), which essentially does what #Thomas and #federico propose:
SPLoadPackages<-function(packages){
for(fP in packages){
eval(parse(text="if(!require("%_%fP%_%")) install.packages('"%_%fP%_%"')"))
eval(parse(text="library("%_%fP%_%")"))
}
}

What is the difference between require() and library()?

What is the difference between require() and library()?
There's not much of one in everyday work.
However, according to the documentation for both functions (accessed by putting a ? before the function name and hitting enter), require is used inside functions, as it outputs a warning and continues if the package is not found, whereas library will throw an error.
Another benefit of require() is that it returns a logical value by default. TRUE if the packages is loaded, FALSE if it isn't.
> test <- library("abc")
Error in library("abc") : there is no package called 'abc'
> test
Error: object 'test' not found
> test <- require("abc")
Loading required package: abc
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
there is no package called 'abc'
> test
[1] FALSE
So you can use require() in constructions like the one below. Which mainly handy if you want to distribute your code to our R installation were packages might not be installed.
if(require("lme4")){
print("lme4 is loaded correctly")
} else {
print("trying to install lme4")
install.packages("lme4")
if(require(lme4)){
print("lme4 installed and loaded")
} else {
stop("could not install lme4")
}
}
In addition to the good advice already given, I would add this:
It is probably best to avoid using require() unless you actually will be using the value it returns e.g in some error checking loop such as given by thierry.
In most other cases it is better to use library(), because this will give an error message at package loading time if the package is not available. require() will just fail without an error if the package is not there. This is the best time to find out if the package needs to be installed (or perhaps doesn't even exist because it it spelled wrong). Getting error feedback early and at the relevant time will avoid possible headaches with tracking down why later code fails when it attempts to use library routines
You can use require() if you want to install packages if and only if necessary, such as:
if (!require(package, character.only=T, quietly=T)) {
install.packages(package)
library(package, character.only=T)
}
For multiple packages you can use
for (package in c('<package1>', '<package2>')) {
if (!require(package, character.only=T, quietly=T)) {
install.packages(package)
library(package, character.only=T)
}
}
Pro tips:
When used inside the script, you can avoid a dialog screen by specifying the repos parameter of install.packages(), such as
install.packages(package, repos="http://cran.us.r-project.org")
You can wrap require() and library() in suppressPackageStartupMessages() to, well, suppress package startup messages, and also use the parameters require(..., quietly=T, warn.conflicts=F) if needed to keep the installs quiet.
Always use library. Never use require.
tl;dr: require breaks one of the fundamental rules of robust software systems: fail early.
In a nutshell, this is because, when using require, your code might yield different, erroneous results, without signalling an error. This is rare but not hypothetical! Consider this code, which yields different results depending on whether {dplyr} can be loaded:
require(dplyr)
x = data.frame(y = seq(100))
y = 1
filter(x, y == 1)
This can lead to subtly wrong results. Using library instead of require throws an error here, signalling clearly that something is wrong. This is good.
It also makes debugging all other failures more difficult: If you require a package at the start of your script and use its exports in line 500, you’ll get an error message “object ‘foo’ not found” in line 500, rather than an error “there is no package called ‘bla’”.
The only acceptable use case of require is when its return value is immediately checked, as some of the other answers show. This is a fairly common pattern but even in these cases it is better (and recommended, see below) to instead separate the existence check and the loading of the package. That is: use requireNamespace instead of require in these cases.
More technically, require actually calls library internally (if the package wasn’t already attached — require thus performs a redundant check, because library also checks whether the package was already loaded). Here’s a simplified implementation of require to illustrate what it does:
require = function (package) {
already_attached = paste('package:', package) %in% search()
if (already_attached) return(TRUE)
maybe_error = try(library(package, character.only = TRUE))
success = ! inherits(maybe_error, 'try-error')
if (! success) cat("Failed")
success
}
Experienced R developers agree:
Yihui Xie, author of {knitr}, {bookdown} and many other packages says:
Ladies and gentlemen, I've said this before: require() is the wrong way to load an R package; use library() instead
Hadley Wickham, author of more popular R packages than anybody else, says
Use library(x) in data analysis scripts. […]
You never need to use require() (requireNamespace() is almost always better)
?library
and you will see:
library(package) and require(package) both load the package with name
package and put it on the search list. require is designed for use
inside other functions; it returns FALSE and gives a warning (rather
than an error as library() does by default) if the package does not
exist. Both functions check and update the list of currently loaded
packages and do not reload a package which is already loaded. (If you
want to reload such a package, call detach(unload = TRUE) or
unloadNamespace first.) If you want to load a package without putting
it on the search list, use requireNamespace.
My initial theory about the difference was that library loads the packages whether it is already loaded or not, i.e. it might reload an already loaded package, while require just checks that it is loaded, or loads it if it isn't (thus the use in functions that rely on a certain package). The documentation refutes this, however, and explicitly states that neither function will reload an already loaded package.
Here seems to be the difference on an already loaded package.
While it is true that both require and library do not load the package. Library does a lot of other things before it checks and exits.
I would recommend removing "require" from the beginning of a function running 2mil times anyway, but if, for some reason I needed to keep it. require is technically a faster check.
microbenchmark(req = require(microbenchmark), lib = library(microbenchmark),times = 100000)
Unit: microseconds
expr min lq mean median uq max neval
req 3.676 5.181 6.596968 5.655 6.177 9456.006 1e+05
lib 17.192 19.887 27.302907 20.852 22.490 255665.881 1e+05

Elegant way to check for missing packages and install them?

I seem to be sharing a lot of code with coauthors these days. Many of them are novice/intermediate R users and don't realize that they have to install packages they don't already have.
Is there an elegant way to call installed.packages(), compare that to the ones I am loading and install if missing?
Yes. If you have your list of packages, compare it to the output from installed.packages()[,"Package"] and install the missing packages. Something like this:
list.of.packages <- c("ggplot2", "Rcpp")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
Otherwise:
If you put your code in a package and make them dependencies, then they will automatically be installed when you install your package.
Dason K. and I have the pacman package that can do this nicely. The function p_load in the package does this. The first line is just to ensure that pacman is installed.
if (!require("pacman")) install.packages("pacman")
pacman::p_load(package1, package2, package_n)
You can just use the return value of require:
if(!require(somepackage)){
install.packages("somepackage")
library(somepackage)
}
I use library after the install because it will throw an exception if the install wasn't successful or the package can't be loaded for some other reason. You make this more robust and reuseable:
dynamic_require <- function(package){
if(eval(parse(text=paste("require(",package,")")))) return(TRUE)
install.packages(package)
return(eval(parse(text=paste("require(",package,")"))))
}
The downside to this method is that you have to pass the package name in quotes, which you don't do for the real require.
A lot of the answers above (and on duplicates of this question) rely on installed.packages which is bad form. From the documentation:
This can be slow when thousands of packages are installed, so do not use this to find out if a named package is installed (use system.file or find.package) nor to find out if a package is usable (call require and check the return value) nor to find details of a small number of packages (use packageDescription). It needs to read several files per installed package, which will be slow on Windows and on some network-mounted file systems.
So, a better approach is to attempt to load the package using require and and install if loading fails (require will return FALSE if it isn't found). I prefer this implementation:
using<-function(...) {
libs<-unlist(list(...))
req<-unlist(lapply(libs,require,character.only=TRUE))
need<-libs[req==FALSE]
if(length(need)>0){
install.packages(need)
lapply(need,require,character.only=TRUE)
}
}
which can be used like this:
using("RCurl","ggplot2","jsonlite","magrittr")
This way it loads all the packages, then goes back and installs all the missing packages (which if you want, is a handy place to insert a prompt to ask if the user wants to install packages). Instead of calling install.packages separately for each package it passes the whole vector of uninstalled packages just once.
Here's the same function but with a windows dialog that asks if the user wants to install the missing packages
using<-function(...) {
libs<-unlist(list(...))
req<-unlist(lapply(libs,require,character.only=TRUE))
need<-libs[req==FALSE]
n<-length(need)
if(n>0){
libsmsg<-if(n>2) paste(paste(need[1:(n-1)],collapse=", "),",",sep="") else need[1]
print(libsmsg)
if(n>1){
libsmsg<-paste(libsmsg," and ", need[n],sep="")
}
libsmsg<-paste("The following packages could not be found: ",libsmsg,"\n\r\n\rInstall missing packages?",collapse="")
if(winDialog(type = c("yesno"), libsmsg)=="YES"){
install.packages(need)
lapply(need,require,character.only=TRUE)
}
}
}
if (!require('ggplot2')) install.packages('ggplot2'); library('ggplot2')
"ggplot2" is the package. It checks to see if the package is installed, if it is not it installs it. It then loads the package regardless of which branch it took.
TL;DR you can use find.package() for this.
Almost all the answers here rely on either (1) require() or (2) installed.packages() to check if a given package is already installed or not.
I'm adding an answer because these are unsatisfactory for a lightweight approach to answering this question.
require has the side effect of loading the package's namespace, which may not always be desirable
installed.packages is a bazooka to light a candle -- it will check the universe of installed packages first, then we check if our one (or few) package(s) are "in stock" at this library. No need to build a haystack just to find a needle.
This answer was also inspired by #ArtemKlevtsov's great answer in a similar spirit on a duplicated version of this question. He noted that system.file(package=x) can have the desired affect of returning '' if the package isn't installed, and something with nchar > 1 otherwise.
If we look under the hood of how system.file accomplishes this, we can see it uses a different base function, find.package, which we could use directly:
# a package that exists
find.package('data.table', quiet=TRUE)
# [1] "/Library/Frameworks/R.framework/Versions/4.0/Resources/library/data.table"
# a package that does not
find.package('InstantaneousWorldPeace', quiet=TRUE)
# character(0)
We can also look under the hood at find.package to see how it works, but this is mainly an instructive exercise -- the only ways to slim down the function that I see would be to skip some robustness checks. But the basic idea is: look in .libPaths() -- any installed package pkg will have a DESCRIPTION file at file.path(.libPaths(), pkg), so a quick-and-dirty check is file.exists(file.path(.libPaths(), pkg, 'DESCRIPTION').
This solution will take a character vector of package names and attempt to load them, or install them if loading fails. It relies on the return behaviour of require to do this because...
require returns (invisibly) a logical indicating whether the required package is available
Therefore we can simply see if we were able to load the required package and if not, install it with dependencies. So given a character vector of packages you wish to load...
foo <- function(x){
for( i in x ){
# require returns TRUE invisibly if it was able to load package
if( ! require( i , character.only = TRUE ) ){
# If package was not able to be loaded then re-install
install.packages( i , dependencies = TRUE )
# Load package after installing
require( i , character.only = TRUE )
}
}
}
# Then try/install packages...
foo( c("ggplot2" , "reshape2" , "data.table" ) )
Although the answer of Shane is really good, for one of my project I needed to remove the ouput messages, warnings and install packages automagically. I have finally managed to get this script:
InstalledPackage <- function(package)
{
available <- suppressMessages(suppressWarnings(sapply(package, require, quietly = TRUE, character.only = TRUE, warn.conflicts = FALSE)))
missing <- package[!available]
if (length(missing) > 0) return(FALSE)
return(TRUE)
}
CRANChoosen <- function()
{
return(getOption("repos")["CRAN"] != "#CRAN#")
}
UsePackage <- function(package, defaultCRANmirror = "http://cran.at.r-project.org")
{
if(!InstalledPackage(package))
{
if(!CRANChoosen())
{
chooseCRANmirror()
if(!CRANChoosen())
{
options(repos = c(CRAN = defaultCRANmirror))
}
}
suppressMessages(suppressWarnings(install.packages(package)))
if(!InstalledPackage(package)) return(FALSE)
}
return(TRUE)
}
Use:
libraries <- c("ReadImages", "ggplot2")
for(library in libraries)
{
if(!UsePackage(library))
{
stop("Error!", library)
}
}
# List of packages for session
.packages = c("ggplot2", "plyr", "rms")
# Install CRAN packages (if not already installed)
.inst <- .packages %in% installed.packages()
if(length(.packages[!.inst]) > 0) install.packages(.packages[!.inst])
# Load packages into session
lapply(.packages, require, character.only=TRUE)
Use packrat so that the shared libraries are exactly the same and not changing other's environment.
In terms of elegance and best practice I think you're fundamentally going about it the wrong way. The package packrat was designed for these issues. It is developed by RStudio by Hadley Wickham. Instead of them having to install dependencies and possibly mess up someone's environment system, packrat uses its own directory and installs all the dependencies for your programs in there and doesn't touch someone's environment.
Packrat is a dependency management system for R.
R package dependencies can be frustrating. Have you ever had to use trial-and-error to figure out what R packages you need to install to make someone else’s code work–and then been left with those packages globally installed forever, because now you’re not sure whether you need them? Have you ever updated a package to get code in one of your projects to work, only to find that the updated package makes code in another project stop working?
We built packrat to solve these problems. Use packrat to make your R projects more:
Isolated: Installing a new or updated package for one project won’t break your other projects, and vice versa. That’s because packrat gives each project its own private package library.
Portable: Easily transport your projects from one computer to another, even across different platforms. Packrat makes it easy to install the packages your project depends on.
Reproducible: Packrat records the exact package versions you depend on, and ensures those exact versions are the ones that get installed wherever you go.
https://rstudio.github.io/packrat/
This is the purpose of the rbundler package: to provide a way to control the packages that are installed for a specific project. Right now the package works with the devtools functionality to install packages to your project's directory. The functionality is similar to Ruby's bundler.
If your project is a package (recommended) then all you have to do is load rbundler and bundle the packages. The bundle function will look at your package's DESCRIPTION file to determine which packages to bundle.
library(rbundler)
bundle('.', repos="http://cran.us.r-project.org")
Now the packages will be installed in the .Rbundle directory.
If your project isn't a package, then you can fake it by creating a DESCRIPTION file in your project's root directory with a Depends field that lists the packages that you want installed (with optional version information):
Depends: ggplot2 (>= 0.9.2), arm, glmnet
Here's the github repo for the project if you're interested in contributing: rbundler.
You can simply use the setdiff function to get the packages that aren't installed and then install them. In the sample below, we check if the ggplot2 and Rcpp packages are installed before installing them.
unavailable <- setdiff(c("ggplot2", "Rcpp"), rownames(installed.packages()))
install.packages(unavailable)
In one line, the above can be written as:
install.packages(setdiff(c("ggplot2", "Rcpp"), rownames(installed.packages())))
The current version of RStudio (>=1.2) includes a feature to detect missing packages in library() and require() calls, and prompts the user to install them:
Detect missing R packages
Many R scripts open with calls to library() and require() to load the packages they need in order to execute. If you open an R script that references packages that you don’t have installed, RStudio will now offer to install all the needed packages in a single click. No more typing install.packages() repeatedly until the errors go away!
https://blog.rstudio.com/2018/11/19/rstudio-1-2-preview-the-little-things/
This seems to address the original concern of OP particularly well:
Many of them are novice/intermediate R users and don't realize that they have to install packages they don't already have.
Sure.
You need to compare 'installed packages' with 'desired packages'. That's very close to what I do with CRANberries as I need to compare 'stored known packages' with 'currently known packages' to determine new and/or updated packages.
So do something like
AP <- available.packages(contrib.url(repos[i,"url"])) # available t repos[i]
to get all known packages, simular call for currently installed packages and compare that to a given set of target packages.
The following simple function works like a charm:
usePackage<-function(p){
# load a package if installed, else load after installation.
# Args:
# p: package name in quotes
if (!is.element(p, installed.packages()[,1])){
print(paste('Package:',p,'Not found, Installing Now...'))
install.packages(p, dep = TRUE)}
print(paste('Loading Package :',p))
require(p, character.only = TRUE)
}
(not mine, found this on the web some time back and had been using it since then. not sure of the original source)
I use following function to install package if require("<package>") exits with package not found error. It will query both - CRAN and Bioconductor repositories for missing package.
Adapted from the original work by Joshua Wiley,
http://r.789695.n4.nabble.com/Install-package-automatically-if-not-there-td2267532.html
install.packages.auto <- function(x) {
x <- as.character(substitute(x))
if(isTRUE(x %in% .packages(all.available=TRUE))) {
eval(parse(text = sprintf("require(\"%s\")", x)))
} else {
#update.packages(ask= FALSE) #update installed packages.
eval(parse(text = sprintf("install.packages(\"%s\", dependencies = TRUE)", x)))
}
if(isTRUE(x %in% .packages(all.available=TRUE))) {
eval(parse(text = sprintf("require(\"%s\")", x)))
} else {
source("http://bioconductor.org/biocLite.R")
#biocLite(character(), ask=FALSE) #update installed packages.
eval(parse(text = sprintf("biocLite(\"%s\")", x)))
eval(parse(text = sprintf("require(\"%s\")", x)))
}
}
Example:
install.packages.auto(qvalue) # from bioconductor
install.packages.auto(rNMF) # from CRAN
PS: update.packages(ask = FALSE) & biocLite(character(), ask=FALSE) will update all installed packages on the system. This can take a long time and consider it as a full R upgrade which may not be warranted all the time!
Today, I stumbled on two handy function provided by the rlang package, namely, is_installed() and check_installed().
From the help page (emphasis added):
These functions check that packages are installed with minimal side effects. If installed, the packages will be loaded but not attached.
is_installed() doesn't interact with the user. It simply returns TRUE or FALSE depending on whether the packages are installed.
In interactive sessions, check_installed() asks the user whether to install missing packages. If the user accepts, the packages are installed [...]. If the session is non interactive or if the user chooses not to install the packages, the current evaluation is aborted.
interactive()
#> [1] FALSE
rlang::is_installed(c("dplyr"))
#> [1] TRUE
rlang::is_installed(c("foobarbaz"))
#> [1] FALSE
rlang::check_installed(c("dplyr"))
rlang::check_installed(c("foobarbaz"))
#> Error:
#> ! The package `foobarbaz` is required.
Created on 2022-03-25 by the reprex package (v2.0.1)
I have implemented the function to install and load required R packages silently. Hope might help. Here is the code:
# Function to Install and Load R Packages
Install_And_Load <- function(Required_Packages)
{
Remaining_Packages <- Required_Packages[!(Required_Packages %in% installed.packages()[,"Package"])];
if(length(Remaining_Packages))
{
install.packages(Remaining_Packages);
}
for(package_name in Required_Packages)
{
library(package_name,character.only=TRUE,quietly=TRUE);
}
}
# Specify the list of required packages to be installed and load
Required_Packages=c("ggplot2", "Rcpp");
# Call the Function
Install_And_Load(Required_Packages);
Quite basic one.
pkgs = c("pacman","data.table")
if(length(new.pkgs <- setdiff(pkgs, rownames(installed.packages())))) install.packages(new.pkgs)
Thought I'd contribute the one I use:
testin <- function(package){if (!package %in% installed.packages())
install.packages(package)}
testin("packagename")
Regarding your main objective " to install libraries they don't already have. " and regardless of using " instllaed.packages() ". The following function mask the original function of require. It tries to load and check the named package "x" , if it's not installed, install it directly including dependencies; and lastly load it normaly. you rename the function name from 'require' to 'library' to maintain integrity . The only limitation is packages names should be quoted.
require <- function(x) {
if (!base::require(x, character.only = TRUE)) {
install.packages(x, dep = TRUE) ;
base::require(x, character.only = TRUE)
}
}
So you can load and installed package the old fashion way of R.
require ("ggplot2")
require ("Rcpp")
48 lapply_install_and_load <- function (package1, ...)
49 {
50 #
51 # convert arguments to vector
52 #
53 packages <- c(package1, ...)
54 #
55 # check if loaded and installed
56 #
57 loaded <- packages %in% (.packages())
58 names(loaded) <- packages
59 #
60 installed <- packages %in% rownames(installed.packages())
61 names(installed) <- packages
62 #
63 # start loop to determine if each package is installed
64 #
65 load_it <- function (p, loaded, installed)
66 {
67 if (loaded[p])
68 {
69 print(paste(p, "loaded"))
70 }
71 else
72 {
73 print(paste(p, "not loaded"))
74 if (installed[p])
75 {
76 print(paste(p, "installed"))
77 do.call("library", list(p))
78 }
79 else
80 {
81 print(paste(p, "not installed"))
82 install.packages(p)
83 do.call("library", list(p))
84 }
85 }
86 }
87 #
88 lapply(packages, load_it, loaded, installed)
89 }
source("https://bioconductor.org/biocLite.R")
if (!require("ggsci")) biocLite("ggsci")
Using lapply family and anonymous function approach you may:
Try to attach all listed packages.
Install missing only (using || lazy evaluation).
Attempt to attach again those were missing in step 1 and installed in step 2.
Print each package final load status (TRUE / FALSE).
req <- substitute(require(x, character.only = TRUE))
lbs <- c("plyr", "psych", "tm")
sapply(lbs, function(x) eval(req) || {install.packages(x); eval(req)})
plyr psych tm
TRUE TRUE TRUE
I use the following which will check if package is installed and if dependencies are updated, then loads the package.
p<-c('ggplot2','Rcpp')
install_package<-function(pack)
{if(!(pack %in% row.names(installed.packages())))
{
update.packages(ask=F)
install.packages(pack,dependencies=T)
}
require(pack,character.only=TRUE)
}
for(pack in p) {install_package(pack)}
completeFun <- function(data, desiredCols) {
completeVec <- complete.cases(data[, desiredCols])
return(data[completeVec, ])
}
Here's my code for it:
packages <- c("dplyr", "gridBase", "gridExtra")
package_loader <- function(x){
for (i in 1:length(x)){
if (!identical((x[i], installed.packages()[x[i],1])){
install.packages(x[i], dep = TRUE)
} else {
require(x[i], character.only = TRUE)
}
}
}
package_loader(packages)
library <- function(x){
x = toString(substitute(x))
if(!require(x,character.only=TRUE)){
install.packages(x)
base::library(x,character.only=TRUE)
}}
This works with unquoted package names and is fairly elegant (cf. GeoObserver's answer)
In my case, I wanted a one liner that I could run from the commandline (actually via a Makefile). Here is an example installing "VGAM" and "feather" if they are not already installed:
R -e 'for (p in c("VGAM", "feather")) if (!require(p, character.only=TRUE)) install.packages(p, repos="http://cran.us.r-project.org")'
From within R it would just be:
for (p in c("VGAM", "feather")) if (!require(p, character.only=TRUE)) install.packages(p, repos="http://cran.us.r-project.org")
There is nothing here beyond the previous solutions except that:
I keep it to a single line
I hard code the repos parameter (to avoid any popups asking about the mirror to use)
I don't bother to define a function to be used elsewhere
Also note the important character.only=TRUE (without it, the require would try to load the package p).
Let me share a bit of madness:
c("ggplot2","ggsci", "hrbrthemes", "gghighlight", "dplyr") %>% # What will you need to load for this script?
(function (x) ifelse(t =!(x %in% installed.packages()),
install.packages(x[t]),
lapply(x, require)))
There is a new-ish package (I am a codeveloper), Require, that is intended to be part of a reproducible workflow, meaning the function produces the same output the first time it is run or subsequent times, i.e., the end-state is the same regardless of starting state. The following installs any missing packages (I include require = FALSE to strictly address the original question... normally I leave this on the default because I will generally want them loaded to the search path).
These two lines are at the top of every script I write (adjusting the package selection as necessary), allowing the script to be used by anybody in any condition (including any or all dependencies missing).
if (!require("Require")) install.packages("Require")
Require::Require(c("ggplot2", "Rcpp"), require = FALSE)
You can thus use this in your script or pass it anyone.

Resources