How to setup a local repository for R package? - r

I want to setup local repository for R package, I'd like the repository works like sonatype nexus(it can proxy the central repository, and cache the artifacts after downloading the artifact from central repository).
Currently nexus does not support R repository format, so it doesn't suite what I needed to do.
Is that any existing solution for creating this repository? I don't want to create a CRAN mirror, which is too heavy for me.

First, you'll want to make sure you have the following path and its directories in your system: "/R/src/contrib". If you don't have this path and these directories, you'll need to create them. All of your R packages files will be stored in the "contrib" directory.
Once you've added package files to the "contrib" directory, you can use the setRepositories function from the utils package to create the repository. I'd recommend adding the following code to your .Rprofile for a local repository:
utils::setRepositories(ind = 0, addURLs = c(WORK = "file://<your higher directories>/R"))
After editing your .Rprofile, restart R.
ind = 0 will indicate that you only want the local repository. Additional repositories can be included in the addURLs = option and are comma separated within the character vector.
Next, create the repository index with the following code:
tools::write_PACKAGES("/<your higher directories>/R/src/contrib", verbose = TRUE)
This will generate the PACKAGE files that serve as the repository index.
To see what packages are in your repository, run the following code and take a look at the resulting data frame: my_packages <- available.packages()
At this point, you can install packages from the repo without referencing the entire path to the package installation file. For example, to install the dplyr package, you could run the following:
install.packages("dplyr", dependencies = TRUE)
If you want to take it a step further and manage a changing repository, you could install and use the miniCRAN package. Otherwise, you'll need to execute the write_PACKAGES function whenever your repository changes.
After installing the miniCRAN package, you can execute the following code to create the miniCRAN repo:
my_packages <- available.packages()
miniCRAN::makeRepo(
pkgs = my_packages[,1,
path = "/<your higher directories>/R",
repos = getOption("repos"),
type = "source",
Rversion = R.version,
download = TRUE,
writePACKAGES = TRUE,
quiet = FALSE
)
You only need to execute the code above once for each repo.
Then, check to make sure each miniCRAN repo has been created. You only need to do this once for each repo:
pkgAvail(
repos = getOption("repos"),
type = "source",
Rversion = R.version,
quiet = FALSE
)
Whenever new package files are placed into the local repo you can update the local repo's index as follows:
miniCRAN::updateRepoIndex("/<your higher directories>/R/")
Finally, as an optional step to see if the new package is in the index, create a data frame of the available packages and search the data frame:
my_packages <- available.packages(repos = "file://<your higher directories>/R/")
This approach has worked pretty well for me, but perhaps others have comments and suggestions that could improve upon it.

Related

How to install a series of Julia packages from a file

In Python, if I install mambaforge or conda then I can make a file with extension .yml and then inside it list the name of packages I want to install alongside their specific versions. How can I do a similar way of installing packages in Julia?
I understand that if I have already installed Julia packages by addcommand in package manager, then I have a file named Project.toml which I can use to install the same packages later. However, this still does not look as good as Python's way of installing packages.
Upon further investigation I realized that to install Julia packages from an empty Prokect.tomlfile, I should add [deps]in the file followed by the name of packages I want and then give each package a uuidwhich can be found here. For example:
[deps]
Images = "916415d5-f1e6-5110-898d-aaa5f9f070e0"
After all , this is still tedious as it needs to find all those uuids.
How can I install packages in Julia the same way I described for Python?
Is there a particular reason that you want to write package names to a .yml file and then read the packages from there? After all, you can generate the Project file and add multiple dependencies automatically:
(#v1.8) pkg> generate MyProject # or whatever name you like
(#v1.8) pkg> activate MyProject
(MyProject) pkg> add Countries Crayons CSV # some example packages
(In recent versions of Julia, an installation prompt will appear if a package isn't already installed).
Speaking from experience, learning to use environments in Julia can be challenging to a new user, but rewarding! The documentation for Pkg.jl are helpful here.
If you are just assembling an environment for your own code, there is probably no need for you to manually edit Project.toml. On the other hand, if you are maintaining a package, you might wish to edit the file directly (e.g., for specifying compatability).
Maybe you can use this:
using TOML
using HTTP
using DataStructures
## Get the Registry.toml file if it already doesn't exist in the current directory
function raed_reg()
if !isfile("Registry.toml")
HTTP.download(
"https://raw.githubusercontent.com/JuliaRegistries/General/master/Registry.toml",
"Registry.toml"
)
end
reg = TOML.parsefile("Registry.toml")["packages"]
return reg
end
## Find the UUID of a specific package from the Registry.toml file
function get_uuid(pkg_name)
reg = raed_reg()
for (uuid, pkg) in reg
if pkg["name"] == pkg_name
return uuid
end
end
end
## Create a dictionary for the gotten UUIDs, by setting the name = UUID and convert it to a project.toml file
function create_project_toml(Pkgs::Vector{String})
reg = raed_reg()
pkgs_names_uuid = OrderedDict{AbstractString, AbstractString}()
pkgs_names_uuid["[deps]"] = ""
for pkg in Pkgs
_uuid = get_uuid(pkg)
pkgs_names_uuid[pkg] = _uuid
end
open("project.toml", "w") do io
TOML.print(io, pkgs_names_uuid)
end
end
## Test on the packages "ClusterAnalysis" and "EvoTrees"
create_project_toml(["ClusterAnalysis", "EvoTrees"])

How to use install_github() for .tar.gz files hosted on Github

A friend has created this file: "their_tar_0.1.tar.gz". He told me that i can install it locally from this file alone, so i did via:
install.packages("their_tar_0.1.tar.gz", repos = NULL)
This worked absolutely fine with no issues. My issue is: I'd like to be able to host this on github publicly. I've saved this file within my github account in the regular format as: "author/package", i.e. "my_github_acc_name/their_tar_0.1". However, it seems that .tar.gz file is actually in the 'master' branch of this repo. Either way, when i try to install it via:
install_github("my_github_acc_name/their_tar_0.1")
or
install_github("my_github_acc_name/their_tar_0.1.tar.gz")
or
install_github("my_github_acc_name/their_tar_0.1\their_tar_0.1.tar.gz")
Which gives me the following error when i run it in RStudio:
*Downloading GitHub repo my_github_acc_name/their_tar_0.1#master
Error: Failed to install 'their_tar_0.1' from GitHub:
Does not appear to be an R package (no DESCRIPTION)*
Any ideas what my problem is? I've looked at the package and various questions online, however none of the questions seem to relate specifically .tar.gz file types.
Any help would be greatly appreciated.
Thanks, Dave
https://rawgit.com/rstudio/cheatsheets/master/package-development.pdf
From help("install.packages"):
Usage
install.packages(pkgs, lib, repos = getOption("repos"),
[... some content omitted ...]
pkgs character vector of the names of packages
whose current versions should be downloaded from the repositories.
If repos = NULL, a character vector of file paths,
on windows,
file paths of ‘.zip’ files containing binary builds of packages.
(http:// and file:// URLs are also accepted
So, you can just do
user <- "my_github_acc_name"
repo <- "their_tar_0.1"
branch <- "master"
fname <- "their_tar_0.1.tar.gz"
pkg_url <- paste("https://raw.github.com", user, repo, branch, fname, sep = "/")
install.packages(pkg_url, repos = NULL)

RStudio Connect, Packrat, and custom packages in local repos

We have recently got RStudio Connect in my office. For our work, we have made custom packages, which we have updated amongst ourselves by opening the project and build+reloading.
I understand the only way I can get our custom packages to work within apps with RSConnect is to get up a local repo and set our options(repos) to include this.
Currently I have the following:
library(drat)
RepoAddress <- "C:/<RepoPath>" # High level path
drat::insertPackage(<sourcePackagePath>, repodir = RepoAddress)
# Add this new repo to Rs knowledge of repos.
options(repos = c(options("repos")$repos,LocalCurrent = paste0("file:",RepoAddress)))
# Install <PackageName> from the local repo :)
install.packages("<PackageName>")
Currently this works nicely and I can install my custom package from the local repo. This indicates to me that the local repo is set up correctly.
As an additional aside, I have changed the DESCRIPTION file to have an extra line saying repository:LocalCurrent.
However when I try to deploy a Shiny app or Rmd which references , I get the following error on my deploy:
Error in findLocalRepoForPkg(pkg, repos, fatal = fatal) :
No package '<PackageName> 'found in local repositories specified
I understand this is a problem with packrat being unable to find my local repos during the deploy process (I believe at a stage where it uses packrat::snapshot()).This is confusing since I would have thought packrat would use my option("repos") repos similar to install.packages. If I follow through the functions, I can see the particular point of failure is packrat:::findLocalRepoForPkg("<PackageName", repos = packrat::get_opts("local.repos")), which fails even after I define packrat::set_opts("local.repos" = c(CurrentRepo2 = paste0("file:",RepoAddress)))
If I drill into packrat:::findLocalRepoForPkg, it fails because it can't find a file/folder called: "C://". I would have thought this is guaranteed to fail, because repos follow the C://bin/windows/contrib/3.3/ structure. At no point would a repo have the structure it's looking for?
I think this last part is showing I'm materially misunderstanding something. Any guidance on configuring my repo so packrat can understand it would be great.
One should always check what options RStudio connect supports at the moment:
https://docs.rstudio.com/connect/admin/r/package-management/#private-packages
Personally I dislike all options for including local/private packages, as it defeats the purpose of having a nice easy target for deploying shiny apps. In many cases, I can't just set up local repositories in the organization because, I do not have clearance for that. It is also inconvenient that I have to email IT-support to make them manually install new packages. Overall I think RS connect is great product because it is simple, but when it comes to local packages it is really not.
I found a nice alternative/Hack to Rstudio official recommendations. I suppose thise would also work with shinyapps.io, but I have not tried. The solution goes like:
add to global.R if(!require(local_package) ) devtools::load_all("./local_package")
Write a script that copies all your source files, such that you get a shinyapp with a source directory for a local package inside, you could call the directory for ./inst/shinyconnect/ or what ever and local package would be copied to ./inst/shinyconnect/local_package
manifest.
add script ./shinyconnect/packrat_sees_these_dependencies.R to shiny folder, this will be picked up by packrat-manifest
Hack rsconnet/packrat to ignore specifically named packages when building
(1)
#start of global.R...
#load more packages for shiny
library(devtools) #need for load_all
library(shiny)
library(htmltools) #or what ever you need
#load already built local_package or for shiny connection pseudo-build on-the-fly and load
if(!require(local_package)) {
#if local_package here, just build on 2 sec with devtools::load_all()
if(file.exists("./DESCRIPTION")) load_all(".") #for local test on PC/Mac, where the shinyapp is inside the local_package
if(file.exists("./local_package/DESCRIPTION")) load_all("./local_package/") #for shiny conenct where local_package is inside shinyapp
}
library(local_package) #now local_package must load
(3)
make script loading all the dependencies of your local package. Packrat will see this. The script will never be actually be executed. Place it at ./shinyconnect/packrat_sees_these_dependencies.R
#these codelines will be recognized by packrat and package will be added to manifest
library(randomForest)
library(MASS)
library(whateverpackageyouneed)
(4) During deployment, manifest generator (packrat) will ignore the existence of any named local_package. This is an option in packrat, but rsconnect does not expose this option. A hack is to load rsconnect to memory and and modify the sub-sub-sub-function performPackratSnapshot() to allow this. In script below, I do that and deploy a shiny app.
library(rsconnect)
orig_fun = getFromNamespace("performPackratSnapshot", pos="package:rsconnect")
#packages you want include manually, and packrat to ignore
ignored_packages = c("local_package")
#highjack rsconnect
local({
assignInNamespace("performPackratSnapshot",value = function (bundleDir, verbose = FALSE) {
owd <- getwd()
on.exit(setwd(owd), add = TRUE)
setwd(bundleDir)
srp <- packrat::opts$snapshot.recommended.packages()
packrat::opts$snapshot.recommended.packages(TRUE, persist = FALSE)
packrat::opts$ignored.packages(get("ignored_packages",envir = .GlobalEnv)) #ignoreing packages mentioned here
print("ignoring following packages")
print(get("ignored_packages",envir = .GlobalEnv))
on.exit(packrat::opts$snapshot.recommended.packages(srp,persist = FALSE), add = TRUE)
packages <- c("BiocManager", "BiocInstaller")
for (package in packages) {
if (length(find.package(package, quiet = TRUE))) {
requireNamespace(package, quietly = TRUE)
break
}
}
suppressMessages(packrat::.snapshotImpl(project = bundleDir,
snapshot.sources = FALSE, fallback.ok = TRUE, verbose = FALSE,
implicit.packrat.dependency = FALSE))
TRUE
},
pos = "package:rsconnect"
)},
envir = as.environment("package:rsconnect")
)
new_fun = getFromNamespace("performPackratSnapshot", pos="package:rsconnect")
rsconnect::deployApp(appDir="./inst/shinyconnect/",appName ="shinyapp_name",logLevel = "verbose",forceUpdate = TRUE)
The problem is one of nomenclature.
I have set up a repo in the CRAN sense. It works fine and is OK. When packrat references a local repo, it is referring to a local git-style repo.
This solves why findlocalrepoforpkg doesn't look like it will work - it is designed to work with a different kind of repo.
Feel free to also reach out to support#rstudio.com
I believe the local package code path is triggered in packrat because of the missing Repository: value line in the Description file of the package. You mentioned you added this line, could you try the case-sensitive version?
That said, RStudio Connect will not be able to install the package from the RepoAddress as you've specified it (hardcoded on the Windows share). We'd recommend hosting your repo over https from a server that both your Dev environment and RStudio Connect have access to. To make this type of repo setup much easier we just released RStudio Package Manager which you (and IT!) may find a compelling alternative to manually managing releases of your internal packages via drat.

Packrat with local binary repository

I want to use packrat on a Windows 7 machine with no internet connection.
I have downloaded all binary packages from http://cran.r-project.org/bin/windows/contrib/3.1/ into the local folder C:/xyz/CRAN_3_1.
The problem is now that
packrat::init(options=list(local.repos="C:/xyz/CRAN_3_1"))
throws a bunch of warnings and errors like
Warning: unable to access index for repository http://cran.rstudio/bin/...
Warning: unable to access index for repository http://cran.rstudio/src/...
Fetching sources for Rcpp (0.11.4) ... Failed
Package Rcpp not available in repository or locally
As it seems packrat tries to find
the binary version of Rcpp on CRAN (fails since there is no internet connection)
the source of Rcpp on CRAN (fails since there is no internet connection)
the local source of the package (fails since I only have the binaries)
What I don't understand is why packrat does not also search for the local binary package...
Question 1: I could download the source CRAN repository to get around this problem. But I would like to know from you guys whether there is an easier solution to this, i.e., whether it is possible to make packrat accept a local binary repo.
Question 2: When I create my own package myPackage with packrat enabled, will the myPackage-specific local packrat library also be included in the package? That is, assume that I give the binary myPackage zip File to one of my colleagues who does not have one of the packages that myPackage depends on (let's say Rcpp). Will Rcpp be included in myPackage when I use packrat? Or does my colleague have to install Rcpp himself?
I managed to hack around this problem. Please bear in mind that I have never used packrat before and that I do not know its "proper" behaviour. But my impression is that the hack works.
Here is how I did it:
Open your project, load packrat via library(packrat)
type fixInNamespace("snapshotImpl",ns="packrat") - a window opens - copy its content into the clipboard
Go to /yourProjDir/ and create a file snapshotImplFix.R
Copy the clipboard's content into this file ...
... but change the first line to
snapshotImplFix=function (project, available = NULL, lib.loc = libDir(project),
dry.run = FALSE, ignore.stale = FALSE, prompt = interactive(),
auto.snapshot = FALSE, verbose = TRUE, fallback.ok = FALSE,
snapshot.sources = FALSE)
Note snapshot.sources = FALSE! Save and close the file.
Create /yourProjDir/.Rprofile and add
setHook(packageEvent("packrat","onLoad"),function(...) {
source("./snapshotImplFix.R");
tmpfun=get("snapshotImpl",envir=asNamespace("packrat"));
environment(snapshotImplFix)=environment(tmpfun);
utils::assignInNamespace(x="snapshotImpl",value=snapshotImplFix,ns="packrat");})
Points 2-6 fix the problem with the snapshot.sources argument being TRUE by default (I did not find a better way to change that...)
Finally, we have to tell packrat to take our local repository. It's important that you have the right folder structure. Therefore I moved the repo from C:/xyz/CRAN_3_1 to C:/xyz/CRAN_3_1/bin/windows/contrib/3.1. Do not forget to run library(tools);write_PACKAGES("C:/xyz/CRAN_3_1/bin/windows/contrib/3.1"); if you also have to move your files.
Open yourProjDir/.Rprofile again and add at the end
local({r=getOption("repos");r["CRAN"]="file:///C:/xyz/CRAN_3_1";r["CRANextra"]=r["CRAN"];options(repos=r)})
Note the 3 / right after file! Save and exit file.
Close the project and re-open.
Now you can execute packrat::init() and it should run without errors.
It would be great if someone with more experience regarding packrat could give his/her input so that I can be sure that this hack works. Any pointers to proper solutions are highly appreciated, of course.

Installing R packages remotely with no special privileges. A.k.a. corruptions from installing R packages in /tmp/

I'm distributing jobs over a cluster and I'd rather not go to each machine and manually install the right packages. The job controller runs scripts as nobody, so I have to specify uncontroversial writeable paths for the installations. I actually had this working solution:
`%ni%` = Negate(`%in%`) ### "not in"
.libPaths("/tmp/") ### for local (remote non super user) install of packages
if ("xxx" %ni% installed.packages()) {install.packages("xxx", repos = "http://cran.r-project.org", lib="/tmp/")}
# ... and more
library(xxx)
# ... and more
It worked at first, but a week later I've got a strange problem.
> library(xxx)
Error in library(xxx) : there is no package called 'xxx'
xxx (and other packages) is in the manifest of installed.packages(), .libPaths is reporting /tmp/ on the path, and ls shows a folder for the package in /tmp/. Reinstalling with install.packages throws an error, as does remove.package, update.package, and find.package.
Two questions:
Is there a different way that I ought to have managed the remote install?
Any ideas what has caused my problem with the failure to load the package?
Please save me from having to implement a kludge like
locdir <- paste("/tmp/", as.integer(runif(1, 1, 100000)), sep='')
system(paste("mkdir", locdir))
.libPaths(locdir)
install.packages("xxx", repos = "http://cran.r-project.org", lib=locdir)
library(xxx)
You might need option character.only = TRUE, although it is weird that your code worked before but not anymore. Anyway, try this function:
packageLoad<-function(libName){
# try to load the package
if (!require(libName,character.only = TRUE)){
# if package is not available, install it
install.packages(libName,dep=TRUE,
repos="http://cran.r-project.org",lib="/tmp/",destdir="/tmp/")
# try again
if(!require(libName,character.only = TRUE))
stop(paste("Package ",libName,"
not found and its installation failed."))
}
}

Resources