How to create local cache for choco package manager - artifactory

I am building a windows docker image and use a package manager called choco for automating package installation. I would like to avoid downloading packages from the internet every time I modify and rebuild the dockerfile. Locally we run an Artifactory server that purpose But the package I am installing has hardcoded url value which is pointing to microsoft.com
In the package installation script here https://community.chocolatey.org/packages/visualcpp-build-tools#files
$packageName = 'visualcpp-build-tools'
$installerType = 'EXE'
$url = 'https://download.microsoft.com/download/5/A/8/5A8B8314-CA70-4225-9AF0-9E957C9771F7/vs_BuildTools.exe'
$checksum = 'E77D433C44F3D0CBF7A3EFA497101DE93918C492C2EBCAEC79A1FAF593D419BC'
How can I overcome this hardcoded Url value and force the install script to download the package from my local artifactory server?
Before discovering the hardcoded Url value inside the package installation script, I tried using the --source option with choco install as described in the guide here https://jfrog.com/blog/artifactory-as-a-caching-mechanism-for-package-managers/ but I noticed that the package is still downloaded from the internet.

Packages on the Chocolatey Community Repository often can't include the original binaries due to licenses and distribution rights - leading to each user downloading the file from the source URL.
However, you can manually re-create the package with the downloaded file inside - to do so, you should:
Extract the relevant content of the nupkg file (e.g. the tools directory and nuspec file)
Download the file specified in URL (which is the file being downloaded repeatedly during the package installation) to the tools directory (or somewhere else within the package)
Replace the URL in URL with the relative path to the downloaded file
Run choco pack $PathToNuspecFile (in the root of the extracted files)
Upload the resultant nupkg file to your Artifactory instance
Alternatively, you could host the binary file on your Artifactory server and update URL to point to that. This would have the potential benefit of not downloading the full size if you had logic in the package that would short-circuit the installation before the actual install step, e.g. if you had gigantic MSU file you only wanted to download if you needed to apply it.
Finally, one of the parts of a licensed copy of Chocolatey for Business is the Package Internalizer.
This automates the process described above, allowing you to simply call choco download visualcpp-build-tools --internalize.

Related

How can I copy and entire renv based project to a new PC (which does not have internet access)?

I have been given access to a beefy machine on which to run a large simulation.
I have developed the code in an RStudio project with renv. Renv makes a local copy of all the packages and stores versions thereof in a lock file.
The target machine (which runs Windows) does not have access to the internet. I have copied the project file, the code files, the renv folder (which includes all the local copies of the packages, the lock file, and the .RProfile file, to a folder on the target machine.
When I open the project on the target machine, the .RProfile executes source("renv/activate.R"). However, this fails to load the projects, instead giving me the following message
The following package(s) are missing their DESCRIPTION files:
... Long list of packages ...
These may be left over from a prior, failed installation attempt.
Consider removing or re-installing these packages.
Trouble is I can't reinstall them since this machine does not have access to the internet. I could manually go through each package and download the binaries on my work machine, then transfer them over to the target machine, then install them one by one, but this seems like a very painful thing to do.
Is there a way for me to convince renv, or R itself, to just use the packages in the local renv folder?
From the Cache section of https://rstudio.github.io/renv/articles/renv.html:
When using renv with the global package cache, the project library is instead formed as a directory of symlinks (or, on Windows, junction points) into the renv global package cache. Hence, while each renv project is isolated from other projects on your system, they can still re-use the same installed packages as required.
I think that implies trying to copy the renv folder means copying those junction points (which are something like shortcuts / symlinks), not the actual underlying folder.
It looks like you already have a solution, but another option would be to call renv::isolate() to ensure that your project doesn't link to packages within the global cache, and instead just maintains an isolated library directly.
In the end I just wrote an small script to copy the files over.
sourceFolder = "some/path"
targetFolder = "some/other/path"
subFolders = list.files(sourceFolder)
for (i in seq_along(subFolders)) {
subFolder = subFolders[i]
file.copy(
from = paste0(sourceFolder, subFolder),
to = targetFolder,
overwrite = TRUE,
recursive = TRUE
)
paste(subFolder) |> message()
}

R Packrat Fails to load private library

I have developed a solution using R and want to transfer it to the production server (CentOS 7) which has no Internet connection to install packages. To facilitate installation of packages, I used packrat to bundle the packages I used in my R script to the project.
Using packrat::bundle(), I have created a tar file of the project and moved the file to the server and untar the zip file.
According to a post in Blogger, once I open the project, When R is started from that directory, Packrat will do its magic and make sure the software environment is the same as on the source machine.
However, when I open the project in Server (using R-Studio Server 0.99), nothing happens and it throws error of unknown packages.
When manually execute the "packarat/init.R" file below error is thrown
Error in ensurePackageSymlink(source, target) :
Target '/home/R_Projects/prjName/packrat/lib-R/base' already exists and is not a symlink
Well, I found the problem and solve it. The symlink error is related to centOS (it is not related to R). I just simply removed all the folders inside the
/home/R_Projects/prjName/packrat/lib-R
Because these folder exist, the packrat is unable to create symlink with the same name inside the lib-R folder. If I remove them, it will create a link (shortcut) to the actual folder where the r package is located.
Hope it helps future readers.

Using R with git and packrat

I have been using git for a while but just recently started using packrat. I would like my repository to be self contained but at the same time I do not want to include CRAN packages as they are available. It seems once R is opened in a project with packrat it will try to use packages from project library; if they are not available then it will try to install from src in the project library; if they are not available it will look at libraries installed in that computer. If a library is not available in the computer; would it look at CRAN next?
What files should I include in my git repo as a minimum (e.g., packrat.lock)?
You can choose to set an external CRAN-like repository with the source tarballs of the packages and their versions that you'd like available for your project. The default behaviour, however, is to look to CRAN next, as you've identified in your question. Check out the packrat.lock file, you will see that for each package you use in packrat, there is an option called source: CRAN (if you've downloaded the file from CRAN, that is).
When you have a locally stored package source file, the contents of the lockout for said package change to the following:
Package: FooPackage
Source: source
Version: 0.4-4
Hash: 44foo4036fb68e9foo9027048d28
SourcePath: /Users/MyName/Documents/code/myrepo/RNetica
I'm a bit unclear on your final question: What files should I include in my git repo as a minimum (e.g., packrat.lock)? But I'm going to take this as a) combination of what files should be present for packrat to run, and b) which of those files should be committed to the git-repo. To answer the first question, I illustrate with initialising packrat on an existing R project.
When you run packrat::init(), two important things happen (among others):
1. All the packrat scaffolding, including source tarballs etc are created under: PackageName/packrat/.
2. packrat/lib*/ is added to your .gitignore file.
So from this, we can see that anything under packrat/lib*/ doesn't need to be committed to your git-repo. This leaves the following 3 files to be committed:
packrat/init.R
packrat/packrat.lock
packrat/packrat.opts
packrat.lock is needed for collaborating with others through a version control system; it helps keep your private libraries in sync. packrat.opts allows you to specify different project specific options for packrat. The file is automatically generated using get_opts and set_opts. Committing this file to the git-repo will ensure that any options you specify are maintained for all collaborators. A final file to be committed to the repo is .Rprofile. This file tells R to use the private package library (when R is started from the project directory).
Depending on your needs, you can choose to commit the source tar balls to the repository, or not. If you don't want them available in your git-repo, you simply add packrat/src/ to the .gitignore. But, this will mean that anyone accessing the git-repo will not have access to the package source code, and the files will be downloaded from CRAN, or from wherever the source line dictates within the packrat.lock file.
From your question, it sounds like committing the packrat/src/ folder contents to your repo might be what you need.

How to install stringi from local file (ABSOLUTELY no Internet Access)

I am working on a remote server using RStudio. This server has no access to the Internet. I would like to install the package "stringi." I have looked at this stackoverflow article, but whenever I use the command
install.packages("stringi_0.5-5.tar.gz",
configure.vars="ICUDT_DIR=/my/directory/for/icudt.zip")
It simply tries to access the Internet, which it cannot do. Up until now I have been using Tools -> Install Packages -> Install from Packaged Archive File. However, due to this error, I can no longer use this method.
How can I install this package?
If you have no internet access on local machines, you can build a distributable source package that includes all the required
ICU data files (for off-line use) by omitting some relevant lines in
the .Rbuildignore file. The following command sequence should do the trick:
wget https://github.com/gagolews/stringi/archive/master.zip -O stringi.zip
unzip stringi.zip
sed -i '/\/icu..\/data/d' stringi-master/.Rbuildignore
R CMD build stringi-master
Assuming the most recent development version is 1.3.1,
a file named stringi_1.3.1.tar.gz is created in the current working directory.
The package can now be installed (the source bundle may be propagated via
scp etc.) by executing:
R CMD INSTALL stringi_1.3.1.tar.gz
or by calling install.packages("stringi_1.3.1.tar.gz", repos=NULL),
from within an R session.
For a Linux machine the easiest way is from my point of view:
Download the release you need from Rexamine in tar.gz format to your local pc. In opposition to the version on CRAN it already contains the icu55\data\ folder.
Move the archive to your target linux machine without internet access
run R CMD INSTALL stringi-1.0-1.tar.gz (in case of release 1.0-1)
You provided the wrong value of configure.vars.
It indicates that you have to give the directory's name, not a final file name.
Correct your code to the following:
install.packages("stringi_0.5-5.tar.gz",
configure.vars="ICUDT_DIR=/my/directory/for/")
Regards,
Sean
Follow the steps below
Download icudt55l.zip seperately from server where you have internet access with
wget http://www.mini.pw.edu.pl/~gagolews/stringi/icudt55l.zip
Copy the downloaded packages to the server where you want to install stringi
Execute the following command
R CMD INSTALL --configure-vars='ICUDT_DIR=/tmp/ALL' stringi_1.1.6.tar.gz
icudt55l.zip is copied to /tmp/ALL
The suggestion from #gagolews almost worked for me. Here's what actually did the trick with RStudio.
Download the master.zip file that will save as stringi-master.zip.
Unzip the file onto your desktop. The unzipped folder should be stringi-master.
Edit the .Rbuildignore file by removing ^src/icu55/data and ^src/icu61/data or similar lines.
Move the folder from your desktop to the home directory of your server.
Create a New Project in RStudio with ~/stringi-master as the Existing Directory
From RStudio's menu, select Build and Build Source Package. (You may need to first select Configure Build Tools. For Project build tools choose Package then select OK.)
It should create a tar.gz file, in the following format: stringi_x.x.(x+1).tar.gz. For example, if the current version of stringi is 1.5.3, it will create version 1.5.4. (I received a few warnings that didn't seem to affect the outcome.)
Move the newly created package to your local repository. Update the repository index. And install the package.

Move r packages to new computer which has no internet

Normally I install packages using:
install.packages("foo")
and a Repo over the internet. But I have a new machine now where I want to replicate the packages from my existing installation without having to pull everything off the internet all over again. (I've a ton of packages and slow internet access)
Both machines are Windows and run the same R version. (2.13.1)
Is there a way to do this? Closest I can get is I know I can install from local zip files using:
install.packages("pathtozip", repos = NULL)
But does R store all Zips somewhere? I found a few in locations like:
C:\Documents and Settings\foouser\Local Settings\Temp\RtmpjNKkyp\downloaded_packages
But not all.
Any tips?
The function .libPaths will give you a vector of all the libraries on your machine. Run this on your old machine to find all of them. You can simply copy all these files into the libraries on your new machine (run .libPaths on it too to find out where).
Alternatively, if you want to set up a real repository (i.e. basically a CRAN mirror) on your computer or on a network drive you can update, you can put binary or source packages into a folder and run tools::write_PACKAGES on that folder. You can them run install.packages using the contriburl argument and point it to your repository folder.
All packages that you have installed are stored in a folder called win-library\r-version, for example,
C:\Users\Ehsan\Documents\R\win-library\2.15 so, it is enough to copy all the folders inside 2.15 to the same folder in your new machine. because you have the same version of R you do not need to update them by update.packages().
On your original computer, run
write.csv(unique(data.frame(installed.packages())[,1]),"packages.csv",row.names=F)
Save this .csv into the working directory of your new computer, then run
install.packages(as.character(read.csv("packages.csv")[,1]))
You can check what your working directory is using getwd().

Resources