immutable artifactory PyPI: prevent re-uploads and deletes - artifactory

Currently I have a problem where if a package is accidentally re-uploaded at the same version the hashes change, and so packages pinned with --generate-hashes will fail to install.
Is there a way to prevent re-uploads to the same version, just like pypi.org?
Also a way to prevent deletes?

Related

Non-manual solution to "cannot remove prior installation of package" when re-installing R packages

I recently began receiving warnings that prior installations of R packages cannot be removed when I try to re-install packages:
install.packages("gtools")
#> Warning: cannot remove prior installation of package ‘gtools’
#> Warning: restored ‘gtools’
I found solutions to this issue encouraging me to delete the packages manually from my library folder, which I could find with .libPaths(). However, (a) this seems like a way of addressing symptoms rather than the underlying issue (which remains unclear) and (b) there are two paths for seemingly different versions of R and I'm not sure which to delete from anyway:
.libPaths()
#> [1] "C:/Users/foo/Documents/R/win-library/4.1"
#> [2] "C:/Program Files/R/R-4.1.2/library"
How can I fix the problem so I don't have to manually delete package folders every time I want to re-install a package? If there is no alternative, do I need to delete the subdirectories for the package from one of those folders or both? FWIW, I'm working in RStudio.
The problem is that you have installed packages using different permissions. On Windows, you need elevated permissions to write to Program Files. At some point you (or an admin) probably used "Run as admin" to install gtools there, and now using regular permissions you can't delete that.
You should be able to delete the Users/foo copy, if you are running as user foo, but even that one may have had permissions changed. But I'd guess the issue is that gtools is in the Program files location.
The error message from R doesn't tell you which location it is trying to delete from, which is unfortunate. In fact, allowing installations of different versions in those two locations is a bad design feature in R that just leads to confusion, because you don't necessarily always use the same version each time you load packages. (The rule for which one you use is the first acceptable one found in the .libPaths list, but since you can change .libPaths, and since packages can load other packages, it's hard to predict which one you'll have loaded at any given time.)
To fix this, you can delete both copies (if you have two) and start over, but that's risky because other packages might be depending on gtools. If you are the only user on your computer, you could instead delete the entire "C:/Users/foo/Documents/R/win-library/4.1" library, and then do all your installs using "Run as admin", but that's also easy to mess up.
(On a Mac, that's effectively what happens, because most single user systems put the user in the "admin" group, so they can always install packages to the system location. It causes a lot less confusion, but some "purists" think the Windows way is better.)
So I don't have any good advice for you, but maybe this explains the situation, and you can work out for yourself the best way forward.

Pinning R package versions

How do you best pin package versions in R?
Rejected strategy 1: Pin to CRAN source tar.gzs
Doesn't work if you want to pin it at the latest version since CRAN does not put the tip version in the archive (duh)
Rejected strategy 2: Use devtools
Don't want to, because it takes ages to compile and adds lots of stuff I don't want to use
Rejected strategy 3: Vendor
Would rather avoid having to copy all source
To provide a little bit more information on packrat, which I use for this purpose. From the website.
R package dependencies can be frustrating. Have you ever had to use
trial-and-error to figure out what R packages you need to install to
make someone else’s code work–and then been left with those packages
globally installed forever, because now you’re not sure whether you
need them? Have you ever updated a package to get code in one of your
projects to work, only to find that the updated package makes code in
another project stop working?
We built packrat to solve these problems. Use packrat to make your R
projects more:
Isolated: Installing a new or updated package for one project won’t
break your other projects, and vice versa. That’s because packrat
gives each project its own private package library. Portable: Easily
transport your projects from one computer to another, even across
different platforms. Packrat makes it easy to install the packages
your project depends on. Reproducible: Packrat records the exact
package versions you depend on, and ensures those exact versions are
the ones that get installed wherever you go.
Packrat stores the version of the packages you use in the packrat.lock file, and then downloads that version from CRAN whenever you packrat::restore(). It is much lighter weight than devtools, but can still take some time to re-download all of the packages (depending on the packages you are using).
If you prefer to store all of the sources in a zip file, you can use packrat::snapshot() to pull down the sources / update the packrat.lock and then packrat::bundle() to "bundle" everything up. The aim for this is to make projects / research reproducible and portable over time by storing the package versions and dependencies used on the original design (along with the source code so that the OS dependency on a binary is avoided).
There is much more information on the website I linked to, and you can see current activity on the git repo. I have encountered a few cases that work in a less-than-ideal way (packages not on CRAN have some issues at times), but the git repo still seems to be pretty active with issues/patches which is encouraging.

What's a good strategy for saving old R package versions on GitHub?

The development of RStudio and the packages devtools and roxygen2 has made R package creation pretty easy. I use GitHub for version control and devtools allows others to easily install directly from my account.
As my package gradually changes with each version, I'm wondering if I should be maintaining .zip files (or other format) of my past stable builds, in case anyone would ever want to use a previous version.
It's easy to download a .zip of an R package directly from GitHub, but I'm wondering if I should add this to the same GitHub directory (e.g. https://github.com/myaccount/mypackage/previous_versions/mypackage_0.1.zip) without messing up somebody's installation via install_github("myaccount/mypackage").
So, the main Qs are:
Should I keep an old package version at all?
Should I keep old package versions in a sub-folder of my GitHub R package directory?
Should I save .zip files downloaded from GitHub as my old version, or produce a Source or Binary file during the package build itself (i.e. in RStudio)?
Is this a superfluous activity if one isn't yet willing to publish to CRAN?!
When you think your package is at a good solid place, you should tag a release. This archives the branch at that point in time and stores the zip file with the source code, and the tar.gz file.
I tend to mark my CRAN packages as a release each time I release it to CRAN (for example, see https://github.com/nutterb/pixiedust/releases) and with some intemediary tags that I consider noteworthy.
Another good strategy for managing changes in between tagged releases is to maintain a development branch below your main branch. That way your development changes won't pollute or break anything being used by those pulling from your main branch. It makes you free to experiment in the dev branch while always having a clean, working copy to push to and restore from.
1. Should I keep an old package version at all?
It's subjective, but I'd definitely say "yes" unless there's a space constraint, which is probably unlikely.
This serves 2 purposes. One is for your own convenience, such as if you want to make sure that you always have a quick way to test the results of older versions versus a newer version.
The other is that people often need older versions of packages, such as if someone wants to use your package but they're using an older version of R on a server where the policies prevent an update to R. Perhaps a newer version of your package includes a new dependency which only works with a package that depends on a certain version of R or higher.
Of course, packages can always be installed without the compressed or binary files, but it's a nice convenience.
2. Should I keep old package versions in a sub-folder of my GitHub R package directory?
I would put it in a trunk or special subfolder that won't be automatically downloaded when someone tries to install_github or clone your master branch. Having a separate branch is a good idea.
3. Should I save .zip files downloaded from GitHub as my old version, or produce a Source or Binary file during the package build itself (i.e. in RStudio)?
As the package author you're in a position to know if these differ significantly and which if either is better, but by default I'd recommend the RStudio build because I assume (if you're like me) that you're less likely to include unnecessary files this way.
4. Is this a superfluous activity if one isn't yet willing to publish to CRAN?!
No, not necessarily. If people rely on your package then it really doesn't matter if it's on CRAN or not. In fact, not being on CRAN may be a reason to be more proactive like this to ensure that your users will always have access to the needed version of your package.

Is there a quick way to debug an external meteor package?

You just installed a meteor package, and for some reason it isn't working. You suspect that it's the package itself that has a bug. You want to investigate that. How do you do that?
Optimally, you'd be able to run a command that forks the original package repository with the right version and replaces the original in your meteor application, ready for you to debug it and, once fixed, possibly generate a pull request.
I don't expect something like this to exist as a single command, but is there a workflow that you follow to do exactly that? Or do you approach the problem in a different way?
Do a git clone of the package into your local packages folder. Fix any bugs you need to. Commit them. And make a pull request. Once the pull request is accepted, you can remove the local package and use the regular package.
From when I've asked in the past, there isn't really an easier way to do this it seems. But to be honest, this approach isn't too much work.
Also, if you just want to debug, you can step through the package code while it's running without cloning the repo locally. (Assuming it's running in development mode and hasn't been minified by Meteor).

Disabling the default library in R

The default R library, .Library, is normally not writeable under Windows.
You need to run R as Administrator. For new packages you can set and use a personal library, but this doesn't work when updating packages in the base setup (e.g. by update.packages()).
If you forget (or don't know you need) to run as Administrator, you get duplicate versions of the same packages, messing the installation.
I think one solution could be copying all packages to a personal library and disabling the default one. I know how to add a new library path to R, i.e. .libPaths ("my/path"), but how to remove the default library from .libPaths ()?
Update for non-Windows users
Some clarifications might help mostly non-Windows R users to understand the mentioned problem.
In Windows "Log on as Administrator" (or better as a user belonging to administrators' group) and "Run as Administrator" are quite different things.
In the former case you just give your credentials at logon, much like in Linux; in the latter you are already logged as a "superuser", but in order to carry out a potentially dangerous action, you have to ask an ad hoc permission to Windows (proving that it's you and not a malware acting).
That's said, programs (and developers), before accessing known Windows' protected objects (i.e. C:\Program Files folder), ask permission to the user to avoid being blocked by the OS.
Even when they don't ask (because they assume the knowledgeable user should give this permission in advance), failure to access is normally reported like "Permission denied to access to folder etc.".
As for R version 3.0.2, update.packages() involves one of the situations, where an elevated permission request should be triggered, because this might involve writing to protected program folders. Unfortunately R doesn't ask and cannot update the directory with old packages.
What about the second safe net: user notifications? While install.packages() gives messages like:
stop ... "'lib' element %s is not a writable directory" ...
and you get the idea of a permission problem, with others functions, such as update.packages(), you get:
warning ... "package '%s' in library '%s' will not be updated"
whose causes can be everything.
Can this scenario be even worse? Yes. Besides not asking for permission to write to "Program Folders", besides not issuing a notification of the permission error, update.packages(), when unable to update packages in protected folders, actually installs them to the personal user folder, but without notifying this. This is similar to what install.packages() does, except that the latter notifies and asks permission to do this.
So you end up with two versions of the same packages in different folders! Your calculations will be therefore dependent on library priorities.
Can this scenario be even worse? Yes. You are clever (or Google) enough to understand that you need to "Run as Administrator", when you want to update packages. You restart R as Administrator and hope this will fix everything. Not at all. R sees the updated packages in the personal library and does not act. So you remain with two versions of the same packages.
To solve this you have to detect duplicate packages and remove them manually, then restart R as administrator and update again (or write a script to do this).
Clearly the solution would be R conforming to Windows apps expected behaviour, or at least do nothing when prevented to act (instead of taking non-notified decisions).
In the meantime I think that totally disabling the default library (located in a protected area) would be a temporary workaround.
A final note. Packages and package updating are crucial for using R, so my humble opinion is that the topic should deserve specific careful attention even for less GNU-blessed systems like Windows.
One solution is to change R_LIBS environment variable. You can see for example this question.
But If you don't have admin rights, you can specify location when you load the package:
library(my_package, lib.loc="my/path")

Resources