How do I use Artifactory to mirror linux distributions? - artifactory

I configured external yum and apt repositories for Artifactory, for CentOS, Debian and Ubuntu and it seems as working but Artifactory does not cache/mirror them in advance. It seems that the artifacts are cached the first time they are requested and I do want to be sure that I pre-cache them.
I imagined that this would be done by replication option but somehow it seems that this option require an Artifactory server on the other side, which I obviously do not have as these are just public http mirrors, like:
http://mirror.bytemark.co.uk/centos/
http://ftp.uk.debian.org/debian/
http://mirror.bytemark.co.uk/ubuntu/
How do I perform the caching/mirroring?

All your observations and assumptions are correct.
Arifactory remote repositories are lazy proxies and download the artifacts only on demand.
Replication can pre-populate the caches, but it requires Artifactory instances on both sides (because of the checksum-based replication algorithm it uses).
If you're sure you want to pre-populate Artifactory with all the artifacts from those repositories (we don't see this demand justified usually), the easiest way will be to use a web crawler on build the list of all the packages and then issue a HEAD request to those packages via Artifactory.

Related

Practical Use of Artifactory Repositories

In a near future I will start using Artifactory in my project. I have been reading about local and remote repositories and I am a bit confused of their practical use. In general as far as I understand
Local repositories are for pushing and pulling artifacts. They have no connection to a remote repository (i.e. npm repo at https://www.npmjs.com/)
Remote repositories are for pulling and caching artifacts on demand. It works only one way, it is not possible to push artifacts.
If I am right up to this point, then practically it means you only need a remote repository for npm if you do not develop npm modules but only use them to build your application. In contrast, if you need to both pull and push Docker container images, you need to have one local repository for pushing&pulling custom images and one remote repository for pulling official images.
Question #1
I am confused because our Artifactory admin created a local npm repository for our project. When I discussed the topic with him he told me that I need to first get packages from the internet to my PC and push them to Artifactory server. This does not make any sense to me because I have seen some remote repositories on the same server and what we need is only to pull packages from npm. Is there a point that I miss?
Question #2
Are artifacts at remote repository cache saved until intentionally deleted? Is there a default retention policy (i.e. delete packages older than 6 months)? I ask this because it is important to keep packages until a meteor hits the servers (for archiving policy of the company).
Question #3
We will need to get official Docker images and customize them for CI. It would be a bit hard to maintain one local repo for pulling&pushing custom images and one remote repo for pulling official images. Let's say I need to pull official Ubuntu latest, modify it, push and finally pull the custom image back. In this case it should be pulled using remote repository, pushed to local repo and pulled again from local repo. Is it possible to use virtual repositories to do this seamlessly as one repo?
Question #1 This does not make any sense to me because I have seen some remote repositories on the same server and what we need is only to pull packages from npm. Is there a point that I miss?
Generally, you would want to use a remote repository for this. You would then point your client to this remote repository and JFrog Artifactory would grab them from the remote site and cache them locally, as needed.
In some very secure environments, corporate policies do not even allow this (they may not even be connected to the internet) and instead manually download, vet, and then upload those third-party libraries to a local repository. I don't think that is your case and they may just not understand their intended usages.
Question #2 Are artifacts at remote repository cache saved until intentionally deleted? Is there a default retention policy?
They will not be deleted unless you actively configure it to do so.
For some repo types there are built-in retention mechanisms like the number of snapshots or maximum tags but not for all of them and even in those that have it, they must be actively turned on. Different organizations have different policies for how long artifacts must be maintained. There are a lot of ways to cleanup those old artifacts but ultimately it will depend on your own requirements.
Question #3 Is it possible to use virtual repositories to do this seamlessly as one repo?
A virtual repository will let you aggregate your local and remote sites and appear as a single source. So you can do something like:
docker pull myarturl/docker/someimage:sometag
... docker build ...
docker push myarturl/docker/someimage:sometag-my-modified-version
docker pull myarturl/docker/someimage:sometag-my-modified-version
It is also security-aware so if the user only has access to the local stuff and not the remote stuff, they will only be able to access the local stuff even though they are using the virtual repository that contains both of them.
That said, I don't see why it would be any harder to explicitly use different repositories:
docker pull myarturl/docker-remote/someimage:sometag
... docker build ...
docker push myarturl/docker-local/someimage:sometag-my-modified-version
docker pull myarturl/docker-local/someimage:sometag-my-modified-version
This also has the added advantage that you know they can only pull your modified version of the image and not the remote (though you can also accomplish that by creating the correct permissions).

Artfactory not syncing org tree with jcenter

I'm setting up a new atrifactory installation for the first time in my life. Downloaded the tar and extraceted it ok. Got some firewall rules in place to allow https to jcenter.bintray.com. After an initial refresh I see loads of artifacts in the com tree that must come from jcenter, so all seems fine, but when I preform simple maven tasks like mvn help:active-profiles I only get warnings and errors that indicate that none of the relevant stuff is available from my artifactory.
I have accessed the firewall logs and I found no outgoing traffic from my artifactory server to anything that's not permitted. What have I missed? My artifactory is OSS version 7.5.7 rev 705070900.
Artifactory remote repositories are not working as a mirror or the external repository they are pointing at.
Remote Artifactory repositories are proxying the external repository, which means that you have to actively request for artifacts. When requesting for an artifact, Artifactory will request it from the external repository and cache it inside Artifactory. Farther requests for a cached artifact will be served from Artifactory without the need to go out to the external repository.
The list of artifacts we are seeing, are ones which are available in the external repository. This is a feature is called remote browsing and available for some of the package types supported by Artifactory.
I found the issue, sort of. For reasons I now understand I have plugin repositories. I added the true source for the plugins to my list of plugin repositories, and that solved the issue for me.

Upgrading Artifactory setup with Remote Repositories

I have an artifactory server, with a bunch of remote repositories.
We are planning to upgrade from 5.11.0 to 5.11.6 to take advantage of a security patch in that version.
Questions are:
do all repositories need to be on exactly the same version?
is there anything else i need to think about when upgrading multiple connected repositories (there is nothing specific about this in the manual)
do i need to do a system-level export just on the primary server? or should i be doing it on all of the remote repository servers
Lastly, our repositories are huge... a full System Export to backup will take too long...
is it enough to just take the config files/dirs
do i get just the config files/dirs by hitting "Exclude Content"
If you have an Artifactory instance that points to other Artifactory instances via smart remote repositories, then you will not have to upgrade all of the instances as they will be able to communicate with each other even if they are not on the same version. With that said, it is always recommended to use the latest version of Artifactory (for all of your instances) in order to enjoy all the latest features and bug fixes and best compatibility between instances. You may find further information about the upgrade process in this wiki page.
In addition, it is also always recommended to keep backups of your Artifactory instance, especially when attempting an upgrade. You may use the built-in backup mechanism or you may manually backup your filestore (by default located in $ARTIFACTORY_HOME/data/filestore) and take DataBase snapshots.
What do you mean by
do all repositories need to be on exactly the same version?
Are you asking about Artifactory instances? Artifactory HA nodes?
Regarding the full system export:
https://www.jfrog.com/confluence/display/RTF/Managing+Backups
https://jfrog.com/knowledge-base/how-should-we-backup-our-data-when-we-have-1tb-of-files/
For more info, you might want to contact JFrog's support.

RPM Remote Repository - Package does not match intended download

We're making use of a remote repository and are storing artifacts locally. However, we are running into a problem because of the fact the remote repository regularly rebuilds all artifacts that it hosts. In our current state, we update metadata (e.x. repodata/repomd.xml), but artifacts are not updated.
We have to continually clear our local remote-repository-cache out in order to allow it to download the rebuilt artifacts.
Is there any way we can configure artifactory to allow it to recache new artifacts as well as the new artifact metadata?
In our current state, the error we regularly run into is
https://artifactory/artifactory/remote-repo/some/path/package.rpm:
[Errno -1] Package does not match intended download.
Suggestion: run yum --enablerepo=artifactory-newrelic_infra-agent clean metadata
Unfortunately, there is no good answer to that. Artifacts under a version should be immutable; it's dependency management 101.
I'd put as much effort as possible to convince the team producing the artifacts to stop overriding versions. It's true that it might be sometimes cumbersome to change versions of dependencies in metadata, but there are ways around it (like resolving the latest patch during development, as supported in the semver spec), and in any way, that's not a good excuse.
If that's not possible, I'd look into enabling direct repository-to-client streaming (i.e. disabling artifact caching) to prevent the problem of stale artifacts.
Another solution might be cleaning up the cache using a user plugin or a script using JFrog CLI once you learn about newer artifacts being published in the remote repository.

setting up a local npm cache using artifactory

I am using artifactory to set up a local npm registry cache.
I did
npm config set registry https://example.com/artifactory/api/npm/npm-virtual/
and have jenkins run
npm install
unfortunately, there doesn't seem to be any difference between using artifactory and using the normal npm registry (npm install uses the same amount of time for both approaches)
am I doing something wrong?
The difference, of course, is not in the install time. Most of the install time is consumed by network, so even if one of the solutions (local registry or Artifactory) is faster than the other, the difference won't be noticeable.
Here's a short, but not complete list of benefits of Artifactory over the simple local registry:
Artifactory works for a very broad set of technologies, not only npm, allowing using single tool for all your development and operational binaries (including Vagrant, Docker, and what's not).
Artifactory supports multiple repositories, allowing you to control access, visibility and build promotion pipelines on top of them. That's the correct way to manage binaries.
Artifactory is priced by server, not by user, allowing bringing more people in the organization to use it without additional cost.
I am with JFrog, the company behind Bintray and [artifactory], see my profile for details and links.

Resources