Practical Use of Artifactory Repositories - artifactory

In a near future I will start using Artifactory in my project. I have been reading about local and remote repositories and I am a bit confused of their practical use. In general as far as I understand
Local repositories are for pushing and pulling artifacts. They have no connection to a remote repository (i.e. npm repo at https://www.npmjs.com/)
Remote repositories are for pulling and caching artifacts on demand. It works only one way, it is not possible to push artifacts.
If I am right up to this point, then practically it means you only need a remote repository for npm if you do not develop npm modules but only use them to build your application. In contrast, if you need to both pull and push Docker container images, you need to have one local repository for pushing&pulling custom images and one remote repository for pulling official images.
Question #1
I am confused because our Artifactory admin created a local npm repository for our project. When I discussed the topic with him he told me that I need to first get packages from the internet to my PC and push them to Artifactory server. This does not make any sense to me because I have seen some remote repositories on the same server and what we need is only to pull packages from npm. Is there a point that I miss?
Question #2
Are artifacts at remote repository cache saved until intentionally deleted? Is there a default retention policy (i.e. delete packages older than 6 months)? I ask this because it is important to keep packages until a meteor hits the servers (for archiving policy of the company).
Question #3
We will need to get official Docker images and customize them for CI. It would be a bit hard to maintain one local repo for pulling&pushing custom images and one remote repo for pulling official images. Let's say I need to pull official Ubuntu latest, modify it, push and finally pull the custom image back. In this case it should be pulled using remote repository, pushed to local repo and pulled again from local repo. Is it possible to use virtual repositories to do this seamlessly as one repo?

Question #1 This does not make any sense to me because I have seen some remote repositories on the same server and what we need is only to pull packages from npm. Is there a point that I miss?
Generally, you would want to use a remote repository for this. You would then point your client to this remote repository and JFrog Artifactory would grab them from the remote site and cache them locally, as needed.
In some very secure environments, corporate policies do not even allow this (they may not even be connected to the internet) and instead manually download, vet, and then upload those third-party libraries to a local repository. I don't think that is your case and they may just not understand their intended usages.
Question #2 Are artifacts at remote repository cache saved until intentionally deleted? Is there a default retention policy?
They will not be deleted unless you actively configure it to do so.
For some repo types there are built-in retention mechanisms like the number of snapshots or maximum tags but not for all of them and even in those that have it, they must be actively turned on. Different organizations have different policies for how long artifacts must be maintained. There are a lot of ways to cleanup those old artifacts but ultimately it will depend on your own requirements.
Question #3 Is it possible to use virtual repositories to do this seamlessly as one repo?
A virtual repository will let you aggregate your local and remote sites and appear as a single source. So you can do something like:
docker pull myarturl/docker/someimage:sometag
... docker build ...
docker push myarturl/docker/someimage:sometag-my-modified-version
docker pull myarturl/docker/someimage:sometag-my-modified-version
It is also security-aware so if the user only has access to the local stuff and not the remote stuff, they will only be able to access the local stuff even though they are using the virtual repository that contains both of them.
That said, I don't see why it would be any harder to explicitly use different repositories:
docker pull myarturl/docker-remote/someimage:sometag
... docker build ...
docker push myarturl/docker-local/someimage:sometag-my-modified-version
docker pull myarturl/docker-local/someimage:sometag-my-modified-version
This also has the added advantage that you know they can only pull your modified version of the image and not the remote (though you can also accomplish that by creating the correct permissions).

Related

Upgrading Artifactory setup with Remote Repositories

I have an artifactory server, with a bunch of remote repositories.
We are planning to upgrade from 5.11.0 to 5.11.6 to take advantage of a security patch in that version.
Questions are:
do all repositories need to be on exactly the same version?
is there anything else i need to think about when upgrading multiple connected repositories (there is nothing specific about this in the manual)
do i need to do a system-level export just on the primary server? or should i be doing it on all of the remote repository servers
Lastly, our repositories are huge... a full System Export to backup will take too long...
is it enough to just take the config files/dirs
do i get just the config files/dirs by hitting "Exclude Content"
If you have an Artifactory instance that points to other Artifactory instances via smart remote repositories, then you will not have to upgrade all of the instances as they will be able to communicate with each other even if they are not on the same version. With that said, it is always recommended to use the latest version of Artifactory (for all of your instances) in order to enjoy all the latest features and bug fixes and best compatibility between instances. You may find further information about the upgrade process in this wiki page.
In addition, it is also always recommended to keep backups of your Artifactory instance, especially when attempting an upgrade. You may use the built-in backup mechanism or you may manually backup your filestore (by default located in $ARTIFACTORY_HOME/data/filestore) and take DataBase snapshots.
What do you mean by
do all repositories need to be on exactly the same version?
Are you asking about Artifactory instances? Artifactory HA nodes?
Regarding the full system export:
https://www.jfrog.com/confluence/display/RTF/Managing+Backups
https://jfrog.com/knowledge-base/how-should-we-backup-our-data-when-we-have-1tb-of-files/
For more info, you might want to contact JFrog's support.

RPM Remote Repository - Package does not match intended download

We're making use of a remote repository and are storing artifacts locally. However, we are running into a problem because of the fact the remote repository regularly rebuilds all artifacts that it hosts. In our current state, we update metadata (e.x. repodata/repomd.xml), but artifacts are not updated.
We have to continually clear our local remote-repository-cache out in order to allow it to download the rebuilt artifacts.
Is there any way we can configure artifactory to allow it to recache new artifacts as well as the new artifact metadata?
In our current state, the error we regularly run into is
https://artifactory/artifactory/remote-repo/some/path/package.rpm:
[Errno -1] Package does not match intended download.
Suggestion: run yum --enablerepo=artifactory-newrelic_infra-agent clean metadata
Unfortunately, there is no good answer to that. Artifacts under a version should be immutable; it's dependency management 101.
I'd put as much effort as possible to convince the team producing the artifacts to stop overriding versions. It's true that it might be sometimes cumbersome to change versions of dependencies in metadata, but there are ways around it (like resolving the latest patch during development, as supported in the semver spec), and in any way, that's not a good excuse.
If that's not possible, I'd look into enabling direct repository-to-client streaming (i.e. disabling artifact caching) to prevent the problem of stale artifacts.
Another solution might be cleaning up the cache using a user plugin or a script using JFrog CLI once you learn about newer artifacts being published in the remote repository.

Connect one Artifactory to another Artifactory

Our setup includes a company wide Artifactory that holds in-house-built artifacts as well as goes out and fetches publicly available artifacts. I’m trying to setup a local Artifactory at our location that would fetch publicly available artifacts through the regular internet, but would connect to the company wide Artifactory for our in-house-built artifacts. Is this possible?
In my local Artifactory setup, I put the company wide Artifactory URL as a Remote Repository. I can hit the Test button and it tells me that it successfully connected. However, when I go to download an artifact it does not work. I would like to say that publicly available artifacts can be fetched through my local Artifactory, so at least I can get to jcenter.bintray.
Can one Artifactory be connected to another Artifactory? If yes, is there a way to test if this connection works
I don’t think we would be using all the contents of the company wide Artifactory, so I don’t want to do an export and import to the local or do replication. I would prefer if we could fetch on demand. Is this possible?
Edit: Thanks to #DarthFennec pointing me to Smart Remote Repositories I have solved my problem. To others who have the same problem
Please follow the steps mentioned on the previously mentioned page to set up the Smart Remote Repository. In my case Artifactory did not detect that the remote was another instance of Artifactory and did not give me any options to set, but I was not interested in these anyway.
Note You can always click the Test button to make sure that your connection to the Remote Repository works.
Next, go to the Admin -> Virtual Repositories select your Repository Key and select your Smart Repository from the Available Repositories so that it moves into the Selected Repositories. Click Save & Finish at the bottom and you should be good to go.
I'm not sure exactly what your problem ended up being, but if you want to remote one Artifactory repository from another, it should be a smart remote repository. This is when Artifactory detects that a remote is pointing at another Artifactory, and it enables a number of extra features, like download statistics, property replication, and remote browsing.
An important thing to keep in mind when configuring a smart remote repository is that depending on the package type, you might need to point the remote at <artifactory>/api/<type>/<repo>, rather than just <artifactory>/<repo>. This is the case for Bower, Chef, CocoaPods, Docker, Go, NuGet, Npm, Php Composer, Puppet, Pypi, RubyGems, and Vagrant repositories. Other repository types should use the standard <artifactory>/<repo> URL.

Can Artifactory age artifacts to S3?

We have an Artifactory solution deployed and I am trying to figure out if it can meet my use case. The normal use case is that artifacts are deleted within a week or so and can normally fit in X GB of local storage, but we'd like to be able to:
Keep some artifacts around much longer, and since they are accessed infrequently, store them in AWS S3.
Sometimes artifacts aren't able to be cleaned up in time, so we'd like to burst to the cloud when local storage is overflowed.
I was thinking I could do the following:
Local repository of X GB
Repo pointing to S3
Virtual repo in front of both of these
Setup a plugin to move artifacts from local->S3 via our policies
However, I can't figure out what a Filestore is in Artifactory, and how you'd have two Repositories backed by different filestores.
Anyone have pointers to documentation or anything that can help? The docs I can find are rather slim on the high level details of filestores and repositories.
The Artifactory binary provider does not support configuring multiple storage backends, so it is impossible to use S3 and NFS in parallel. The main reason for this limitation is that Artifactory has a checksum based storage which stores each binary only once and keeps pointers from all relevant repositories. For that reason Artifactory does not manage separate storage per repository.
For archiving purposes, one of the possible solutions is setting up another Artifactory instance which will take care of archiving. This instance can be connected to an S3 storage backend.
You can use replication to synchronize between the two instances (without syncing deletes). You can have a repository(s) in your master Artifactory which contains artifacts which should be archived, those artifacts will be replicated to the archive Artifactory and later on can be deleted from the master.
You can use a user plugin to decide which artifacts should be moved to the archive repository.

How to sync 2 remotes using Phabricator Diffusion

I have to use 2 remotes for my repositories. For eg.
One is my local git server (gitblit)
One is Github/bitbucket
Additionally, I have to use Phabricator to manage all this. So the workflow i am thinking is:
I push the changes to my local git server, and my friends push to github. Phabricator Observe the changes from local git server + Github and sync it with the other remote changes. I have tried Mirror option, but it deleted the changes from one of remote, because that's what mirror is supposed to do.
So I need to know a way which i can use to sync these 2 remotes using Phabricator.
Apart from creating a (read-only, as you discovered) mirror, Phabricator doesn't really have any ability to push to other servers. It assumes one of the following workflows:
Phabricator is the master copy of the repository - everyone pushes to Phabricator (Phabricator can push to mirrors in this scenario).
Some other server is the master copy of the repository - Phabricator will monitor the remote master and keep a read-only copy of the repository locally.
It might be possible to implement a respository merging task in Harbormaster, but you'll have to be prepared for frequent manual intervention in any workflow that has users pushing to different repositories and expecting automation to sync them together. Probably this syncing task would be easier if you were to get rid of the gitblit server from the equation, and just use Phabricator locally.

Resources