How to debug artifactory slowness? - artifactory

Installing PyPI or NPM packages from our company's Artifactory instance is about 5-10 times slower then installing from PyPI server or simple NFS share. The network infrastructure is the same in all cases and seems fine..
Does Artifactory suppose to be slower (Because extra security checks or something)?
How can I debug and fix the slowness?

You should not be seeing such a difference between resolving from the public pypi/npm repositories and Artifactory.
There are a couple of things which may impact performance:
Location of your Artifactory server - if your Artifactory server is located outside your network (for example if your client is on-prem and Artifactory is hosted in the cloud), make sure you have good network connection to Artifactory
In case you are self hosting Artifactory - make sure that the resources provisioned to Artifactory are meeting the minimum system requirements. An overloaded database or a slow storage can affect download speed. In case your Artifactory is under a heavy load, you can take a look at some tuning best practices
In case Artifactory is using LDAP/SAML for authentication, any latency in the communication with those will affect the download time. More information about debugging LDAP issues can be found in the knowledge base
The type of repository you are resolving from can have an affect: if you are resolving from a remote repository, the connection to the remote URL can affect the download speed.You can get useful debug information by using the trace capability. Try downloading an artifact directly (using a browser or curl) and add the trace parameter, for example: http://localhost:8081/artifactory/npm-local/drorb/craftyjs-npm-example/-/drorb/craftyjs-npm-example-1.0.0.tgz?trace

Related

Artfactory not syncing org tree with jcenter

I'm setting up a new atrifactory installation for the first time in my life. Downloaded the tar and extraceted it ok. Got some firewall rules in place to allow https to jcenter.bintray.com. After an initial refresh I see loads of artifacts in the com tree that must come from jcenter, so all seems fine, but when I preform simple maven tasks like mvn help:active-profiles I only get warnings and errors that indicate that none of the relevant stuff is available from my artifactory.
I have accessed the firewall logs and I found no outgoing traffic from my artifactory server to anything that's not permitted. What have I missed? My artifactory is OSS version 7.5.7 rev 705070900.
Artifactory remote repositories are not working as a mirror or the external repository they are pointing at.
Remote Artifactory repositories are proxying the external repository, which means that you have to actively request for artifacts. When requesting for an artifact, Artifactory will request it from the external repository and cache it inside Artifactory. Farther requests for a cached artifact will be served from Artifactory without the need to go out to the external repository.
The list of artifacts we are seeing, are ones which are available in the external repository. This is a feature is called remote browsing and available for some of the package types supported by Artifactory.
I found the issue, sort of. For reasons I now understand I have plugin repositories. I added the true source for the plugins to my list of plugin repositories, and that solved the issue for me.

Upgrading Artifactory setup with Remote Repositories

I have an artifactory server, with a bunch of remote repositories.
We are planning to upgrade from 5.11.0 to 5.11.6 to take advantage of a security patch in that version.
Questions are:
do all repositories need to be on exactly the same version?
is there anything else i need to think about when upgrading multiple connected repositories (there is nothing specific about this in the manual)
do i need to do a system-level export just on the primary server? or should i be doing it on all of the remote repository servers
Lastly, our repositories are huge... a full System Export to backup will take too long...
is it enough to just take the config files/dirs
do i get just the config files/dirs by hitting "Exclude Content"
If you have an Artifactory instance that points to other Artifactory instances via smart remote repositories, then you will not have to upgrade all of the instances as they will be able to communicate with each other even if they are not on the same version. With that said, it is always recommended to use the latest version of Artifactory (for all of your instances) in order to enjoy all the latest features and bug fixes and best compatibility between instances. You may find further information about the upgrade process in this wiki page.
In addition, it is also always recommended to keep backups of your Artifactory instance, especially when attempting an upgrade. You may use the built-in backup mechanism or you may manually backup your filestore (by default located in $ARTIFACTORY_HOME/data/filestore) and take DataBase snapshots.
What do you mean by
do all repositories need to be on exactly the same version?
Are you asking about Artifactory instances? Artifactory HA nodes?
Regarding the full system export:
https://www.jfrog.com/confluence/display/RTF/Managing+Backups
https://jfrog.com/knowledge-base/how-should-we-backup-our-data-when-we-have-1tb-of-files/
For more info, you might want to contact JFrog's support.

RPM Remote Repository - Package does not match intended download

We're making use of a remote repository and are storing artifacts locally. However, we are running into a problem because of the fact the remote repository regularly rebuilds all artifacts that it hosts. In our current state, we update metadata (e.x. repodata/repomd.xml), but artifacts are not updated.
We have to continually clear our local remote-repository-cache out in order to allow it to download the rebuilt artifacts.
Is there any way we can configure artifactory to allow it to recache new artifacts as well as the new artifact metadata?
In our current state, the error we regularly run into is
https://artifactory/artifactory/remote-repo/some/path/package.rpm:
[Errno -1] Package does not match intended download.
Suggestion: run yum --enablerepo=artifactory-newrelic_infra-agent clean metadata
Unfortunately, there is no good answer to that. Artifacts under a version should be immutable; it's dependency management 101.
I'd put as much effort as possible to convince the team producing the artifacts to stop overriding versions. It's true that it might be sometimes cumbersome to change versions of dependencies in metadata, but there are ways around it (like resolving the latest patch during development, as supported in the semver spec), and in any way, that's not a good excuse.
If that's not possible, I'd look into enabling direct repository-to-client streaming (i.e. disabling artifact caching) to prevent the problem of stale artifacts.
Another solution might be cleaning up the cache using a user plugin or a script using JFrog CLI once you learn about newer artifacts being published in the remote repository.

Dependency resolution against local Artifactory takes very long

I have an Artifactory pro (without support) server installed in my local network.
One major use case for this artifactory was to use it as local cache for remote artifacts from e.g. repo1 maven repository or lightbend ivy2 repository. The hope was that I could speedup resolution of dependencies hosted on repo1 when caching them on my local artifactory.
I am pretty sure my development machine is configured correctly to exclusively resolve artifacts against my local artifactory.
However, every once in a while (suspiciously close to the interval configured as Metadata Retrieval Cache Period (Sec) in the Advanced Tab of the remote repository settings), the resolution of dependencies originally hosted on maven repo 1 takes far longer then usual.
I suspect that at these times artifactory refreshes artifact meta data (pom, ivy.xml) of remote artifacts. But this takes far longer than I would expect, assuming that a simple pom or ivy download should not take several seconds but rather a few milli seconds.
I am currently requesting root access to the server for attempting a tcpdump from OPs which may take time...
So my question is
Has anyone an idea what actually might happen that takes several seconds per dependency of a remote repository to refresh meta data files or am I looking in the wrong direction?
Update
My Artifactory version is
Artifactory Professional 5.1.3 rev 50019
We had a similar issue but with npm repo's where the meta data re-calculation was taking quite sometime and eventually we came to know that it was a bug in artifactory and was resolved in version 6.1.0. Worth checking the artifactory jira's for any such bugs. Hope so this helps!
Artifactory Jira Link

How do I use Artifactory to mirror linux distributions?

I configured external yum and apt repositories for Artifactory, for CentOS, Debian and Ubuntu and it seems as working but Artifactory does not cache/mirror them in advance. It seems that the artifacts are cached the first time they are requested and I do want to be sure that I pre-cache them.
I imagined that this would be done by replication option but somehow it seems that this option require an Artifactory server on the other side, which I obviously do not have as these are just public http mirrors, like:
http://mirror.bytemark.co.uk/centos/
http://ftp.uk.debian.org/debian/
http://mirror.bytemark.co.uk/ubuntu/
How do I perform the caching/mirroring?
All your observations and assumptions are correct.
Arifactory remote repositories are lazy proxies and download the artifacts only on demand.
Replication can pre-populate the caches, but it requires Artifactory instances on both sides (because of the checksum-based replication algorithm it uses).
If you're sure you want to pre-populate Artifactory with all the artifacts from those repositories (we don't see this demand justified usually), the easiest way will be to use a web crawler on build the list of all the packages and then issue a HEAD request to those packages via Artifactory.

Resources