Artifactory: Package management strategy to support n-2 releases

Artifactory: Package management strategy to support n-2 releases - artifactory

I am new to Artifactory and trying to figure out the best strategy for my company's need. We've been using in-house package management system so far and want to go with more of industry standard solution.
Our situation:
We have on-prem deployed software. Since each customer has their own story and strategy for software update, we need to support more than the latest release (let's say, we support the last 2 releases).
We have 40+ git repos that forms a single installer. Some of those 40+ git repos create npm package or nuget package, and some others consume them (and make its own nuget/npm package sometimes.)
Every release gets their own branch and CI pipeline, so that updating a package in release-1.1 pipeline will not accidentally leak to any consumer of the package in release-1.0. This happens on all these 40+ git repos.
New release branch/CI pipeline is spawned about twice a year.
I see Artifactory provides multiple repos feature. In their recommended repo best practice doc, https://jfrog.com/whitepaper/best-practices-structuring-naming-artifactory-repositories/, it suggests you use a separator for maturity, such as dev vs. prod.
To apply this to our situation, one idea is to perceive each release as maturity, so we will have dev, release-1.0, release-1.1, etc. artifactory repos, and each artifactory release repos are tied to their own branch. This will work okay but it takes more automation on artifactory side. I can see making separate Artifactory repos to manage permission, but making new repos just to filter packages feels overkill for us. There is no permission difference between releases.
Alternatively, we could go with single artifactory repo, and each package to be labeled with releases. Say, CI pipeline for release 1.0 will release a package with a label release-1.0. With a tooling like GitVersion that guarantees each CI pipeline will produce unique version number, this can provide nice filtering/grouping mechanism for all the packages without the burden of per-release repos. Only if nuget update or npm update can do the update of the package versions with label-based filtering.
jfrog cli provides a way to download files based upon labeling from certain artifactory repo. To build one git repo package, I could download all the packages from 40+ git repos with the label filtering, and then do nuget update using local folder. It doesn't sound ideal.
I am surprised that nuget or npm already don't have update with label filtering feature. They support labels, but it is only for searching. The only way I can think of is to write custom script that will go through each package reference in nuget.config or package.config (for npm), run query with jfrog cli (or api) to get the latest version of the package, and then update one by one. It will work, but I am wondering if this is the right solution since it involves handful of custom work.
Any advice from package management guru is much appreciated.

My problem is resolvable by utilizing path, as noted in this article with baseRev.
https://www.jfrog.com/confluence/display/JFROG/Repository+Layouts
It was important to recognize our releases are not maturity (as in dev vs. prod), but long-living branch. The difference is that long living branch's artifacts are compiled again, whereas prod artifacts are promoted from dev artifacts as-is. So when I tried to resolve long-living branch problem by applying maturity practice, it created awkward flow here and there.
Each long-living branch set of 40+ repos will provide/consume their own nuget packages within. To address this without making new repos for each release, we can utilize path in local repo, such as artifactory.my-server.com/api/nuget/nuget-local/master/* vs artifactory.my-server.com/api/nuget/nuget-local/release-1/*.
Unfortunately, you can use path for push, but not for install.
So for consumption side, you need to make one virtual repo for each release, which is not too big of a deal for us.

Related

Best practices for developing own custom operators in Airflow 2.0

We are currently in the process of developing custom operators and sensors for our Airflow (>2.0.1) on Cloud Composer. We use the official Docker image for testing/developing
As of Airflow 2.0, the recommended way is not to put them in the plugins directory of Airflow but to build them as separate Python package. This approach however seems quite complicated when developing DAGs and testing them on the Docker Airflow.
To use Airflows recommended approach we would use two separate repos for our DAGs and the operators/sensors, we would then mount the custom operators/sensors package to Docker to quickly test it there and edit it on the local machine. For further use on Composer we would need to publish our package to our private pypi repo and install it on Cloud Composer.
The old approach however, to put everything in the local plugins folder, is quite straight forward and doesnt deal with these problems.
Based on your experience what is your recommended way of developing and testing custom operators/sensors ?

You can put the "common" code (custom operators and such) in the dags folder and exclude it from being processed by scheduler via .airflowignore file. This allows for rather quick iterations when developing stuff.
You can still keep the DAG and "common code" in separate repositories to make things easier. you can easily use a "submodule" pattern for that (add "common" repo as submodule of the DAG repo - this way you will be able to check them out together, you can even keep different DAG directories (for different teams) with different version of the common packages this way (just submodule-link it to different versions of the packages).
I think the "package" pattern if more of a production deployment thing rather than development. Once you developed the common code locally, it would be great if you package it together in common package and version accordingly (same as any other python package). Then you can release it after testing, version it etc. etc..
In the "development" mode you can checkout the code with "recursive" submodule update and add the "common" subdirectory to PYTHONPATH. In production - even if you use git-sync, you could deploy your custom operators via your ops team using custom image (by installing appropriate, released version of your package) where your DAGS would be git-synced separately WITHOUT the submodule checkout. The submodule would only be used for development.
Also it would be worth in this case to run a CI/CD with the Dags you push to your DAG repo to see if they continue working with the "released" custom code in the "stable" branch, while running the same CI/CD with the common code synced via submodule in "development" branch (this way you can check the latest development DAG code with the linked common code).
This is what I'd do. It would allow for quick iteration while development while also turning the common code into "freezable" artifacts that could provide stable environment in production, while still allowing your DAGs to be developed and evolve quickly, while also CI/CD could help in keeping the "stable" things really stable.

How can R script determine it's version, from a GitHub release tag?

I have an R project that generates some Solr and RDF output. The project is in a GitHub repo, and I have a pre-release tagged 1.0.0
My team has decided that any knowledge artifacts we create should have an internal indication of the version of the software that created them.
Currently, I manually enter the release tag into a JSON configuration file after manually/interactively making the release on the GitHub website.
Could I either
automatically enter the release number into the config file when the release is built
automatically determine the release version when running the R scripts
And are either of those approaches good ideas? A user could theoretically make local changes to the R code and get out of sync with the cited release, right?

RPM Remote Repository - Package does not match intended download

We're making use of a remote repository and are storing artifacts locally. However, we are running into a problem because of the fact the remote repository regularly rebuilds all artifacts that it hosts. In our current state, we update metadata (e.x. repodata/repomd.xml), but artifacts are not updated.
We have to continually clear our local remote-repository-cache out in order to allow it to download the rebuilt artifacts.
Is there any way we can configure artifactory to allow it to recache new artifacts as well as the new artifact metadata?
In our current state, the error we regularly run into is
https://artifactory/artifactory/remote-repo/some/path/package.rpm:
[Errno -1] Package does not match intended download.
Suggestion: run yum --enablerepo=artifactory-newrelic_infra-agent clean metadata

Unfortunately, there is no good answer to that. Artifacts under a version should be immutable; it's dependency management 101.
I'd put as much effort as possible to convince the team producing the artifacts to stop overriding versions. It's true that it might be sometimes cumbersome to change versions of dependencies in metadata, but there are ways around it (like resolving the latest patch during development, as supported in the semver spec), and in any way, that's not a good excuse.
If that's not possible, I'd look into enabling direct repository-to-client streaming (i.e. disabling artifact caching) to prevent the problem of stale artifacts.
Another solution might be cleaning up the cache using a user plugin or a script using JFrog CLI once you learn about newer artifacts being published in the remote repository.

Guide to Repository Layout and Build Promotion with Artifactory

Usual story, new to artificatory and looking for a jump start.
Can anyone point me at a decent how-to post for using Artifactory (free version) with Jenkins in a deployment pipeline?
I figure I'm likely to want to:
setup several repos for dev thru production (any standards for this?)
have jenkins publish artifacts to the first repo using the artifactory plugin - limiting the amount of builds kept in artifactory.
promote builds from one repo to next as release to next environment - again deleting older builds
I just need a good guide/example to get started...

Did you check the User Guide? It covers all your questions perfectly well. Here are:
Creating repositories (re the standards - the preconfigured repos reflect the standards).
Working with Jenkins
Jenkins Release Management
Keep in mind that all the promotion functionality is available in Pro version only. That includes both simple move operation (not recommended for promotion pipeplines) and extremely powerful release management, based on build integration with Jenkins Artifactory plugin.

What is the point of Artifactory or Nexus, and how might I use them?

While investigating CI tools, I've found that many installations of CI also integrate to artifact repositories like SonaType Nexus and JFrog Artifactory.
Those tools sound highly integrated to Maven. We do not use Maven, nor do we compile Java even. We compile C++ using Qt/qmake/make, and this build works really well for us. We are still investigating CI tools.
What is the point of using an Artifact repository?
Is archiving to Nexus or Artifactory (or Archiva) supposed to be a step in our make chain, or part of the CI chain, or could it be either?
How might I make our "make" builds or perl/bash/batch scripts interact with them?

An artifact repository has several purposes. The main purpose is to have an copy of maven central (or any other maven repo) to have faster download times and you can use maven even if the internet is down. Since your not using maven this is irrelevant for you.
The second purpose is to store files in it you want to use as dependency but you can not download freely from the internet. So you buy them or get them from your vendors and put them in your repo. This is also more applicable to maven user and there dependency mechanism.
The third important purpose is to have a central way were you can store your releases. So if you build a release v1.0 you can upload it to such a repository and with the clean way of naming in maven its kinda easy to know how to find v1.0 and to use it with all other tools. So you could write a script which downloads your release with wget and install it on a host.
Most of the time these repos have a way of a staging process. So you can store v1.0 in the repo in staging. Someone do the test and when its fine he promotes it to the release repo were everybody can find and use it.
Its simple to integrate them with Maven projects and they are lot of other build tools frameworks with has a easy possiblity to connect against it like ant ivy, groovy grape and so on. Because of the naming schema there is no limitation that you use bash or perl to download/upload files from it.
So if you have releases or files which should be shared between projects and do not have a good solution for it an artefact repository could be good starting point to see how this could work.

As mentioned here:
Providing stable and reliable access to repositories
Supporting a large number of common binaries across different environments
Security and access control
Tracing any action done to a file back to the user
Transferring a large number of binaries to a remote location
Managing infrastructure configuration across different environments

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex