Artifactory retention of artifacts. Jenkins artifactory plugin - artifactory

We use retention of artifact in build info.
def buildInfo = server.upload(uploadSpec)
buildInfo.retention maxDays: 5, doNotDiscardBuilds: Eval.me(env.param), deleteBuildArtifacts: true
Artifact deploys to snapshot repository (uploadSpec contains link to snapshot repository), but then promotes to the next repositories. Here is the question: if we turn on retention function it will have influence on snapshot repository only or all next promoted repositories also will be cleaned with retention function after 5 days?

Related

Airflow ECS-Operator not fetching CloudWatch Logs

I'm using Airflow's EcsOperator, ECS tasks writing to Cloudwatch.
Sometimes Airflow log fetcher collects logs from CloudWatch and sometimes does not.
On the CloudWatch console, I always see the logs.
On tasks that take a long time, I usually see the log or at least part of it.
Someone had the same issue with ECSOperator?
First ECSOperator is deprecated and removed in provider version 5.0.0
You should switch to EcsRunTaskOperator.
In EcsRunTaskOperator there is awslogs_fetch_interval which control over the interval to fetch logs from Ecs. The default is 30 seconds. If you wish for more frequent polls then set the parameter value accordingly.
You didn't mention what provider version you are on but this part of the code was refactored in version 5.0.0 (PR) so upgrading the Amazon provider might also resolve your issue.

Perform Git PULL using Java [JGIT] with user consent (Yes/No)

Requirement
Suppose we have already performed git clone to local repository. Now there exist some updates to remote repository file. On the application start up, I would like to find if there are any updates in remote repository compared to local repository. If updates available, a notification must be shown to user of application and ask for his/her consent to download those updates. If user gives consent to download the updates, application has to pull those changes else skip downloading for this run/launch. When user launches application for next time, again the same check has to be performed.
Problem Statement
As suggested in Git Remote Updates finder in Java fetched the remote branch and found the updates. This worked fine for first time. When the same steps were repeated, FetchResult didn't have the updates of Remote Repository.
Already Tried
Tried git.reset().setMode(ResetType.HARD).call() and git.revert().call() when user clicks NO to download the updates so that next start of the application gets the same delta when it performs Git FETCH
PSEUDO Code
if(alreadyCloned){
check if updates available in remote - as per the [suggestions in above link](https://stackoverflow.com/questions/69250170/git-remote-updates-finder-in-java)
if(updatesAvailable){
fetch user consent to download
if(user permits to download updates) {
perform git pull
} else {
clear local repository
}
} else {
do nothing
}
}
It seems there is a mismatch of your understanding and how Git works w.r.t. remote updates.
For each local branch (e.g. refs/heads/foo) usually there is a 'tracking branch' (e.g. refs/remotes/origin/foo). The fetch command updates the tracking branch. Hence, if the remote repo didn't change, subsequent fetches won't change the local repo.
To find out if a branch has changed, you should first fetch the branch (which updates the tracking branch if there are changes) and then compare the local branch with its corresponding remote tracking branch (e.g. using the BranchTrackingStatus).
Finally, if the commits of the remote should be applied ('user permits to download updates' in your program flow), then reset, rebase, or merge the local branch to/with the tracking branch.
Note, that pulling a branch in Git is a composite operation that first fetches commits from the remote repo (which updates the remote tracking branch) and then rebases or merges the local branch with the tracking branch.

Google Cloud Build - how to do a build/deploy to Firebase with a stateful container that "spins down" to 0 (like cloud run)

We have most, but not all of our build artifacts into a Git repo (Bitbucket).
Our current build looks like this, and takes 30+ minutes to build/deploy to Firebase, we would like to reduce the time to build.
We are not using Google Cloud Build at the moment, but before heading down that path, I want to find out if that would even be fruitful.
We have all of the code cloned from the git repo (Bitbucket), to a GCE VM.
And then 1 TB of static data is then copied into a directory under the git repo area, artifacts that are needed for the deploy.
We do not want to check in that 1TB of data into the git repo, it is from a 3rd party, it is rarely updated, and would be too heavy of a directory to pull into developer environments on their IDE's, it is pointless to do so.
We launch a build script on the GCE VM to build the code, and deploy to Firebase (bash script), it takes about 30 minutes.
We want the builds to go faster, and possible to use cloud build.
With this:
a git repo
external files that need to remain in a stateful container, not copied over each time, due to the time it would take
how do we create a stateful container that would only require a git update (pull origin master), and then to fire off a build/deploy to Firebase?
We want to avoid ingress traffic to the Firebase deploy using external build services where the 1TB of data that remains the same each and every time is sent to Firebase, where we would be billed.
Cloud Run containers are not stateful. GCE VM's are stateful, but it requires that we keep them up and going 24x7x365, so that any developer anywhere can run a build, and that may take only 30 minutes out of any day, and we don't know when that will be, so leaving it up 24x7x365 is mostly wasteful.
We want to avoid building a stateless container where the code is checked out fresh each and every time, a git pull origin master will do, and to have to copy the 1TB of artifacts into the container each and every time taking time.
We just want to do:
git pull origin master
Fire off the build as the next step in the script
spin down the container, have it save it's state for the next build, minimizing time, each and every time, saving the previous 'git pull origin master' updated artifacts, and preserving the 1TB files we copied to the container.
The ideal situation would be to have a container that is stateful, that spins down when not in use, and "spins up", or is made active for use when we need to do a build.
It would retain the previous git update (git pull origin master), and would retain all artifacts outside the git repo that we copy over. We also need shell access to the container (ssh, scp) etc.
A stateful 'Cloud Run' option would be ideal, but I don't know of such a thing (stateful containers with GCP that we can run and only be billed for runtime/compute time)
One solution is to use a VM for this. Add a startup script. In it
git pull origin master
Fire off the build as the next step in the script
Add this line which stop your compute
gcloud compute instances stop $(curl -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/id) --zone=$(curl -H "Metadata-Flavor: Google" http://metadata.google.internal/compu
teMetadata/v1/instance/zone)
By the way, each time that you start your VM, it will apply the startup script and shutdown automatically. You keep your persistent disk, and thus your 1TB data, and you pay few because of automatic stop.
If you have to wait an external build. 2 solutions:
Either you set a sleep timer and shutdown after in the startup script
Or customize this tutorial -> At the end of your build, publish a message in PubSub, which trigger a function which will stop your instance.
EDITED to reply to comments
Here again, 2 solutions:
You can create a custom role with only the permission required. You can see all compute role here. If you provide an access to the console, I recommend list (to view the VMs), start and stop. Else, only start and/or stop if you write a script.
You can create a private function or Cloud Run. Assign a service account as identity to this, with enough role to start the VM (even if there is more permission as required -> it's not a good practice. Prefer the least privilege with custom role) and grant the role function.invoker or run.invoker to the user (depend if you use Function or Cloud Run) for allowing it to call this private endpoint and start the VM without right on the VM (only the right to perform an HTTP call).

How to handle multiple code checkins in Concourse pipeline?

One of the github repository is resource for my pipeline. I have 3 parallel jobs in my concourse pipeline which gets triggered when there is any checkin to the github repository. Other jobs in the pipeline is in sequence. I am having the below issues:
1) I want the pipeline to complete full execution then only start new run. I am using pool resource to make sure the execution completes then only new run is triggered. Is there a better way to resolve it.
2) If there are multiple checkins while the pipeline is in progress then is there a way to only execute pipeline on the last checkin. For example 1st instance of pipeline is running and while the pipeline execution completes there are 6 checkins in the repository. Can the pipeline pick only 6th version of the repos and purge the run for previous five checkins?
using the lock pool resource is almost the perfect option but as you have rightly caught, there will be a trigger for each git commit and jobs will start to queue.
It sounds like you want this pipeline to be serialised. Have you considered serial_groups http://concourse-ci.org/single-page.html#job-serial-groups

git push a huge repository to a server with limited memory

The server has only 64MB of memory. I'm trying to push a huge git repository to it. Initially the target directory contains an empty bare repository. The push fails:
$ git push server:/tmp/repo master
Counting objects: 3064514, done.
Compressing objects: 100% (470245/470245), done.
fatal: Out of memory, calloc failed
error: pack-objects died of signal 13
error: failed to push some refs to 'server:/tmp/repo'
$ ssh server cat /tmp/repo.git/config
[pack]
threads = 1
deltaCacheSize = 8m
windowMemory = 32m
[core]
repositoryformatversion = 0
filemode = true
bare = true
I get the same error message after changing git config pack.windowMemory 16m on the server.
The same push succeeds to localhost:
$ git push 127.0.0.1:/tmp/repo master
Password:
Counting objects: 3064514, done.
Compressing objects: 100% (470245/470245), done.
Writing objects: 100% (3064514/3064514), 703.02 MiB | 10.84 MiB/s, done.
Total 3064514 (delta 2569775), reused 3059081 (delta 2565342)
To 127.0.0.1:/tmp/repo
* [new branch] master -> master
Is there a remote git config setting which can make the push succeed? Or do I have to repack the repo locally before pushing (with what settings)?
Please note that using a different server with more memory is not an option. Adding memory to the existing server is an option, up to 96MB. It's OK for me to use more disk space than usual on the server if the memory limit is met.
Similar question without a working solution: https://serverfault.com/questions/372899/git-fails-to-push-with-error-out-of-memory
Repacking the repository locally didn't help, git push prints the same error. Repack settings in the local repo:
git config core.packedgitlimit 32m
git config core.packedgitwindowsize=32m
git config pack.threads 1
git config pack.deltacachesize 8m
git config pack.windowmemory 32m
git config pack.packsizelimit 500m
My idea is that the reason why it fails is that the total number of objects is too large: even the SHA-1 hashes won't fit (20 * 3064514 bytes is almost 64MB).
Possible other causes
As #torek pointed out in his comment, this may not be an indication of the server running out of memory, but an indication that something is going wrong locally. Perhaps something changed between when you were pushing to server and to local host that freed up memory on your local machine?
It's also plausible that git is figuring out that you're pushing to localhost, and bypassing the "Git aware" transport mechanism and/or using hardlinks, which might reduce the memory needed. I don't see any indication in the docs that it WOULD do this, and I'm not sure off the top of my head how you could test this, or force it not to do that, but it's a possibility.
Another possible issue is that the host.xz:path/to/repo.git/ url syntax is only recognized if there are no slashes before the first colon, so depending on what server is, that could be causing problems.
If none of these are the case, and the problem is in fact that it's running out of memory on the server, you might have a few options here, depending on the circumstances. I don't know if any of these will work, but they're worth a try.
Solution 1: don't push all the commits at once
I'm assuming you've got many commits in the commit history of master. Try pushing them in stages. E.g.
git push server:/tmp/repo master~500
git push server:/tmp/repo master~400
git push server:/tmp/repo master~300
git push server:/tmp/repo master~200
git push server:/tmp/repo master~100
git push server:/tmp/repo master
Solution 2: Push individual objects one at a time
This is going to be incredibly tedious and DEFINITELY need to be automated/scripted on your local machine. However, you don't actually need to push whole commits all at once.
Instead, you can push individual objects one at a time as long as you push them to a tag ref instead of a branch ref. E.g. if we were working with https://github.com/llvm/llvm-project and wanted to push the tree object 0082ee0b3ad78ff55b2a3a65ef5bfdb8cd9713a1 from it (this is the tree object pointed to by commit faf5e0ec737a676088649d7c13cb50f3f91a703a), we could do git push server:/tmp/repo 0082ee0b3ad78ff55b2a3a65ef5bfdb8cd9713a1:refs/tags/test. Using this we can push individual objects one at a time, starting with blobs, then the tree objects, then finally commit objects. We'd end up with a TON of tags to clean up later, but I'll leave that to you to figure out.
For the rest of these solutions, I'm working under a couple assumptions:
Given the limitations you described, and the way you specified the url as server:/tmp/repo instead of something ending with .git, I'm assuming this remote repository isn't going to be managed with any service like github or gitlab, which should give you a little more room to use some unconventional techniques.
I'm also assuming you probably have the ability to log on to/run
commands on the server.
If either of these are not the case, and the above didn't work, I'm out of ideas at the moment.
Solution 3: backwards push using fetch or clone
There's actually nothing special about a server, it's just another git repository that you can trade commits with. The only difference is that a server is usually hosting what's called a bare repository: it doesn't typically keep a working tree of it's own (in other words, it only keeps the contents of the .git folder).
So, try performing the push in reverse using fetch/clone from server:
Push to a third, intermediate server (let's call it server2). Ideally, one with a lot more performance, like a github hosted repo.
Log onto/ssh into server, and from there, clone the repo into /tmp/repo: git clone --bare git#github.com:path/to/your/repo.git.
I would be surprised if this solved anything on it's own, but it's worth trying and step 1 will still set us up for solution 4 and 5. If by chance it does work, you can tidy up by removing server2 as a remote on server: git remote remove origin, then setting up your remotes on your local machine to point towards server instead of server2.
Solution 4: backwards push, but without fetching all the commits at once
Like solution 3, push to an intermediate server, but this time, instead of using clone and fetching everything all at once, fetch the commits in stages:
Log onto/ssh into server, and from there, initialize /tmp/repo as a bare repo:
cd /tmp/repo
git init --bare
git remote add origin git#github.com:path/to/your/repo.git
Still on server, fetch commits one at a time:
git fetch origin 569d84fe99e63e830ea036598f7fa7a5f9899d7c
git fetch origin 9aaba9d9bb4fc3648a9417820858086b14b6b73e
git fetch origin faf5e0ec737a676088649d7c13cb50f3f91a703a
Solution 5: backwards push, but using partial and/or shallow clones
Instead of fetching individual commits, we can use partial and or shallow clones to restrict how much we are fetching at once. There is a good write-up explaining what those are on the github blog: https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/. This time, we can't won't use a bare repository. We want to be able to check out commits to fill in the missing objects later. You can follow the instructions here to convert it to a bare repository when you're done. Alternatively, instead of using a regular (non-bare) repository, explicitly fetching the objects might also work, but I don't know for sure off the top of my head.
I think everything I've already written, combined with that write-up should give you all the pieces you need to figure out how to do this. I've already spent hours writing this up, it's late, this solution's kind of complicated, and it's an esoteric question that hasn't been touched in years. If somebody comes across this and needs a more complete answer for this, leave a comment and I'll fill it in, but this is as far as I'm willing to go right now for some potential internet points if nobody actually needs this answer XD.

Resources