Perform Git PULL using Java [JGIT] with user consent (Yes/No) - jgit

Requirement
Suppose we have already performed git clone to local repository. Now there exist some updates to remote repository file. On the application start up, I would like to find if there are any updates in remote repository compared to local repository. If updates available, a notification must be shown to user of application and ask for his/her consent to download those updates. If user gives consent to download the updates, application has to pull those changes else skip downloading for this run/launch. When user launches application for next time, again the same check has to be performed.
Problem Statement
As suggested in Git Remote Updates finder in Java fetched the remote branch and found the updates. This worked fine for first time. When the same steps were repeated, FetchResult didn't have the updates of Remote Repository.
Already Tried
Tried git.reset().setMode(ResetType.HARD).call() and git.revert().call() when user clicks NO to download the updates so that next start of the application gets the same delta when it performs Git FETCH
PSEUDO Code
if(alreadyCloned){
check if updates available in remote - as per the [suggestions in above link](https://stackoverflow.com/questions/69250170/git-remote-updates-finder-in-java)
if(updatesAvailable){
fetch user consent to download
if(user permits to download updates) {
perform git pull
} else {
clear local repository
}
} else {
do nothing
}
}

It seems there is a mismatch of your understanding and how Git works w.r.t. remote updates.
For each local branch (e.g. refs/heads/foo) usually there is a 'tracking branch' (e.g. refs/remotes/origin/foo). The fetch command updates the tracking branch. Hence, if the remote repo didn't change, subsequent fetches won't change the local repo.
To find out if a branch has changed, you should first fetch the branch (which updates the tracking branch if there are changes) and then compare the local branch with its corresponding remote tracking branch (e.g. using the BranchTrackingStatus).
Finally, if the commits of the remote should be applied ('user permits to download updates' in your program flow), then reset, rebase, or merge the local branch to/with the tracking branch.
Note, that pulling a branch in Git is a composite operation that first fetches commits from the remote repo (which updates the remote tracking branch) and then rebases or merges the local branch with the tracking branch.

Related

Google Cloud Build - how to do a build/deploy to Firebase with a stateful container that "spins down" to 0 (like cloud run)

We have most, but not all of our build artifacts into a Git repo (Bitbucket).
Our current build looks like this, and takes 30+ minutes to build/deploy to Firebase, we would like to reduce the time to build.
We are not using Google Cloud Build at the moment, but before heading down that path, I want to find out if that would even be fruitful.
We have all of the code cloned from the git repo (Bitbucket), to a GCE VM.
And then 1 TB of static data is then copied into a directory under the git repo area, artifacts that are needed for the deploy.
We do not want to check in that 1TB of data into the git repo, it is from a 3rd party, it is rarely updated, and would be too heavy of a directory to pull into developer environments on their IDE's, it is pointless to do so.
We launch a build script on the GCE VM to build the code, and deploy to Firebase (bash script), it takes about 30 minutes.
We want the builds to go faster, and possible to use cloud build.
With this:
a git repo
external files that need to remain in a stateful container, not copied over each time, due to the time it would take
how do we create a stateful container that would only require a git update (pull origin master), and then to fire off a build/deploy to Firebase?
We want to avoid ingress traffic to the Firebase deploy using external build services where the 1TB of data that remains the same each and every time is sent to Firebase, where we would be billed.
Cloud Run containers are not stateful. GCE VM's are stateful, but it requires that we keep them up and going 24x7x365, so that any developer anywhere can run a build, and that may take only 30 minutes out of any day, and we don't know when that will be, so leaving it up 24x7x365 is mostly wasteful.
We want to avoid building a stateless container where the code is checked out fresh each and every time, a git pull origin master will do, and to have to copy the 1TB of artifacts into the container each and every time taking time.
We just want to do:
git pull origin master
Fire off the build as the next step in the script
spin down the container, have it save it's state for the next build, minimizing time, each and every time, saving the previous 'git pull origin master' updated artifacts, and preserving the 1TB files we copied to the container.
The ideal situation would be to have a container that is stateful, that spins down when not in use, and "spins up", or is made active for use when we need to do a build.
It would retain the previous git update (git pull origin master), and would retain all artifacts outside the git repo that we copy over. We also need shell access to the container (ssh, scp) etc.
A stateful 'Cloud Run' option would be ideal, but I don't know of such a thing (stateful containers with GCP that we can run and only be billed for runtime/compute time)
One solution is to use a VM for this. Add a startup script. In it
git pull origin master
Fire off the build as the next step in the script
Add this line which stop your compute
gcloud compute instances stop $(curl -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/id) --zone=$(curl -H "Metadata-Flavor: Google" http://metadata.google.internal/compu
teMetadata/v1/instance/zone)
By the way, each time that you start your VM, it will apply the startup script and shutdown automatically. You keep your persistent disk, and thus your 1TB data, and you pay few because of automatic stop.
If you have to wait an external build. 2 solutions:
Either you set a sleep timer and shutdown after in the startup script
Or customize this tutorial -> At the end of your build, publish a message in PubSub, which trigger a function which will stop your instance.
EDITED to reply to comments
Here again, 2 solutions:
You can create a custom role with only the permission required. You can see all compute role here. If you provide an access to the console, I recommend list (to view the VMs), start and stop. Else, only start and/or stop if you write a script.
You can create a private function or Cloud Run. Assign a service account as identity to this, with enough role to start the VM (even if there is more permission as required -> it's not a good practice. Prefer the least privilege with custom role) and grant the role function.invoker or run.invoker to the user (depend if you use Function or Cloud Run) for allowing it to call this private endpoint and start the VM without right on the VM (only the right to perform an HTTP call).

HCM Full Data Sync to FSCM not publishing data

I am setting up Integration Broker messaging from HCM 9.2 to FSCM 9.2 using the PERSON_BASIC_FULLSYNC service operation (the delivered process) to sync data from HCM to FSCM. I have activated the service operation, handler, queue, and routing on both sides, however when I run the Full Data Publish process, it runs to No Success with the following error:
Fetching array element 0: index is not in range 1 to 3.
(180,252) EOL_PUBLISH.PUBDTL.GBL.default.190 0-01-01.Step05.OnExecute PCPC:16088 Statement:266
I had initially run this process, and it ran to success, however it did not publish any new data in PS_PERSONAL_DATA in FSCM, so I updated the service operation version in HCM from 'INTERNAL' to 'VERSION_1', as the corresponding service operation in FSCM only had the 'VERSION_1' version available. But after I change the version so they match, and run the process it goes to No Success.
If I set the version of the service operation in HCM back to 'INTERNAL' and run the process, then it is successful but no data gets published in PS_PERSONAL_DATA. Any thoughts on what I should look at?
Sounds like a service op. routing problem. Confirm the routing directions and ensure that any alias' that are set don't cause issues. Service Ops on each side need to be the same.

Meteor 0.7.2 + OplogObserveDriver not updating under certain circumstances

This is pretty cutting-edge as 0.7.2 was just released today, but I thought I'd ask in case somebody can shed some light.
I didn't report this to MDG because I can't reproduce this on my dev environment and thus I wouldn't have a recipe to give them.
I've set up oplog tailing in my production environment, which was deployed exactly as my dev environment was, except it's on a remote server.
The server runs Ubuntu + node 0.10.26 and I'm running the bundled version of the app with forever. Mongo reports its replSet is working in order.
The problem is that some collection updates made in server code don't make it to the client. This is the workflow the code is following:
Server publishes the collection using a very simple user_id: this.userId selector.
Client subscribes
Client calls a server method using Meteor.call()
Client starts observing a query on that collection using a specific _id: "something" selector. It will echo on "changed"
Server method calls .update() on the document matching that "something" _id, after doing some work.
If I run the app without oplog tailing (by not setting MONGO_OPLOG_URL), the above workflow works every time. However, if I run it with oplog tailing, the client doesn't echo any changes and if I query the collection directly from the JS console on the browser I don't see the updated version of the collection.
To add to the mystery, if I go into the mongo console and update the document manually, I see the the change on the client immediately. Or if I refresh the browser after the Meteor.call() and then query the collection manually from the js console the changes are there, as I'd expect.
As mentioned before, if I run the app on my dev environment with oplog tailing (verified using the facts package) it all works as expected and I can't reproduce the issue. The only difference here would be latency between client and server? (my dev environment is in my LAN).
Maybe if somebody is running into something similar we can isolate the issue and make it reproducible..

git push a huge repository to a server with limited memory

The server has only 64MB of memory. I'm trying to push a huge git repository to it. Initially the target directory contains an empty bare repository. The push fails:
$ git push server:/tmp/repo master
Counting objects: 3064514, done.
Compressing objects: 100% (470245/470245), done.
fatal: Out of memory, calloc failed
error: pack-objects died of signal 13
error: failed to push some refs to 'server:/tmp/repo'
$ ssh server cat /tmp/repo.git/config
[pack]
threads = 1
deltaCacheSize = 8m
windowMemory = 32m
[core]
repositoryformatversion = 0
filemode = true
bare = true
I get the same error message after changing git config pack.windowMemory 16m on the server.
The same push succeeds to localhost:
$ git push 127.0.0.1:/tmp/repo master
Password:
Counting objects: 3064514, done.
Compressing objects: 100% (470245/470245), done.
Writing objects: 100% (3064514/3064514), 703.02 MiB | 10.84 MiB/s, done.
Total 3064514 (delta 2569775), reused 3059081 (delta 2565342)
To 127.0.0.1:/tmp/repo
* [new branch] master -> master
Is there a remote git config setting which can make the push succeed? Or do I have to repack the repo locally before pushing (with what settings)?
Please note that using a different server with more memory is not an option. Adding memory to the existing server is an option, up to 96MB. It's OK for me to use more disk space than usual on the server if the memory limit is met.
Similar question without a working solution: https://serverfault.com/questions/372899/git-fails-to-push-with-error-out-of-memory
Repacking the repository locally didn't help, git push prints the same error. Repack settings in the local repo:
git config core.packedgitlimit 32m
git config core.packedgitwindowsize=32m
git config pack.threads 1
git config pack.deltacachesize 8m
git config pack.windowmemory 32m
git config pack.packsizelimit 500m
My idea is that the reason why it fails is that the total number of objects is too large: even the SHA-1 hashes won't fit (20 * 3064514 bytes is almost 64MB).
Possible other causes
As #torek pointed out in his comment, this may not be an indication of the server running out of memory, but an indication that something is going wrong locally. Perhaps something changed between when you were pushing to server and to local host that freed up memory on your local machine?
It's also plausible that git is figuring out that you're pushing to localhost, and bypassing the "Git aware" transport mechanism and/or using hardlinks, which might reduce the memory needed. I don't see any indication in the docs that it WOULD do this, and I'm not sure off the top of my head how you could test this, or force it not to do that, but it's a possibility.
Another possible issue is that the host.xz:path/to/repo.git/ url syntax is only recognized if there are no slashes before the first colon, so depending on what server is, that could be causing problems.
If none of these are the case, and the problem is in fact that it's running out of memory on the server, you might have a few options here, depending on the circumstances. I don't know if any of these will work, but they're worth a try.
Solution 1: don't push all the commits at once
I'm assuming you've got many commits in the commit history of master. Try pushing them in stages. E.g.
git push server:/tmp/repo master~500
git push server:/tmp/repo master~400
git push server:/tmp/repo master~300
git push server:/tmp/repo master~200
git push server:/tmp/repo master~100
git push server:/tmp/repo master
Solution 2: Push individual objects one at a time
This is going to be incredibly tedious and DEFINITELY need to be automated/scripted on your local machine. However, you don't actually need to push whole commits all at once.
Instead, you can push individual objects one at a time as long as you push them to a tag ref instead of a branch ref. E.g. if we were working with https://github.com/llvm/llvm-project and wanted to push the tree object 0082ee0b3ad78ff55b2a3a65ef5bfdb8cd9713a1 from it (this is the tree object pointed to by commit faf5e0ec737a676088649d7c13cb50f3f91a703a), we could do git push server:/tmp/repo 0082ee0b3ad78ff55b2a3a65ef5bfdb8cd9713a1:refs/tags/test. Using this we can push individual objects one at a time, starting with blobs, then the tree objects, then finally commit objects. We'd end up with a TON of tags to clean up later, but I'll leave that to you to figure out.
For the rest of these solutions, I'm working under a couple assumptions:
Given the limitations you described, and the way you specified the url as server:/tmp/repo instead of something ending with .git, I'm assuming this remote repository isn't going to be managed with any service like github or gitlab, which should give you a little more room to use some unconventional techniques.
I'm also assuming you probably have the ability to log on to/run
commands on the server.
If either of these are not the case, and the above didn't work, I'm out of ideas at the moment.
Solution 3: backwards push using fetch or clone
There's actually nothing special about a server, it's just another git repository that you can trade commits with. The only difference is that a server is usually hosting what's called a bare repository: it doesn't typically keep a working tree of it's own (in other words, it only keeps the contents of the .git folder).
So, try performing the push in reverse using fetch/clone from server:
Push to a third, intermediate server (let's call it server2). Ideally, one with a lot more performance, like a github hosted repo.
Log onto/ssh into server, and from there, clone the repo into /tmp/repo: git clone --bare git#github.com:path/to/your/repo.git.
I would be surprised if this solved anything on it's own, but it's worth trying and step 1 will still set us up for solution 4 and 5. If by chance it does work, you can tidy up by removing server2 as a remote on server: git remote remove origin, then setting up your remotes on your local machine to point towards server instead of server2.
Solution 4: backwards push, but without fetching all the commits at once
Like solution 3, push to an intermediate server, but this time, instead of using clone and fetching everything all at once, fetch the commits in stages:
Log onto/ssh into server, and from there, initialize /tmp/repo as a bare repo:
cd /tmp/repo
git init --bare
git remote add origin git#github.com:path/to/your/repo.git
Still on server, fetch commits one at a time:
git fetch origin 569d84fe99e63e830ea036598f7fa7a5f9899d7c
git fetch origin 9aaba9d9bb4fc3648a9417820858086b14b6b73e
git fetch origin faf5e0ec737a676088649d7c13cb50f3f91a703a
Solution 5: backwards push, but using partial and/or shallow clones
Instead of fetching individual commits, we can use partial and or shallow clones to restrict how much we are fetching at once. There is a good write-up explaining what those are on the github blog: https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/. This time, we can't won't use a bare repository. We want to be able to check out commits to fill in the missing objects later. You can follow the instructions here to convert it to a bare repository when you're done. Alternatively, instead of using a regular (non-bare) repository, explicitly fetching the objects might also work, but I don't know for sure off the top of my head.
I think everything I've already written, combined with that write-up should give you all the pieces you need to figure out how to do this. I've already spent hours writing this up, it's late, this solution's kind of complicated, and it's an esoteric question that hasn't been touched in years. If somebody comes across this and needs a more complete answer for this, leave a comment and I'll fill it in, but this is as far as I'm willing to go right now for some potential internet points if nobody actually needs this answer XD.

build queue issues in CC.net

Having a question on how the build queue is configured in CC.net.
I believe we have an issue , when trying to “force” build a scheduled project, the server tries to run several builds at the same time and fails
Most of them except the one that started first.
We need to get to a state when regardless how many builds are scheduled or how many we “force” start in about the same time, all build requests are placed in to a build queue and
executed one after finishing another in the order they were placed, and no extra request are generated.
Build Failed email is sent but the build was actually successful.
In short,The erroneous email is likely due to an error in the build server’s build scheduler/queue, trying to run 2 builds instead of one when asked for a “forced” build, as a result the first one is successful and the second one fails.
How to correct/resolve this issue....?
Thanks
Nilesh
To specify your projects' queue you need to set the queue property like this :
<project name="MyFirstProject" queue="Q1" queuePriority="1">
The default value is a queue per project. If you manually set the same queue (for example Q1) for all you project then, you will have a unique queue.
As for the queuePriority, the project (not yet started) in the queue are ordonned by queuePriority, low queuePriority projects start first.
It's all described in the cc net documentation which is now offline due to a problem at sourceforge.

Resources