I'm using JGit to run some commits, but I notice the commit time in log is the same for all my commits, even the process is few seconds long. After some testing, I can confirm commits are created with commitTime = time the Repository has been opened.
Sounds like a bug, don't you think ?
I just understood commit time is set via committer PersonIdent object.
Related
When i changed the start time of a coordinator job in job.properties in oozie, the job is not taking the changed time, instead its running in the old scheduled time.
Old job.properties:
startMinute=08
startTime=${startDate}T${startHour}:${startMinute}Z
New job.properties:
startMinute=07
startTime=${startDate}T${startHour}:${startMinute}Z
The job is not running at the changed time:07th minute,its running at 08th minute in every hour.
Please can you let me know the solution, how i can make the job pickup the updated properties(changed timing) without restarting or killing the job.
You can't really change the timing of the co-ordinator via any methods given by Oozie(v3.3.2) . When you submit a job the contents properties are stored in the database whereas the actual workflow is in the HDFS.
Everytime you execute the co-ordinator it is necessary to have the workflow in the path specified in properties during job submission but the properties file is not needed. What I mean to imply is the properties file does not come into the picture after submitting the job.
One hack is to update the time directly in the database using SQL query.But I am not sure about the implications of it.The property might become inconsistent across the database.
You have to kill the job and resubmit a new one.
Note: oozie provides a way to change the concurrency,endtime and pausetime as specified in the official docs.
The server has only 64MB of memory. I'm trying to push a huge git repository to it. Initially the target directory contains an empty bare repository. The push fails:
$ git push server:/tmp/repo master
Counting objects: 3064514, done.
Compressing objects: 100% (470245/470245), done.
fatal: Out of memory, calloc failed
error: pack-objects died of signal 13
error: failed to push some refs to 'server:/tmp/repo'
$ ssh server cat /tmp/repo.git/config
[pack]
threads = 1
deltaCacheSize = 8m
windowMemory = 32m
[core]
repositoryformatversion = 0
filemode = true
bare = true
I get the same error message after changing git config pack.windowMemory 16m on the server.
The same push succeeds to localhost:
$ git push 127.0.0.1:/tmp/repo master
Password:
Counting objects: 3064514, done.
Compressing objects: 100% (470245/470245), done.
Writing objects: 100% (3064514/3064514), 703.02 MiB | 10.84 MiB/s, done.
Total 3064514 (delta 2569775), reused 3059081 (delta 2565342)
To 127.0.0.1:/tmp/repo
* [new branch] master -> master
Is there a remote git config setting which can make the push succeed? Or do I have to repack the repo locally before pushing (with what settings)?
Please note that using a different server with more memory is not an option. Adding memory to the existing server is an option, up to 96MB. It's OK for me to use more disk space than usual on the server if the memory limit is met.
Similar question without a working solution: https://serverfault.com/questions/372899/git-fails-to-push-with-error-out-of-memory
Repacking the repository locally didn't help, git push prints the same error. Repack settings in the local repo:
git config core.packedgitlimit 32m
git config core.packedgitwindowsize=32m
git config pack.threads 1
git config pack.deltacachesize 8m
git config pack.windowmemory 32m
git config pack.packsizelimit 500m
My idea is that the reason why it fails is that the total number of objects is too large: even the SHA-1 hashes won't fit (20 * 3064514 bytes is almost 64MB).
Possible other causes
As #torek pointed out in his comment, this may not be an indication of the server running out of memory, but an indication that something is going wrong locally. Perhaps something changed between when you were pushing to server and to local host that freed up memory on your local machine?
It's also plausible that git is figuring out that you're pushing to localhost, and bypassing the "Git aware" transport mechanism and/or using hardlinks, which might reduce the memory needed. I don't see any indication in the docs that it WOULD do this, and I'm not sure off the top of my head how you could test this, or force it not to do that, but it's a possibility.
Another possible issue is that the host.xz:path/to/repo.git/ url syntax is only recognized if there are no slashes before the first colon, so depending on what server is, that could be causing problems.
If none of these are the case, and the problem is in fact that it's running out of memory on the server, you might have a few options here, depending on the circumstances. I don't know if any of these will work, but they're worth a try.
Solution 1: don't push all the commits at once
I'm assuming you've got many commits in the commit history of master. Try pushing them in stages. E.g.
git push server:/tmp/repo master~500
git push server:/tmp/repo master~400
git push server:/tmp/repo master~300
git push server:/tmp/repo master~200
git push server:/tmp/repo master~100
git push server:/tmp/repo master
Solution 2: Push individual objects one at a time
This is going to be incredibly tedious and DEFINITELY need to be automated/scripted on your local machine. However, you don't actually need to push whole commits all at once.
Instead, you can push individual objects one at a time as long as you push them to a tag ref instead of a branch ref. E.g. if we were working with https://github.com/llvm/llvm-project and wanted to push the tree object 0082ee0b3ad78ff55b2a3a65ef5bfdb8cd9713a1 from it (this is the tree object pointed to by commit faf5e0ec737a676088649d7c13cb50f3f91a703a), we could do git push server:/tmp/repo 0082ee0b3ad78ff55b2a3a65ef5bfdb8cd9713a1:refs/tags/test. Using this we can push individual objects one at a time, starting with blobs, then the tree objects, then finally commit objects. We'd end up with a TON of tags to clean up later, but I'll leave that to you to figure out.
For the rest of these solutions, I'm working under a couple assumptions:
Given the limitations you described, and the way you specified the url as server:/tmp/repo instead of something ending with .git, I'm assuming this remote repository isn't going to be managed with any service like github or gitlab, which should give you a little more room to use some unconventional techniques.
I'm also assuming you probably have the ability to log on to/run
commands on the server.
If either of these are not the case, and the above didn't work, I'm out of ideas at the moment.
Solution 3: backwards push using fetch or clone
There's actually nothing special about a server, it's just another git repository that you can trade commits with. The only difference is that a server is usually hosting what's called a bare repository: it doesn't typically keep a working tree of it's own (in other words, it only keeps the contents of the .git folder).
So, try performing the push in reverse using fetch/clone from server:
Push to a third, intermediate server (let's call it server2). Ideally, one with a lot more performance, like a github hosted repo.
Log onto/ssh into server, and from there, clone the repo into /tmp/repo: git clone --bare git#github.com:path/to/your/repo.git.
I would be surprised if this solved anything on it's own, but it's worth trying and step 1 will still set us up for solution 4 and 5. If by chance it does work, you can tidy up by removing server2 as a remote on server: git remote remove origin, then setting up your remotes on your local machine to point towards server instead of server2.
Solution 4: backwards push, but without fetching all the commits at once
Like solution 3, push to an intermediate server, but this time, instead of using clone and fetching everything all at once, fetch the commits in stages:
Log onto/ssh into server, and from there, initialize /tmp/repo as a bare repo:
cd /tmp/repo
git init --bare
git remote add origin git#github.com:path/to/your/repo.git
Still on server, fetch commits one at a time:
git fetch origin 569d84fe99e63e830ea036598f7fa7a5f9899d7c
git fetch origin 9aaba9d9bb4fc3648a9417820858086b14b6b73e
git fetch origin faf5e0ec737a676088649d7c13cb50f3f91a703a
Solution 5: backwards push, but using partial and/or shallow clones
Instead of fetching individual commits, we can use partial and or shallow clones to restrict how much we are fetching at once. There is a good write-up explaining what those are on the github blog: https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/. This time, we can't won't use a bare repository. We want to be able to check out commits to fill in the missing objects later. You can follow the instructions here to convert it to a bare repository when you're done. Alternatively, instead of using a regular (non-bare) repository, explicitly fetching the objects might also work, but I don't know for sure off the top of my head.
I think everything I've already written, combined with that write-up should give you all the pieces you need to figure out how to do this. I've already spent hours writing this up, it's late, this solution's kind of complicated, and it's an esoteric question that hasn't been touched in years. If somebody comes across this and needs a more complete answer for this, leave a comment and I'll fill it in, but this is as far as I'm willing to go right now for some potential internet points if nobody actually needs this answer XD.
I want to make a report of start and end times of a Autosys job from last three months.
How can i get it. Do i need to check archived history or logs?
If yes, please lemme know the details.
TIA
Autosys internally uses Oracle or Sybase database. As long as the data is available in the DB you can fetch it using autorep command. To get past run time use -r handle.
For example: autorep -J JobA -r -30
The above will give you last 30th run time for the job.
However, due to performance bottleneck that may arise due to historical data in the DBs the DBAs generally purge the data after a while. I have seen period of 1 day to 7 days based on the number of the jobs and database instance power.
Other approximate way would to be use the log files created by autosys if the option stdout is specified with unique filenames.
For example: you can have the attribute as std_out: $JOB_NAME.out.date +%m.%s
In this case the log file will be created as soon as the job starts which you can get from the filename using text function on unix,etc.
For the end-time, you can use the last modified time - this is where the approximate part comes in as the time would depend if your job had an echo to the log file or not. It can either be close or far based on the command of the script.
This method will not let you know the times for the box jobs as they never have a log attribute, for that you can depend on the first job in the box.
Autosys job retail_daily_job runs at 8:00 GMT. It is dependent on success of runner_daily_job.
Condition is If runner_daily_job is not success by 7:30 GMT, then status of retail_daily_job should be made to fail.i.e retail_daily_job should fail.
How to do this in autosys? what is the command to be used in jil file?
Thanks and Regards,
Simi
Not easy to do. Autosys doesn't support a negation condition for example NOT SUCCESS. I would try creating a job that would run at 0730 that would change the status of the retail_daily_job job to FAILURE if the runner_daily_job is FAILURE or TERMINATED or RUNNING.
Agree with clmccomas, not easy due to lack of certain operators.
We get around this by having our jobs create "status" files that other jobs can reference if need be. I've found that trying to parse/depend on output from autorep can be troublesome but is also doable. So we usually have a job status directory and create dated folders
/yyyymmdd/jobname_
The your command job would check the existing status file that's there. Often you may only want to write the status on completion.
The suggestion of having a job to alter statuses might work out well too.
Itseems like you are making the situation a bit complex.
If you want retail_daily_job to run based on the success of runner_daily_job, then simply put the condition as success(runner_daily_job).
So as you said if runner_daily_job is not success by 7:30 GMT (which means failure), then automatically the status of retail_daily_job will become to fail.
Hope this clarifies your question.
we have scheduled one script to run for every 5 mins.
how we can check that whether the script is running for every 5 minutes in linux?
If anyone knows, please Reply..
Thanks in Advance.
We have created the script. But we dont run that script. Our Support team will run this for every minutes. If they found any error, then they will update to us.
How we can cofirm that they r running this script properly or not?
If you don't want to modify your script and you've scheduled it in cron, you could change your cron line to:
*/5 * * * * /home/me/myscript.sh; date >> /tmp/mylog
And check /tmp/mylog - a new line should be added with the date and time every run.
Maybe by making the script log in a logfile with the timestamp, so you can verify if the timestamp is OK, something like:
date >>/tmp/myprocess.log
at the top of the script (or in the loop if that's how you're "running" it) then you can examine the log file to check.
Maybe you could just add a log with a timestamp in your script ?
Then you could see if the script was effectively run every 5 min.
Without changing the script?
atop is a process monitor package that can record history of programs being run, and as root will catch terminations as well. See http://www.atcomputing.nl/Tools/atop/whyatop.html
Also consider the process accounting tools http://www.faqs.org/docs/Linux-mini/Process-Accounting.html