Efficient serverside batch deletion via Artifactory REST API - artifactory

I'm looking for an efficient way to remove a large set of artifacts, spread across various locations from Artifactory (by retrievable with a search query).
I've tried using the JFrog CLI 'rt del' command (along with an AQL file) to search and then remove results, and this works. However, I am finding the removal rate is pretty slow for our instance -- around 1 artifact removal/sec. I will need to remove several hundred thousand artifacts, and this will take way too long. So I am looking for a batch removal mechanism which executes entirely serverside.
I noticed the Artifactory UI supports a 'search stash' feature where a search can be performed, then saved off and results can be acted upon (including deletion action). Is this available via the REST API? This seems like it would be a good match for this use-case.
Alternatively, is there a way to perform a search by creation date in the UI? If so, I could presumably use the search stash feature and perform the removal on the search stash.
Last option I can think of is to write a custom plugin to do this work, but I am hoping there is an easier way as it seems like a semi-common case.
Thanks in advance!

Deleting from search stash would delete the artfifacts from the stash results but won't delete from the artifactory(as per my understanding).
There is groovy plugin available that will clear your artifacts depending on few conditions(link below)
Groovy Clean Up
I found Artifactory AQL quite helpful in searching and deleting artifacts.
Also I had written a custom clean up script that in turn used aql to delete the artifacts for repo regex match and also checks for the artifacts promotion status

Related

Search for the node just created Alfresco JAVA

In one requirement I need to query just created document. If I use lucene search then it will take few seconds to do the indexing and may not come in the search result.
The query should be executing from some alfresco webscript or a scheduler which runs every 5 seconds.
Right now I am doing it by using NodeService and finding child by name which is not the efficient way to do. I am using JAVA API.
Is there any other way to do it?
Thank you
You don't mention what version of Alfresco you are using, but it looks like you are using Solr.
If you just created the document, the recommendation is to keep the reference to it, so you don't have to search for it again.
However, sometimes it is not possible to have the document reference. For example, client1 is not aware that client2 just created a document. If you are using Alfresco version 4.2 or later, you can probably enable Transactional Metadata Queries (TMQ), which allows you to perform searches against the database, so there is no Solr latency. Please review the whole section, because you need to comply with four conditions to use TMQ:
Enable the TMQ patch, so the nodes properties tables get indexed in the database.
Enable searches using the database, whenever possible (TRANSACTION_IF_POSSIBLE).
Make sure that you use the correct query language (CMIS, AFTS, db-lucene, etc.)
Your query must be supported by TMQ.

git build number c#

I'm trying to embed git describe-generated version info into AssemblyInfo.cs plus some label within ASP.NET website.
I already tried using git-vs-versionino but this assumes Git executable on PATH. However default install of msysgit on Windows does not set this up; it uses git bash. This caused problems.
Now I am looking for a way to utilize libgit2sharp library (for zero external dependencies) to use as build number generator. However this library has no describe command...
Thanks!
git-describe is a UI feature that nobody has implemented in the library or bindings yet (or at least nobody's contributed it), but you can do it yourself fairly easily.
You get a list of the tags and what commits they point to, do a walk down the commits and count how many steps it took to get to a commit that you have in the list you built. This already gives you the information you need. If the steps were zero, then your description would be the tag name only; otherwise you append the number of steps and the current commit's id to it.
There's a work in progress libgit2 pull request that proposes an implementation of git-describe functionalities.
See #1066 for more information.
It's not finished yet. Make sure to subscribe to it in order to be notified of its future progress.
Once it's done, it should be quite easy to bind it and make it available through LibGit2Sharp.

Alfresco failed to find new folder

I'm using alfresco throw cmis.
On one of our environment, we have an issue.
We want to create a folder and put some docs in it.
This works fines in all our env except one.
In this one, we can create the folder.
But when we do a search to find the folder, the folder isn't found.
After that i can find it with the share gui.
I have no error message in the share app.
Does any one have an idea on what could be the issue?
Promoting a comment to an answer...
When using Alfresco with SOLR, you need to be aware that the SOLR index isn't quite real-time. Close to real time, sure, but it's asynchronous so there's always a lag. (It's an eventually consistent index, not a fully realtime one)
There's a lot of information on the Alfresco and SOLR Wiki, including the way you can query what the current lag is.
If the lag is very low (eg a lightly loaded system), you can find that SOLR will catch up almost instantly, and newly created items will show instantly in the search results. However, it's more normal to expect to have to wait a little bit, especially on more loaded systems.
If no new results are showing up even after several minutes, you'll want to follow the instructions on the wiki or the SOLR Monitoring and Troubleshooting docs to work out why and fix.

fossil dvcs difference between update and checkout commands

After reading the builtin help, it seems to me that both commads can be used for modifying the workspace to match a certain revision. But I don't understand the differences between update and checkout. Please include some trivial workflows in your answer which show when update/checkout are appropriate.
First major difference is that if you have a remote url set, update will pull first latest artifacts from the remote repository.
Another difference is that if you have uncomitted changes, checkout will not run (unless you force it), whereas update will retain your changes and reapply them. With update you can therefore integrate changes from other users before committing.
So:
Update is what you need when you collaborate on a project, in order to prevent forks.
Checkout lets you deploy a particular version.

DVCS - How often and when to commit changes

There is another thread here on StackOverflow, dealing wih how often to commit changes to source control. I want to put that in the context of using a DVCS like git or mercurial.
How often and when do you commit?
Do you only commit changes when they
build correctly?
How often and when do you push your changes (or file a pull request or similar)?
How do you approac developing a complex feature / doing a complex refactoring requiring many places to be touched? Are "private commits" that won't build ok? When finished, do you push them also to the master repository or do you bundle all your changes into a single changeset before pushing?
It depends on the nature of the branch ("line of development") you are working on.
The main advantage with those DVCS (git or mercurial) is the ease you can:
branch
merge
So:
1/ How often and when do you commit?
2/ Do you only commit changes when they build correctly?
As many time as necessary on a private branch (for instance, if it compiles).
The practice to only commit if unit tests pass is a good one, but should only apply to an "official" (as in "could be published or 'pushed'") branch: in your private branch, you merge a gazillon times if you need to.
The only thing is: do some merge --interactive to reorganize your many commits on your private branch, before replaying them on your main development branch, where you can pass some tests.
3/ How often and when do you push your changes (or file a pull request or similar)?
Publication is another matter and should be done with a "clear" history (coherent merges, representing a content which compile and pass some tests).
The branch you publish should be one where the history is never rewritten, always updated.
The pace of the publications depends on the nature of the remote branch and of the population pulling that branch. For instance, if it is for another team, you could push quite often. If it is for a system-wide integration testing team, you will push a lot less often.
4/ How do you approach developing a complex feature / doing a complex refactoring requiring many places to be touched? Are "private commits" that won't build ok? When finished, do you push them also to the master repository or do you bundle all your changes into a single changeset before pushing?
See 1. and 2.: patch first in your own private branch, then reorganize your commits on an official (published) patch branch. One single commit is not always the best option if the patch involves several different "activities" (or bug fix).
I'd commit changes to my DVCS (my own topic or task branch) very, very often, this way I can use it not only for "delivering changes" but also to help me while I work: like "why this was working 5 minutes ago and it's not working anymore?" If you commit often you can just run a diff.
Also, a technique I found very, very good is using it to "self-document refactors". Let me explain: if you've to do a big refactor on a topic branch and then review the change as a whole (having modified a nice set of files), you'd probably get lost. But, suppose you checkin on every "intermediate step" and document it with a comment, then you're creating some sort of "movie" of your own changes helping to describe what you've done! Huge for reviewers.
I commit a lot; when adding functions or even reformatting my sources.
I use git and do most of my work on non-shared branches. And when I've added enough little changes that count as a block, I use git rebase to collect the smaller related changes into larger chunks and commit that to the main branches.
This way, I have all the advantages of committed code that I can go backwards or forwards in, but I don't have to commit all my mistakes, and bactracks to the main history.
How often and when do you commit?
Very frequently. It could be as much as a few times in a hour, if the changes I've made work and make up a nice patch. Or it could be every few hours, depending on whether I am spending longer debugging things, or experimenting with risky changes.
Do you only commit changes when they build correctly?
Yes, almost always. I can't think of a reason right to check in code that didn't build correctly. There's plenty of reasons you might check in code that doesn't run correctly (in a branch) though.
How often and when do you push your changes (or file a pull request or similar)?
Normally only when a feature is complete and ready for integration testing. That means it has passed unit tests and other relevant tests, and the code/patch is clean enough that I consider it ready for review.
How do you approac developing a complex feature / doing a complex refactoring requiring many places to be touched? Are "private commits" that won't build ok? When finished, do you push them also to the master repository or do you bundle all your changes into a single changeset before pushing?
I would create a named branch for the feature (which would have traceability across design docs and Issue Tracking system). Commits that don't build would only really be ok on a private branch as an intermediate step, but would still be exceptional. Currently I don't rebase and merge the entire feature branch into a single changeset, though it is something I'm looking at doing in future. Changes are only pushed when appropriate tests are all passed.
I follow this kind of flow
alt text http://img121.imageshack.us/img121/3272/versioncontrolsysbestpr.png
(Here is the original image url)
I guess this says pretty everything. Basically I would do a check-in after implementing a full working use case / user story (depends on your software process). The major important thing is that you check-in things that work in the sense that they compile. Never break the build!
Doing a commit after each user story/use case has the advantage that you have a better tracking of past versions and undoing changes is much easier.

Resources