Say I have a cmd.wait script that watches a managed git repository for changes. What’s the best way to trigger that script even if the repo hasn’t changed?
Here's the scenario:
my-repo:
git.latest:
- name: git#github.com:my/repo.git
- rev: master
- target: /opt/myrepo
- user: me
- require:
- pkg: git
syncdb:
cmd.run:
- name /opt/bin/syncdb.sh
load-resources:
cmd.wait:
- name: /opt/bin/load_script.py /opt/myrepo/resources.xml
- require:
- cmd: syncdb
- watch:
- git: my-repo
index-resources:
cmd.wait:
- name: /opt/bin/indexer.sh
- watch:
- cmd: load-resources
Say that I run this state, but syncdb fails. load-resources and index-resources fail as well because of missing prerequisites. But my-repo succeeded, and now has the latest checkout of the repository.
So I go ahead and fix the problem that was causing syncdb to fail, and it succeeds. But now my cmd.watch scripts won't run, because my-repo isn't reporting any changes.
I need to trigger, just once, load-resources, and going forward I want it to only trigger when the repo changes. Now, I could just change it to use cmd.run, but in actuality I have a bunch of these cmd.wait scripts in a similar state, and I really don't want to have to go through and switch them all and then switch them back. The same goes for introducing artificial changes into the git repo. There are multiple repos involved and that's annoying in many ways. Finally, I can foresee something like this happening again, and I want a sure solution for handling this case that doesn't involve a bunch of error-prone manual interventions.
So, is there a way to trigger those cmd.watch scripts manually? Or is there a clever way to rearrange the dependencies so that a manual triggering is possible?
Assuming the above sls lives in: /srv/salt/app.sls then you should be able to execute load-resources by doing this:
$: salt '*appserver*' state.sls_id load-resources app base
That said, there are surely many better ways to do this, so that you don't have to manually handle failures.
You could change your load-resources to use cmd.run with unless command that actually checks whether the resources have been loaded or not. If that's not possible to do in business terms (i.e. no easy way to check), then something generic could do, this can be as simple as a file you create at the end of load_script.py. The file can contain the commit id of the git repo at the time of the import, and if the file doesn't exist or the commit id in the file is different than that of the current git repo, you know you have to re-import.
A better variation would be to even bake the unless logic into load_script.py, which would make that script idempotent. Just like all salt modules. Your SLS file would be even simpler then.
Related
Here is the step i'm trying to improve - it removes all files/directories except .git.
- name: Clear Working directory
continue-on-error: true
run: |
rm -fr . .eslintrc.json .gitignore .stylelintrc .github
The small problem here is that anytime I add a hidden .file the action would need to be updated and I would like a more permanent solution.
My attempts have used ! and () which have proven unsucessful because they are not allowed as part of a bash command even after running shopt -s extglob.
I have considered writing a bash script but that is something that would need to be commited to the repo which seems less than optimal or writing my own action which also seems unnecessary when it's something that should be doable with one line in a unix bash shell.
Original Question - See Answer at End
We've been using Firebase Functions for 2+ years and have amassed well over 120 HTTP, callable, triggered, and scheduled functions, all being called from a single functions/index and managed by a single package.json, probably like a lot of you. As you can imagine, we have some old dependencies in there that we're hesitant to update because it's an awful lot of code to go through, test, etc. So it got me thinking, and I'm asking if any of you have done this or know why this wouldn't work...
Looking at the GCP dashboard, each function is a separate, stand-alone service. But if you download the code from there, you end up with the full build of all 120+ functions, node modules, etc. So if I run npm deploy on my single functions directory (if quotas weren't an issue), it looks like
Firebase Tools grabs my single build on my machine or CI tool
copies it 120+ times, and then
pushes one full copy of the entire build into each of those functions
That got me thinking - considering I can't and don't want to build my entire project and deploy all functions at once, do I have to have them all in a single functions directory, sharing a single package.json and dependencies, and exported from a single functions/index?
Is there any reason I couldn't have, for example:
- functions
- functionSingleA (running on node 10)
- lib/index.js
- package.json (stripe 8.92 and joi)
- src/index.ts
- node_modules
- functionGroupB (running on node 12)
- lib/index.js
- package.json (stripe 8.129 and #hapi/joi)
- src/index.ts
- node_modules
I know that I lose the ability to deploy all at once, but I don't have that luxury any more due to quotas. Beyond that, is there any reason this wouldn't work? After all, as best as I can tell, Firebase Functions are just individual serverless Cloud Functions with Firebase credentials built in. Am I missing something, or do you do this and it works fine (or breaks everything)?
Answer from Google Firebase Team
A Firebase engineer through support confirms that this is absolutely possible, but also check out the discussion between me and #samthecodingman. You can break up your functions into completely self-contained modules or groups with different package.json files and dependencies, and deploy each one (individually or as groups) without affecting other functions.
What you lose in return is the ability to deploy all with the firebase functions deploy command (though #samthecodingman presented a solution), and you lose the ability to emulate functions locally. I don't have a workaround for that yet.
It should be possible by tweaking the file structure to this:
- functionProjects
- deployAll.sh
- node10
- deploy.sh
- firebase.json
- functions
- lib/index.js
- package.json (stripe 8.92 and joi)
- src/index.ts
- node_modules
- node12
- deploy.sh
- firebase.json
- functions
- lib/index.js
- package.json (stripe 8.129 and #hapi/joi)
- src/index.ts
- node_modules
As a rough idea, you should be able to use a script to perform targeted deployments. Using the targeted deploy commands, it should leave the other functions untouched (i.e. it won't ask you to delete missing functions).
Each deploy.sh should change the working directory to where it is located, and then execute a targeted deploy command.
#!/bin/bash
# update current working directory to where the script resides
SCRIPTPATH=$(readlink -f "$0")
SCRIPTPARENT=$(dirname "$SCRIPTPATH")
pushd $SCRIPTPARENT
firebase deploy --only functions:function1InThisFolder,functions:function2InThisFolder,functions:function3InThisFolder,...
popd
The deployAll.sh file just executes each 'child' folder's deploy.sh.
#!/bin/bash
/bin/bash ./node10/deploy.sh
/bin/bash ./node12/deploy.sh
This requires maintaining the list of functions in deploy.sh, but I don't think that's too tall of an ask. You could mock the firebase-functions library so that calls to functions.https.onRequest() (along with the other function exports) just return true and use that to get a dynamic list of functions if you so desire.
You could also flatten the file structure so that ./node10 and ./node12 are the deployed function directories (instead of the nested functions folders) by adding "functions": { { "source": "." } } to their respective firebase.json files.
I have just removed a DVC tracking file by mistake using the command dvc remove training_data.dvc -p, which led to all my training dataset gone completely. I know in Git, we can easily revert a deleted branch based on its hash. Does anyone know how to revert all my lost data in DVC?
You should be safe (at least data is not gone) most likely. From the dvc remove docs:
Note that it does not remove files from the DVC cache or remote storage (see dvc gc). However, remember to run dvc push to save the files you actually want to use or share in the future.
So, if you created training_data.dvc as with dvc add and/or dvc run and dvc remove -p didn't ask/warn you about anything, means that data is cached similar to Git in the .dvc/cache.
There are ways to retrieve it, but I would need to know a little bit more details - how exactly did you add your dataset? Did you commit training_data.dvc or it's completely gone? Was it the only data you have added so far? (happy to help you in comments).
Recovering a directory
First of all, here is the document that describes briefly how DVC stores directories in the cache.
What we can do is to find all .dir files in the .dvc/cache:
find .dvc/cache -type f -name "*.dir"
outputs something like:
.dvc/cache/20/b786b6e6f80e2b3fcf17827ad18597.dir
.dvc/cache/00/db872eebe1c914dd13617616bb8586.dir
.dvc/cache/2d/1764cb0fc973f68f31f5ff90ee0883.dir
(if the local cache is lost and we are restoring data from the remote storage, the same logic applies, commands (e.g. to find files on S3 with .dir extension) look different)
Each .dir file is a JSON with a content of one version of a directory (file names, hashes, etc). It has all the information needed to restore it. The next thing we need to do is to understand which one do we need. There is no one single rule for that, what I would recommend to check (and pick depending on your use case):
Check the date modified (if you remember when this data was added).
Check the content of those files - if you remember a specific file name that was present only in the directory you are looking for - just grep it.
Try to restore them one by one and check the directory content.
Okay, now let's imagine we decided that we want to restore .dvc/cache/20/b786b6e6f80e2b3fcf17827ad18597.dir, (e.g. because content of it looks like:
[
{"md5": "6f597d341ceb7d8fbbe88859a892ef81", "relpath": "test.tsv"}, {"md5": "32b715ef0d71ff4c9e61f55b09c15e75", "relpath": "train.tsv"}
]
and we want to get a directory with train.tsv).
The only thing we need to do is to create a .dvc file that references this directory:
outs:
- md5: 20b786b6e6f80e2b3fcf17827ad18597.dir
path: my-directory
(note, that path /20/b786b6e6f80e2b3fcf17827ad18597.dir became a hash value: 20b786b6e6f80e2b3fcf17827ad18597.dir)
And run dvc pull on this file.
That should be it.
I have a patch file that needs to be applied in the working directory of codebase checked out using SVN. I need to write program to do this. Now I used SVNKIT jar to do (checkout from repository, updating code base, reverting any local changes). Now I could not figure out a way to apply a patch to the code base. Is there any way to do?
With SVNKit, use the "doPatch(java.io.File, java.io.File, boolean, int)" method of the SVNDiffClient class:
Arguments:
the source patch file
the target directory
"dryRun" - if "true", the patching process is carried out, and full notification feedback is provided, but the working copy is not modified
"stripCount" - specifies how many leading path components should be stripped from paths obtained from the patch (usually "0")
I'm just learning saltstack to start automating provisioning and deployment. One thing I'm having trouble finding is how to recursively set ownership on a directory after extracting an archive. When I use the user and group properties, I get a warning that says this functionality will be dropped in archive.extracted in a future release (carbon).
This seems so trivial, but I can't find a good way to do the equivalent of chown -R user:user on the dir that's extracted from the tar I'm unpacking.
The only thing I could find via googling was to add a cmd.run statement in the state file that runs chown and requires the statement that unpacks the tar. There's gotta be a better way, right?
EDIT: the cmd.run workout works perfectly btw, it just seems like a work around.
Here's how I have used it. I extract the file and then have a file.directory which set's the permission.
/path/to/extracted/dir:
file.directory:
- user: <someuser>
- group: <group>
- mode: 755 # some permission
- recurse:
- user
- group
- require:
- archive: <State id of `archive.extracted`>