What happens if orchestration triggers a salt-master service restart? - salt-stack

Suppose I have an orchestration file that runs the salt-formula's salt.master state, among others. Suppose also that I've made some pillar change that results in an update to the master's config file, which in turn causes the salt-master service to restart.
What happens to the rest of the orchestration run? In particular, what happens if the config change is to something like GitFS remotes, where new files may be available to minions after the salt.master state runs?

Once the salt master service restarts, a highstate stops dead in its tracks. There is no built in way to for a highstate to keep state across salt-master restarts. There are some workarounds where you set a flag on the file system or in a grain and have your highstate check for those flags.
That being said, if you're using the state.orchestrate or state.over runners, those aren't necessarily dependent on the salt-master daemon. I haven't tested this, but the state.orchestrate should most likely continue even if the salt-master daemon restarts.
I may have some time this afternoon to test, but I'd recommend just testing this in your environment.

Related

How to implement a state that wait for other minions finishing certain jobs then execute certain state?

How to implement a state that wait for other minions finishing certain jobs then execute certain state?
For example, I have a cluster of minions called minion-aha1 to minion-aha3, and I install hadoop and hbase on these 3 minions. Now, I would like to convert them to HA mode. Suppose minion-aha1 is the leader. So the logic flow would be:
Start hadoop and hbase on all 3 minions
-> minion-aha1 wait till rest of minions are hadoop and hbase are on and healthy
-> minion-aha1 call join (e.g. stop namenode, hdfs namenode -initializeSharedEdits, start namenode)
-> rest minions call nn2 (e.g. hdfs namenode -bootstrapStandby, start namenode)
I already knew how to convert hbase to HA mode, and I could set the leader in grain, just curious on how to shrink above procedure to single-line, i.e.
salt 'minion-aha*' state.apply hadoop.hbase_to_ha
Or even salt.orch state would be acceptable. The above would fail due to minion-aha1 never know the state of rest of minions. In other words, it might run successfully once if the developer is lucky, but I look for the solution would run successfully every time.
Thank you.
If you want to solve this without making use of an Orchestration SLS, you could look at one of the following approaches:
use the Salt Mine to publish information from a Minion to the Master, which can then be retrieved by another one
use Peer Communication to allow one Minion to generate a job to be executed on another one
Basically I benefit from this answer with some modification:
Trigger event on Master and wait for "response event" on Salt Minion
For custom event sent to the salt master, e.g. mycompany/hbase/status/*/start, I have to send event -> saltutil.sync_all, then the wait_for_event.

Openresty dynamic spawning of worker process

Is is possible to spawn a new worker process and gracefully shutdown an existing one dynamically using Lua scripting in openresty?
Yes but No
Openresty itself doesn't really offer this kind of functionality directly, but it does give you the necessary building blocks:
nginx workers can be terminated by sending a signal to them
openresty allows you to read the PID of the current wroker thread
LuaJITs FFI allows you to use the kill() system call or
using os.execute you can just call kill directly.
Combining those, you should be able to achieve what you want :D
Note: After reading the question again, I noticed that I really only answered the second part.
nginx uses a set number of worker processes, so you can only shut down running workers which the master process will then restart, but the number will stay the same.
If you just want to change the number of worker processes, you would have to restart the nginx instance completely (I just tried nginx -s reload -g 'worker_processes 4;' and it didn't actually spawn any additional workers).
However, I can't see a good reason why you'd ever do that. If you need additional threads, there's a separate API for that, other than that, you'll probably just have to live with a hard restart.

Can a salt master provide a state to unavailable minions?

Background: I have several servers which run a service I develop. All of them should have the same copy of the service.
To ensure deployment and up-to-dateness I use Ansible, with an idempotent playbook which deploys the service. Since the servers are on an unreliable network, I have to run the playbook periodically (in a cron job) to reach the servers which may not have been available before.
Problem: I was under the impression that the SaltStack philosophy is different: I though I could just "set a state, compile it and offer to a set of minions. These minions would then, at their own leisure, come to the master and get whatever they need to do".
This does not seem to be the case: the minions which were not available at deployment time are skipped.
Question: is there a mechanism which would allow for an asynchronous deployment, in the sense that a state set on the master one time only would then be pulled and applied by the minions (to themselves) once they are ready / can reach the master?
Specifically, without the need to continuously re-offer the same state to all minions, in the hope that the ones which were unavailable in the past are now capable to get the update.
Each time a minion connects to the master there is an event on the event bus which you can react upon.
Reactor
This is the main difference between Ansible and Saltstack.
In order to do what you want, I would react on each minion's reconnect and try to apply a state which is idempotent.
Idempotent
You could also setup a scheduled task in Saltstack that runs the state every X minutes and apply the desired configuration.
Scheduled task
The answer from Daniel Wallace (salt developper):
That is not possible.
The minions connect to the publish port/bus and the master puts new
jobs on that bus. Then the minion picks it up and runs the job, if the
minion is not connected when the job is published, then it will not
see the job.

Salt multi master: does it work with multiple masters offline

I am trying to run a multi-master setup in our dev environment.
The idea is that every dev team has their own salt master. However, all minions in the entire dev environment should be able to receive salt commands from all salt master servers.
Since not every team needs their salt master 24/7, most of them are turned off for several days during the week.
I'm running 2016.11.4 on the masters, as well as on the minions.
However, I run into the following problem: If one of the hosts that are listed in the mininons config file is shut down, the minion will not always report back on a 'test.ping' command (not even with -t 60)
My experience is, that the more master servers are offline, the longer the lag of the minion is to answer requests.
Especially if you execute a 'test.ping' on MasterX while the minions' log is at this point:
2017-05-19 08:31:44,819 [salt.minion ][DEBUG ][5336] Connecting to master. Attempt 4 (infinite attempts)
If I trigger a 'test.ping' at this point, chances are 50/50 that I will get a 'minion did not return' on my master.
Obviously though, I always want a return to my 'test.ping', regardless from which master I send it.
Can anybody tell me if what I try is feasible with salt? Because all the articles about salt multi master setup that I could find would only say: 'put a list of master servers into the minion config and that's it!'
The comment from gtmanfred solved my question:
That is not really the way multi master is meant to work. It is supposed to be used more for failover and not for separating out teams.

how to get instances back after reboot in openstack

After successful installation of devstack and launching instances,but once reboot machine, need to start all over again and lose all the instances which were launched back then.I tried rejoin-stack but did not worked,How can i get the instances back after reboot ?
You might set resume_guests_state_on_host_boot = True in nova.conf. The file should be located at /etc/nova/nova.conf
I've found some old discussion http://www.gossamer-threads.com/lists/openstack/dev/8772
AFAIK at the present time OpenStack (Icehouse) still not completely aware about environments inside it, so it can't restore completely after reboot. The instances will be there (virsh domains), but even if you start them manually or using nova flags I'm not sure whether other facilities will handle this correctly (e.g. neutron will correctly configure all L3 rules according to DB records, etc.) Honestly I'm pretty sure they won't...
The answer depends of what you need to achieve:
If you need a template environment (e.g. similar set of instances and networks each time after reboot) you may just script everything. In other words just make a bash script creating everything you need and run it each time after stack.sh. Make sure you're starting with clean environment since OpenStack DB state remains between ./unstack - ./stack.sh or ./rejoin-stack.sh (you might try to just clean DB, or delete it. stack.sh will build it back).
If you need a persistent environment (e.g. you don't want to loose VM's and whole infrastructure state after reboot) I'm not aware how to do this using OpenStack. F.e. neutron agents (they configure iptables, dhcp etc) do not save state and driven by events from Neutron service. They will not restore after reboot, so the network will be dead. I'll be very glad if someone will share a method to do such recovery.
In general I think OpenStack is not focusing on this and will not focus during the nearest release cycles. Common approach is to have multi-node environment where each node is replaceable.
See http://docs.openstack.org/high-availability-guide/content/ch-intro.html for reference
devstack is an ephemeral environment. it is not supposed to survive a reboot. this is not a supported behavior.
that being said you might find success in re-initializing the environment by running
./unstack.sh
follower by
./stack.sh
again.
Again, devstack is an ephemeral environment. It's primary purpose for existing is to run gate testing for openstack's CI infrastructure.
or try ./rejoin-stack.sh to re-join previous screens.

Resources