Troubles running multiple salt-master

Troubles running multiple salt-master - salt-stack

Why is this not working?
I want to run multiple masters to serve different file sets.
I created this salt state launching a number of secondary salt-masters which appears to be working. Until they launch. Then zeroMQ exits with code None about 3 per second.
version: 2015.8.8+ds-1 Ubuntu 16.04
Lots of Log output of the following:
2016-08-25 23:23:02,461 [salt.transport.zeromq][INFO ][12120] Worker binding to socket ipc:///var/run/salt3/master/workers.ipc
2016-08-25 23:23:02,475 [salt.utils.process][INFO ][12116] Process <bound method ZeroMQReqServerChannel.zmq_device of <salt.transport.zeromq.ZeroMQReqServerChannel object at 0xb3ddae0c>>
(12553) died with exit status None, restarting...
Configuration of a secondary master instance:
publish_port: 502
ret_port: 503
pidfile: "/var/run/salt-master2.pid"
pki_dir: "/etc/salt2/pki/master"
cachedir: "/var/cache/salt2/master"
sock_dir: "/var/run/salt2/master"
roster_file: "/etc/salt2/roster"
worker_threads: "3"
file_roots:
base:
- /home/data/salt/file2
fileserver_backend:
- roots
- git
gitfs_remotes:
- file:///home/data/salt/saltgit/salt-formula
pillar_roots:
base:
- /home/data/salt/pillar
ext_pillar:
- file_tree:
root_dir: /home/data/salt/filepillar
follow_dir_links: True
log_level: "info"

Related

DynamoDB local behaving erratically

This is a very strange situation that's driving me nuts, and I would really appreciate some help here.
I am using CDK to define the DynamoDB table and associated indices. To test them locally, I installed cdklocal and DynamoDB local using localstack. When the computer (Mac running Ventura 13.1) is restarted, everything works as expected. Here is the script I use to bootstrap and start the stack (this is in a file called startStack.sh):
docker-compose up -d
echo "Waiting for 5 seconds"
sleep 5
cd test-app
cdklocal bootstrap
echo "Waiting for 5 seconds"
sleep 5
cdklocal deploy TestAppStack
#cdklocal deploy TestAppStack/ops-table
DYNAMO_ENDPOINT="http://localhost:4566/" dynamodb-admin &
open http://0.0.0.0:8001
cd ..
The test-app directory contains a local copy of the DynamoDB (and associated indices) definition. I do not encounter any errors running the cdklocal (or cdk) deploy commands so I am assuming that the CDK definition is not an issue.
The docker-compose looks like this:
version: "3.8"
services:
localstack:
container_name: AWS-DEVELOPMENT-WITH-LOCALSTACK
image: localstack/localstack:latest
network_mode: bridge
ports:
- "127.0.0.1:53:53"
- "127.0.0.1:53:53/udp"
- "127.0.0.1:443:443"
- "127.0.0.1:4566:4566"
- "127.0.0.1:4571:4571"
- "127.0.0.1:${PORT_WEB_UI-8080}:${PORT_WEB_UI-8080}"
environment:
- DYNAMODB_SHARE_DB=1
- DISABLE_CORS_CHECKS=1
- SERVICES=s3,dynamodb,sns,sqs,firehose,kinesis,ses,sts,cloudformation
- DEBUG=1
- DATA_DIR=/tmp/localstack/data
- PORT_WEB_UI=8080
- LAMBDA_EXECUTOR=local
- KINESIS_ERROR_PROBABILITY=1.0
- DOCKER_HOST=unix:///var/run/docker.sock
- HOST_TMP_FOLDER=./.localstack
volumes:
- './.localstack:/var/lib/localstack'
- '/var/run/docker.sock:/var/run/docker.sock'
Everything works as expected when I first run the startStack.sh file - the dynamodb-admin window opens up correctly and other interfaces can interact with the local DynamoDB table. But after some time (and I have not been able to pinpoint the cause), all interactions with local DynamoDB start failing with the following error(s):
Bootstrapping environment aws://000000000000/us-west-2...
❌ Environment aws://000000000000/us-west-2 failed bootstrapping: UnknownEndpoint: Inaccessible host: `localhost' at port `4566'. This service may not be available in the `us-west-2' region.
at Request.ENOTFOUND_ERROR (/usr/local/lib/node_modules/aws-sdk/lib/event_listeners.js:611:46)
at Request.callListeners (/usr/local/lib/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
at Request.emit (/usr/local/lib/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
at Request.emit (/usr/local/lib/node_modules/aws-sdk/lib/request.js:686:14)
at error2 (/usr/local/lib/node_modules/aws-sdk/lib/event_listeners.js:443:22)
at ClientRequest.<anonymous> (/usr/local/lib/node_modules/aws-sdk/lib/http/node.js:99:9)
at ClientRequest.emit (node:events:513:28)
at ClientRequest.emit (node:domain:489:12)
at Socket.socketErrorListener (node:_http_client:494:9)
at Socket.emit (node:events:513:28) {
code: 'UnknownEndpoint',
region: 'us-west-2',
hostname: 'localhost',
retryable: true,
originalError: [Error],
time: 2023-01-15T06:46:40.614Z
}
Inaccessible host: `localhost' at port `4566'. This service may not be available in the `us-west-2' region.
The script hangs at the following message:
[16:52:01] Retrieved account ID 000000000000 from disk cache
[16:52:01] Assuming role 'arn:aws:iam::000000000000:role/cdk-hnb659fds-deploy-role-000000000000-us-west-2'.
[16:52:01] Assuming role failed: Inaccessible host: `localhost' at port `4566'. This service may not be available in the `us-west-2' region.
[16:52:01] Could not assume role in target account using current credentials Inaccessible host: `localhost' at port `4566'. This service may not be available in the `us-west-2' region. . Please make sure that this role exists in the account. If it doesn't exist, (re)-bootstrap the environment with the right '--trust', using the latest version of the CDK CLI.
current credentials could not be used to assume 'arn:aws:iam::000000000000:role/cdk-hnb659fds-deploy-role-000000000000-us-west-2', but are for the right account. Proceeding anyway.
[16:52:01] Waiting for stack CDKToolkit to finish creating or updating...
Restarting the computer fixes it, but it's not clear what causes the issue in the first place. Restarting Docker does not help either.
Any thoughts on what could be causing the problem and how I can avoid it?

I'm adding this as an answer, although I do not have an affirmative answer I thought I would try to help.
I believe your port is being occupied and thus the process you are running is unable to obtain it resulting in error. Before running the job, check if the port is occupied:
sudo lsof -i :4566

This job depends on other jobs with expired / erased artifact

I have a python project which uses Gitlab Job retry api to retry a job of a pipeline.
But my retry job is getting failed with the error "This job depends on other jobs with expired / erased artifact". What could be the reason for this error?
stages:
- build
build:
tags: [kubernetes, linux, default]
image: #image-url
stage: build
script:
- python3 setup.py sdist bdist wheel
artifacts:
paths:
- $CI_PROJECT_DIR/dist
- ${CI_PROJECT_DIR}/job
- ${CI_PROJECT_DIR}/*.egg-info/PKG-INFO
expire_in: 600 mins

your artifacts expire after 600 minutes, so if you re-run a pipeline stage later than that, the artifact will not be present any more. If the pipeline stage you re-ran depends on a previous stage's artifact, then the error you are seeing occurs

Airflow Log file does not exist:

Airflow was working fine for several weeks and suddenly started getting errors for a few days.
Dags fail randomly with this error.
Log file does not exist: airflow_path/1.log
Fetching from: http://:8793/airflow_path/1.log
*** Failed to fetch log file from worker. The request to ':///' is missing either an 'http://

I had a similar issue, and I figured that in my case the worker node (I was using Celery Executor) was exhausted and therefore unavailable to execute any dags on it, can you check the CPU and memory utilized by the worker node (or its alternative if you are not using celery executor).
You can try to increase the CPU and memory for that applicable node and try.

Happened to me as well using LocalExecutor and an Airflow setup on Docker Compose. Eventually, I figured that the webserver would fail to fetch old logs whenever I recreated my Docker containers. Digging deeper, I realized that the webserver was failing to fetch the logs because it didn't have access to the filesystem of the scheduler (where the logs live).
The fix was to ensure that both the scheduler and the webserver services in docker-compose.yml share a volume with the logs, i.e.:
# docker-compose.yml
version: "3.9"
services:
scheduler:
image: ...
volumes:
- airflow_logs:/airflow/logs
...
webserver:
image: ...
volumes:
- airflow_logs:/airflow/logs
...
volumes:
airflow_logs:

No matching sls found for 'swapfile' in env 'base'

I have successfully been using saltstack for managing virtual and bare-metal Ubuntu-14.04-servers for about a year.
On master, I have the following /srv/salt/top.sls:
base:
'*':
- common
- users
- openvpn # openvpn-formula
- openvpn.config # openvpn-formula
- fail2ban # fail2ban-formula
- fail2ban.config # fail2ban-formula
- swapfile # swapfile-formula
- ntp # ntp-formula to set up and configure the ntp client or serv
In /etc/salt/master I have included the following:
gitfs_remotes:
- https://github.com/srbolle/openvpn-formula.git
- https://github.com/srbolle/postgres-formula.git
- https://github.com/srbolle/fail2ban-formula.git
- https://github.com/srbolle/ntp-formula.git
- https://github.com/srbolle/swapfile-formula.git
I have had no problems with saltstack-formulas until now, but when I recently included the "swapfile-formula" I get the following when running
salt '*' state.highstate:
servername:
Data failed to compile:
----------
No matching sls found for 'swapfile' in env 'base'
Also, I get the same error message when running:
salt 'servername' state.show_sls swapfile
servername:
- No matching sls found for 'swapfile' in env 'base'
When I run:
salt servername state.show_top
'swapfile' is listed. I have tried to clear the cache, restarted servers, recreated servers, used 'kwalify -m top.sls' to validate my top.sls-file. I have spent days on this error, and don't know how to further debug (logs don't show anything suspiciuos).
Thankful for any clues on how to proceed.

What are "states" when using SaltStack?

I'm trying SaltStack after using Puppet for a while, but I can't understand their use of the word "state".
My understanding is that, for example, a light switch has 2 possible states - on or off. When I write my SLS configuration I am describing what state a server should be in. When I ask SaltStack to provision a server I issue the command salt '*' state.highstate. I understand that a server can be in a highstate (as described in my config) or not. All good so far.
But this page describes other states. It describes lowstate, highstate and overstate (amongst others) as layers. Does this mean a server passes through several states to get to a highstate? Or all states are maintained simultaneously as layers? Or can I configure multiple possible states in my SLS and have SaltStack switch between them? Or are they just layers to SaltStack that have 'state' in the name and I'm confused?
I'm probably missing something obvious, if anyone can nudge me in the right direction I think a lot of the documentation will become clear to me!

Here, top.sls wihch contain,
# cat top.sls
base:
'*':
- httpd_require
and,
# cat httpd_require.sls
install_httpd:
pkg.installed:
- name: httpd
service.running:
- name: httpd
- enable: True
- require:
- file: install_httpd
file.managed:
- name: /var/www/html/index.html
- source: salt://index1.html
- user: root
- group: root
- mode: 644
- require:
- pkg: install_httpd
High state:
We can see all the aspects of high state system while working with state files( .sls), There are three specific components.
High data:
SLS file:
High State
Each individual State represents a piece of high data(pkg.installed:'s block), Salt will compile all relevant SLS inside the top.sls, When these files are tied together using includes, and further glued together for use inside an environment using a top.sls file, they form a High State.
# salt 'remote_minion' state.show_highstate --out yaml
remote_minion:
install_httpd:
__env__: base
__sls__: httpd_require
file:
- name: /var/www/html/index.html
- source: salt://index1.html
- user: root
- group: root
- mode: 644
- require:
- pkg: install_httpd
- managed
- order: 10002
pkg:
- name: httpd
- installed
- order: 10000
service:
- name: httpd
- enable: true
- require:
- file: install_httpd
- running
- order: 10001
First, an order is declared, All States that are set to be first will have their order adjusted accordingly. Salt will then add 10000 to the last defined number (which is 0 by default), and add any States that are not explicitly ordered.
Salt will also add some variables that it uses internally, to know which environment (__env__) to execute the State in, and which SLS file (__sls__) the State declaration came from, Remember that the order is still no more than a starting point; the actual High State will be executed based first on requisites, and then on order.
"In other words, "High" data refers generally to data as it is seen by the user."
Low States:
""Low" data refers generally to data as it is ingested and used by Salt."
Once the final High State has been generated, it will be sent to the State compiler. This will reformat the State data into a format that Salt uses internally to evaluate each declaration, and feed data into each State module (which will in turn call the execution modules, as necessary). As with high data, low data can be broken into individual components:
Low State
Low chunks
State module
Execution module(s)
# salt 'remote_minion' state.show_lowstate --out yaml
remote_minion:
- __env__: base
__id__: install_httpd
__sls__: httpd_require
fun: installed
name: httpd
order: 10000
state: pkg
- __env__: base
__id__: install_httpd
__sls__: httpd_require
enable: true
fun: running
name: httpd
order: 10001
require:
- file: install_httpd
state: service
- __env__: base
__id__: install_httpd
__sls__: httpd_require
fun: managed
group: root
mode: 644
name: /var/www/html/index.html
order: 10002
require:
- pkg: install_httpd
source: salt://index1.html
state: file
user: root
Together, all this comprises a Low State. Each individual item is a Low Chunk. The first Low Chunk on this list looks like this:
- __env__: base
__id__: install_httpd
__sls__: httpd_require
fun: installed
name: http
order: 10000
state: pkg
Each low chunk maps to a State module (in this case, pkg) and a function inside that State module (in this case, installed). An ID is also provided at this level (__id__). Salt will map relationships (that is, requisites) between States using a combination of State and __id__. If a name has not been declared by the user, then Salt will automatically use the __id__ as the name.Once a function inside a State module has been called, it will usually map to one or more execution modules which actually do the work.

salt '\*' state.highstate
'*' refers to all the minions connected to the master.
'state.highstate' is used to run all modules / scripts mentioned in top.sls defined in master
To invoke a specific module / script on all minions, use the following salt command where the state information is defined in state.sls for apache in the example given below.
salt '\*' state.sls apache
To invoke the above salt call only on a specific minion, use the below command.
salt 'minion-name' state.sls apache

I don't know all levels of state, but when you run :
salt '*' state.highstate
Saltstack apply the states you provide in /srv/salt/top.sls.
If you write nothing in it, you can't apply an highstate.
You can apply other state with this command :
salt '*' state.sls state.example

A highstate is just the collection of states that is applied to your server. There is a process in the background where Salt's "state compiler" goes through several stages preparing the data in order to produce the highstate, but you don't really need to worry about those.
Things like the lowstate can help with debugging, but aren't necessary for day to day usage. The highstate is only applied once.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex