I'm try to run vault instance on aws and when i want to run command: vault operator init -key-shares=5 -key-threshold=3 -format json on Ansible role and i have error code :
fatal: [vault]: FAILED! => {"changed": true, "cmd": "vault operator init -key-shares=5 -key-threshold=3 -format json", "delta": "0:00:00.054870", "end": "2021-12-12 14:30:50.956504", "msg": "non-zero return code", "rc": 2, "start": "2021-12-12 14:30:50.901634", "stderr": "Error initializing: Put \"http://127.0.0.1:8200/v1/sys/init\": dial tcp 127.0.0.1:8200: connect: connection refused", "stderr_lines": ["Error initializing: Put \"http://127.0.0.1:8200/v1/sys/init\": dial tcp 127.0.0.1:8200: connect: connection refused"], "stdout": "", "stdout_lines": []}
When i'm on my vault server and when i do service vault status, i have this result :
vault.service - a tool for managing secrets
Loaded: loaded (/etc/systemd/system/vault.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sun 2021-12-12 14:19:47 UTC; 6min ago
Docs: https://vaultproject.io/docs/
Process: 5152 ExecStart=/usr/local/bin/vault server -config=/etc/vault.hcl (code=exited, status=213/SECUREBITS)
Main PID: 5152 (code=exited, status=213/SECUREBITS)
Dec 12 14:19:47 ip-172-31-37-194 systemd[1]: Started a tool for managing secrets.
Dec 12 14:19:47 ip-172-31-37-194 systemd[5152]: vault.service: Failed to set process secure bits: Operation not perm
Dec 12 14:19:47 ip-172-31-37-194 systemd[5152]: vault.service: Failed at step SECUREBITS spawning /usr/local/bin/vau
Dec 12 14:19:47 ip-172-31-37-194 systemd[1]: vault.service: Main process exited, code=exited, status=213/SECUREBITS
Dec 12 14:19:47 ip-172-31-37-194 systemd[1]: vault.service: Failed with result 'exit-code'.
There'is my 2 config files :
vault.hcl :
disable_mlock = true
listener "tcp" {
address = "http://{{ listener_address }}"
tls_disable = 1
}
backend "file" {
path = "/var/lib/vault"
}
my vault.service :
[Unit]
Description=a tool for managing secrets
Documentation=https://vaultproject.io/docs/
After=network.target
ConditionFileNotEmpty=/etc/vault.hcl
[Service]
User=vault
Group=vault
ExecStart=/usr/local/bin/vault server -config=/etc/vault.hcl
ExecReload=/usr/local/bin/kill --signal HUP $MAINPID
CapabilityBoundingSet=CAP_SYSLOG CAP_IPC_LOCK
Capabilities=CAP_IPC_LOCK+ep
SecureBits=keep-caps
NoNewPrivileges=yes
KillSignal=SIGINT
[Install]
WantedBy=multi-user.target
I didn't find anything yet who could unlock this situation, if someone have an idea.
Related
I'm having issues with ssl certificate verification. When I am trying to send logs to the server to nginx, I get an error message that says:
Feb 14 21:38:53 username td-agent-bit[31178]: [2022/02/14 21:38:53] [error] [tls] /tmp/fluent-bit-1.8.12/src/tls/mbedtls.c:380 X509 - Certificate verification failed, e.g. CRL, CA or signature check
Feb 14 21:38:53 username td-agent-bit[31178]: [2022/02/14 21:38:53] [error] [output:http:http.0] no upstream connections available to 127.0.0.1:443
Feb 14 21:38:53 username td-agent-bit[31178]: [2022/02/14 21:38:53] [ warn] [engine] failed to flush chunk '31025-1644867441.221825565.flb', retry in 32 seconds: task_id=20, input=storage_backlog.6 > out
put=http.0 (out_id=0)
Feb 14 21:38:53 username td-agent-bit[31178]: [2022/02/14 21:38:53] [ info] [output:http:http.0] 127.0.0.1:443, HTTP status=200
Feb 14 21:38:53 username td-agent-bit[31178]: {"status":200}
Feb 14 21:38:54 username td-agent-bit[31178]: [2022/02/14 21:38:54] [error] [tls] /tmp/fluent-bit-1.8.12/src/tls/mbedtls.c:380 X509 - Certificate verification failed, e.g. CRL, CA or signature check
Feb 14 21:38:54 username td-agent-bit[31178]: [2022/02/14 21:38:54] [error] [output:http:http.0] no upstream connections available to 127.0.0.1:443
Feb 14 21:38:54 username td-agent-bit[31178]: [2022/02/14 21:38:54] [ warn] [engine] failed to flush chunk '31025-1644867401.174594241.flb', retry in 37 seconds: task_id=12, input=storage_backlog.6 > out
put=http.0 (out_id=0)
Feb 14 21:38:54 username td-agent-bit[31178]: [2022/02/14 21:38:54] [error] [tls] /tmp/fluent-bit-1.8.12/src/tls/mbedtls.c:380 X509 - Certificate verification failed, e.g. CRL, CA or signature check
Feb 14 21:38:54 username td-agent-bit[31178]: [2022/02/14 21:38:54] [error] [output:http:http.0] no upstream connections available to 127.0.0.1:443
Feb 14 21:38:54 username td-agent-bit[31178]: [2022/02/14 21:38:54] [ warn] [engine] failed to flush chunk '31025-1644867416.136883568.flb', retry in 12 seconds: task_id=15, input=storage_backlog.6 > out
put=http.0 (out_id=0)
Feb 14 21:38:54 username td-agent-bit[31178]: [2022/02/14 21:38:54] [error] [tls] /tmp/fluent-bit-1.8.12/src/tls/mbedtls.c:380 X509 - Certificate verification failed, e.g. CRL, CA or signature check
Feb 14 21:38:54 username td-agent-bit[31178]: [2022/02/14 21:38:54] [error] [output:http:http.0] no upstream connections available to 127.0.0.1:443
Feb 14 21:38:54 username td-agent-bit[31178]: [2022/02/14 21:38:54] [ warn] [engine] failed to flush chunk '31025-1644867481.167299560.flb', retry in 10 seconds: task_id=28, input=storage_backlog.6 > out
put=http.0 (out_id=0)
Feb 14 21:38:54 username td-agent-bit[31178]: [2022/02/14 21:38:54] [ info] [output:http:http.0] 127.0.0.1:443, HTTP status=200
Feb 14 21:38:54 username td-agent-bit[31178]: {"status":200}
Feb 14 21:38:55 username td-agent-bit[31178]: [2022/02/14 21:38:55] [error] [tls] /tmp/fluent-bit-1.8.12/src/tls/mbedtls.c:380 X509 - Certificate verification failed, e.g. CRL, CA or signature check
Feb 14 21:38:55 username td-agent-bit[31178]: [2022/02/14 21:38:55] [error] [output:http:http.0] no upstream connections available to 127.0.0.1:443
Feb 14 21:38:55 username td-agent-bit[31178]: [2022/02/14 21:38:55] [ warn] [engine] failed to flush chunk '31178-1644867522.155353155.flb', retry in 19 seconds: task_id=3, input=tail.2 > output=http.0 (
out_id=0)
Feb 14 21:38:55 username td-agent-bit[31178]: [2022/02/14 21:38:55] [ info] [output:http:http.0] 127.0.0.1:443, HTTP status=200
Feb 14 21:38:55 username td-agent-bit[31178]: {"status":200}
CRL, CA or signature verification failed, for some reason. Verification passes only after certain number of attempts.
How to fix it?
td-agent-bit.conf:
[SERVICE]
# Flush
# =====
# set an interval of seconds before to flush records to a destination
flush 5
# Daemon
# ======
# instruct Fluent Bit to run in foreground or background mode.
daemon Off
# Log_Level
# =========
# Set the verbosity level of the service, values can be:
#
# - error
# - warning
# - info
# - debug
# - trace
#
# by default 'info' is set, that means it includes 'error' and 'warning'.
log_level info
# Parsers File
# ============
# specify an optional 'Parsers' configuration file
parsers_file parsers.conf
# Plugins File
# ============
# specify an optional 'Plugins' configuration file to load external plugins.
plugins_file plugins.conf
# HTTP Server
# ===========
# Enable/Disable the built-in HTTP Server for metrics
http_server Off
http_listen 0.0.0.0
http_port 2020
# Storage
# =======
# Fluent Bit can use memory and filesystem buffering based mechanisms
#
# - https://docs.fluentbit.io/manual/administration/buffering-and-storage
#
# storage metrics
# ---------------
# publish storage pipeline metrics in '/api/v1/storage'. The metrics are
# exported only if the 'http_server' option is enabled.
#
# storage.metrics on
# storage.path
# ------------
# absolute file system path to store filesystem data buffers (chunks).
#
storage.path /tmp/fluent-bit-storage/
# storage.sync
# ------------
# configure the synchronization mode used to store the data into the
# filesystem. It can take the values normal or full.
#
storage.sync normal
# storage.checksum
# ----------------
# enable the data integrity check when writing and reading data from the
# filesystem. The storage layer uses the CRC32 algorithm.
#
storage.checksum off
# storage.backlog.mem_limit
# -------------------------
# if storage.path is set, Fluent Bit will look for data chunks that were
# not delivered and are still in the storage layer, these are called
# backlog data. This option configure a hint of maximum value of memory
# to use when processing these records.
#
storage.backlog.mem_limit 2M
[INPUT]
name tail
tag log.development.production
path /home/username/production.log
Buffer_Max_Size 2mb
Refresh_interval 5
Offset_Key offset
Path_Key path
storage.type filesystem
DB /tmp/production.db
DB.sync normal
DB.locking false
DB.journal_mode wal
# Read interval (sec) Default: 1
#interval_sec 1
[INPUT]
name tail
tag log.development.nginx
path /home/username/nginx.log
Buffer_Max_Size 2mb
Refresh_interval 5
Offset_Key offset
Path_Key path
storage.type filesystem
DB /tmp/nginx.db
DB.sync normal
DB.locking false
DB.journal_mode wal
# Read interval (sec) Default: 1
#interval_sec 1
[INPUT]
name tail
tag log.development.apache
path /home/username/apache.log
Buffer_Max_Size 2mb
Refresh_interval 5
Offset_Key offset
Path_Key path
storage.type filesystem
DB /tmp/apache.db
DB.sync normal
DB.locking false
DB.journal_mode wal
# Read interval (sec) Default: 1
#interval_sec 1
[INPUT]
name tail
tag log.development.syslog
path /home/username/syslog.log
Buffer_Max_Size 2mb
Refresh_interval 5
Offset_Key offset
Path_Key path
storage.type filesystem
DB /tmp/syslog.db
DB.sync normal
DB.locking false
DB.journal_mode wal
# Read interval (sec) Default: 1
#interval_sec 1
[INPUT]
name tail
tag log.development.postgres
path /home/username/postgres.log
Buffer_Max_Size 2mb
Refresh_interval 5
Offset_Key offset
Path_Key path
storage.type filesystem
DB /tmp/postgres.db
DB.sync normal
DB.locking false
DB.journal_mode wal
# Read interval (sec) Default: 1
#interval_sec 1
[INPUT]
name tail
tag log.development.zabbix
path /home/username/zabbix.log
Buffer_Max_Size 2mb
Refresh_interval 5
Offset_Key offset
Path_Key path
storage.type filesystem
DB /tmp/zabbix.db
DB.sync normal
DB.locking false
DB.journal_mode wal
# Read interval (sec) Default: 1
#interval_sec 1
[OUTPUT]
Name http
Match *
Host 127.0.0.1
Port 443
http_User fluentbit
http_Passwd fluentbit
tls on
tls.verify on
tls.debug 4
tls.ca_file /home/username/cert/ca_1/CA.pem
tls.crt_file /home/username/cert/ca_1/signed_certificates/server.crt
tls.key_file /home/username/cert/ca_1/signed_certificates/server.key
Format json
Header_tag header_tag_is_here
Header Location localhost
Retry_Limit no_limits
nginx.conf:
server {
listen 443 ssl default_server;
listen [::]:443 ssl default_server;
ssl on;
ssl_certificate /home/username/cert/ca_1/signed_certificates/server.crt;
ssl_certificate_key /home/username/cert/ca_1/signed_certificates/server.key;
ssl_session_cache builtin:1000 shared:SSL:10m;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_ciphers HIGH:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4;
server_name _;
location / {
proxy_pass http://localhost:3000/;
}
}
**Apache Airflow version:**1.10.9-composer
Kubernetes Version : Client Version: version.Info{Major:"1", Minor:"15+", GitVersion:"v1.15.12-gke.6002", GitCommit:"035184604aff4de66f7db7fddadb8e7be76b6717", GitTreeState:"clean", BuildDate:"2020-12-01T23:13:35Z", GoVersion:"go1.12.17b4", Compiler:"gc", Platform:"linux/amd64"}
Environment: Airflow, running on top of Kubernetes - Linux version 4.19.112
OS : Linux version 4.19.112+ (builder#7fc5cdead624) (Chromium OS 9.0_pre361749_p20190714-r4 clang version 9.0.0 (/var/cache/chromeos-cache/distfiles/host/egit-src/llvm-project c11de5eada2decd0a495ea02676b6f4838cd54fb) (based on LLVM 9.0.0svn)) #1 SMP Fri Sep 4 12:00:04 PDT 2020
Kernel : Linux gke-europe-west2-asset-c-default-pool-dc35e2f2-0vgz
4.19.112+ #1 SMP Fri Sep 4 12:00:04 PDT 2020 x86_64 Intel(R) Xeon(R) CPU # 2.20GHz GenuineIntel GNU/Linux
What happened ?
A running task is marked as Zombie after the execution time crossed the latest heartbeat + 5 minutes.
The task is running in background in another application server, triggered using SSHOperator.
[2021-01-18 11:53:37,491] {taskinstance.py:888} INFO - Executing <Task(SSHOperator): load_trds_option_composite_file> on 2021-01-17T11:40:00+00:00
[2021-01-18 11:53:37,495] {base_task_runner.py:131} INFO - Running on host: airflow-worker-6f6fd78665-lm98m
[2021-01-18 11:53:37,495] {base_task_runner.py:132} INFO - Running: ['airflow', 'run', 'dsp_etrade_process_trds_option_composite_0530', 'load_trds_option_composite_file', '2021-01-17T11:40:00+00:00', '--job_id', '282759', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/dsp_etrade_trds_option_composite_0530.py', '--cfg_path', '/tmp/tmpge4_nva0']
Task Executing time:
dag_id dsp_etrade_process_trds_option_composite_0530
duration 7270.47
start_date 2021-01-18 11:53:37,491
end_date 2021-01-18 13:54:47.799728+00:00
Scheduler Logs during that time:
[2021-01-18 13:54:54,432] {taskinstance.py:1135} ERROR - <TaskInstance: dsp_etrade_process_etrd.push_run_date 2021-01-18 13:30:00+00:00 [running]> detected as zombie
{
textPayload: "[2021-01-18 13:54:54,432] {taskinstance.py:1135} ERROR - <TaskInstance: dsp_etrade_process_etrd.push_run_date 2021-01-18 13:30:00+00:00 [running]> detected as zombie"
insertId: "1ca8zyfg3zvma66"
resource: {
type: "cloud_composer_environment"
labels: {3}
}
timestamp: "2021-01-18T13:54:54.432862699Z"
severity: "ERROR"
logName: "projects/asset-control-composer-prod/logs/airflow-scheduler"
receiveTimestamp: "2021-01-18T13:54:55.714437665Z"
}
Airflow-webserver log :
X.X.X.X - - [18/Jan/2021:13:54:39 +0000] "GET /_ah/health HTTP/1.1" 200 187 "-" "GoogleHC/1.0"
{
textPayload: "172.17.0.5 - - [18/Jan/2021:13:54:39 +0000] "GET /_ah/health HTTP/1.1" 200 187 "-" "GoogleHC/1.0"
"
insertId: "1sne0gqg43o95n3"
resource: {2}
timestamp: "2021-01-18T13:54:45.401670481Z"
logName: "projects/asset-control-composer-prod/logs/airflow-webserver"
receiveTimestamp: "2021-01-18T13:54:50.598807514Z"
}
Airflow Info logs :
2021-01-18 08:54:47.799 EST
{
textPayload: "NoneType: None
"
insertId: "1ne3hqgg47yzrpf"
resource: {2}
timestamp: "2021-01-18T13:54:47.799661030Z"
severity: "INFO"
logName: "projects/asset-control-composer-prod/logs/airflow-scheduler"
receiveTimestamp: "2021-01-18T13:54:50.914461159Z"
}
[2021-01-18 13:54:47,800] {taskinstance.py:1192} INFO - Marking task as FAILED.dag_id=dsp_etrade_process_trds_option_composite_0530, task_id=load_trds_option_composite_file, execution_date=20210117T114000, start_date=20210118T115337, end_date=20210118T135447
Copy link
{
textPayload: "[2021-01-18 13:54:47,800] {taskinstance.py:1192} INFO - Marking task as FAILED.dag_id=dsp_etrade_process_trds_option_composite_0530, task_id=load_trds_option_composite_file, execution_date=20210117T114000, start_date=20210118T115337, end_date=20210118T135447"
insertId: "1ne3hqgg47yzrpg"
resource: {2}
timestamp: "2021-01-18T13:54:47.800605248Z"
severity: "INFO"
logName: "projects/asset-control-composer-prod/logs/airflow-scheduler"
receiveTimestamp: "2021-01-18T13:54:50.914461159Z"
}
Airflow Database shows the latest heartbeat as:
select state, latest_heartbeat from job where id=282759
--------------------------------------
state | latest_heartbeat
running | 2021-01-18 13:48:41.891934
Airflow Configurations:
celery
worker_concurrency=6
scheduler
scheduler_health_check_threshold=60
scheduler_zombie_task_threshold=300
max_threads=2
core
dag_concurrency=6
Kubernetes Cluster :
Worker nodes : 6
What was expected to happen ?
The backend process takes around 2hrs 30 minutes to finish. During
such long running jobs the task is detected as zombie. Eventhough the
worker node is still processing the task. The state of the job is
still marked as 'running'. State if the task is not known during the
run time.
I followed the following guides on installing JFrog Artifactory OSS using RPM/Yum and using an external PostgreSQL database.
https://www.jfrog.com/confluence/display/JFROG/Installing+Artifactory
https://www.jfrog.com/confluence/display/RTF6X/PostgreSQL
SELinux is disabled and jfrog-artifactory-oss is installed from the JFrog repository [https://jfrog.bintray.com/artifactory-rpms].
Check the service:
[root#jfrog ~]# systemctl status artifactory -l
● artifactory.service - Artifactory service
Loaded: loaded (/usr/lib/systemd/system/artifactory.service; enabled; vendor preset: disabled)
Active: active (running) since Sat 2020-08-08 01:56:50 +08; 11min ago
Process: 9714 ExecStop=/opt/jfrog/artifactory/app/bin/artifactoryManage.sh stop (code=exited, status=0/SUCCESS)
Process: 10268 ExecStart=/opt/jfrog/artifactory/app/bin/artifactoryManage.sh start (code=exited, status=0/SUCCESS)
Main PID: 12388 (java)
CGroup: /system.slice/artifactory.service
‣ 12388 /opt/jfrog/artifactory/app/third-party/java/bin/java -Djava.util.logging.config.file=/opt/jfrog/artifactory/app/artifactory/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djdk.tls.ephemeralDHKeySize=2048 -Djava.protocol.handler.pkgs=org.apache.catalina.webresources -Dorg.apache.catalina.security.SecurityListener.UMASK=0027 -server -Xss256k -XX:+UseG1GC -XX:OnOutOfMemoryError=kill -9 %p --add-opens java.base/java.util=ALL-UNNAMED --add-opens java.base/java.lang.reflect=ALL-UNNAMED --add-opens java.base/java.lang.invoke=ALL-UNNAMED --add-opens java.base/java.text=ALL-UNNAMED --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.desktop/java.awt.font=ALL-UNNAMED -Dfile.encoding=UTF8 -Djruby.compile.invokedynamic=false -Djruby.bytecode.version=1.8 -Dorg.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true -Djava.security.egd=file:/dev/./urandom -Dartdist=rpm -Djf.product.home=/opt/jfrog/artifactory -Xms512m -Xmx3g -Djruby.bytecode.version=1.8 -Dartifactory.metadata.native.ui=true -Dignore.endorsed.dirs= -classpath /opt/jfrog/artifactory/app/artifactory/tomcat/bin/bootstrap.jar:/opt/jfrog/artifactory/app/artifactory/tomcat/bin/tomcat-juli.jar -Dcatalina.base=/opt/jfrog/artifactory/app/artifactory/tomcat -Dcatalina.home=/opt/jfrog/artifactory/app/artifactory/tomcat -Djava.io.tmpdir=/opt/jfrog/artifactory/var/work/artifactory/tomcat/temp org.apache.catalina.startup.Bootstrap start
Aug 08 01:56:50 jfrog artifactoryManage.sh[10268]: 2020-08-07T17:56:50.027Z [shell] [INFO ] [] [systemYamlHelper.sh:462 ] [main] - Resolved shared.logging.consoleLog.enabled (true) from /opt/jfrog/artifactory/var/etc/system.yaml
Aug 08 01:56:50 jfrog artifactoryManage.sh[10268]: JF_METADATA_ACCESSCLIENT_URL: http://localhost:8081/access
Aug 08 01:56:50 jfrog artifactoryManage.sh[10268]: metadata started. PID: 12988
Aug 08 01:56:50 jfrog su[13048]: (to artifactory) root on none
Aug 08 01:56:50 jfrog artifactoryManage.sh[10268]: Starting frontend...
Aug 08 01:56:50 jfrog artifactoryManage.sh[10268]: frontend not running. Proceed to start it up.
Aug 08 01:56:50 jfrog artifactoryManage.sh[10268]: 2020-08-07T17:56:50.317Z [shell] [INFO ] [] [systemYamlHelper.sh:462 ] [main] - Resolved shared.logging.consoleLog.enabled (true) from /opt/jfrog/artifactory/var/etc/system.yaml
Aug 08 01:56:50 jfrog artifactoryManage.sh[10268]: frontend started. PID: 13147
Aug 08 01:56:50 jfrog systemd[1]: Started Artifactory service.
Aug 08 01:56:51 jfrog artifactoryManage.sh[10268]: 2020-08-07T17:56:51.003Z [shell] [INFO ] [] [systemYamlHelper.sh:462 ] [main] - Resolved shared.logging.consoleLog.enabled (true) from /opt/jfrog/artifactory/var/etc/system.yaml
[root#jfrog ~]#
Test:
[root#jfrog ~]# curl -I http://localhost:8082/ui/
HTTP/1.1 503 Service Unavailable
Date: Fri, 07 Aug 2020 18:08:50 GMT
Content-Length: 19
Content-Type: text/plain; charset=utf-8
[root#jfrog ~]#
/opt/jfrog/artifactory/var/log/console.log shows the following errors:
[DEBUG] Resolved system configuration file path: /opt/jfrog/artifactory/var/etc/system.yaml
No ssl parameter found, falling back to sslmode=disable
2020-08-07T17:56:50.179Z [jfmd ] [INFO ] [1462831a45a25233] [database_bearer.go:84 ] [main ] - Connecting to (db config: {postgresql user='jfroguser' password='***' dbname=jfrogdb host=dbserver.example.com port= sslmode=disable}) [database]
2020-08-07T17:56:50.216Z [jfmd ] [ERROR] [1462831a45a25233] [database_bearer.go:68 ] [main ] - Could not initialize database (db config: {postgresql user='jfroguser' password='***' dbname=jfrogdb host=dbserver.example.com port= sslmode=disable}): error connecting to database
jfrog.com/metadata/services/common/db.(*databaseBearer).init
/src/jfrog.com/metadata/services/common/db/database_bearer.go:114
jfrog.com/metadata/services/common/db.NewDatabaseBearer
/src/jfrog.com/metadata/services/common/db/database_bearer.go:66
main.main
/src/jfrog.com/metadata/metadata.go:38
runtime.main
/src/runtime/proc.go:203
runtime.goexit
/src/runtime/asm_amd64.s:1373
goroutine 1 [running]:
runtime/debug.Stack(0x38, 0xc00015c040, 0xc00032c080)
/src/runtime/debug/stack.go:24 +0x9d
jfrog.com/jfrog-go-commons/pkg/log.(*standardLogger).Panicfc(0xc00043bda0, 0x166e420, 0xc000142750, 0x13eb133, 0x32, 0xc00032c080, 0x2, 0x2)
/src/jfrog.com/go-commons/pkg/log/standard_logger.go:42 +0x6a
jfrog.com/metadata/services/common/db.NewDatabaseBearer(0x166e420, 0xc000142750, 0x166f220, 0xc00007f770, 0x1673460, 0xc0000c97c0, 0x1666260, 0xc000011098, 0x16489c0, 0xc00043bd70, ...)
/src/jfrog.com/metadata/services/common/db/database_bearer.go:68 +0x2d4
main.main()
/src/jfrog.com/metadata/metadata.go:38 +0x5b7
[database]
panic: Could not initialize database (db config: {postgresql user='jfroguser' password='***' dbname=jfrogdb host=dbserver.example.com port= sslmode=disable}): error connecting to database
jfrog.com/metadata/services/common/db.(*databaseBearer).init
/src/jfrog.com/metadata/services/common/db/database_bearer.go:114
jfrog.com/metadata/services/common/db.NewDatabaseBearer
/src/jfrog.com/metadata/services/common/db/database_bearer.go:66
main.main
/src/jfrog.com/metadata/metadata.go:38
runtime.main
/src/runtime/proc.go:203
runtime.goexit
/src/runtime/asm_amd64.s:1373
goroutine 1 [running]:
runtime/debug.Stack(0x38, 0xc00015c040, 0xc00032c080)
/src/runtime/debug/stack.go:24 +0x9d
jfrog.com/jfrog-go-commons/pkg/log.(*standardLogger).Panicfc(0xc00043bda0, 0x166e420, 0xc000142750, 0x13eb133, 0x32, 0xc00032c080, 0x2, 0x2)
/src/jfrog.com/go-commons/pkg/log/standard_logger.go:42 +0x6a
jfrog.com/metadata/services/common/db.NewDatabaseBearer(0x166e420, 0xc000142750, 0x166f220, 0xc00007f770, 0x1673460, 0xc0000c97c0, 0x1666260, 0xc000011098, 0x16489c0, 0xc00043bd70, ...)
/src/jfrog.com/metadata/services/common/db/database_bearer.go:68 +0x2d4
main.main()
/src/jfrog.com/metadata/metadata.go:38 +0x5b7
goroutine 1 [running]:
github.com/rs/zerolog.(*Logger).Panic.func1(0xc000358500, 0x4bb)
/pkg/mod/github.com/rs/zerolog#v1.18.0/log.go:338 +0x4f
github.com/rs/zerolog.(*Event).msg(0xc0000be240, 0xc000358500, 0x4bb)
/pkg/mod/github.com/rs/zerolog#v1.18.0/event.go:146 +0x200
github.com/rs/zerolog.(*Event).Msgf(0xc0000be240, 0xc000961dc0, 0x35, 0xc00015c0c0, 0x3, 0x4)
/pkg/mod/github.com/rs/zerolog#v1.18.0/event.go:126 +0x83
jfrog.com/jfrog-go-commons/pkg/log.(*standardLogger).logMessage(0xc00043bda0, 0x166e420, 0xc000142750, 0xc0000be240, 0xc000961dc0, 0x35, 0xc00015c0c0, 0x3, 0x4)
/src/jfrog.com/go-commons/pkg/log/standard_logger.go:61 +0x197
jfrog.com/jfrog-go-commons/pkg/log.(*standardLogger).Panicfc(0xc00043bda0, 0x166e420, 0xc000142750, 0x13eb133, 0x32, 0xc00015c0c0, 0x3, 0x4)
/src/jfrog.com/go-commons/pkg/log/standard_logger.go:43 +0x1df
jfrog.com/metadata/services/common/db.NewDatabaseBearer(0x166e420, 0xc000142750, 0x166f220, 0xc00007f770, 0x1673460, 0xc0000c97c0, 0x1666260, 0xc000011098, 0x16489c0, 0xc00043bd70, ...)
/src/jfrog.com/metadata/services/common/db/database_bearer.go:68 +0x2d4
main.main()
/src/jfrog.com/metadata/metadata.go:38 +0x5b7
Any ideas what to check? The server is an up-to-date Centos 7 server. Login to the external database is also possible:
[root#jfrog ~]# psql -h dbserver.example.com -p 5432 -U jfrog
Password for user jfrog:
psql (11.8)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.
jfrog=> SHOW server_version;
server_version
----------------
11.8
(1 row)
jfrog=> \q
[root#jfrog ~]#
i am trying to add additional compute node on different virtual machine to the pre-installed openstack. I disabled the firewall services,enable to ping other virtual machine.. but still compute node is not able to register with Rabbitmq service running on controller node..
Here it is my nova.conf file...
[DEFAULT]
dhcpbridge_flagfile=/etc/nova/nova.conf
dhcpbridge=/usr/bin/nova-dhcpbridge
state_path=/var/lib/nova
lock_path=/var/lock/nova
force_dhcp_release=True
iscsi_helper=tgtadm
libvirt_use_virtio_for_bridges=True
connection_type=libvirt
root_helper=sudo nova-rootwrap /etc/nova/rootwrap.conf
verbose=True
ec2_private_dns_show_ip=True
api_paste_config=/etc/nova/api-paste.ini
volumes_path=/var/lib/nova/volumes
enabled_apis=ec2,osapi_compute,metadata
rpc_backend = rabbit
auth_strategy = keystone
use_neutron = True
firewall_driver = nova.virt.firewall.NoopFirewallDriver
my_ip = #compute node ip
rabbit_host= #controller_node_ip
rabbit_port = 5672
rabbit_userid = stackrabbit
rabbit_password = devstack
rabbit_use_ssl = False
rabbit_virtual_host=/
[keystone_authtoken]
auth_uri = http://controller_node_ip:5000
auth_url = http://controller_node_ip:35357
memcached_servers = controller_node_ip:11211
auth_type = password
project_domain_name = default
user_domain_name = default
project_name = service
username = nova
password = devstack
auth_host = controller_node_ip
auth_port = 35357
auth_protocol = http
[vnc]
enabled = True
vncserver_listen = 0.0.0.0
vncserver_proxyclient_address = $my_ip
novncproxy_base_url = http://controller_node_ip:6080/vnc_auto.html
[glance]
api_servers = http://controller_node_ip:9292
[oslo_concurrency]
lock_path = /var/lib/nova/tmp
Here it is my nova-compute.log:
2016-09-20 19:08:57.701 7201 INFO oslo.messaging._drivers.impl_rabbit [-] Reconnecting to AMQP server on localhost:5672
2016-09-20 19:08:57.701 7201 INFO oslo.messaging._drivers.impl_rabbit [-] Delaying reconnect for 1.0 seconds...
2016-09-20 19:08:58.708 7201 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on localhost:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 30 seconds...
Please suggest me something so that i can resolve this issue...
Thank you in advance...
I encountered this when expanding my nova-compute estate (although I'm not using Devstack).
In my newly created compute server, the following was seen in /var/log/nova/nova-compute.log : -
2017-11-14 11:40:53.287 52408 ERROR oslo.messaging._drivers.impl_rabbit [req-adfd6dc7-fe8c-4de5-8401-58d325c3b4a8 - - - - -] [be6e0302-dfc8-4512-8b48-0d824fc6ea14] AMQP server on 127.0.0.1:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds. Client port: None
The solution was quite simple. I checked /var/log/sysinfo (I run ubuntu; /var/log/messages for those on Redhat systems) and could see the following lines:-
Nov 14 12:01:48 compute2 systemd[1]: Started OpenStack Compute.
Nov 14 12:01:49 compute2 nova-compute[3222]: Traceback (most recent call last):
Nov 14 12:01:49 compute2 nova-compute[3222]: File "/usr/bin/nova-compute", line 10, in <module>
Nov 14 12:01:49 compute2 nova-compute[3222]: sys.exit(main())
Nov 14 12:01:49 compute2 nova-compute[3222]: File "/usr/lib/python2.7/dist-packages/nova/cmd/compute.py", line 42, in main
Nov 14 12:01:49 compute2 nova-compute[3222]: config.parse_args(sys.argv)
Nov 14 12:01:49 compute2 nova-compute[3222]: File "/usr/lib/python2.7/dist-packages/nova/config.py", line 52, in parse_args
Nov 14 12:01:49 compute2 nova-compute[3222]: default_config_files=default_config_files)
Nov 14 12:01:49 compute2 nova-compute[3222]: File "/usr/lib/python2.7/dist-packages/oslo_config/cfg.py", line 2355, in __call__
Nov 14 12:01:49 compute2 nova-compute[3222]: self._namespace._files_permission_denied)
Nov 14 12:01:49 compute2 nova-compute[3222]: oslo_config.cfg.ConfigFilesPermissionDeniedError: Failed to open some config files: /etc/nova/nova.conf
Nov 14 12:01:49 compute2 systemd[1]: nova-compute.service: Main process exited, code=exited, status=1/FAILURE
Which shows that my /etc/nova/nova.conf file was unreadable. It turns out this was because I used scp to copy the nova.conf from my first compute to my new machine, and the file was read-only to the root user. The solution was to (on my new compute)
cd /etc/nova/
chown nova:nova nova.conf
service nova-compute restart
I'm new to dynamoDB. I have created a table and am trying to insert data into the table. It works well when I connect from my home internet. But when I try from my office network, I get the below error:
I suspect this is due to proxy issues. Can you please help me resolve this issue? Thank you.
[UnknownEndpoint: Inaccessible host: dynamodb.ap-southeast-2.amazonaws.com'. This service may not be available in theap-southeast-2' region.]
message: 'Inaccessible host: dynamodb.ap-southeast-2.amazonaws.com\'. This service may not be available in theap-southeast-2\' region.',
code: 'UnknownEndpoint',
region: 'ap-southeast-2',
hostname: 'dynamodb.ap-southeast-2.amazonaws.com',
retryable: true,
originalError:
{ [NetworkingError: getaddrinfo ENOTFOUND dynamodb.ap-southeast-2.amazonaws.com dynamodb.ap-southeast-2.amazonaws.com:443]
message: 'getaddrinfo ENOTFOUND dynamodb.ap-southeast-2.amazonaws.com dynamodb.ap-southeast-2.amazonaws.com:443',
code: 'NetworkingError',
errno: 'ENOTFOUND',
syscall: 'getaddrinfo',
hostname: 'dynamodb.ap-southeast-2.amazonaws.com',
host: 'dynamodb.ap-southeast-2.amazonaws.com',
port: 443,
region: 'ap-southeast-2',
retryable: true,
time: Mon Sep 21 2015 11:19:58 GMT+1000 (AUS Eastern Standard Time) },
time: Mon Sep 21 2015 11:19:58 GMT+1000 (AUS Eastern Standard Time) }
Thank you for the pointers. I managed to solve the issue using below code snipped.
var proxy = require('proxy-agent');
AWS.config.update({
httpOptions: {
agent: proxy('http://{user_name}:{password}#<proxy>:<port>')
}
});
This is documented in amazon's aws-sdk configuration site: http://docs.aws.amazon.com/AWSJavaScriptSDK/guide/node-configuring.html