logspout error like pump.Run() logs are coming in Kibana logs - kibana

We are getting the logspout error logs in the kibana alongside the application logs
[DATE_TIME] [SOME_UNIQUE_NAME] [LOG_LEVEL] [MQTT Call: UUID] [UUID] [APPlication class] - This is the actual logs
2022/12/05 05:44:05 pump.Run() event: 4618d7523b66 exec_create: /agent --healthcheck
2022/12/05 05:44:05 pump.Run() event: 4618d7523b66 exec_start: /agent --healthcheck
2022/12/05 05:44:05 pump.Run() event: cd52bcae1bea exec_create: /bin/sh -c curl -f http://localhost:5000/ || exit 1
The application log is just the first line then on we are getting this logspout errors.
We tried changing the format in which we write the logs but we are not sure why this is breaking and also why the errors are coming in.

Related

Not able to download selenium-server log from the saucelab, getting FilenotFound Exception

I am not able to download selenium-server log from the saucelab
This is the error:
Sep 07, 2021 9:27:43 PM com.saucelabs.saucerest.SauceREST saveFile
WARNING: Error downloading Sauce Results
java.io.FileNotFoundException :
This is the code:
SauceREST sauceREST = new SauceREST(username, accesskey);
sauceREST.downloadLog(sessionid, "/target/saucelab");
This usually happens because either:
Your network infrastructure is blocking requests to the REST API
The test you're grabbing logs for hasn't finalised asset upload
Check whether you can download the logs manually (docs here):
curl -u "$SAUCE_USERNAME:$SAUCE_ACCESS_KEY" --location \
--request GET 'https://api.us-west-1.saucelabs.com/rest/v1/{USERNAME}/jobs/{JOB_ID}/assets/log.json'
If you can, try wrapping your REST code in a try/catch block, and if it fails, wait a small amount of time and try again.

Setting repmgr witness node on Debian

I am trying to set up repmgr version 5 on Debian with PostgtrSql 11.
Seems like the documentation is more oriented towards centos/RHEL.
When I am trying to setup the witnes node to start the repmgr daemon, I get an error without any idea where to look for for seeing what is the cause of the error.
This is my repmgr.conf file:
node_id=3
node_name='PG-Node-Witness'
conninfo='host=10.97.7.140 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/11/main'
failover='automatic'
promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'
priority=60
monitor_interval_secs=2
connection_check_type='ping'
reconnect_attempts=6
reconnect_interval=8
primary_visibility_consensus=true
standby_disconnect_on_failover=true
repmgrd_service_start_command='sudo /etc/init.d/repmgrd start' #??????
repmgrd_service_stop_command='sudo //etc/init.d/repmgrd stop'#??????
service_start_command='sudo /usr/bin/systemctl start postgresql#11-main.service'
service_stop_command='sudo /usr/bin/systemctl stop postgresql#11-main.service'
service_restart_command='sudo /usr/bin/systemctl restart postgresql#11-main.service'
service_reload_command='sudo /usr/bin/systemctl relaod postgresql#11-main.service'
monitoring_history=yes
log_status_interval=60
register is OK:
repmgr -f /etc/repmgr.conf witness register -h 10.97.7.97
INFO: connecting to witness node "PG-Node-Witness" (ID: 3)
INFO: connecting to primary node
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
INFO: witness registration complete
NOTICE: witness node "PG-Node-Witness" (ID: 3) successfully registered
repmgr daemon dry-run OK too:
$repmgr -f /etc/repmgr.conf daemon start --dry-run
INFO: prerequisites for starting repmgrd met
DETAIL: following command would be executed:
sudo /usr/bin/systemctl start postg...#11-main.service
I setup /etc/default/repmgrd with:
REPMGRD_ENABLED=yes
and
REPMGRD_CONF="/etc/repmgr.conf"
But still get error when trying to run the daemon start:
$ repmgr -f /etc/repmgr.conf daemon start
I get:
NOTICE: executing: "sudo /etc/init.d/repmgrd start"
ERROR: repmgrd does not appear to have started after 15 seconds
HINT: use "repmgr service status" to confirm that repmgrd was successfully started
It is recommended to run repmgrd as a systemd service,
According to the docs (for debian) you may first need to configure /etc/default/repmgrd,
My configuration looks like this:
# default settings for repmgrd. This file is source by /bin/sh from
# /etc/init.d/repmgrd
# disable repmgrd by default so it won't get started upon installation
# valid values: yes/no
REPMGRD_ENABLED=yes
# configuration file (required)
REPMGRD_CONF="/etc/repmgr/12/repmgr.conf"
# additional options
REPMGRD_OPTS="--daemonize=false"
# user to run repmgrd as
REPMGRD_USER=postgres
# repmgrd binary
REPMGRD_BIN=/bin/repmgrd
# pid file
REPMGRD_PIDFILE=/var/run/repmgrd.pid
Secondly, I would revisit sudoers (visudo) in order to check whether the non-root user can execute sudo /etc/init.d/repmgrd start.
Further, the user who runs repmgr commands should be able to write logs depending on your configuration.
Apparently the correct command to start the repmgr daemon is:
repmgrd -f /etc/prepmgr.conf

Task fails due to not being able to read log file

Composer is failing a task due to it not being able to read a log file, it's complaining about incorrect encoding.
Here's the log that appears in the UI:
*** Unable to read remote log from gs://bucket/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** 'ascii' codec can't decode byte 0xc2 in position 6986: ordinal not in range(128)
*** Log file does not exist: /home/airflow/gcs/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Fetching from: http://airflow-worker-68dc66c9db-x945n:8793/log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-68dc66c9db-x945n', port=8793): Max retries exceeded with url: /log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1c9ff19d10>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I try viewing the file in the google cloud console and it also throws an error:
Failed to load
Tracking Number: 8075820889980640204
But I am able to download the file via gsutil.
When I view the file, it seems to have text overriding other text.
I can't show the entire file but it looks like this:
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------
#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,313] {models.py:1569} INFO - Executing <Task(BigQueryOperator): merge_campaign_exceptions> on 2019-08-03T10:00:00+00:00#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,314] {base_task_runner.py:124} INFO - Running: ['bash', '-c', u'airflow run __campaign_exceptions_0_0_1 merge_campaign_exceptions 2019-08-03T10:00:00+00:00 --job_id 22767 --pool _bq_pool --raw -sd DAGS_FOLDER//-campaign-exceptions.py --cfg_path /tmp/tmpyBIVgT']#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:24,658] {base_task_runner.py:107} INFO - Job 22767: Subtask merge_campaign_exceptions [2019-08-04 10:01:24,658] {settings.py:176} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
Where the #-#{} pieces seems to be "on top of" the typical log.
I faced the same problem. In my case the problem was that I removed the google_gcloud_default connection that was being used to retrieve the logs.
Check the configuration and look for the connection name.
[core]
remote_log_conn_id = google_cloud_default
Then check the credentials used for that connection name has the right permissions to access the GCS bucket.
I'm having a similar problem with viewing logs in GCP Cloud Composer. It doesn't appear to be preventing the failing DAG task from running though. What it looks like is a permissions error between the GKE and Storage Bucket where the log files are kept.
You can still view the logs by going into your cluster's storage bucket in the same directory as your /dags folder where you should also see a logs/ folder.
Your helm chart should setup global env:
- name: AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT
value: "google-cloud-platform://"
Then, you should deploy a Dockerfile with root account only (not airflow account), additionaly, you set up your helm uid, gid as:
uid: 50000 #airflow user
gid: 50000 #airflow group
Then upgrade helm chart with new config
*** Unable to read remote log from gs://bucket
1)Found the solution after assigning the roles to the service account
2)The SA key(json or txt) to be added and configured to the connection in the
remote_log_conn_id = google_cloud_default
3)restart the scheduler and webserver of the airflow
4)restart the dags on the airflow
you can find the logs on the GCS bucket where its configured

ICp 2.1.0.1: Installation failed with error TASK [master: Waiting for MariaDB service to start]

I am installing ICp 2.1.0.1 and I received an error at the TASK
[master: Waiting for MariaDB service to start] msg: The MariaDB
component failed to start.
After this msg the installation completed with failed status.
We are installing ICp with 3 Masters, 3 Proxies and 2 Workers. We have 1 IP for VIP master and 1 for VIP proxy.
I tried to install multiple times and all installations got the same error.
For prior issues with that error, the correct db admin password was not used. So check the db user and password to resolve issue.
Would you validate whether each master host was able to access port 3306 on the other hosts?
If you run with .. install -vv | tee -a install-log.txt, do you get additional details as well?
The error was solved by following the steps below.
Check whether kubelet is running:
Log in to your master node.
Run the following command to check kubelet status:
systemctl status kubelet
If kubelet is not running, run the following command to get the logs:
journalctl -u kubelet &> kubelet.log
We found the error in the kubelet.log log:
Error: failed to run Kubelet: Running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false.
We found this troubleshoot in this link, and the solution at the ICP issue 4651.
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_2.1.0/troubleshoot/etcd_fails.html
https://github.ibm.com/IBMPrivateCloud/roadmap/issues/4651

Authentication failed with capifony

I'm trying to do a Symfony 2 project deployment web app based on capifony and Symfony2.
It uses Process to trigger my "cap deploy" task and display my output in a web browser.
When in a shell, if I run my "cap deploy" as user www-data (the same as used by Process) , my deployement works fine so there's nothing wrong either with my deploy task nor with my authentication keys.
Though, when I call my task from my web app, capifony tells me it can't authenticate on the remote server.
triggering start callbacks for `deploy'
* executing `deploy:setdomain'
* executing `deploy'
* executing `deploy:update'
** transaction: start
* executing `deploy:update_code'
triggering before callbacks for `deploy:update_code'
[32m--> Updating code base with checkout strategy[0m
executing locally: "git ls-remote [ myrepo ]"
command finished in 2068ms
* executing "git clone -q -o [ remote server ] [ my repo ]
/var/www/spinfony/releases/20121211100449 && cd /var/www/spinfony/releases/20121211100449 && git checkout -q -b deploy be53233e51a4c542c3bc8603b424e57f988898a4 && (echo be53233e51a4c542c3bc8603b424e57f988898a4 > /var/www/spinfony/releases/20121211100449/REVISION)"
servers: ["[ remote server ]"]
Password: stty: standard input: Invalid argument
stty: standard input: Invalid argument
stty: standard input: Invalid argument
*** [deploy:update_code] rolling back
* executing "rm -rf /var/www/spinfony/releases/20121211100449; true"
servers: ["[ remote server ]"]
** [deploy:update_code] exception while rolling back: Capistrano::ConnectionError, connection failed for: [ remote server ] (Net::SSH::AuthenticationFailed: [ user ])
connection failed for: [ remote server ] (Net::SSH::AuthenticationFailed: [ user ])
I'm trying to figure out why capifony seems to expect a password I can't provide since i'm not running it from a shell, whereas when I do run it from a shell, it works fine without asking me anything.
Once again, the same file is called from the same user.
This is a known "bug"
You need to tell capistrano wich key to use
Try adding this to your deploy.rb :
ssh_options[:keys] = %w(/what/ever/.ssh/id_rsa)
Source : http://adam.goucher.ca/?p=1253
When you call your task from a web app, you need to tell it what user to use.
set :user, "www-data"
set :domain, "webserverdomainname.com"

Resources