Upload Stem Cell Error - Keystone connection timed out - openstack

I am getting a connection timeout error while uploading the stemcell to bosh director. I am using bosh cli v2. The following is myerror logs.
> bosh -e sdp-bosh-env upload-stemcell https://bosh.io/d/stemcells/bosh-openstack-kvm-ubuntu-trusty-go_agent?v=3541.12 --fix
Using environment '10.82.73.8' as client 'admin'
Task 13
Task 13 | 05:02:40 | Update stemcell: Downloading remote stemcell (00:00:51)
Task 13 | 05:03:31 | Update stemcell: Extracting stemcell archive (00:00:03)
Task 13 | 05:03:34 | Update stemcell: Verifying stemcell manifest (00:00:00)
Task 13 | 05:03:35 | Update stemcell: Checking if this stemcell already exists (00:00:00)
Task 13 | 05:03:35 | Update stemcell: Uploading stemcell bosh-openstack-kvm-ubuntu-trusty-go_agent/3541.12 to the cloud (00:10:41)
L Error: CPI error 'Bosh::Clouds::CloudError' with message 'Unable to connect to the OpenStack Keystone API http://10.81.102.5:5000/v2.0/tokens
Connection timed out - connect(2) for 10.81.102.5:5000 (Errno::ETIMEDOUT)' in 'create_stemcell' CPI method
Task 13 | 05:14:16 | Error: CPI error 'Bosh::Clouds::CloudError' with message 'Unable to connect to the OpenStack Keystone API http://10.81.102.5:5000/v2.0/tokens
Connection timed out - connect(2) for 10.81.102.5:5000 (Errno::ETIMEDOUT)' in 'create_stemcell' CPI method
Task 13 Started Sat Apr 7 05:02:40 UTC 2018
Task 13 Finished Sat Apr 7 05:14:16 UTC 2018
Task 13 Duration 00:11:36
Task 13 error
Uploading remote stemcell 'https://bosh.io/d/stemcells/bosh-openstack-kvm-ubuntu-trusty-go_agent?v=3541.12':
Expected task '13' to succeed but state is 'error'
Exit code 1

Check OpenStack's security group for the BOSH Director machine.
SG should contain ALLOW IPv4 to 0.0.0.0/0, if it doesn't add at least egress TCP to 10.81.102.5 on port 5000.
Check connection using ssh:
bbl ssh --director
nc -tvn 10.81.102.5 5000
If it doesn't help check network/firewall configuration.
https://bosh.io/docs/uploading-stemcells/
https://github.com/cloudfoundry/bosh-bootloader/blob/master/terraform/openstack/templates/resources.tf

Related

Error on Starting MySQL Cluster 8.0 Data Node on Ubuntu 22.04 LTS

When I start the data nodeid 1 (10.1.1.103) of MySQL Cluster 8.0 on Ubuntu 22.04 LTS I am getting the following error:
# ndbd
Failed to open /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list: No such file or directory
2023-01-02 17:16:55 [ndbd] INFO -- Angel connected to '10.1.1.102:1186'
2023-01-02 17:16:55 [ndbd] INFO -- Angel allocated nodeid: 2
When I start data nodeid 2 (10.1.1.105) I get the following error:
# ndbd
Failed to open /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list: No such file or directory
2023-01-02 11:10:04 [ndbd] INFO -- Angel connected to '10.1.1.102:1186'
2023-01-02 11:10:04 [ndbd] ERROR -- Failed to allocate nodeid, error: 'Error: Could not alloc node id at 10.1.1.102:1186: Connection done from wrong host ip 10.1.1.105.'
The management node log file reports (on /var/lib/mysql-cluster/ndb_1_cluster.log):
2023-01-02 11:28:47 [MgmtSrvr] INFO -- Node 2: Initial start, waiting for 3 to connect, nodes [ all: 2 and 3 connected: 2 no-wait: ]
What is the relevance of failing to open: /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list: No such file or directory?
Why is data node on 10.1.1.105 unable to allocate a nodeid?
I initially installed a single Management Node on 10.1.1.102:
wget https://dev.mysql.com/get/Downloads/MySQL-Cluster-8.0/mysql-cluster_8.0.31-1ubuntu22.04_amd64.deb-bundle.tar
tar -xf mysql-cluster_8.0.31-1ubuntu22.04_amd64.deb-bundle.tar
dpkg -i mysql-cluster-community-management-server_8.0.31-1ubuntu22.04_amd64.deb
mkdir /var/lib/mysql-cluster
vi /var/lib/mysql-cluster/config.ini
The configuration set up on config.ini:
[ndbd default]
# Options affecting ndbd processes on all data nodes:
NoOfReplicas=2 # Number of replicas
[ndb_mgmd]
# Management process options:
hostname=10.1.1.102 # Hostname of the manager
datadir=/var/lib/mysql-cluster # Directory for the log files
[ndbd]
hostname=10.1.1.103 # Hostname/IP of the first data node
NodeId=2 # Node ID for this data node
datadir=/usr/local/mysql/data # Remote directory for the data files
[ndbd]
hostname=10.1.1.105 # Hostname/IP of the second data node
NodeId=3 # Node ID for this data node
datadir=/usr/local/mysql/data # Remote directory for the data files
[mysqld]
# SQL node options:
hostname=10.1.1.102 # In our case the MySQL server/client is on the same Droplet as the cluster manager
I then started and killed the running server and created a systemd file for Cluster manager:
ndb_mgmd -f /var/lib/mysql-cluster/config.ini
pkill -f ndb_mgmd
vi /etc/systemd/system/ndb_mgmd.service
Adding the following configuration:
[Unit]
Description=MySQL NDB Cluster Management Server
After=network.target auditd.service
[Service]
Type=forking
ExecStart=/usr/sbin/ndb_mgmd -f /var/lib/mysql-cluster/config.ini
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target
I then reloaded the systemd daemon to apply the changes, started and enabled the Cluster Manager and checked its active status:
systemctl daemon-reload
systemctl start ndb_mgmd
systemctl enable ndb_mgmd
Here is the status of the Cluster Manager:
# systemctl status ndb_mgmd
● ndb_mgmd.service - MySQL NDB Cluster Management Server
Loaded: loaded (/etc/systemd/system/ndb_mgmd.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2023-01-01 08:25:07 CST; 27min ago
Main PID: 320972 (ndb_mgmd)
Tasks: 12 (limit: 9273)
Memory: 2.5M
CPU: 35.467s
CGroup: /system.slice/ndb_mgmd.service
└─320972 /usr/sbin/ndb_mgmd -f /var/lib/mysql-cluster/config.ini
Jan 01 08:25:07 nuc systemd[1]: Starting MySQL NDB Cluster Management Server...
Jan 01 08:25:07 nuc ndb_mgmd[320971]: MySQL Cluster Management Server mysql-8.0.31 ndb-8.0.31
Jan 01 08:25:07 nuc systemd[1]: Started MySQL NDB Cluster Management Server.
I then set up a data node on 10.1.1.103, installing dependencies, downloading the data node and setting up its config:
apt update && apt -y install libclass-methodmaker-perl
wget https://dev.mysql.com/get/Downloads/MySQL-Cluster-8.0/mysql-cluster_8.0.31-1ubuntu22.04_amd64.deb-bundle.tar
tar -xf mysql-cluster_8.0.31-1ubuntu22.04_amd64.deb-bundle.tar
dpkg -i mysql-cluster-community-data-node_8.0.31-1ubuntu22.04_amd64.deb
vi /etc/my.cnf
I entered the address of the Cluster Management Node in the configuration:
[mysql_cluster]
# Options for NDB Cluster processes:
ndb-connectstring=10.1.1.102 # location of cluster manager
I then created a data directory and started the node:
mkdir -p /usr/local/mysql/data
ndbd
This is when I got the "Failed to open" error result on data nodeid 1 (102.1.1.103):
# ndbd
Failed to open /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list: No such file or directory
2023-01-02 17:16:55 [ndbd] INFO -- Angel connected to '10.1.1.102:1186'
2023-01-02 17:16:55 [ndbd] INFO -- Angel allocated nodeid: 2
UPDATED (2023-01-02)
Thank you #MauritzSundell. I corrected the (private) IP addresses above and no longer got:
# ndbd
Failed to open /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list: No such file or directory
ERROR: Unable to connect with connect string: nodeid=0,10.1.1.2:1186
Retrying every 5 seconds. Attempts left: 12 11 10 9 8 7 6 5 4 3 2 1, failed.
2023-01-01 14:41:57 [ndbd] ERROR -- Could not connect to management server, error: ''
Also #MauritzSundell, in order to use the ndbmtd process rather than the ndbd process, does any alteration need to be made to any of the configuration files (e.g. /etc/systemd/system/ndb_mgmd.service)?
What is the appropriate reference/tutorial documentation for MySQL Cluster 8.0? Is it MySQL Cluster "MySQL NDB Cluster 8.0" on:
https://downloads.mysql.com/docs/mysql-cluster-excerpt-8.0-en.pdf
Or is it "MySQL InnoDB Cluster" on:
https://dev.mysql.com/doc/refman/8.0/en/mysql-innodb-cluster-introduction.html
Not sure I understand the difference.

Connection string for MariaDB

I'm running CentOS v7.9 with MariaDB v5.5.68. I'm trying to access the MariaDB databases from a Win10 machine using Visual Studio Code with SQLTools & MySQL/MariaDB extensions.
I have configured MariaDB for remote access per this link: Configuring MariaDB for Remote Client Access
[mysqld]
skip-networking=0
skip-bind-address
I created the users and added the privileges - tested by logging in locally with 'bob' and viewing permissions in mysql.user. (BTW, in case not readily apparent, the UID, host, and PWD aren't real.)
CREATE USER 'bob'#'1.2.3.%' IDENTIFIED BY 'myPWD';
GRANT ALL PRIVILEGES ON *.* TO 'bob'#'1.2.3.%' IDENTIFIED BY 'myPWD';
However, when I try to log in remotely (from another Linux box) using mysql -u userID -h hostIP -p, I get the error:
ERROR 2003 (HY000): Can't connect to MySQL server on '1.2.3.4' (110)
When I try to make the database connection using VS Code, SQLTools tells me I've connected, but it won't show any tables, I'm not able to make any queries, and I get this error: Request connection/GetChildrenForTreeItemRequest failed with message: Handshake inactivity timeout.
I have reviewed this SO page and others, but still can't get the connection to work.
UPDATED for clarity - provides mysql.user and netstat info:
MariaDB [(none)]> select user, host from mysql.user;
+------+-------------+
| user | host |
+------+-------------+
| bob | 10.0.2.15 | # Can't connect
| rob | 127.0.0.1 | # Logs in locally via command line
| root | 127.0.0.1 | # Logs in locally via command line
| bob | 192.168.0.% | # Can't connect
| root | 192.168.0.% | # Can't connect
| root | ::1 | # Logs in locally via command line
| rob | localhost | # Logs in locally via command line
| root | localhost | # Logs in locally via command line
+------+-------------+
8 rows in set (0.00 sec)
$ > netstat -tulpen
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 27 33813 -
Any help is much appreciated as I've been working this problem for 2+ days and have not made any headway.

Airflow does not update the progress of a dag/task to be completed even though the dag/task has actually completed

I have set up airflow to be running in a distributed mode with 10 worker nodes. I tried to access the performance of the parallel workloads by triggering a test dag which contains just 1 task which just sleeps for 3 seconds and then comes out.
I triggered the dag using the command
airflow backfill test_dag -s 2015-06-20 -e 2015-07-10
The scheduler kicks of the jobs/dags in parallel and frequently I see the below o/p:
[2017-06-27 09:52:29,611] {models.py:4024} INFO - Updating state for considering 1 task(s)
[2017-06-27 09:52:29,647] {models.py:4024} INFO - Updating state for considering 1 task(s)
[2017-06-27 09:52:29,664] {jobs.py:1983} INFO - [backfill progress] | finished run 19 of 21 | tasks waiting: 0 | succeeded: 19 | kicked_off: 2 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
Here the kicked_off:2 indicates that 2 tasks are kicked off but when I see the UI for the status of the dag runs, I see 2 instances of dags to be running. When i look into the respective task instance log, it indicates that the task has been successfully completed but still the above message is displayed in the command prompt infinitely
[2017-06-27 09:52:29,611] {models.py:4024} INFO - Updating state for considering 1 task(s)
[2017-06-27 09:52:29,647] {models.py:4024} INFO - Updating state for considering 1 task(s)
[2017-06-27 09:52:29,664] {jobs.py:1983} INFO - [backfill progress] | finished run 19 of 21 | tasks waiting: 0 | succeeded: 19 | kicked_off: 2 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
Is it that the messages which are being sent by the worker is getting dropped and hence the status is not getting updated?
Is there any parameter in airflow.cfg file which allows the failed jobs like these to be retried on other worker nodes instead of infinately waiting for the message for the worker node which is responsible for executing the aobe failed tasks..

nova ERROR: [Errno 111] Connection refused

I'm using Centos 6.5 x86_64 to setup Openstack Havana and all services work well. But when I've rebooted the operating system, I've founded that the nova service does not work properly, the following error triggered:
nova flavor-list
ERROR: [Errno 111] Connection refused
Reviewing the log files in / var / log / nova gives the following error:
2014-03-24 12:24:04.293 6275 INFO nova.osapi_compute.wsgi.server [-] (6275) wsgi starting up
2014-03-24 12:24:04.297 6267 CRITICAL nova [-] [Errno 98] Address already in use
2014-03-24 12:24:04.412 6275 INFO nova.openstack.common.service [-] Parent process has died unexpectedly, exiting
2014-03-24 12:24:04.412 6274 INFO nova.openstack.common.service [-] Parent process has died unexpectedly, exiting
2014-03-24 12:24:04.412 6275 INFO nova.wsgi [-] Stopping WSGI server.
2014-03-24 12:24:04.412 6274 INFO nova.wsgi [-] Stopping WSGI server.
The state of my OpenStack server
nova-manage service list
Binary Host Zone Status State Updated_At
nova-cert controller internal enabled :-) 2014-03-24 14:28:03
nova-consoleauth controller internal enabled :-) 2014-03-24 14:28:01
nova-scheduler controller internal enabled :-) 2014-03-24 14:28:00
nova-conductor controller internal enabled :-) 2014-03-24 14:27:59
nova-compute controller nova enabled :-) 2014-03-24 14:28:06
nova-network controller internal enabled :-) 2014-03-24 14:27:58
keystone service-list
+----------------------------------+----------+----------+---------------------------+
| id | name | type | description |
+----------------------------------+----------+----------+---------------------------+
| 7ce108d652ee48d7897127045a371795 | cinder | volume | Cinder Volume Service |
| 9452b875328f4763b7766eb533bd75c4 | cinderv2 | volumev2 | Cinder Volume Service V2 |
| e9607d1a308140298f8364fd2a0e62a8 | glance | image | Glance Image Service |
| b7ac07f69e2e41f684d6470c69db4781 | keystone | identity | Keystone Identity Service |
| cbdfa73329094d7d94c7464b9bf0ef7d | nova | compute | Nova Compute service |
+----------------------------------+----------+----------+---------------------------+
ps -ef | grep "nova-api"
nova 2522 1 0 11:22 ? 00:00:00 /usr/bin/python /usr/bin/nova-api-metadata --logfile /var/log/nova/metadata-api.log
root 11909 6217 0 15:11 pts/1 00:00:01 gedit nova-api.log
root 12644 3832 0 15:31 pts/0 00:00:00 grep nova-api
netstat -napo | grep 877
tcp 0 0 0.0.0.0:8775 0.0.0.0:* LISTEN 2522/python off (0.00/0/0)
Any pointers would be extremely helpful.
Thanks
firstly, i strongly recommend you to find or ask for answer on ask.openstack.org
then from what you described, it may caused by: you've enabled nova-api-metadata and nova-api service in the same time.
from the default configuration we know that: ['ec2', 'osapi_compute', 'metadata'] are enabled, see https://github.com/openstack/nova/blob/stable/havana/nova/service.py#L55
so it will start each service one by one when nova-api service is called, see https://github.com/openstack/nova/blob/stable/havana/nova/cmd/api.py#L45
since nova-api-metadata service is running, which cause the 8775 port is used, then one service launched by nova-api will die and since this exception is not caught, then the other two will die too, then you get what you see in the log
If what I've assumed is right, please cancel the nova-api-metadata service and use nova-api service only, which means 'chkconfig openstack-nova-api-metadata off; chkconfig openstack-nova-api on', i'm not sure about the specific service name on your system, but should be something like that, correct it if i'm wrong
Connection refused is a common error encountered everytime. One of the case is keystone is refusing the connection for the nova service.
make sure SERVICE_PASSWORD for nova and quantum are same while creating the keystone services.Go to quantum and nova config files and verify the SERVICE_PASSWORD are same.
Njoy!!

WebLogic OBIEE Scheduler Component Down

I have an OBIEE 11g installation in a Red Hat machine, but I'm finding problems to make it running. I can start WebLogic and its services, so I’m able to enter the WebLogic console and Enterprise Manager, but problems come when I try to start OBIEE components with opmnctl command.
The steps I’m performing are the following:
1) Start WebLogic
cd /home/Oracle/Middleware/user_projects/domains/bifoundation_domain/bin/
./startWebLogic.sh
2) Start NodeManager
cd /home/Oracle/Middleware/wlserver_10.3/server/bin/
./startNodeManager.sh
3) Start Managed WebLogic
cd /home/Oracle/Middleware/user_projects/domains/bifoundation_domain/bin/
./startManagedWebLogic.sh bi_server1
4) Set up OBIEE Components
cd /home/Oracle/Middleware/instances/instance1/bin/
./opmnctl startall
The result is:
opmnctl startall: starting opmn and all managed processes...
================================================================================
opmn id=JustiziaInf.mmmmm.mmmmm.9999
Response: 4 of 5 processes started.
ias-instance id=instance1
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ias-component/process-type/process-set:
coreapplication_obisch1/OracleBISchedulerComponent/coreapplication_obisch1/
Error
--> Process (index=1,uid=1064189424,pid=4396)
failed to start a managed process after the maximum retry limit
Log:
/home/Oracle/Middleware/instances/instance1/diagnostics/logs/OracleBISchedulerComponent/
coreapplication_obisch1/console~coreapplication_obisch1~1.log
5) Check the status of components
cd /home/Oracle/Middleware/instances/instance1/bin/
./opmnctl status
Processes in Instance: instance1
---------------------------------+--------------------+---------+---------
ias-component | process-type | pid | status
---------------------------------+--------------------+---------+---------
coreapplication_obiccs1 | OracleBIClusterCo~ | 8221 | Alive
coreapplication_obisch1 | OracleBIScheduler~ | N/A | Down
coreapplication_obijh1 | OracleBIJavaHostC~ | 8726 | Alive
coreapplication_obips1 | OracleBIPresentat~ | 6921 | Alive
coreapplication_obis1 | OracleBIServerCom~ | 7348 | Alive
Read the log files from /home/Oracle/Middleware/instances/instance1/diagnostics/logs/OracleBISchedulerComponent/
coreapplication_obisch1/console~coreapplication_obisch1~1.log.
I would recommend trying the the steps in the below link as this is a common issue when upgrading OBIEE.
http://www.askjohnobiee.com/2012/11/fyi-opmnctl-failed-to-start-managed.html
Not sure, what your log says, but try these below steps and check if it works or not
Login as superuser
cd $ORACLE_HOME/Apache/Apache/bin
chmod 6750 .apachectl
logout and login as ORACLE user
opmnctl startproc process-type=OracleBIScheduler

Resources