I just want to know if there is possible to configure cinder to user lmv backends on different servers?
Example:
-server1 (10.10.0.1) VG-cinder
-Server2 (10.10.0.2) VG-cinder2
I know that it is possible to configure multi backends like, ceph, lvm, etc. but i haven't been able to find how configure multi LVM on different servers.
I have tried the following configuration but it gives me an error when I have created a volume on that backend:
[lvm2]
target_helper = lioadm
target_protocol = iscsi
iscsi_ip_address = xx.xx.xx.x
volume_group = VG-cinder
volume_driver = cinder.volume.drivers.lvm.LVMVolumeDriver
volumes_dir = $state_path/volumes
The error:
ERROR cinder.scheduler.flows.create_volume [req-58f3e687-a50b-499a-97b5-ec93a0b1e9eb 2278c149f52d44d890c3e6506168bf6a 6e7c315ff9ac432595c4260ab74967e0 - default default] Failed to run task cinder.scheduler.flows.create_volume.ScheduleCreateVolumeTask;volume:create: No valid backend was found. No weighed backends available: NoValidBackend: No valid backend was found. No weighed backends available
Thanks a lot.
Related
I am using Ansible to create a server in the Hetzner Cloud, the playbook reads:
- name: create the server at Hetzner
hetzner.hcloud.hcloud_server:
name: "{{server_hostname}}"
enable_ipv4: false
enable_ipv6: false
server_type: cx11
location: "{{server_location}}"
image: ubuntu-22.04
ssh_keys:
- "mykey"
state: present
api_token: "{{hetzner_secret}}"
private_networks: ipfire
register: server
My aim is to integrate the new server into the private network named 'ipfire' that I have previously created. The server should not be accessible via the internet, so I have disabled ipv4 and ipv6. Rather, I'd like to access the server by connecting via OpenVPN to the private network 'ipfire' and connect by use of ssh from there.
Unfortunately, I get an error message as follows:
PLAY [Order servers] ********************************************************************************************************
TASK [hetznerserver : create the server at Hetzner] *************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Unsupported parameters for (hetzner.hcloud.hcloud_server) module: private_networks. Supported parameters include: rebuild_protection, api_token, location, enable_ipv6, upgrade_disk, ipv4, endpoint, ipv6, firewalls, server_type, state, force, labels, ssh_keys, delete_protection, image, id, name, enable_ipv4, placement_group, force_upgrade, user_data, datacenter, rescue_mode, allow_deprecated_image, volumes, backups."}
PLAY RECAP ******************************************************************************************************************
localhost : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
The module private_networks does not seem to work like this?
Error messages like Unsupported parameters for (<moduleName>) module: <givenParameter>. Supported parameters include: <supportedParametersList> are usually syntax errors of the module used.
Therefore one may need to look up the respective documentation, in the example case hcloud_server module – Create and manage cloud servers on the Hetzner Cloud.
If the documentation shows the Parameters in question are available, it indicates
either a version mismatch of module used, means the used version is too old and an update is necessary
or an bug within the module code and further debugging and investigation within the module code is necessary
Code and Documentation Links
Community Authors> hetzner> hcloud
ansible-collections / hetzner.hcloud
After further investigation it might turn out that the parameter in question was introduced recently, in example
Github hetzner.hcloud Issue #150 "Unable to create cloud server without public ipv4 and ipv6"
Github hetzner.hcloud Pull #160 "Add possibility to specify private network when creating or updating servers"
which indicates in your example case that you'll need to update the Ansible Collection module in question since the parameter wasn't introduced in your used version of the module but as of v1.9.0.
Just created a managed 2-node Kubernetes (ver. 1.22.8) cluster on DigitalOcean (DO).
After installing WordPress using Bitnami Helm chart, and then installing a WP plugin, the site became unreachable.
Looking into DO K8s dashboard in the deployment section, the wordpress deployment shows the following error:
0/2 nodes are available: 2 pod has unbound immediate PersistentVolumeClaims.
AttachVolume.Attach failed for volume "pvc-c859847e-f250-4e71-9ed3-63c92cc01f50" : rpc error: code = DeadlineExceeded desc = context deadline exceeded
MountVolume.MountDevice failed for volume "pvc-c859847e-f250-4e71-9ed3-63c92cc01f50" : rpc error: code = Internal desc = formatting disk failed: exit status 1 cmd: 'mkfs.ext4 -F /dev/disk/by-id/scsi-0DO_Volume_pvc-c859847e-f250-4e71-9ed3-63c92cc01f50' output: "mke2fs 1.45.5 (07-Jan-2020)\nThe file /dev/disk/by-id/scsi-0DO_Volume_pvc-c859847e-f250-4e71-9ed3-63c92cc01f50 does not exist and no size was specified.\n"
Readiness probe failed: HTTP probe failed with statuscode: 404
As I'm quite new to K8s, I don't know much how to troubleshoot this.
Any help would be much appreciated.
UPDATE
I was able to reproduce the error and found what triggers it.
WordPress Bitnami charts installs several WP plugins by default. As soon as I try to delete them, the error shows up and the persistent volume gets corrupted...
Is this maybe a bug or it's standard behavior?
As the title says. I cannot run h20.init.
I have already downloaded the 64 bit version of the Java SE Development Kit 8u291. I also downloaded the xgboost library in R (install.packages("xgboost") ). Finally, I have updated all my NVIDIA drivers and downloaded the latest CUDA (although, tbh I don't even know what that does). I followed the steps described in the NVIDIA forums to avoid the crash I had when installing (i.e. remove integration with visual studio). FWIW I'm using a DELL Inspiron 15 Gaming and it has a NVIDIA GTX 1050 with 4GB.
Here's the full code I'm using (straight from the h2o download instructions except for the first line):
library(xgboost)
library(h2o)
localH2O = h2o.init()
demo(h2o.kmeans)
Any help would be much appreciated.
The full message I get when running the above code chunk:
H2O is not running yet, starting it now...
Note: In case of errors look at the following log files:
C:\Users\<my username>\AppData\Local\Temp\RtmpcdvCce\file1a106074110b/h2o_<my username>_started_from_r.out
C:\Users\<my username>\AppData\Local\Temp\RtmpcdvCce\file1a10253139db/h2o_<my username>_started_from_r.err
java version "15.0.2" 2021-01-19
Java(TM) SE Runtime Environment (build 15.0.2+7-27)
Java HotSpot(TM) 64-Bit Server VM (build 15.0.2+7-27, mixed mode, sharing)
Starting H2O JVM and connecting: ............................................................Diagnostic HTTP Request:
HTTP Status Code: -1
HTTP Error Message: Failed to connect to localhost port 54321: Connection refused
Cannot load library from path lib/windows_64/xgboost4j_gpu.dll
Cannot load library from path lib/xgboost4j_gpu.dll
Failed to load library from both native path and jar!
Cannot load library from path lib/windows_64/xgboost4j_omp.dll
Cannot load library from path lib/xgboost4j_omp.dll
Failed to load library from both native path and jar!
Cannot load library from path lib/windows_64/xgboost4j_minimal.dll
Cannot load library from path lib/xgboost4j_minimal.dll
Failed to load library from both native path and jar!
Failed to add native path to the classpath at runtime
java.io.IOException: Failed to get field handle to set library path
at ai.h2o.xgboost4j.java.NativeLibLoader.addNativeDir(NativeLibLoader.java:229)
at ai.h2o.xgboost4j.java.NativeLibLoader.initXGBoost(NativeLibLoader.java:43)
at ai.h2o.xgboost4j.java.NativeLibLoader.getLoader(NativeLibLoader.java:66)
at hex.tree.xgboost.XGBoostExtension.initXgboost(XGBoostExtension.java:70)
at hex.tree.xgboost.XGBoostExtension.isEnabled(XGBoostExtension.java:51)
at water.ExtensionManager.isEnabled(ExtensionManager.java:189)
at water.ExtensionManager.registerCoreExtensions(ExtensionManager.java:103)
at water.H2O.main(H2O.java:2203)
at water.H2OStarter.start(H2OStarter.java:22)
at water.H2OStarter.start(H2OStarter.java:48)
at water.H2OApp.main(H2OApp.java:12)
Cannot initialize XGBoost backend! Xgboost (enabled GPUs) needs:
- CUDA 8.0
XGboost (minimal version) needs:
- GCC 4.7+
For more details, run in debug mode: `java -Dlog4j.configuration=file:///tmp/log4j.properties -jar h2o.jar`
ERROR: Unknown argument (<my username>/AppData/Local/Temp/RtmpcdvCce)
Usage: java [-Xmx<size>] -jar h2o.jar [options]
(Note that every option has a default and is optional.)
-h | -help
Print this help.
-version
Print version info and exit.
-name <h2oCloudName>
Cloud name used for discovery of other nodes.
Nodes with the same cloud name will form an H2O cloud
(also known as an H2O cluster).
-flatfile <flatFileName>
Configuration file explicitly listing H2O cloud node members.
-ip <ipAddressOfNode>
IP address of this node.
-port <port>
Port number for this node (note: port+1 is also used by default).
(The default port is 54321.)
-network <IPv4network1Specification>[,<IPv4network2Specification> ...]
The IP address discovery code will bind to the first interface
that matches one of the networks in the comma-separated list.
Use instead of -ip when a broad range of addresses is legal.
(Example network specification: '10.1.2.0/24' allows 256 legal
possibilities.)
-ice_root <fileSystemPath>
The directory where H2O spills temporary data to disk.
-log_dir <fileSystemPath>
The directory where H2O writes logs to disk.
(This usually has a good default that you need not change.)
-log_level <TRACE,DEBUG,INFO,WARN,ERRR,FATAL>
Write messages at this logging level, or above. Default is INFO.
-max_log_file_size
Maximum size of INFO and DEBUG log files. The file is rolled over after a specified size has been reached.
(The default is 3MB. Minimum is 1MB and maximum is 99999MB)
-flow_dir <server side directory or HDFS directory>
The directory where H2O stores saved flows.
(The default is 'C:\Users\<my username>\h2oflows'.)
-nthreads <#threads>
Maximum number of threads in the low priority batch-work queue.
(The default is.)
-client
Launch H2O node in client mode.
-notify_local <fileSystemPath>
Specifies a file to write when the node is up. The file contains one line with the IP and
port of the embedded web server. e.g. 192.168.1.100:54321
-context_path <context_path>
The context path for jetty.
Authentication options:
-jks <filename>
Java keystore file
-jks_pass <password>
(Default is 'h2oh2o')
-jks_alias <alias>
(Optional, use if the keystore has multiple certificates and you want to use a specific one.)
-hostname_as_jks_alias
(Optional, use if you want to use the machine hostname as your certificate alias.)
-hash_login
Use Jetty HashLoginService
-ldap_login
Use Jetty Ldap login module
-kerberos_login
Use Jetty Kerberos login module
-spnego_login
Use Jetty SPNEGO login service
-pam_login
Use Jetty PAM login module
-login_conf <filename>
LoginService configuration file
-spnego_properties <filename>
SPNEGO login module configuration file
-form_auth
Enables Form-based authentication for Flow (default is Basic authentication)
-session_timeout <minutes>
Specifies the number of minutes that a session can remain idle before the server invalidates
the session and requests a new login. Requires '-form_auth'. Default is no timeout
-internal_security_conf <filename>
Path (absolute or relative) to a file containing all internal security related configurations
Cloud formation behavior:
New H2O nodes join together to form a cloud at startup time.
Once a cloud is given work to perform, it locks out new members
from joining.
Examples:
Start an H2O node with 4GB of memory and a default cloud name:
$ java -Xmx4g -jar h2o.jar
Start an H2O node with 6GB of memory and a specify the cloud name:
$ java -Xmx6g -jar h2o.jar -name MyCloud
Start an H2O cloud with three 2GB nodes and a default cloud name:
$ java -Xmx2g -jar h2o.jar &
$ java -Xmx2g -jar h2o.jar &
$ java -Xmx2g -jar h2o.jar &
So... after a lot of poking around I found the answer. Windows Defender ughhh was blocking access to the h2o.jar. The solution was to open PowerShell on the h2o java folder and run the h2o.jar using java -jar h2o.jar. Then you'll get the security prompt asking you to authorize the program (I've had to do it every time, so you might want to check your settings). Once you do that h2o.init() runs very smoothly in R.
I am attempting to create various instances and Compute is failing to spawn some of them.
My instance has the following characteristics:
Name: ThirdInstance
Created from image: CentOS-7-x86_64
Flavor: m1.medium (2 VCPU, 4GB RAM, 40GB Disk)
I have two other instances running. I was unable to spawn these instances unless I used the flavor m1.small (1VCPU, 2GB RAM, 20GB Disk). Any deviation from that flavor and the instance spawning failed.
Unfortunately, my ThirdInstance fails to spawn regardless of the flavor used. I have tried creating it with m1.small and it fails consistently.
I looked at the Nova logs, and am noting that when I attempt to create this instance I am consistently getting the following message in the nova-conductor.log file:
2020-08-29 13:21:09.637 98391 ERROR nova.conductor.manager
2020-08-29 13:21:09.637 98391 ERROR nova.conductor.manager
2020-08-29 13:21:09.890 98391 WARNING nova.scheduler.utils [req-30539015-22f1-4d46-b8b7-63f9c679eed1 4c4c7de6dd134250972958ce260530f2 166dc91ccec24f21963c71a437380ee9 - default default] Failed to compute_task_build_instances: No valid host was found.
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 241, in inner
return func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/nova/scheduler/manager.py", line 200, in select_destinations
raise exception.NoValidHost(reason="")
nova.exception.NoValidHost: No valid host was found.
: nova.exception_Remote.NoValidHost_Remote: No valid host was found.
2020-08-29 13:21:09.891 98391 WARNING nova.scheduler.utils [req-30539015-22f1-4d46-b8b7-63f9c679eed1 4c4c7de6dd134250972958ce260530f2 166dc91ccec24f21963c71a437380ee9 - default default] [instance: fe54feaf-ecb6-4725-97e9-7d208066ddb0] Setting instance to ERROR state.: nova.exception_Remote.NoValidHost_Remote: No valid host was found.
What am I missing here? What causes these No Host Found failures when I attempt to use flavors other than m1.small, and why does a third instance fail to spawn regardless of the flavor used??? How (if possible) can I get these instances to run properly?
NOTE: I am using an installation created from Packstack on CentOS 8. My machine is a 2- core with 32G of RAM and 3 Terabytes of disk space. The Openstack version is Ussuri.
Seems to me like you have not enough resources, especially CPU-cores. You have written, that your node has only two cores and you had already spawned 2 VMs with small flavor, which requires 1 core each. This No valid host was found-error comes also, when no compute-host was found with enough resources for the selected flavor.
You can check this by yourself:
Run openstack hypervisor list to list your hypervisor and then openstack hypervisor show <ID> with the id of your hypervisor. In the output you find vcpus and vcpus_used. vcpus is the maximum available number of cpu-cores on the selected compute-host. Based on the information of your question, I think both of these values are 2 in your case and that would show you, that you have not enough resources for your third VM.
I am running Airflowv1.9 with Celery Executor. I have 5 Airflow workers running in 5 different machines. Airflow scheduler is also running in one of these machines. I have copied the same airflow.cfg file across these 5 machines.
I have daily workflows setup in different queues like DEV, QA etc. (each worker runs with an individual queue name) which are running fine.
While scheduling a DAG in one of the worker (no other DAG have been setup for this worker/machine previously), I am seeing the error in the 1st task and as a result downstream tasks are failing:
*** Log file isn't local.
*** Fetching here: http://<worker hostname>:8793/log/PDI_Incr_20190407_v2/checkBCWatermarkDt/2019-04-07T17:00:00/1.log
*** Failed to fetch log file from worker. 404 Client Error: NOT FOUND for url: http://<worker hostname>:8793/log/PDI_Incr_20190407_v2/checkBCWatermarkDt/2019-04-07T17:00:00/1.log
I have configured MySQL for storing the DAG metadata. When I checked task_instance table, I see proper hostnames are populated against the task.
I also checked the log location and found that the log is getting created.
airflow.cfg snippet:
base_log_folder = /var/log/airflow
base_url = http://<webserver ip>:8082
worker_log_server_port = 8793
api_client = airflow.api.client.local_client
endpoint_url = http://localhost:8080
What am I missing here? What configurations do I need to check additionally for resolving this issue?
Looks like the worker's hostname is not being correctly resolved.
Add a file hostname_resolver.py:
import os
import socket
import requests
def resolve():
"""
Resolves Airflow external hostname for accessing logs on a worker
"""
if 'AWS_REGION' in os.environ:
# Return EC2 instance hostname:
return requests.get(
'http://169.254.169.254/latest/meta-data/local-ipv4').text
# Use DNS request for finding out what's our external IP:
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect(('1.1.1.1', 53))
external_ip = s.getsockname()[0]
s.close()
return external_ip
And export: AIRFLOW__CORE__HOSTNAME_CALLABLE=airflow.hostname_resolver:resolve
The web program of the master needs to go to the worker to fetch the log and display it on the front-end page. This process is to find the host name of the worker. Obviously, the host name cannot be found,Therefore, add the host name to IP mapping on the master's vim /etc/hosts
If this happens as part of a Docker Compose Airflow setup, the hostname resolution needs to be passed to the container hosting the webserver, e.g. through extra_hosts:
# docker-compose.yml
version: "3.9"
services:
webserver:
extra_hosts:
- "worker_hostname_0:192.168.xxx.yyy"
- "worker_hostname_1:192.168.xxx.zzz"
...
...
More details here.