live migration in openstack-ansible [closed] - openstack

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
When I try to migrate from one compute host to another, I get an error
What is the reason for this error? i get same error.
compute2
2019-09-17 10:29:27.009 2371 ERROR nova.virt.libvirt.driver [-]
[instance: ab64119d-d075-4c99-8687-788695711b32] Live Migration
failure: Unsafe migration: Migration without shared storage is unsafe:
libvirtError: Unsafe migration: Migration without shared storage is
unsafe 2019-09-17 10:29:27.506 2371 ERROR nova.virt.libvirt.driver [-]
[instance: ab64119d-d075-4c99-8687-788695711b32] Migration operation
has aborted 2019-09-17 10:29:27.533 2371 INFO nova.compute.manager [-]
[instance: ab64119d-d075-4c99-8687-788695711b32] Swapping old
allocation on 0002f629-1480-4c71-b74a-eb9ca16f87d1 held by migration
ae674faa-49f0-4139-8eb9-966d842d8370 for instance
compute1
2019-09-17 10:29:25.626 2261 INFO nova.virt.libvirt.imagecache [req-7455f1fa-1821-4760-a38c-80ed4a7aa95a - - - - -] image e0d82262-e5dd-46f3-8747-8bb451a11f3d at (/var/lib/nova/instances/_base/993dda6ef2a8133a22deb14a205ae0d791dbd070): checking
2019-09-17 10:29:25.627 2261 INFO os_vif [req-7dfec421-606d-4923-a8f8-b4796ffdc155 b2223e6724d441dc9ceb01e2d93c42e2 a4d7dd39e119424781ff6cc62874381e - default default] Successfully plugged vif VIFBridge(active=True,address=fa:16:3e:78:d1:a2,bridge_name='brqb8d9540b-30',has_traffic_filtering=True,id=65ef51ba-8e72-44a2-9f45-ac3aa0ad2225,network=Network(b8d9540b-307c-490d-a99b-7ce565065a11),plugin='linux_bridge',port_profile=,preserve_on_delete=False,vif_name='tap65ef51ba-8e')
2019-09-17 10:29:25.665 2261 INFO nova.virt.libvirt.imagecache [req-7455f1fa-1821-4760-a38c-80ed4a7aa95a - - - - -] Active base files: /var/lib/nova/instances/_base/993dda6ef2a8133a22deb14a205ae0d791dbd070

Related

AcquireJobsRunnableImpl trows PSQLException: SSL error: readHandshakeRecord

With a successfully migrated Alfresco (5.2 to) 7.0 repository, I get PSQLException: SSL error: readHandshakeRecord (see stracktrace below) every morning at 4:00, which then causes the repository to stop responding.
Could someone please help me decipher this stack trace? Why is this job running around 4am? I can't find a suitable quartz job. Does anyone know how to manually force this call to fix this problem? At first I thought it might be related to the contentStoreCleaner running at 4:00 am, but disabling this job doesn't change anything.
The only work around I found so far was to disable the activity workflow engine.
2021-08-08 04:35:39,396 ERROR [org.activiti.engine.impl.jobexecutor.AcquireJobsRunnableImpl] [Thread-46] exception during job acquisition: Could not open JDBC Connection for transaction; nested exception is org.postgresql.util.PSQLException: SSL error: readHandshakeRecord
org.springframework.transaction.CannotCreateTransactionException: Could not open JDBC Connection for transaction; nested exception is org.postgresql.util.PSQLException: SSL error: readHandshakeRecord
at org.springframework.jdbc.datasource.DataSourceTransactionManager.doBegin(DataSourceTransactionManager.java:309)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.startTransaction(AbstractPlatformTransactionManager.java:400)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:373)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:137)
at org.activiti.spring.SpringTransactionInterceptor.execute(SpringTransactionInterceptor.java:45)
at org.activiti.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:31)
at org.activiti.engine.impl.cfg.CommandExecutorImpl.execute(CommandExecutorImpl.java:40)
at org.activiti.engine.impl.cfg.CommandExecutorImpl.execute(CommandExecutorImpl.java:35)
at org.activiti.engine.impl.jobexecutor.AcquireJobsRunnableImpl.run(AcquireJobsRunnableImpl.java:54)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.postgresql.util.PSQLException: SSL error: readHandshakeRecord
at org.postgresql.ssl.MakeSSL.convert(MakeSSL.java:43)
at org.postgresql.core.v3.ConnectionFactoryImpl.enableSSL(ConnectionFactoryImpl.java:534)
at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:149)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:213)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:51)
at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:223)
at org.postgresql.Driver.makeConnection(Driver.java:465)
at org.postgresql.Driver.connect(Driver.java:264)
at org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
at org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1188)
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
at org.springframework.jdbc.datasource.DataSourceTransactionManager.doBegin(DataSourceTransactionManager.java:265)
... 9 more
Caused by: javax.net.ssl.SSLException: readHandshakeRecord
at java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1335)
at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:440)
at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:411)
at org.postgresql.ssl.MakeSSL.convert(MakeSSL.java:41)
... 22 more
Suppressed: java.net.SocketException: Broken pipe (Write failed)
at java.base/java.net.SocketOutputStream.socketWrite0(Native Method)
at java.base/java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:110)
at java.base/java.net.SocketOutputStream.write(SocketOutputStream.java:150)
at java.base/sun.security.ssl.SSLSocketOutputRecord.encodeAlert(SSLSocketOutputRecord.java:81)
at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:380)
at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:292)
at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:450)
... 24 more
Caused by: java.net.SocketException: Broken pipe (Write failed)
at java.base/java.net.SocketOutputStream.socketWrite0(Native Method)
at java.base/java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:110)
at java.base/java.net.SocketOutputStream.write(SocketOutputStream.java:150)
at java.base/sun.security.ssl.SSLSocketOutputRecord.flush(SSLSocketOutputRecord.java:251)
at java.base/sun.security.ssl.HandshakeOutStream.flush(HandshakeOutStream.java:89)
at java.base/sun.security.ssl.Finished$T13FinishedProducer.onProduceFinished(Finished.java:679)
at java.base/sun.security.ssl.Finished$T13FinishedProducer.produce(Finished.java:658)
at java.base/sun.security.ssl.SSLHandshake.produce(SSLHandshake.java:436)
at java.base/sun.security.ssl.Finished$T13FinishedConsumer.onConsumeFinished(Finished.java:1011)
at java.base/sun.security.ssl.Finished$T13FinishedConsumer.consume(Finished.java:874)
at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:392)
at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:443)
at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:421)
at java.base/sun.security.ssl.TransportContext.dispatch(TransportContext.java:182)
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:171)
at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1418)
at java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1324)
... 25 more
The stacktrace was misleading - the jdbc connection problem was caused by memory filling up by the Alfresco trashcan cleaner module: OutOfMemoryError: Java heap space due to no limit on getChildAssocs.
We have ~ 4 million nodes in the trashcan and the module retrieves in every batch run all nodes again and again until the memory has been filled up ...

Live migration failure: not all arguments were converted to strings

Live migration to another compute node fails. I receive an error in the nova-compute log of the host compute node:
2020-10-21 15:15:52.496 614454 DEBUG nova.virt.libvirt.driver [-] [instance: bc41148a-8fdd-4be1-b8fa-468ee17a4f5b] About to invoke the migrate API _live_migration_operation /usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py:7808
2020-10-21 15:15:52.497 614454 ERROR nova.virt.libvirt.driver [-] [instance: bc41148a-8fdd-4be1-b8fa-468ee17a4f5b] Live Migration failure: not all arguments converted during string formatting
2020-10-21 15:15:52.498 614454 DEBUG nova.virt.libvirt.driver [-] [instance: bc41148a-8fdd-4be1-b8fa-468ee17a4f5b] Migration operation thread notification thread_finished /usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py:8149
2020-10-21 15:15:52.983 614454 DEBUG nova.virt.libvirt.migration [-] [instance: bc41148a-8fdd-4be1-b8fa-468ee17a4f5b] VM running on src, migration failed _log /usr/lib/python3/dist-packages/nova/virt/libvirt/migration.py:361
2020-10-21 15:15:52.984 614454 DEBUG nova.virt.libvirt.driver [-] [instance: bc41148a-8fdd-4be1-b8fa-468ee17a4f5b] Fixed incorrect job type to be 4 _live_migration_monitor /usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py:7978
2020-10-21 15:15:52.985 614454 ERROR nova.virt.libvirt.driver [-] [instance: bc41148a-8fdd-4be1-b8fa-468ee17a4f5b] Migration operation has aborted
Please help me with a solution to this issue.

Riak is Not Starting After Reboot

I'm seeing an IO error on the Riak console. I'm not sure what the cause is as the owner of the directory is riak. Here's how the error looks.
2018-01-25 23:18:06.922 [info] <0.2301.0>#riak_kv_vnode:maybe_create_hashtrees:234 riak_kv/730750818665451459101842416358141509827966271488: unable to start index_hashtree: {error,{{badmatch,{error,{db_open,"IO error: lock /var/lib/riak/anti_entropy/v0/730750818665451459101842416358141509827966271488/LOCK: already held by process"}}},[{hashtree,new_segment_store,2,[{file,"src/hashtree.erl"},{line,725}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_hashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,712}]},{lists,foldl,3,[{file,"lists.erl"},{line,1248}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,565}]},{riak_kv_index_hashtree,init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,308}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}}
2018-01-25 23:18:06.927 [info] <0.2315.0>#riak_kv_vnode:maybe_create_hashtrees:234 riak_kv/890602560248518965780370444936484965102833893376: unable to start index_hashtree: {error,{{badmatch,{error,{db_open,"IO error: lock /var/lib/riak/anti_entropy/v0/890602560248518965780370444936484965102833893376/LOCK: already held by process"}}},[{hashtree,new_segment_store,2,[{file,"src/hashtree.erl"},{line,725}]},{hashtree,new,2,[{file,"src/hashtree.erl"},{line,246}]},{riak_kv_index_hashtree,do_new_tree,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,712}]},{lists,foldl,3,[{file,"lists.erl"},{line,1248}]},{riak_kv_index_hashtree,init_trees,3,[{file,"src/riak_kv_index_hashtree.erl"},{line,565}]},{riak_kv_index_hashtree,init,1,[{file,"src/riak_kv_index_hashtree.erl"},{line,308}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,304}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}}
2018-01-25 23:18:06.928 [error] <0.27284.0> CRASH REPORT Process <0.27284.0> with 0 neighbours exited with reason: no match of right hand value {error,{db_open,"IO error: lock /var/lib/riak/anti_entropy/v0/890602560248518965780370444936484965102833893376/LOCK: already held by process"}} in hashtree:new_segment_store/2 line 725 in gen_server:init_it/6 line 328
Any ideas on what the problem could be?

Unknown option 'auth_opts' issue in ejabberd_http_auth plugin

I currently using ejabberd_http_auth mod to authenticate a user by external http api. However after I set up the following configuration in ejabberd.yml. I got Unknown option 'auth_opts' error.
I have already installed that plugin in ejabberd using command prompt and I have disable register mod.
Configuration:
auth_method: http
auth_opts:
host: "http://localhost:8080"
connection_pool_size: 10
connection_opts: []
basic_auth: ""
path_prefix: "/test/auth/"
Error Message:
2015-12-15 00:16:16.268 [error] <0.37.0>#ejabberd_config:validate_opts:794 unknown option 'auth_opts' will be likely ignored
2015-12-15 00:16:16.366 [info] <0.37.0>#cyrsasl_digest:start:60 FQDN used to check DIGEST-MD5 SASL authentication: MY_SERVER
2015-12-15 00:16:16.367 [info] <0.37.0>#ejabberd_app:add_windows_nameservers:195 Adding machine's DNS IPs to Erlang system:
[]
2015-12-15 00:16:16.373 [error] <0.36.0> CRASH REPORT Process <0.36.0> with 0 neighbours exited with reason: call to undefined function ejabberd_auth_http:start(<<"localhost">>) in application_master:init/4 line 133
2015-12-15 00:16:16.373 [info] <0.7.0> Application ejabberd exited with reason: call to undefined function ejabberd_auth_http:start(<<"localhost">>)
Thanks a lot.
The unknown option is not the problem here.
The problem is on that line:
2015-12-15 00:16:16.373 [info] <0.7.0> Application ejabberd exited with reason: call to undefined function ejabberd_auth_http:start(<<"localhost">>)
It means ejabberd_auth_http.beam is not in your Erlang path. It means it is either not installed or placed outside Erlang VM path.

Cloudify : "Could not determine the type of file "sftp://root:***#192.168.10.xxx/root/gs-files"."

I am attempting to bootstrap a private OpenStack cloud using Cloudify 2.7.1. It boots up the Linux instance correctly but fails "Uploading files to 192.168.10.XXX." due to an SFTP problem : "Could not determine the type of file "sftp://root:***#192.168.10.xxx/root/gs-files".".
I can access to the Instance using ssh (there is no probleme in the connection). I tried with other images (CentOS, Ubuntu, Cerros, ...) but always the same error !!
can anyone help me please ?
I attached a screenshot of the network topology created by Cloudify, and the stack trace.
Full stack trace:
2015-04-30 10:26:27,470 INFO [org.cloudifysource.shell.commands.AbstractGSCommand] - Setting security profile to "nonsecure".
2015-04-30
10:26:27,589 INFO
[org.cloudifysource.shell.commands.AbstractGSCommand] - Bootstrapping
cloud openstack-havana. This may take a few minutes.
2015-04-30
10:26:27,677 INFO
[org.cloudifysource.esc.driver.provisioning.BaseProvisioningDriver] -
Setup network configuration for managers
2015-04-30 10:26:27,677
INFO [org.cloudifysource.esc.driver.provisioning.BaseProvisioningDriver]
- Using management network : Cloudify-Management-Network
2015-04-30
10:26:51,536 INFO
[org.cloudifysource.esc.shell.listener.CliAgentlessInstallerListener] -
Attempting to access Management VM 192.168.10.241.
2015-04-30
10:27:10,551 INFO
[org.cloudifysource.esc.shell.listener.CliAgentlessInstallerListener] -
Uploading files to 192.168.10.241.
2015-04-30 10:27:15,708 WARNING [com.jcraft.jsch] - Permanently added '192.168.10.241' (RSA) to the list of known hosts.
2015-04-30
10:27:25,998 INFO
[org.cloudifysource.esc.shell.installer.CloudGridAgentBootstrapper] -
Failed accessing management VM 192.168.10.241 Reason: Failed to set up
file transfer: Unknown message with code "Could not determine the type
of file "sftp://cirros#192.168.10.241/cirros/gs-files".".; Caused by:
org.cloudifysource.esc.installer.InstallerException: Failed to set up
file transfer: Unknown message with code "Could not determine the type
of file "sftp://cirros#192.168.10.241/cirros/gs-files".".
2015-04-30
10:27:26,210 INFO
[org.cloudifysource.esc.driver.provisioning.openstack.OpenStackCloudifyDriver]
- Deleting Floating ip:
FloatingIp[floatingNetworkId=15578898-5e6b-44d9-a73a-1328ca6ea140,floatingIpAddress=192.168.10.241,portId=4b8dc211-12e8-4383-8799-f783d2786e98,id=593d8424-cfec-41ed-8204-ed8609366416]
2015-04-30
10:27:29,607 SEVERE
[org.cloudifysource.shell.commands.AbstractGSCommand] - Failed to set up
file transfer: Unknown message with code "Could not determine the type
of file "sftp://cirros#192.168.10.241/cirros/gs-files".". :
org.cloudifysource.esc.installer.InstallerException: Failed to set up
file transfer: Unknown message with code "Could not determine the type
of file "sftp://cirros#192.168.10.241/cirros/gs-files".".
at org.cloudifysource.esc.installer.filetransfer.VfsFileTransfer.initialize(VfsFileTransfer.java:206)
at org.cloudifysource.esc.installer.AgentlessInstaller.uploadFilesToServer(AgentlessInstaller.java:306)
at org.cloudifysource.esc.installer.AgentlessInstaller.installOnMachineWithIP(AgentlessInstaller.java:210)
at org.cloudifysource.esc.shell.installer.CloudGridAgentBootstrapper$1.call(CloudGridAgentBootstrapper.java:865)
at org.cloudifysource.esc.shell.installer.CloudGridAgentBootstrapper$1.call(CloudGridAgentBootstrapper.java:860)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused
by: org.apache.commons.vfs2.FileSystemException: Unknown message with
code "Could not determine the type of file
"sftp://cirros#192.168.10.241/cirros/gs-files".".
at org.apache.commons.vfs2.provider.sftp.SftpFileObject.refresh(SftpFileObject.java:95)
at org.apache.commons.vfs2.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:366)
at org.apache.commons.vfs2.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:317)
at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:85)
at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:65)
at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:693)
at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:621)
at org.cloudifysource.esc.installer.filetransfer.VfsFileTransfer.resolveTargetDirectory(VfsFileTransfer.java:218)
at org.cloudifysource.esc.installer.filetransfer.VfsFileTransfer.initialize(VfsFileTransfer.java:203)
... 8 more
Caused
by: org.apache.commons.vfs2.FileSystemException: Could not determine
the type of file "sftp://cirros#192.168.10.241/cirros/gs-files".
at org.apache.commons.vfs2.provider.AbstractFileObject.getType(AbstractFileObject.java:505)
at org.apache.commons.vfs2.provider.sftp.SftpFileObject.refresh(SftpFileObject.java:91)
... 16 more
Caused by: org.apache.commons.vfs2.FileSystemException: Could not connect to SFTP server at "sftp://cirros#192.168.10.241/".
at org.apache.commons.vfs2.provider.sftp.SftpFileSystem.getChannel(SftpFileSystem.java:153)
at org.apache.commons.vfs2.provider.sftp.SftpFileObject.statSelf(SftpFileObject.java:151)
at org.apache.commons.vfs2.provider.sftp.SftpFileObject.doGetType(SftpFileObject.java:114)
at org.apache.commons.vfs2.provider.AbstractFileObject.getType(AbstractFileObject.java:496)
... 17 more
Caused by: com.jcraft.jsch.JSchException: java.io.IOException: Pipe closed
at com.jcraft.jsch.ChannelSftp.start(ChannelSftp.java:288)
at com.jcraft.jsch.Channel.connect(Channel.java:152)
at com.jcraft.jsch.Channel.connect(Channel.java:145)
at org.apache.commons.vfs2.provider.sftp.SftpFileSystem.getChannel(SftpFileSystem.java:130)
... 20 more
Caused by: java.io.IOException: Pipe closed
at java.io.PipedInputStream.read(PipedInputStream.java:308)
at java.io.PipedInputStream.read(PipedInputStream.java:378)
at com.jcraft.jsch.ChannelSftp.fill(ChannelSftp.java:2665)
at com.jcraft.jsch.ChannelSftp.header(ChannelSftp.java:2691)
at com.jcraft.jsch.ChannelSftp.start(ChannelSftp.java:257)
... 23 more
Looks like you are trying to sftp into a cirros instance - I am not sure cirros even supports sftp. You can try this by using the sftp command line utility.
In general, sftp has to be configured and available on the target machine.
You can try using the SCP file transfer mode by setting this in your compute template:
fileTransfer org.cloudifysource.domain.cloud.FileTransferModes.SCP
If you are really using cirros, I suspect bootstrapping will fail. Cloudify was never tested on cirros. I think cirros is lacking some very basic utilities (I think it is not running bash. Not sure if it has wget). Cirros was never meant as a generic distribution - it is meant for testing your cloud's basic functionality.
One more thing - Cloudify 2 has reached End-of-Life - it is no longer supported. You should check out Cloudify 3.

Resources