Solr cloud shard splitting fails when index type is native or simple - solrcloud

I am using solr 4.10.2. I tried to perform a shard split on my solr cloud test cluster. It fails all the time if the index type is set to "native" or "simple".
Is that normal? I can perform shard splitting if the index type is set to "single" or "none".
They advertise that shard splitting can be done while solr is running and i hardly imagine poking around changing the lock type of a production server...
Here is the test environment:
1 shard, 2 nodes, 1 collection.
Initially the collection was empty. I added few documents, verified that they have been replicated. All worked.
Issued the split shard command:
server1:port/solr/admin/collections?action=SPLITSHARD&collection=mycollection&shard=shard1&async=myhandle
After verifying that the operation had finished, by calling
server1:port/solr/admin/collections?action=REQUESTSTATUS&requestid=myhandle
The status was "complete".
Here is the log:
OverseerCollectionProcessor.processMessage : splitshard , {
"operation":"splitshard",
"shard":"shard1",
"collection":"mycollection",
"async":"myhandle"}
1/26/2015, 1:49:02 PM
ERROR
CoreContainer
Error creating core [mycollection_shard1_0_replica1]: Error opening new searcher
org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:873)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:646)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:491)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:466)
at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:575)
at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:199)
at org.apache.solr.handler.admin.CoreAdminHandler$ParallelCoreAdminHandlerThread.run(CoreAdminHandler.java:1234)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:845)
... 9 more
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock#/nfs/solr/index/write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:89)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:753)
at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77)
at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:279)
at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:111)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1528)
... 11 more
org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:873)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:646)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:491)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:466)
at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:575)
at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:199)
at org.apache.solr.handler.admin.CoreAdminHandler$ParallelCoreAdminHandlerThread.run(CoreAdminHandler.java:1234)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:845)
... 9 more
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock#/nfs/solr/index/write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:89)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:753)
at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77)
at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:279)
at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:111)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1528)
... 11 more
1/26/2015, 1:49:26 PM
ERROR
SolrIndexWriter
SolrIndexWriter was not closed prior to finalize(),​ indicates a bug -- POSSIBLE RESOURCE LEAK!!!
1/26/2015, 1:49:26 PM
ERROR
SolrIndexWriter
Error closing IndexWriter
java.lang.NullPointerException
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3230)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3203)
at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:907)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:984)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:954)
at org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:129)
at org.apache.solr.update.SolrIndexWriter.finalize(SolrIndexWriter.java:182)
at java.lang.System$2.invokeFinalize(System.java:1213)
at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98)
at java.lang.ref.Finalizer.access$100(Finalizer.java:34)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210)

I fixed the problem. Here is how:
When i created the solr cloud environment, i used the -Dsolr.data.dir property to map the collection storage to a different file system. This was because i was running VMs with limited storage capacity. Once i removed this property everything started working.
I think solr tries to use the same solr.data.dir path for the new cores created by the shard split causing the lock problem.

Related

SFTP On New or Update throws error Pipe close

I'm using the "On New or Update" source of the Mule 4 SFTP Connector, to process files from an SFTP server directory. The process works fine, however while reading the last file the SFTP connector throws an error as shown below and the file remains in directory waiting next schedule time to be picked up and the same thing will happen for the last file of other new set of files.
Any thoughts on how to fix this issue?
ERROR:
11:20:45.315 05/04/2022 Worker-0 [MuleRuntime].uber.27: [sftp-demo-app].prcsACKFiles-Error-SuccessFlow.CPU_INTENSIVE #1648077b ERROR
event:c458bc90-cbbd-11ec-85e2-06a565d43154
********************************************************************************
Message : "org.mule.weave.v2.module.reader.ReaderParsingException: org.mule.runtime.api.exception.MuleRuntimeException - Exception was found trying to retrieve the contents of file /home/messages/file_8ddb7674.json
org.mule.runtime.api.exception.MuleRuntimeException: Exception was found trying to retrieve the contents of file /home/messages/file_8ddb7674.json
at org.mule.extension.sftp.internal.connection.SftpClient.exception(SftpClient.java:427)
at org.mule.extension.sftp.internal.connection.SftpClient.exception(SftpClient.java:423)
at org.mule.extension.sftp.internal.connection.SftpClient.getFileContent(SftpClient.java:349)
at org.mule.extension.sftp.internal.connection.SftpFileSystem.retrieveFileContent(SftpFileSystem.java:117)
at org.mule.extension.sftp.internal.SftpInputStream$SftpFileInputStreamSupplier.getContentInputStream(SftpInputStream.java:111)
at org.mule.extension.sftp.internal.SftpInputStream$SftpFileInputStreamSupplier.getContentInputStream(SftpInputStream.java:93)
at org.mule.extension.file.common.api.AbstractConnectedFileInputStreamSupplier.getContentInputStream(AbstractConnectedFileInputStreamSupplier.java:81)
at org.mule.extension.file.common.api.AbstractFileInputStreamSupplier.get(AbstractFileInputStreamSupplier.java:65)
at org.mule.extension.file.common.api.AbstractFileInputStreamSupplier.get(AbstractFileInputStreamSupplier.java:33)
at org.mule.extension.file.common.api.stream.LazyStreamSupplier.lambda$new$1(LazyStreamSupplier.java:29)
at org.mule.extension.file.common.api.stream.LazyStreamSupplier.get(LazyStreamSupplier.java:42)
at org.mule.extension.file.common.api.stream.AbstractNonFinalizableFileInputStream.lambda$createLazyStream$0(AbstractNonFinalizableFileInputStream.java:48)
at $java.io.InputStream$$EnhancerByCGLIB$$55e4687e.read(<generated>)
at org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:102)
at org.mule.runtime.core.internal.streaming.bytes.AbstractInputStreamBuffer.consumeStream(AbstractInputStreamBuffer.java:111)
at com.mulesoft.mule.runtime.core.internal.streaming.bytes.FileStoreInputStreamBuffer.consumeForwardData(FileStoreInputStreamBuffer.java:239)
at com.mulesoft.mule.runtime.core.internal.streaming.bytes.FileStoreInputStreamBuffer.consumeForwardData(FileStoreInputStreamBuffer.java:202)
at com.mulesoft.mule.runtime.core.internal.streaming.bytes.FileStoreInputStreamBuffer.doGet(FileStoreInputStreamBuffer.java:125)
at org.mule.runtime.core.internal.streaming.bytes.AbstractInputStreamBuffer.get(AbstractInputStreamBuffer.java:93)
at org.mule.runtime.core.internal.streaming.bytes.BufferedCursorStream.assureDataInLocalBuffer(BufferedCursorStream.java:126)
at org.mule.runtime.core.internal.streaming.bytes.BufferedCursorStream.doRead(BufferedCursorStream.java:101)
at org.mule.runtime.core.internal.streaming.bytes.AbstractCursorStream.read(AbstractCursorStream.java:124)
at org.mule.runtime.core.internal.streaming.bytes.BufferedCursorStream.read(BufferedCursorStream.java:26)
at java.io.InputStream.read(InputStream.java:101)
at org.mule.runtime.core.internal.streaming.bytes.ManagedCursorStreamDecorator.read(ManagedCursorStreamDecorator.java:96)
at org.mule.weave.v2.el.SeekableCursorStream.read(MuleTypedValue.scala:306)
at org.mule.weave.v2.module.reader.UTF8StreamSourceReader.handleBOM(SeekableStreamSourceReader.scala:179)
at org.mule.weave.v2.module.reader.UTF8StreamSourceReader.readAscii(SeekableStreamSourceReader.scala:163)
at org.mule.weave.v2.module.json.reader.JsonTokenizer.$init$(JsonTokenizer.scala:21)
at org.mule.weave.v2.module.json.reader.indexed.IndexedJsonTokenizer.<init>(IndexedJsonTokenizer.scala:15)
at org.mule.weave.v2.module.json.reader.indexed.IndexedJsonParser.parser(IndexedJsonParser.scala:17)
at org.mule.weave.v2.module.json.reader.JsonReader.readValue(JsonReader.scala:40)
at org.mule.weave.v2.module.json.reader.JsonReader.doRead(JsonReader.scala:30)
at org.mule.weave.v2.module.reader.Reader.read(Reader.scala:35)
at org.mule.weave.v2.module.reader.Reader.read$(Reader.scala:33)
at org.mule.weave.v2.module.json.reader.JsonReader.read(JsonReader.scala:20)
at org.mule.weave.v2.el.MuleTypedValue.value(MuleTypedValue.scala:147)
at org.mule.weave.v2.model.values.wrappers.DelegateValue.valueType(DelegateValue.scala:17)
at org.mule.weave.v2.model.values.wrappers.DelegateValue.valueType$(DelegateValue.scala:16)
at org.mule.weave.v2.el.MuleTypedValue.valueType(MuleTypedValue.scala:177)
at org.mule.weave.v2.model.types.ObjectType$.accepts(Type.scala:1068)
Caused by: org.mule.extension.sftp.api.SftpConnectionException: Error occurred while trying to connect to host
... 112 more
Caused by: org.mule.runtime.api.connection.ConnectionException:
at org.mule.extension.sftp.api.SftpConnectionException.<init>(SftpConnectionException.java:38)
... 112 more
Caused by: org.mule.runtime.api.connection.ConnectionException:
... 112 more
Caused by: 4:
at com.jcraft.jsch.ChannelSftp.get(ChannelSftp.java:1540)
at com.jcraft.jsch.ChannelSftp.get(ChannelSftp.java:1290)
at org.mule.extension.sftp.internal.connection.SftpClient.getFileContent(SftpClient.java:347)
... 110 more
Caused by: java.io.IOException: Pipe closed
Error Pipe closed in SFTP indicates a communication error that the SFTP connector can not resolve, so the operation fails. I don't believe that there is anything that you can do about that. You might try to test a newer version of the connector if you are using an older one, just in case.

Including Multiple Attachments In Transaction Kills Node

Setup
Corda 4.6
Working from Java template
I have been experimenting adding up to 10 Attachments of small (1K) zip files to a transaction.
Error when testing with StartedMockNodes:
io.github.classgraph.ClassGraphException: Uncaught exception during scan
at io.github.classgraph.ClassGraphException.newClassGraphException(ClassGraphException.java:89) ~[classgraph-4.8.90.jar:4.8.90]
at io.github.classgraph.ClassGraph.scan(ClassGraph.java:1555) ~[classgraph-4.8.90.jar:4.8.90]
...
Caused by: java.lang.OutOfMemoryError: Java heap space
at nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler.readAllBytesWithSpilloverToDisk(NestedJarHandler.java:815) ~[classgraph-4.8.90.jar:4.8.90]
at nonapi.io.github.classgraph.fastzipfilereader.PhysicalZipFile.<init>(PhysicalZipFile.java:161) ~[classgraph-4.8.90.jar:4.8.90]
at nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler.downloadJarFromURL(NestedJarHandler.java:576) ~[classgraph-4.8.90.jar:4.8.90]
...
Error when testing local nodes built with CordForm and connecting with RPC:
Node will stop suddenly. No errors in the log. In the directory of the failed node there will be two files:
hs_err_pid20400.log
java_pid20400.hprof
log file has similar errors as StartedMockNode failures:
j nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler.readAllBytesWithSpilloverToDisk(Ljava/io/InputStream;Ljava/lang/String;JLnonapi/io/github/classgraph/utils/LogNode;)Lnonapi/io/github/classgraph/fileslice/Slice;+65
j nonapi.io.github.classgraph.fastzipfilereader.PhysicalZipFile.<init>(Ljava/io/InputStream;JLjava/lang/String;Lnonapi/io/github/classgraph/fastzipfilereader/NestedJarHandler;Lnonapi/io/github/classgraph/utils/LogNode;)V+25
j nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler.downloadJarFromURL(Ljava/lang/String;Lnonapi/io/github/classgraph/utils/LogNode;)Lnonapi/io/github/classgraph/fastzipfilereader/PhysicalZipFile;+428
j nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler.access$000(Lnonapi/io/github/classgraph/fastzipfilereader/NestedJarHandler;Ljava/lang/String;Lnonapi/io/github/classgraph/utils/LogNode;)Lnonapi/io/github/classgraph/fastzipfilereader/PhysicalZipFile;+3
j nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler$4.newInstance(Ljava/lang/String;Lnonapi/io/github/classgraph/utils/LogNode;)Ljava/util/Map$Entry;+124
Clarification #1: The error occurs during transaction execution. Not when originally uploading the files to the node using CordaRPCOps.uploadAttachmentWithMetadata (that works fine).
Clarification #2: The first node to fail is the one constructing the transaction. If you try restarting this node it will fail on restart. It will take several re-starts to get up and running again. Then any node that was receiving the transaction will fail. They will also require several restarts to get up an running again. As a testament to Corda's Flow framework - after enough restarts the transaction will eventually be successful and the Attachment's will be transmitted.
Clarification #3: I can pre-upload the Attachments to all the nodes before executing the transaction and the failures still occur.
StartedMockNodes:
Found this
Added the following to my workFlows build.gradle file to stop the errors:
test {
maxHeapSize = "4096m"
}
Local Nodes & RPC:
??? - haven't found a solution yet

Does Nexus accept Groovy strings in script?

After running a groovy script as task to create a role with:
security.addRole(// id
roleDeveloper,
// name
roleDeveloper,
// description
"A developer on ${repoCap} group",
// privileges
["nx-repository-view-maven2-${repo}-dependencies-browse",
"nx-repository-view-maven2-${repo}-dependencies-read"],
// roles
["dw-all-public-repos"])
I can't access to the roles menu. I get the following error:
com.orientechnologies.orient.core.exception.ODatabaseException: Error on deserialization of Serializable DB name="security"
[...]
Caused by: java.lang.ClassNotFoundException: org.codehaus.groovy.runtime.GStringImpl
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) [na:1.8.0_91]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) [na:1.8.0_91]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) [na:1.8.0_91]
at org.apache.felix.framework.BundleWiringImpl.doImplicitBootDelegation(BundleWiringImpl.java:1782) [na:na]
at org.apache.felix.framework.BundleWiringImpl.searchDynamicImports(BundleWiringImpl.java:1717) [na:na]
at org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1552) [na:na]
at org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:79) [na:na]
at org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:2018) [na:na]
After running several tests (with and without String interpolations) on several version of Nexus (3.x) it looks like String interpolations are supported for some parameters but not for privileges parameter.
Is it a known issue ?
Now that my Roles menu is inaccessible due to this above error is there a way to fix it ? (I tried to remove it with a script but it failed because delete perform a load first)
Sorry for the problems Alexandre. It looks like you'll have to connect to the database directly in order to fix the problematic records. Instructions for how to do this with Nexus offline are here: https://support.sonatype.com/hc/en-us/articles/115002930827-Accessing-the-OrientDB-Console
In particular the database you're looking to connect to is 'security':
connect plocal:data/db/security admin admin
And the tables you will need to inspect/delete from are 'privilege' and 'role'.
I'll keep an eye out here in case you run into problems or have any followup questions.

When upgrade to oracle11g-x64,we get into trouble

We ran an java web application server with weblogic10.3+bea jdk1.6+hibernate3+c3p0 0.9.1.2+oracle 9.2.8. When we upgraded the database to oracle11gx64 cluster with ojdbc6, we met many errors.
First the following error message appeared and the application can't connect to database at intervals of hours:
*com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#2a01aa -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
2016-01-28 18:09:55 com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#2a01aa -- APPARENT DEADLOCK!!! Complete Status:
Managed Threads: 3
Active Threads: 3
Active Tasks:*
Then we changed the config "hibernate.c3p0.max_statements"=0,this error disappeared,BUT other OutOfMemoryError arose:
Caused by: javassist.CannotCompileException: by java.lang.OutOfMemoryError: class allocation, 188463944 loaded, 187957248 footprint JVM#check_alloc (src/jvm/model/classload/classalloc.c:118). 67744 bytes
at javassist.util.proxy.FactoryHelper.toClass(FactoryHelper.java:169)
at org.jboss.seam.util.ProxyFactory.createClass3(ProxyFactory.java:350)
... 77 more
Caused by: java.lang.OutOfMemoryError: class allocation, 188463944 loaded, 187957248 footprint JVM#check_alloc (src/jvm/model/classload/classalloc.c:118). 67744 bytes
Anyone could help me? Thanks in advance!
Update to the latest c3p0 (now 0.9.5.2).
Continue to use statement caching if that works for you, but to avoid the deadlocks, use the following setting.
c3p0.statementCacheNumDeferredCloseThreads=1
See the docs.

Cloudify:: uninstallation-application fails

To uninstall an application I called uninstall-application app-name from the cloudify prompt in a local cloud environment. However the uninstall is unsuccessful. The log file shows following exception.
2013-10-14 13:06:50,537 rest [1] INFO [org.cloudifysource.rest.controllers.ServiceController] - Removing all application scope attributes for application
2013-10-14 13:06:50,542 rest [1] WARNING [org.openspaces.admin.internal.admin.DefaultAdmin] - Failed to execute: org.openspaces.admin.internal.gsm.DefaultGridServiceManager$3#70b1ec8b - org.openspaces.admin.AdminException: Failed to undeploy processing unit [app-name]; Caused by: org.openspaces.admin.AdminException: Failed to undeploy processing unit [app-name]
at org.openspaces.admin.internal.gsm.DefaultGridServiceManager.undeployProcessingUnit(DefaultGridServiceManager.java:279)
at org.openspaces.admin.internal.gsm.DefaultGridServiceManager$3.run(DefaultGridServiceManager.java:799)
at org.openspaces.admin.internal.admin.DefaultAdmin$LoggerRunnable.run(DefaultAdmin.java:2077)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.jini.rio.core.OperationalStringException: GSM not found
at org.jini.rio.monitor.ProvisionMonitorImpl.undeploy(ProvisionMonitorImpl.java:601)
at org.jini.rio.monitor.ProvisionMonitorAdminImpl.undeploy(ProvisionMonitorAdminImpl.java:126)
at org.jini.rio.monitor.DeployAdminGigaspacesMethodinternalInvoke7.internalInvoke(Unknown Source)
at com.gigaspaces.internal.reflection.fast.AbstractMethod.invoke(AbstractMethod.java:41)
at com.gigaspaces.lrmi.LRMIRuntime.invoked(LRMIRuntime.java:450)
at com.gigaspaces.lrmi.nio.Pivot.consumeAndHandleRequest(Pivot.java:557)
at com.gigaspaces.lrmi.nio.Pivot.handleRequest(Pivot.java:658)
at com.gigaspaces.lrmi.nio.Pivot$ChannelEntryTask.run(Pivot.java:196)
... 3 more
2013-10-14 13:06:51,544 rest [1] INFO [org.cloudifysource.rest.util.RestPollingRunnable] - undeployAndWait for processing unit has not finished yet
#
Eventually the operation times out. Post that I can not even teardown the local cloud. The only way to come out of this is the reboot the system. Appreciate some help on this one.
The following error:
Caused by: org.jini.rio.core.OperationalStringException: GSM not found at org.jini.rio.monitor.ProvisionMonitorImpl.undeploy
indicates that one of the Cloudify management components was missing. It may have crashed earlier, or perhaps the local machine was running at 100% CPU, causing local components to not respond to each other.
In an actual cloud deployment, this would cause the Cloudify agent to restart the failed component, but in the local-cloud environment the agent and the other management components run in the same process to conserve memory and speed up start-up time.

Resources