Including Multiple Attachments In Transaction Kills Node - corda

Setup
Corda 4.6
Working from Java template
I have been experimenting adding up to 10 Attachments of small (1K) zip files to a transaction.
Error when testing with StartedMockNodes:
io.github.classgraph.ClassGraphException: Uncaught exception during scan
at io.github.classgraph.ClassGraphException.newClassGraphException(ClassGraphException.java:89) ~[classgraph-4.8.90.jar:4.8.90]
at io.github.classgraph.ClassGraph.scan(ClassGraph.java:1555) ~[classgraph-4.8.90.jar:4.8.90]
...
Caused by: java.lang.OutOfMemoryError: Java heap space
at nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler.readAllBytesWithSpilloverToDisk(NestedJarHandler.java:815) ~[classgraph-4.8.90.jar:4.8.90]
at nonapi.io.github.classgraph.fastzipfilereader.PhysicalZipFile.<init>(PhysicalZipFile.java:161) ~[classgraph-4.8.90.jar:4.8.90]
at nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler.downloadJarFromURL(NestedJarHandler.java:576) ~[classgraph-4.8.90.jar:4.8.90]
...
Error when testing local nodes built with CordForm and connecting with RPC:
Node will stop suddenly. No errors in the log. In the directory of the failed node there will be two files:
hs_err_pid20400.log
java_pid20400.hprof
log file has similar errors as StartedMockNode failures:
j nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler.readAllBytesWithSpilloverToDisk(Ljava/io/InputStream;Ljava/lang/String;JLnonapi/io/github/classgraph/utils/LogNode;)Lnonapi/io/github/classgraph/fileslice/Slice;+65
j nonapi.io.github.classgraph.fastzipfilereader.PhysicalZipFile.<init>(Ljava/io/InputStream;JLjava/lang/String;Lnonapi/io/github/classgraph/fastzipfilereader/NestedJarHandler;Lnonapi/io/github/classgraph/utils/LogNode;)V+25
j nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler.downloadJarFromURL(Ljava/lang/String;Lnonapi/io/github/classgraph/utils/LogNode;)Lnonapi/io/github/classgraph/fastzipfilereader/PhysicalZipFile;+428
j nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler.access$000(Lnonapi/io/github/classgraph/fastzipfilereader/NestedJarHandler;Ljava/lang/String;Lnonapi/io/github/classgraph/utils/LogNode;)Lnonapi/io/github/classgraph/fastzipfilereader/PhysicalZipFile;+3
j nonapi.io.github.classgraph.fastzipfilereader.NestedJarHandler$4.newInstance(Ljava/lang/String;Lnonapi/io/github/classgraph/utils/LogNode;)Ljava/util/Map$Entry;+124
Clarification #1: The error occurs during transaction execution. Not when originally uploading the files to the node using CordaRPCOps.uploadAttachmentWithMetadata (that works fine).
Clarification #2: The first node to fail is the one constructing the transaction. If you try restarting this node it will fail on restart. It will take several re-starts to get up and running again. Then any node that was receiving the transaction will fail. They will also require several restarts to get up an running again. As a testament to Corda's Flow framework - after enough restarts the transaction will eventually be successful and the Attachment's will be transmitted.
Clarification #3: I can pre-upload the Attachments to all the nodes before executing the transaction and the failures still occur.

StartedMockNodes:
Found this
Added the following to my workFlows build.gradle file to stop the errors:
test {
maxHeapSize = "4096m"
}
Local Nodes & RPC:
??? - haven't found a solution yet

Related

Google Cloud Composer (Airflow) - Scalability issues

I'm facing some issues on my Cloud Composer instance resulting in failed tasks.
Details of instance configuration :
Composer image : composer-2.0.29-airflow-2.3.3 / Airflow version : 2.3.3
Airflow.cfg :
parallelism = 32 / dag_concurrency = 100 / worker_concurrency = 24
In terms of resources :
I have 60 DAGs which can contains up to 55 tasks that needs to run in parallel.
They don't do any compute, only some light PythonOperator/GCSOperator/BigQueryOperator.
I often encounter this type of errors :
*** Log file is not found: gs://xxx/xxx/attempt=2.log.
*** The task might not have been executed or worker executing it might have finished abnormally (e.g. was evicted).
*** Please, refer to https://cloud.google.com/composer/docs/how-to/using/troubleshooting-dags#common_issues hints to learn what might be possible reasons for a missing log.
All of my tasks have 3 retries but when it happens for a reason it stops at 2 retries and send a failure error. I don't understand why. Example of error in mail sent :
Try 2 out of 3
Exception:
Executor reports task instance finished (failed) although the task says its queued. (Info: None) Was the task killed externally?
I also receives random zombie tasks Detected as zombie
My metrics are the following :
When I clear the task, it succeeds as it should.
(I don't have access to GKE but if it helps I can ask to have access)
Any advice to prevent this errors and understand what happens ?

firebase-admin fails with error maxretry without much information about the underlying cause and stops the node script also

I have a mobile and web app that use firebase realtime database and there are some long running tasks which are served on servers with the help of firebase-queue and firebase-admin. The long running tasks is to find out who else is using the mobile app in a person's contact book. So, if you install the app, the app will send a task to the server with your contact book data and ask it to find the people in the contact book who are also using the app. Every now and then I see two types of errors in the logs. First error is below which also cause the node process to stop.
/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:15373
return queue[i].onComplete(new Error(abortReason), false, null);
^
Error: maxretry
at /Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:15373:52
at exceptionGuard (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:4018:9)
at repoRerunTransactionQueue (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:15386:9)
at repoRerunTransactions (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:15279:5)
at /Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:15260:13
at /Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:7061:17
at PersistentConnection.onDataMessage_ (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:7088:17)
at Connection.onDataMessage_ (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:5882:14)
at Connection.onPrimaryMessageReceived_ (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:5876:18)
at WebSocketConnection.onMessage (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:5778:27)
at WebSocketConnection.appendFrame_ (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:4491:18)
at WebSocketConnection.handleIncomingFrame (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:4539:22)
at Client.mySock.onmessage (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:4438:19)
at Client.dispatchEvent (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:2883:30)
at Client._receiveMessage (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:3042:10)
at Client$2.<anonymous> (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:2924:49)
at Client$2.emit (node:events:539:35)
at Client$2.emit (node:domain:475:12)
at Client$2.<anonymous> (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:2186:14)
at pipe (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:1503:40)
at Pipeline$1._loop (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:1510:3)
at Pipeline$1.processIncomingMessage (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:1479:8)
at Extensions$1.processIncomingMessage (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:1645:20)
at Client$2._emitMessage (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:2177:22)
at Client$2._emitFrame (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:2137:19)
at Client$2.parse (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:1863:18)
at Client$2.parse (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:2369:60)
at IO.write (/Users/varungupta/Projects/myapp-server/node_modules/#firebase/database-compat/dist/index.standalone.js:186:16)
at TLSSocket.ondata (node:internal/streams/readable:754:22)
at TLSSocket.emit (node:events:527:28)
at TLSSocket.emit (node:domain:475:12)
at addChunk (node:internal/streams/readable:315:12)
at readableAddChunk (node:internal/streams/readable:289:9)
at TLSSocket.Readable.push (node:internal/streams/readable:228:10)
at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23)
There isn't much information about what exactly caused the maxretry error. The error happens at random after a few days of running the script. It doesn't happen right away.
The second error that I see that isn't as disruptive as the one above is
[2022-06-01T10:30:49.722Z] #firebase/database: FIREBASE WARNING: transaction at /queue/tasks/-N3TDQCAdt4y-akb0_MK failed: disconnect
This doesn't stop the node process and I can see that a transaction failed but not sure why did it disconnect and how can I resolve this problem.
I am using firebase-admin 9.6.0.

Error staging application: App staging failed in the buildpack compile phase in HWC Buildpack

I am trying to deploy my application built in ASP.Net 4.6.1. So I am using HWC Buildpack.
Below is my manifest.yml
---
applications:
- name: DRSN
random-route: true
memory: 128M
buildpack:
https://github.com/cloudfoundry/hwc-buildpack.git
env:
DOTNET_CLI_TELEMETRY_OPTOUT: 1
DOTNET_SKIP_FIRST_TIME_EXPERIENCE: true
The error that I am receiving is below.
Waiting for API to complete processing files...
Staging app and tracing logs...
Cell 0f7012eb-9e32-4fdf-ba92-85aee4639139 creating container for instance 34107c3c-1acb-4aa5-b435-b06516abcfcb
Cell 0f7012eb-9e32-4fdf-ba92-85aee4639139 successfully created container for instance 34107c3c-1acb-4aa5-b435-b06516abcfcb
Downloading app package...
Downloading build artifacts cache...
Downloaded build artifacts cache (231B)
Downloaded app package (19.5M)
Failed to compile droplet: Failed to compile droplet: fork/exec /tmp/buildpackdownloads/6c6dca8d638ac0d145d6581f9eb9a96a/bin/compile: permission denied
Exit status 223
Cell 0f7012eb-9e32-4fdf-ba92-85aee4639139 stopping instance 34107c3c-1acb-4aa5-b435-b06516abcfcb
Cell 0f7012eb-9e32-4fdf-ba92-85aee4639139 destroying container for instance 34107c3c-1acb-4aa5-b435-b06516abcfcb
Error staging application: App staging failed in the buildpack compile phase
Can anyone help me resolve this issue? Am I not correct in my manifest.yml? Or is it something else?
I believe that the problem is that you're telling the system to use the HWC buildpack, but at the same time you're not setting the Windows stack (at least based on what info I can see). That means it's going to default to the Linux stack, which I believe is why you're seeing the fork/exec /tmp/buildpackdownloads/... error.
Try adding stack: windows to your manifest.yml or -s windows to your cf push command (for future reference, when you need help always include the full cf push command you're running).
PS: you shouldn't use https://github.com/cloudfoundry/hwc-buildpack.git that is telling the system to grab the master branch in whatever state it's currently in. That's a.) not reproducible and b.) not guaranteed to be in a working state. You should either use the platform provided buildpack names (from cf buildpacks) or append #<branch_or_tag> to the end of the URL so it picks a specific branch. All CF Buildpacks have tags for each release. It's strongly recommended you use a tagged release.

Corda throws error trying to generate the basic nodes

Am trying to generate the basic nodes- PartyA, PartyB and Notary on Ubuntu 14 by running ./gradlew deployNodes or even ./gradlew clean deployNodes. The error reads:
... still waiting. If this is taking longer than usual, check the node logs.
Error while generating node info file /cordapp-template-java/build/nodes/Notary/logs
Error while generating node info file /cordapp-template-java/build/nodes/PartyB/logs
Error while generating node info file /cordapp-template-java/build/nodes/PartyA/logs
Task :deployNodes FAILED
FAILURE: Build failed with an exception.
What went wrong:
Execution failed for task ':deployNodes'.
Error while generating node info file. Please check the logs in /cordapp-template-java/build/nodes/Notary/logs.
Error while generating node info file. Please check the logs in /cordapp-template-java/build/nodes/Notary/logs.
The error logs do not provide any indication of error.
I have personally run into the above question myself. From what I saw, it seems it was a random incident on the Unix based machine.
The issue was resolved after I moved the project to the different location. It is absurd. But I have never ran into this issue ever again.

When upgrade to oracle11g-x64,we get into trouble

We ran an java web application server with weblogic10.3+bea jdk1.6+hibernate3+c3p0 0.9.1.2+oracle 9.2.8. When we upgraded the database to oracle11gx64 cluster with ojdbc6, we met many errors.
First the following error message appeared and the application can't connect to database at intervals of hours:
*com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#2a01aa -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
2016-01-28 18:09:55 com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#2a01aa -- APPARENT DEADLOCK!!! Complete Status:
Managed Threads: 3
Active Threads: 3
Active Tasks:*
Then we changed the config "hibernate.c3p0.max_statements"=0,this error disappeared,BUT other OutOfMemoryError arose:
Caused by: javassist.CannotCompileException: by java.lang.OutOfMemoryError: class allocation, 188463944 loaded, 187957248 footprint JVM#check_alloc (src/jvm/model/classload/classalloc.c:118). 67744 bytes
at javassist.util.proxy.FactoryHelper.toClass(FactoryHelper.java:169)
at org.jboss.seam.util.ProxyFactory.createClass3(ProxyFactory.java:350)
... 77 more
Caused by: java.lang.OutOfMemoryError: class allocation, 188463944 loaded, 187957248 footprint JVM#check_alloc (src/jvm/model/classload/classalloc.c:118). 67744 bytes
Anyone could help me? Thanks in advance!
Update to the latest c3p0 (now 0.9.5.2).
Continue to use statement caching if that works for you, but to avoid the deadlocks, use the following setting.
c3p0.statementCacheNumDeferredCloseThreads=1
See the docs.

Resources