I want to run my existing R script from Spark.
I have setup R and Spark on my machine and trying to execute the code but i am getting exception but that is not very helpful.
Spark Code-
String file = "/home/MSA2.R";
SparkConf sparkConf = new SparkConf().setAppName("First App")
.setMaster("local[1]");
#SuppressWarnings("resource")
JavaSparkContext sparkContext = new JavaSparkContext(sparkConf);
JavaRDD<String> rdd = sparkContext.textFile("/home/test.csv")
.pipe(file);
R code -
f1 <- read.csv("/home/testing.csv")
Exception -
Exception in thread "main" org.apache.spark.SparkException: Job
aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most
recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost):
java.lang.IllegalStateException: Subprocess exited with status 2.
Command ran: /home/MSA2.R
java.util.NoSuchElementException: key not found: 1
rg.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in 120 >seconds. This timeout is controlled by spark.rpc.askTimeout at >org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTime>out$$createRpcTimeoutException(RpcTimeout.scala:48)
There is not much in exception to debug the issue.
Can anyone suggest if the approach is correct or not. If yes can anyone help with the issue, If no, please suggest an approach.
Note: I don't want to use Spark R
Reference of above code- https://www.linkedin.com/pulse/executing-existing-r-scripts-from-spark-rutger-de-graaf
Actual error is :
java.lang.IllegalStateException: Subprocess exited with status 2.
Command ran: /home/MSA2.R
Make sure, MSA2.R exists in the given location and in the same cluster where you are running spark jobs.
Generally exit status 2 occurs when script is not able to access the device.
I have fixed the issue. I have added
#!/usr/bin/Rscript
on the first line of the RScript and it worked.
Related
I have setup streaming job using autoloader feature and input is located at azure adls gen2 in parquet format.below is the code.
df = spark.readStream.format("cloudFiles")\
.options(**cloudfile)\
.schema(schema).load(staging_path)
df.writeStream\
.trigger(processingTime="10 minutes"))\
.outputMode("append")\
.option("checkpointLocation", checkpoint_path)\
.foreachBatch(writeBatchToADXandDelta)\
.start()
This code throws an error as below
py4j.Py4JException: An exception was raised by the Python Proxy. Return Message: Traceback (most recent call last):
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in stage 11.0 failed 4 times, most recent failure: Lost task 5.3 in stage 11.0 (TID 115) (172.20.58.133 executor 1): com.databricks.sql.io.FileReadException: Error while reading file /mnt/adl2/environment=production/data_lake=main/tier=ingress/area=transient/domain=iotdata/entity=screens/topic=sensor/vendor=abc/source_system=iot_hub/parent=external/dataset=screens/kind=data/evolution=2/file_format=parquet/source=kevitsa/ingestion_date=2022/08/03/13/-136567710_c96a862c2aaf43cfbd62025cd3db4a48_1.parquet.
..
Caused by: java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:208)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$2.apply(ParquetFileFormat.scala:397)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$2.apply(ParquetFileFormat.scala:373)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:333)
... 18 more
what could be the reason for this.
Thanks in advance!!
From the error message it looks like you have a broken file in your location. You can use ignoreCorruptFiles option (doc) to skip broken files instead of failing.
Am trying to generate the basic nodes- PartyA, PartyB and Notary on Ubuntu 14 by running ./gradlew deployNodes or even ./gradlew clean deployNodes. The error reads:
... still waiting. If this is taking longer than usual, check the node logs.
Error while generating node info file /cordapp-template-java/build/nodes/Notary/logs
Error while generating node info file /cordapp-template-java/build/nodes/PartyB/logs
Error while generating node info file /cordapp-template-java/build/nodes/PartyA/logs
Task :deployNodes FAILED
FAILURE: Build failed with an exception.
What went wrong:
Execution failed for task ':deployNodes'.
Error while generating node info file. Please check the logs in /cordapp-template-java/build/nodes/Notary/logs.
Error while generating node info file. Please check the logs in /cordapp-template-java/build/nodes/Notary/logs.
The error logs do not provide any indication of error.
I have personally run into the above question myself. From what I saw, it seems it was a random incident on the Unix based machine.
The issue was resolved after I moved the project to the different location. It is absurd. But I have never ran into this issue ever again.
I tried to migrate yo-cordapp from version 2.0 to 3.0 but gets this error.
FAILURE: Build failed with an exception.
Where:
Build file '/home/atul/Documents/mg/IdeaProjects/yo-cordapp/build.gradle' line: 36
What went wrong:
A problem occurred evaluating root project 'yo'.
Failed to apply plugin [id 'net.corda.plugins.cordformation']
Could not create plugin of type 'Cordformation'.
Could not initialize class net.corda.plugins.Cordformation
Try:
Run with --stacktrace option to get the stack trace. Run with --debug option to get more log output.
BUILD FAILED
Total time: 0.513 secs
Stopped 0 worker daemon(s).
Received result Failure[value=org.gradle.initialization.ReportedException: org.gradle.internal.exceptions.LocationAwareException: Build file '/home/atul/Documents/mg/IdeaProjects/yo-cordapp/build.gradle' line: 36
A problem occurred evaluating root project 'yo'.] from daemon DaemonInfo{pid=1439, address=[1bb69a7c-e166-4da4-be23-025402c62d96 port:36544, addresses:[/0:0:0:0:0:0:0:1, /127.0.0.1]], state=Idle, lastBusy=1527564107106, context=DefaultDaemonContext[uid=dbe9d9f3-b86b-448f-8d35-648c4aad50fd,javaHome=/usr/lib/jvm/java-8-oracle,daemonRegistryDir=/root/.gradle/daemon,pid=1439,idleTimeout=10800000,daemonOpts=-XX:MaxPermSize=256m,-XX:+HeapDumpOnOutOfMemoryError,-Xmx1024m,-Dfile.encoding=UTF-8,-Duser.country=IN,-Duser.language=en,-Duser.variant]} (build should be done).
why i am getting this error
[PS : i know this migration already bean done but i am getting this error when i tried.]
You need to apply the new cordapp plugin. See https://github.com/corda/cordapp-example/blob/release-V3/kotlin-source/build.gradle#L11.
good day, I got an issue in creating an android project, Im currently using windows7 with JDK8u40 installed and Im using the latest dalvik sdk. But when I attempted to create an android project, an error was thrown:
* What went wrong:
Execution failed for task ':deleteSrcAndLayout'.
> Directory does not exist: C:\AndroidFX\CodeGenerator\src
Here's the complete error log:
C:\dalvik-sdk\samples\Ensemble8>./gradlew --info createProject -PDEBUG -PDIR=C:/
AndroidFX -PPACKAGE="hello" -PNAME="CodeGenerator" -PANDROID_SDK=C:/AndroidSDK/s
dk -PJFX_SDK=C:/dalvik-sdk -PJFX_APP=C:/Jar -PJFX_MAIN="hello.Hello"
Starting Build
Settings evaluated using empty settings script.
Projects loaded. Root project using build file 'C:\dalvik-sdk\samples\Ensemble8\
build.gradle'.
Included projects: [root project 'Ensemble8']
Evaluating root project 'Ensemble8' using build file 'C:\dalvik-sdk\samples\Ense
mble8\build.gradle'.
Starting file lock listener thread.
All projects evaluated.
Selected primary task 'createProject'
Tasks to be executed: [task ':conf', task ':androidCreateProject', task ':delete
SrcAndLayout', task ':writeAntProperties', task ':updateManifest', task ':update
StringsXml', task ':updateBuildXml', task ':createProject']
:conf (Thread[main,5,main]) started.
:conf
Executing task ':conf' (up-to-date check took 0.0 secs) due to:
Task has not declared any outputs.
====================================================
Android SDK: [C:/AndroidSDK/sdk]
Target: [android-21]
Project name: [CodeGenerator]
Package: [hello]
JavaFX application: [C:/Jar]
JavaFX sdk: [C:/dalvik-sdk]
JavaFX main.class: [hello.Hello]
Workdir: [C:/AndroidFX]
debug: [true]
===================================================
:conf (Thread[main,5,main]) completed. Took 0.078 secs.
:androidCreateProject (Thread[main,5,main]) started.
:androidCreateProject
Executing task ':androidCreateProject' (up-to-date check took 0.0 secs) due to:
Task has not declared any outputs.
Starting process 'command 'C:/AndroidSDK/sdk/tools/android.bat''. Working direct
ory: C:\AndroidFX Command: C:/AndroidSDK/sdk/tools/android.bat create project -n
CodeGenerator -p CodeGenerator -t android-21 -k hello -a Activity
An attempt to initialize for well behaving parent process finished.
Successfully started process 'command 'C:/AndroidSDK/sdk/tools/android.bat''
Error: Package name 'hello' contains invalid characters.
A package name must be constitued of two Java identifiers.
Each identifier allowed characters are: a-z A-Z 0-9 _
Proces
s 'command 'C:/AndroidSDK/sdk/tools/android.bat'' finished with exit value 0 (st
ate: SUCCEEDED)
:androidCreateProject (Thread[main,5,main]) completed. Took 1.375 secs.
:deleteSrcAndLayout (Thread[main,5,main]) started.
:deleteSrcAndLayout
Executing task ':deleteSrcAndLayout' (up-to-date check took 0.0 secs) due to:
Task has not declared any outputs.
:deleteSrcAndLayout FAILED
:deleteSrcAndLayout (Thread[main,5,main]) completed. Took 0.594 secs.
FAILURE: Build failed with an exception.
* Where:
Build file 'C:\dalvik-sdk\samples\Ensemble8\build.gradle' line: 203
* What went wrong:
Execution failed for task ':deleteSrcAndLayout'.
> Directory does not exist: C:\AndroidFX\CodeGenerator\src
* Try:
Run with --stacktrace option to get the stack trace. Run with --debug option to
get more log output.
BUILD FAILED
Total time: 6.531 secs
Please help me!! Im stuck!!!!
I also tried JDK7u75 but it didnt worked!!
I successfully created an android project by editing the createHelloWorld.bat under android-tools in the dalvik-sdk.
I wrote a SBT task to run cssLint for my project using rhino. cssLint returns exit code to my SBT task.
My question is how to make the task fail if the exit code is non-zero?
I don't want to throw any exceptions. I want my last line of the task result to show [Failed] instead of [success] and exit code of my SBT task to be non-zero.
SAMPLE
MyTask {
val exitcode = //rhino functions
//what to do??
}
The actual intent is to fail the build if css errors are present.
The way of failing the build without producing the stacktrace on the console is using the exceptions that are specifically handled:
for sbt.MessageOnlyException an error message is logged twice (without task name and then with task name) and the build is stopped
mix in sbt.FeedbackProvidedException or sbt.UnprintableException to implement custom exceptions for which sbt does not print a stacktrace. The string with task name and exception's toString is logged on the top level once and the build is stopped. It is expected that essential information for the user is already logged before throwing these.
Disclaimer: I've not seen this information in sbt manual. Extracted from the sources of sbt 0.13.16. sbt.FeedbackProvidedException is used this way by sbt compiler, sbt tests and by sbt-web and Play sbt plugins.
My understanding is that the success message is printed out always unless
showSuccess setting is set to false or
a task throws an exception.
In your particular case you want to report an error and so you should throw an exception or a value of the type of the result that might be considered a sort of exception like None or Failure.
Say, you've got the following task defined in build.sbt:
lazy val tsk = taskKey[Unit]("Task that always fails")
tsk := {
throw new IllegalStateException("a message")
}
When you execute the tsk task, the exception is printed out with no [success] afterwards.
[no-success]> tsk
[trace] Stack trace suppressed: run last *:tsk for the full output.
[error] (*:tsk) java.lang.IllegalStateException: a message
[error] Total time: 0 s, completed Feb 15, 2014 11:45:27 PM
I would rather prefer avoiding this style of programming and rely on Option as a way to report an issue with processing.
With the following tskO definition:
lazy val tskO = taskKey[Option[String]]("Task that reports failure as None")
tskO := None
you'd then check the result and if it's None you'd know it's a failure.