Errors running Linpack - mpi

I'm trying to run HPL Linpack on my personal laptop. I'm using CentOS 8 on a VM.
Allocated cores : 6
Memory : 12.5 gb
Nodes : 1
When I run with smaller values of N, its running fine, but when I try to maximise the CPU usage, with bigger values of N(trying to go upto 75-80% of usage), I'm getting different errors each time.
ERRORS - All errors popped up on separate runs.
[1617771807.179752] [localhost:3301 :0] sock.c:344 UCX ERROR recv(fd=28) failed: Bad address
[1617771807.188129] [localhost:3298 :0] sock.c:344 UCX ERROR recv(fd=27) failed: Connection reset by peer
[1617771807.249456] [localhost:3298 :0] sock.c:344 UCX ERROR sendv(fd=-1) failed: Bad file descriptor
[localhost:03298] *** An error occurred in MPI_Send
[localhost:03298] *** reported by process [3696427009,2]
[localhost:03298] *** on communicator MPI COMMUNICATOR 5 SPLIT FROM 3
[localhost:03298] *** MPI_ERR_OTHER: known error not in list
[localhost:03298] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[localhost:03298] *** and potentially your MPI job)
_________________________________________________________________________________________
malloc(): corrupted top size
[localhost:06009] *** Process received signal ***
[localhost:06009] Signal: Aborted (6)
[localhost:06009] Signal code: (-6)
[localhost:06009] [ 0] /lib64/libpthread.so.0(+0x12b20)[0x7f230e65cb20]
[localhost:06009] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f230e2be7ff]
[localhost:06009] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f230e2a8c35]
[localhost:06009] [ 3] /lib64/libc.so.6(+0x7a987)[0x7f230e301987]
[localhost:06009] [ 4] /lib64/libc.so.6(+0x81d8c)[0x7f230e308d8c]
[localhost:06009] [ 5] /lib64/libc.so.6(+0x851f5)[0x7f230e30c1f5]
[localhost:06009] [ 6] /lib64/libc.so.6(__libc_malloc+0x1e2)[0x7f230e30d412]
[localhost:06009] [ 7] ./xhpl[0x4232e3]
[localhost:06009] [ 8] ./xhpl[0x4202cd]
[localhost:06009] [ 9] ./xhpl[0x41168e]
[localhost:06009] [10] ./xhpl[0x408eff]
[localhost:06009] [11] ./xhpl[0x4018aa]
[localhost:06009] [12] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f230e2aa7b3]
[localhost:06009] [13] ./xhpl[0x401cae]
[localhost:06009] *** End of error message ***
_________________________________________________________________________________________
corrupted size vs. prev_size
[localhost:05847] *** Process received signal ***
[localhost:05847] Signal: Aborted (6)
[localhost:05847] Signal code: (-6)
[localhost:05847] [ 0] /lib64/libpthread.so.0(+0x12b20)[0x7f07c812eb20]
[localhost:05847] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f07c7d907ff]
[localhost:05847] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f07c7d7ac35]
[localhost:05847] [ 3] /lib64/libc.so.6(+0x7a987)[0x7f07c7dd3987]
[localhost:05847] [ 4] /lib64/libc.so.6(+0x81d8c)[0x7f07c7ddad8c]
[localhost:05847] [ 5] /lib64/libc.so.6(+0x825e6)[0x7f07c7ddb5e6]
[localhost:05847] [ 6] /lib64/libc.so.6(+0x83a1b)[0x7f07c7ddca1b]
[localhost:05847] [ 7] ./xhpl[0x423596]
[localhost:05847] [ 8] ./xhpl[0x4202a6]
[localhost:05847] [ 9] ./xhpl[0x41168e]
[localhost:05847] [10] ./xhpl[0x408eff]
[localhost:05847] [11] ./xhpl[0x4018aa]
[localhost:05847] [12] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f07c7d7c7b3]
[localhost:05847] [13] ./xhpl[0x401cae]
[localhost:05847] *** End of error message ***
using formula :
N = int((round(sqrt((memory_per_node * 1024 * 1024 * 1024 * nodes)/8))) * percentage_usage)

Related

ZeroMQ and Dash on linux cause "Address already in use" error\

Puzzled by this error message as nothing is running on the port for Dash (I run sudo netstat -nlp to make sure) and the ZMQ is working, on it’s own, perfectly as is the Dash functionality. Put them together, which doesn’t seem unreasonable, and they should work well. I don’t see how the bindings are causing the issue.
I’m using linux and I was told that ZMQ and Dash are using the same port. My problem is that I don’t know how to address this issue.
here’s the error
ave#deepthought:~/tontine_2022/just_messing$ julia 6_25_min_dash.jl
starting IN Socket 5555
Received request: END
dying
ERROR: LoadError: TaskFailedException
Stacktrace:
[1] wait
# ./task.jl:334 [inlined]
[2] hot_restart(func::Dash.var"#72#74"{Dash.DashApp, String, Int64}; check_interval::Float64, env_key::String, suppress_warn::Bool)
# Dash ~/.julia/packages/Dash/yscRy/src/utils/hot_restart.jl:23
[3] run_server(app::Dash.DashApp, host::String, port::Int64; debug::Bool, dev_tools_ui::Nothing, dev_tools_props_check::Nothing, dev_tools_serve_dev_bundles::Nothing, dev_tools_hot_reload::Nothing, dev_tools_hot_reload_interval::Nothing, dev_tools_hot_reload_watch_interval::Nothing, dev_tools_hot_reload_max_retry::Nothing, dev_tools_silence_routes_logging::Nothing, dev_tools_prune_errors::Nothing)
# Dash ~/.julia/packages/Dash/yscRy/src/server.jl:64
[4] build_dash()
# Main ~/tontine_2022/just_messing/6_25_min_dash.jl:46
[5] top-level scope
# ~/tontine_2022/just_messing/6_25_min_dash.jl:68
nested task error: LoadError: StateError("Address already in use")
Stacktrace:
[1] bind(socket::Socket, endpoint::String)
# ZMQ ~/.julia/packages/ZMQ/R3wSD/src/socket.jl:58
[2] top-level scope
# ~/tontine_2022/just_messing/6_25_min_dash.jl:10
[3] include(mod::Module, _path::String)
# Base ./Base.jl:418
[4] include(x::String)
# Main.##274 ~/.julia/packages/Dash/yscRy/src/utils/hot_restart.jl:21
[5] top-level scope
# ~/.julia/packages/Dash/yscRy/src/utils/hot_restart.jl:21
[6] eval
# ./boot.jl:373 [inlined]
[7] eval
# ./Base.jl:68 [inlined]
[8] (::Dash.var"#21#22"{String, Symbol})()
# Dash ./task.jl:423
in expression starting at /home/dave/tontine_2022/just_messing/6_25_min_dash.jl:10
in expression starting at /home/dave/tontine_2022/just_messing/6_25_min_dash.jl:68
dave#deepthought:~/tontine_2022/just_messing$
can someone look at my code to see what I am doing wrong please?
here’s the ZMQ pull code
using ZMQ
using Dash
using DataFrames
context = Context()
in_socket = Socket(context, PULL)
ZMQ.bind(in_socket, "tcp://*:5555")
println("starting IN Socket 5555")
dash_columns = ["sym","price","sdmove","hv20","hv10","hv5","iv","iv%ile","prc%ile","volume"]
df_dash_table = DataFrame([col => (col == "sym" ? String : Float64)[] for col in dash_columns ])
function build_dash()
app = dash()
app.layout = html_div() do
html_h1("tontine2"),
dash_datatable( id="table", columns=[Dict("name" =>i, "id" => i) for i in names(df_dash_table)],
data = Dict.(pairs.(eachrow(df_dash_table))),
editable=false,
filter_action="native",
sort_action="native",
sort_mode="multi",
row_selectable="multi",
row_deletable=false,
selected_rows=[],
## page_action="native",
## page_current= 0,
## page_size= 10,
)#end dash_datatable
end
run_server(app, "0.0.0.0", debug=true)
end
while true
message = String(ZMQ.recv(in_socket))
println("Received request: $message")
if message == "END"
println("dying")
break
end
end
build_dash()
and here’s the code to trigger the event
using ZMQ
context = Context()
stk_socket = Socket(context, PUSH)
ZMQ.connect(stk_socket, "tcp://localhost:5555")
ZMQ.send(stk_socket,"END")
ZMQ.close(stk_socket)
ZMQ.close(context)
This works here:
using ZMQ
using Dash
using Distributed
app = dash(external_stylesheets = ["https://codepen.io/chriddyp/pen/bWLwgP.css"])
app.layout = html_div() do
html_h1("Hello Dash"),
html_div("Dash.jl: Julia interface for Dash"),
dcc_graph(
id = "example-graph",
figure = (
data = [
(x = [1, 2, 3], y = [4, 1, 2], type = "bar", name = "SF"),
(x = [1, 2, 3], y = [2, 4, 5], type = "bar", name = "Montréal"),
],
layout = (title = "Dash Data Visualization",)
)
)
end
#spawn run_server(app, "0.0.0.0", 8080)
function testpush()
context = Context()
stk_socket = Socket(context, PUSH)
ZMQ.connect(stk_socket, "tcp://localhost:5555")
ZMQ.send(stk_socket,"END")
ZMQ.close(stk_socket)
ZMQ.close(context)
end
context = Context()
in_socket = Socket(context, PULL)
ZMQ.bind(in_socket, "tcp://*:5555")
sleep(1)
#spawn testpush()
sleep(5)
If you are running the program in an editor task, please note that the spawned process will now continue until you restart the editor. Perhaps a spawned process is clinging to the port?

Clickhouse distributed table node stops accepting TCP connections

Clickhouse version: (version 20.3.9.70 (official build))
(I know this is no longer supported, we have plans to upgrade but it takes time)
The Setup
We are running three query nodes (nodes with distributed tables only), we spun up the third one yesterday. All nodes point to the same storage nodes and tables.
The Problem
The node serves requests just fine over TCP and HTTP for up to 11 hours. After that, the clickhouse server starts to close TCP connections. HTTP still works just fine when this happens.
Extra Information/Evidence
The system.metrics.tcp_connection number steadily drops over time for the new node.
netstatgives shows a lot of ACTIVE_WAIT connections
netstat -ntp | tail -n+3 | awk '{print $6}' | sort | uniq -c | sort -n
2 LAST_ACK
380 CLOSE_WAIT
386 ESTABLISHED
29279 TIME_WAIT
Normal node for comparison:
1199 CLOSE_WAIT
1292 ESTABLISHED
186 TIME_WAIT
Opening the clickhouse_client is not possible
user#server:~$ clickhouse-client
ClickHouse client version 20.3.9.70 (official build).
Connecting to localhost:9000 as user default.
Code: 32. DB::Exception: Attempt to read after eof
The following shows up in logs:
2021.12.15 19:00:29.215048 [ 25146 ] {e2f742e013b7d83f5d1d6e524afc5d2b} <Warning> ConnectionPoolWithFailover: Connection failed at try №1, reason: Code: 32, e.displayText() = DB::Exception: Attempt to read after eof (version 20.3.9.70 (official build))
2021.12.15 19:03:32.098881 [ 25536 ] {} <Error> ServerErrorHandler: Poco::Exception. Code: 1000, e.code() = 107, e.displayText() = Net Exception: Socket is not connected, Stack trace (when copying this message, always include the lines below):
0. /build/obj-x86_64-linux-gnu/../contrib/poco/Foundation/src/Exception.cpp:27: Poco::IOException::IOException(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) # 0x1053e380 in /usr/lib/debug/usr/bin/clickhouse
1. /build/obj-x86_64-linux-gnu/../contrib/poco/Net/src/NetException.cpp:26: Poco::Net::NetException::NetException(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) # 0xe38f6ed in /usr/lib/debug/usr/bin/clickhouse
2. /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/string:2134: Poco::Net::SocketImpl::error(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) (.cold) # 0xe3a5093 in /usr/lib/debug/usr/bin/clickhouse
3. /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/string:2134: Poco::Net::SocketImpl::peerAddress() # 0xe3a0633 in /usr/lib/debug/usr/bin/clickhouse
4. /build/obj-x86_64-linux-gnu/../src/IO/ReadBufferFromPocoSocket.cpp:66: DB::ReadBufferFromPocoSocket::ReadBufferFromPocoSocket(Poco::Net::Socket&, unsigned long) # 0x902ffd7 in /usr/lib/debug/usr/bin/clickhouse
5. /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/type_traits:3696: DB::TCPHandler::runImpl() # 0x9023905 in /usr/lib/debug/usr/bin/clickhouse
6. /build/obj-x86_64-linux-gnu/../programs/server/TCPHandler.cpp:1235: DB::TCPHandler::run() # 0x9025470 in /usr/lib/debug/usr/bin/clickhouse
7. /build/obj-x86_64-linux-gnu/../contrib/poco/Net/src/TCPServerConnection.cpp:57: Poco::Net::TCPServerConnection::start() # 0xe3ac69b in /usr/lib/debug/usr/bin/clickhouse
8. /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:856: Poco::Net::TCPServerDispatcher::run() # 0xe3acb1d in /usr/lib/debug/usr/bin/clickhouse
9. /build/obj-x86_64-linux-gnu/../contrib/poco/Foundation/include/Poco/Mutex_STD.h:132: Poco::PooledThread::run() # 0x105c3317 in /usr/lib/debug/usr/bin/clickhouse
10. /build/obj-x86_64-linux-gnu/../contrib/poco/Foundation/include/Poco/AutoPtr.h:205: Poco::ThreadImpl::runnableEntry(void*) # 0x105bf11c in /usr/lib/debug/usr/bin/clickhouse
11. /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:2615: void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void* (*)(void*), Poco::ThreadImpl*> >(void*) # 0x105c0abd in /usr/lib/debug/usr/bin/clickhouse
12. start_thread # 0x8184 in /lib/x86_64-linux-gnu/libpthread-2.19.so
13. __clone # 0xfe03d in /lib/x86_64-linux-gnu/libc-2.19.so
(version 20.3.9.70 (official build))
2021.12.15 19:03:32.098881 [ 25536 ] {} <Error> ServerErrorHandler: Poco::Exception. Code: 1000, e.code() = 107, e.displayText() = Net Exception: Socket is not connected, Stack trace (when copying this message, always include the lines below):
0. /build/obj-x86_64-linux-gnu/../contrib/poco/Foundation/src/Exception.cpp:27: Poco::IOException::IOException(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) # 0x1053e380 in /usr/lib/debug/usr/bin/clickhouse
1. /build/obj-x86_64-linux-gnu/../contrib/poco/Net/src/NetException.cpp:26: Poco::Net::NetException::NetException(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > cons
t&, int) # 0xe38f6ed in /usr/lib/debug/usr/bin/clickhouse
2. /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/string:2134: Poco::Net::SocketImpl::error(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) (.cold
) # 0xe3a5093 in /usr/lib/debug/usr/bin/clickhouse
3. /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/string:2134: Poco::Net::SocketImpl::peerAddress() # 0xe3a0633 in /usr/lib/debug/usr/bin/clickhouse
4. /build/obj-x86_64-linux-gnu/../src/IO/ReadBufferFromPocoSocket.cpp:66: DB::ReadBufferFromPocoSocket::ReadBufferFromPocoSocket(Poco::Net::Socket&, unsigned long) # 0x902ffd7 in /usr/lib/debug/usr/bin/cl
ickhouse
5. /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/type_traits:3696: DB::TCPHandler::runImpl() # 0x9023905 in /usr/lib/debug/usr/bin/clickhouse
6. /build/obj-x86_64-linux-gnu/../programs/server/TCPHandler.cpp:1235: DB::TCPHandler::run() # 0x9025470 in /usr/lib/debug/usr/bin/clickhouse
7. /build/obj-x86_64-linux-gnu/../contrib/poco/Net/src/TCPServerConnection.cpp:57: Poco::Net::TCPServerConnection::start() # 0xe3ac69b in /usr/lib/debug/usr/bin/clickhouse
8. /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:856: Poco::Net::TCPServerDispatcher::run() # 0xe3acb1d in /usr/lib/debug/usr/bin/clickhouse
9. /build/obj-x86_64-linux-gnu/../contrib/poco/Foundation/include/Poco/Mutex_STD.h:132: Poco::PooledThread::run() # 0x105c3317 in /usr/lib/debug/usr/bin/clickhouse
10. /build/obj-x86_64-linux-gnu/../contrib/poco/Foundation/include/Poco/AutoPtr.h:205: Poco::ThreadImpl::runnableEntry(void*) # 0x105bf11c in /usr/lib/debug/usr/bin/clickhouse
11. /build/obj-x86_64-linux-gnu/../contrib/libcxx/include/memory:2615: void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__t
hread_struct> >, void* (*)(void*), Poco::ThreadImpl*> >(void*) # 0x105c0abd in /usr/lib/debug/usr/bin/clickhouse
12. start_thread # 0x8184 in /lib/x86_64-linux-gnu/libpthread-2.19.so
13. __clone # 0xfe03d in /lib/x86_64-linux-gnu/libc-2.19.so
(version 20.3.9.70 (official build))
Attempted Remedies/Debugging
Restarting clickhouse on the host temporarily fixes the problem. We have tried it once. This state happens again after 10-11 hours of operation.
There are no helpful logs at INFO level before the dwindling of TCP connections
# this returns nothing
cat clickhouse-server.log.18-43-to-18-52 | grep -vE 'Done processing|Client has not sent any data|executeQuery|Processed in'
HTTP still works just fine when this happens

Error using AHP for R cran.r-project.org/web/packages/ahp

I am using the ahp package for R (cran.r-project.org/web/packages/ahp). I built a new document .ahp with the alternatives and criteria
Version: 2.0
#########################
# Alternatives Section
#
Alternatives: &alternatives
# Here, we list all the alternatives, together with their attributes.
A:
hectareas: 1.88
ninos: 1
adultos: 12
B:
hectareas: 21.06
ninos: 14
adultos: 19
#
# End of Alternatives Section
#####################################
#####################################
# Goal Section
#
Goal:
# The goal spans a tree of criteria and the alternatives
name: Zona Verde
description: >
This is a classic single decision maker problem.
author: unknown
preferences:
# preferences are typically defined pairwise
# 1 means: A is equal to B
# 9 means: A is highly preferrable to B
# 1/9 means: B is highly preferrable to A
pairwise:
- [hectareas, ninos, 3]
- [hectareas, adultos, 7]
- [ninos, adultos, 3]
children:
hectareas:
preferences:
pairwise:
- [A, B, 9]
children: *alternatives
ninos:
preferences:
pairwise:
- [A, B, 1/3]
children: *alternatives
adultos:
preferences:
pairwise:
- [A, B, 1/4]
children: *alternatives
#
# End of Goal Section
#####################################
The other document is ahp.R by te libray and analysis
library(ahp)
#list example files provided by the package
list.files(system.file("extdata", package="ahp"))
#zonas verdes example
ahpFile <- system.file("extdata", "zonasverdes.ahp", package="ahp")
zonasverdesAhp <- Load(ahpFile)
Calculate(zonasverdesAhp)
Analyze(zonasverdesAhp)
AnalyzeTable(zonasverdesAhp)
When I run the code to analyze the AHP results appear this error:
Error in value[[3L]](cond) :
Could not load ahp model. Exception caught when converting into a data.tree: Error in preferences[[type]]: attempt to select less than one element in get1index
I do not know if there are mistakes in the code indented or in some functions.
Thanks
This is a question about indentation.
In lines 49, 54, and 59 of ahpfile, you need to reduce indentation.
like this:
<code>
children:
hectareas:
preferences:
pairwise:
- [A, B, 9]
children: *alternatives
ninos:
preferences:
pairwise:
- [A, B, 1/3]
children: *alternatives
adultos:
preferences:
pairwise:
- [A, B, 1/4]
children: *alternatives
</code>

How to build logistic regression model in SparkR

I am new to Spark as well as SparkR. I have successfully installed Spark and SparkR.
When I tried to build Logistic regression model with R and Spark over csv file stored in HDFS, I got the error "incorrect number of dimensions".
My Code is :
points <- cache(lapplyPartition(textFile(sc, "hdfs://localhost:54310/Henry/data.csv"), readPartition))
collect(points)
w <- runif(n=D, min = -1, max = 1)
cat("Initial w: ", w, "\n")
# Compute logistic regression gradient for a matrix of data points
gradient <- function(partition) {
partition = partition[[1]]
Y <- partition[, 1] # point labels (first column of input file)
X <- partition[, -1] # point coordinates
# For each point (x, y), compute gradient function
dot <- X %*% w
logit <- 1 / (1 + exp(-Y * dot))
grad <- t(X) %*% ((logit - 1) * Y)
list(grad)
}
for (i in 1:iterations) {
cat("On iteration ", i, "\n")
w <- w - reduce(lapplyPartition(points, gradient), "+")
}
Error Message is:
On iteration 1
Error in partition[, 1] : incorrect number of dimensions
Calls: do.call ... func -> FUN -> FUN -> Reduce -> <Anonymous> -> FUN -> FUN
Execution halted
14/09/27 01:38:13 ERROR Executor: Exception in task 0.0 in stage 181.0 (TID 189)
java.lang.NullPointerException
at edu.berkeley.cs.amplab.sparkr.RRDD.compute(RRDD.scala:125)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
at org.apache.spark.scheduler.Task.run(Task.scala:54)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)
14/09/27 01:38:13 WARN TaskSetManager: Lost task 0.0 in stage 181.0 (TID 189, localhost): java.lang.NullPointerException:
edu.berkeley.cs.amplab.sparkr.RRDD.compute(RRDD.scala:125)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
org.apache.spark.scheduler.Task.run(Task.scala:54)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:701)
14/09/27 01:38:13 ERROR TaskSetManager: Task 0 in stage 181.0 failed 1 times; aborting job
Error in .jcall(getJRDD(rdd), "Ljava/util/List;", "collect") : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 181.0 failed 1 times, most recent failure: Lost task 0.0 in stage 181.0 (TID 189, localhost): java.lang.NullPointerException: edu.berkeley.cs.amplab.sparkr.RRDD.compute(RRDD.scala:125) org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) org.apache.spark.rdd.RDD.iterator(RDD.scala:229) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) org.apache.spark.scheduler.Task.run(Task.scala:54) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:701) Driver stacktrace:
Dimension of data (sample) :
data <- read.csv("/home/Henry/data.csv")
dim(data)
[1] 17 541
What could be the possible reason for this error?
The problem is that textFile() reads some text data and return a distributed collection of strings, each of which corresponds to a line of the text file. Therefore later in the program partition[, -1] fails. The program's real intent seems to be treating points as a distributed collection of data frames. We are working on providing data frame support in SparkR soon (SPARKR-1).
To resolve the issue, simply manipulate your partition using string operations to extract X, Y correctly. Some other ways include (I think you've probably seen this before) producing a different type of distributed collection from the beginning as is done here: examples/logistic_regression.R.

Create output file in matlab containing numeric and string cells

I am currently working on a project where I have to program the same tool both in Matlab and R and compare both software options.
I started in R and now I am translating the code to Matlab but I am now stuck at the most important part. The output file that the tool creates after doing the analysis.
Basically, my tool makes an analysis that loops n times, and after each loop I get many variables that go into an output table. So, to be more clear, after each loop I get variables:
A = 123
B = 456
C = 'string1'
D = 'string2'
E = 789
The values in each variable change after each loop, I just want to make clear that the variables are both numeric and string values since this is what causes my problem.
In R what I do after each loop is:
outp <- cbind(A,B,C,D,E)
and create a data frame containing each variable in one cell arranged horizontally to afterwards add the result of each loop vertically in a new data frame:
outp2 <- rbind(outp2,outp)
so in the end I get a data frame (outp2) with A,B,C,D,E columns and n rows containing the values of each variable after each loop. So at the end of the looping process I can use write.csv function and create an output file of outp2 that contains both numeric and string columns.
I tried to do this in Matlab but I cannot find a function that can join the data in the same way I am doing it in R because using brackets '[]' only allows me to join numeric kind of variables. So basically my question is: How can I replicate what I am doing in R in Matlab?
I hope I was clear enough, I found it a bit hard to explain.
You can append your output with a cell array, first using curly brackets to declare your cell format (empty {} or containing your data {...}), then using brackets [...] to concatenate the output (one line after the others using ;).
out_array = {}; %initialize empty
%vertical concatenation with ";"
for ii = 1:3
out_array = [out_array; {123, 456, 'string1', 'string2', 789}];
end
This gives
out_array =
[123] [456] 'string1' 'string2' [789]
[123] [456] 'string1' 'string2' [789]
[123] [456] 'string1' 'string2' [789]
Don't now if this solves your problem, but in Matlab you can do things like
oupt = {123, 456, 'string1', 'string2', 789}
Just use curly braces instead of square brackets.
As they have said before, use curly braces to create a cell array. I imagine A, B, C, D, and E are your table headers and you already have the data that goes under them, so I'd do it like this:
outp = { A , B , C , D , E };
# This next step is only to have some data...
outp2 = magic(5);
outp2 = num2cell(outp2);
output = [ outp ; outp2 ]
output =
[123] [456] 'string1' 'string2' [789]
[ 17] [ 24] [ 1] [ 8] [ 15]
[ 23] [ 5] [ 7] [ 14] [ 16]
[ 4] [ 6] [ 13] [ 20] [ 22]
[ 10] [ 12] [ 19] [ 21] [ 3]
[ 11] [ 18] [ 25] [ 2] [ 9]

Resources