Why is the Erlang riak client crashing? - riak

I have followed the riak installation instructions for Ubuntu and the readme for the riak erlang client. At the point in the client instructions where a ping is sent to test the connection, I get the error report below indicating a missing module. In the report, I've replace my server IP with IP_Of_Server.
How do I make sure the module is included in the compile of the client? Where should it be located?
riakc_pb_socket:ping(Pid).
=ERROR REPORT==== 28-Apr-2022::20:04:49.468604 ===
** Generic server <0.81.0> terminating
** Last message in was {req,rpbpingreq,60000}
** When Server state == {state,"IP_Of_Server",8087,false,false,#Port<0.7>,
false,gen_tcp,undefined,
{[],[]},
1,[],infinity,undefined,undefined,undefined,
undefined,[],100,false,
{false,0}}
** Reason for termination ==
** {'module could not be loaded',
[{riak_pb_codec,encode,[rpbpingreq],[]},
{riakc_pb_socket,encode_request_message,1,
[{file,"/home/pseudo/riak-erlang-client/src/riakc_pb_socket.erl"},
{line,3297}]},
{riakc_pb_socket,send_request,2,
[{file,"/home/pseudo/riak-erlang-client/src/riakc_pb_socket.erl"},
{line,3269}]},
{riakc_pb_socket,handle_call,3,
[{file,"/home/pseudo/riak-erlang-client/src/riakc_pb_socket.erl"},
{line,2089}]},
{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,661}]},
{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,690}]},
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
** Client <0.78.0> stacktrace
** [{gen,do_call,4,[{file,"gen.erl"},{line,167}]},
{gen_server,call,3,[{file,"gen_server.erl"},{line,219}]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,684}]},
{shell,exprs,7,[{file,"shell.erl"},{line,686}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]
=CRASH REPORT==== 28-Apr-2022::20:04:49.469247 ===
crasher:
initial call: riakc_pb_socket:init/1
pid: <0.81.0>
registered_name: []
exception error: undefined function riak_pb_codec:encode/1
in function riakc_pb_socket:encode_request_message/1 (/home/pseudo/riak-erlang-client/src/riakc_pb_socket.erl, line 3297)
in call from riakc_pb_socket:send_request/2 (/home/pseudo/riak-erlang-client/src/riakc_pb_socket.erl, line 3269)
in call from riakc_pb_socket:handle_call/3 (/home/pseudo/riak-erlang-client/src/riakc_pb_socket.erl, line 2089)
in call from gen_server:try_handle_call/4 (gen_server.erl, line 661)
in call from gen_server:handle_msg/6 (gen_server.erl, line 690)
ancestors: [<0.78.0>]
message_queue_len: 0
messages: []
links: [<0.78.0>,#Port<0.7>]
dictionary: []
trap_exit: false
status: running
heap_size: 6772
stack_size: 27
reductions: 13740
neighbours:
neighbour:
pid: <0.78.0>
registered_name: []
initial_call: {erlang,apply,2}
current_function: {gen,do_call,4}
ancestors: []
message_queue_len: 0
links: [<0.77.0>,<0.81.0>]
trap_exit: false
status: waiting
heap_size: 610
stack_size: 34
reductions: 4167
current_stacktrace: [{gen,do_call,4,[{file,"gen.erl"},{line,167}]},
{gen_server,call,3,[{file,"gen_server.erl"},{line,219}]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,684}]},
{shell,exprs,7,[{file,"shell.erl"},{line,686}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]
** exception exit: undef
in function riak_pb_codec:encode/1
called as riak_pb_codec:encode(rpbpingreq)
in call from riakc_pb_socket:encode_request_message/1 (/home/pseudo/riak-erlang-client/src/riakc_pb_socket.erl, line 3297)
in call from riakc_pb_socket:send_request/2 (/home/pseudo/riak-erlang-client/src/riakc_pb_socket.erl, line 3269)
in call from riakc_pb_socket:handle_call/3 (/home/pseudo/riak-erlang-client/src/riakc_pb_socket.erl, line 2089)
in call from gen_server:try_handle_call/4 (gen_server.erl, line 661)
in call from gen_server:handle_msg/6 (gen_server.erl, line 690)
in call from proc_lib:init_p_do_apply/3 (proc_lib.erl, line 249)

Here is what I have found.
When starting the REPL, the line
erl -pa ./riak-erlang-client/_build/default/lib/riakc/ebin ./riak-erlang-client/_build/default/lib/riakc/deps/*/ebin
was missing a directory declaration. I changed it to be
erl -pa ./riak-erlang-client/_build/default/lib/riakc/ebin ./riak-erlang-client/_build/default/lib/riakc/deps/*/ebin ./riak-erlang-client/_build/default/lib/riak_pb/ebin/

Related

How to deal with airflow error taskinstance.py line 983, in _run_raw_task / psycopg2?

I have a couple of DAGs running, which execute python scripts which are basically copying data from A to B. But one of the dags throws an error - but data is still getting copied, so somehow it does not have influence on the execution of the python script.
The only "special" what is within this dag is, that the python script builds up a connecton to a postgres database using psycopg2~=2.8.5 but not sure if this is somehow the root cause.
I also checked the permissions for the user, which seem to be fine at least in the dags folder.
Is there any specific timeout value I have to adjust in the config?
[2021-05-19 12:53:42,036] {taskinstance.py:1145} ERROR - Bash command failed 255
Traceback (most recent call last):
File "/hereisthepath/venv/lib64/python3.6/site-packages/airflow/models/taskinstance.py", line 983, in _run_raw_task
result = task_copy.execute(context=context)
File "/hereisthepath/venv/lib64/python3.6/site-packages/airflow/operators/bash_operator.py", line 134, in execute
raise AirflowException("Bash command failed")
airflow.exceptions.AirflowException: Bash command failed
Update: This is the passage of the operator, which fails. I copied the entire function, however the error throws at line 134 ("raise AirflowException("Bash command failed"))
def execute(self, context):
"""
Execute the bash command in a temporary directory
which will be cleaned afterwards
"""
self.log.info("Tmp dir root location: \n %s", gettempdir())
# Prepare env for child process.
env = self.env
if env is None:
env = os.environ.copy()
airflow_context_vars = context_to_airflow_vars(context, in_env_var_format=True)
self.log.debug('Exporting the following env vars:\n%s',
'\n'.join(["{}={}".format(k, v)
for k, v in airflow_context_vars.items()]))
env.update(airflow_context_vars)
self.lineage_data = self.bash_command
with TemporaryDirectory(prefix='airflowtmp') as tmp_dir:
with NamedTemporaryFile(dir=tmp_dir, prefix=self.task_id) as f:
f.write(bytes(self.bash_command, 'utf_8'))
f.flush()
fname = f.name
script_location = os.path.abspath(fname)
self.log.info(
"Temporary script location: %s",
script_location
)
def pre_exec():
# Restore default signal disposition and invoke setsid
for sig in ('SIGPIPE', 'SIGXFZ', 'SIGXFSZ'):
if hasattr(signal, sig):
signal.signal(getattr(signal, sig), signal.SIG_DFL)
os.setsid()
self.log.info("Running command: %s", self.bash_command)
self.sub_process = Popen(
['bash', fname],
stdout=PIPE, stderr=STDOUT,
cwd=tmp_dir, env=env,
preexec_fn=pre_exec)
self.log.info("Output:")
line = ''
for line in iter(self.sub_process.stdout.readline, b''):
line = line.decode(self.output_encoding).rstrip()
self.log.info(line)
self.sub_process.wait()
self.log.info(
"Command exited with return code %s",
self.sub_process.returncode
)
if self.sub_process.returncode:
raise AirflowException("Bash command failed")
if self.xcom_push_flag:
return line
Update2: It really seems, that this behavior is related to the psycopg2: I now tested all other possible error sources and only when I test with the postgres datasource using psycopg2 package, the error occurs. Meanwhile I also upgraded to the most recent version of psycopg2 (2.8.6) but without success.
Maybe this helps for further investigation

How to solve executing problem of MPICH: error code (10049)

I'm new to MPICH2 I trying to execute a little program. The program builds without errors but when I try to run it shows the error I've attached below. Could you please help me to solve this.
> [01:17268]..ERROR:Error while connecting to host, The requested address is not valid in its context. (10049)
[01:17268]..ERROR:Connect on sock (host=localhost, port=0) failed, exhaused all end points
SMPDU_Sock_post_connect failed.
[2] PMI_ConnectToHost failed: unable to post a connect to localhost:0, error: Undefined dynamic error code
uPMI_ConnectToHost returning PMI_FAIL
[2] PMI_Init failed.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(377): Initialization failed
MPID_Init(90)........: channel initialization failed
MPID_Init(357).......: PMI_Init returned -1
[01:10548]..ERROR:Error while connecting to host, The requested address is not valid in its context. (10049)
[01:10548]..ERROR:Connect on sock (host=localhost, port=0) failed, exhaused all end points
SMPDU_Sock_post_connect failed.
[1] PMI_ConnectToHost failed: unable to post a connect to localhost:0, error: Undefined dynamic error code
uPMI_ConnectToHost returning PMI_FAIL
[1] PMI_Init failed.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(377): Initialization failed
MPID_Init(90)........: channel initialization failed
MPID_Init(357).......: PMI_Init returned -1
[01:11576]..ERROR:Error while connecting to host, The requested address is not valid in its context. (10049)
[01:11576]..ERROR:Connect on sock (host=localhost, port=0) failed, exhaused all end points
SMPDU_Sock_post_connect failed.
[0] PMI_ConnectToHost failed: unable to post a connect to localhost:0, error: Undefined dynamic error code
uPMI_ConnectToHost returning PMI_FAIL
[0] PMI_Init failed.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(377): Initialization failed
MPID_Init(90)........: channel initialization failed
MPID_Init(357).......: PMI_Init returned -1
#include <mpi.h>
#include <stdio.h>
int main() {
int np;
int pid;
MPI_Init(NULL,NULL);
MPI_Comm_size(MPI_COMM_WORLD,&np);
MPI_Comm_rank(MPI_COMM_WORLD, &pid);
printf("Hi parallel...\n");
MPI_Finalize();
return 0;
}

Airflow on_success_callback() does not run

I am running a pipeline using Airflow which contains multiple Bash Operators to be executed.
Each operator has on_failure_callback and on_success_callback attributes that call a function to send an email with the status of the task (success/fail) and uploads generated log file from a directory to hdfs. The following code snippets shows a sample of the operator I am using and the callable function.
Bash Operator:
op = BashOperator(
task_id='test_op',
bash_command='python3 run.py' ,
on_failure_callback=fail_email,
on_success_callback = success_email,
retries=3,
dag=dag)
success_email:
def success_email(contextDict,**kwargs):
"""Send custom email alerts."""
# email title.
title = "Task {} SUCCEEDED. Execution date: {}".format(contextDict['task'].task_id, contextDict['execution_date'])
# email contents
body = """
<br>
The correspondent log file:
<br>
{}
""".format(hdfs_log)
print("Uploading log to hdfs")
subprocess.check_call(["hdfs", "dfs", "-mkdir", "-p", hdfs_log_folder])
subprocess.check_call(["hdfs", "dfs", "-put", local_log, hdfs_log])
send_email('email#domain.com', title, html_content=body)
The success_callback always fails when calling the hdfs commands and gives the following error:
[2018-12-28 09:13:29,727] INFO - Uploading log to hdfs
[2018-12-28 09:13:30,344] INFO - [2018-12-28 09:13:30,342] WARNING - State of this instance has been externally set to success. Taking the poison pill.
[2018-12-28 09:13:30,381] INFO - Sending Signals.SIGTERM to GPID 11515
[2018-12-28 09:13:30,382] ERROR - Received SIGTERM. Terminating subprocesses.
[2018-12-28 09:13:30,382] INFO - Sending SIGTERM signal to bash process group
[2018-12-28 09:13:30,390] ERROR - Failed when executing success callback
[2018-12-28 09:13:30,390] ERROR - [Errno 3] No such process
Traceback (most recent call last):
File "/opt/hadoop/airflow/python/lib/python3.6/site-packages/airflow/models.py", line 1687, in _run_raw_task
task.on_success_callback(context)
File "/usr/local/airflow/dags/Appl_FUMA.py", line 139, in success_email
subprocess.check_call(["hdfs", "dfs", "-mkdir", "-p", hdfs_log_folder])
File "/usr/lib64/python3.6/subprocess.py", line 286, in check_call
retcode = call(*popenargs, **kwargs)
File "/usr/lib64/python3.6/subprocess.py", line 269, in call
return p.wait(timeout=timeout)
File "/usr/lib64/python3.6/subprocess.py", line 1457, in wait
(pid, sts) = self._try_wait(0)
File "/usr/lib64/python3.6/subprocess.py", line 1404, in _try_wait
(pid, sts) = os.waitpid(self.pid, wait_flags)
File "/opt/hadoop/airflow/python/lib/python3.6/site-packages/airflow/models.py", line 1611, in signal_handler
task_copy.on_kill()
File "/opt/hadoop/airflow/python/lib/python3.6/site-packages/airflow/operators/bash_operator.py", line 125, in on_kill
os.killpg(os.getpgid(self.sp.pid), signal.SIGTERM)
ProcessLookupError: [Errno 3] No such process
[2018-12-28 09:13:30,514] INFO - Process psutil.Process(pid=11515 (terminated)) (11515) terminated with exit code 0
[2018-12-28 09:13:30,514] INFO - Process psutil.Process(pid=20649 (terminated)) (20649) terminated with exit code None
[2018-12-28 09:13:30,514] INFO - Process psutil.Process(pid=11530 (terminated)) (11530) terminated with exit code None
[2018-12-28 09:13:30,515] INFO - [2018-12-28 09:13:30,515] INFO - Task exited with return code 0
However, it manages to send emails (sometimes) when I comment out the two lines of subprocesses.
Any idea how to fix this issue?

Issues with reading file in erlang

So, I am trying to read and write into a file.
While writing into the file, I need to check if a particular index exist in file then I don't write and throw error.
The data in file will look like this:
{1,{data,dcA,1}}.
{2, {data, dcA, 2}}.
{3,{data,dcA,3}}.
I added the dot at the end of each line because file:consult() needs the file like this.
Which is in this format.
{Index, {Data, Node, Index}}
When I have to add a new file, I check with this Index.
Here's what I have tried so far - https://pastebin.com/apnWLk45
And I run it like this:
193> {ok, P9} = poc:start(test1, self()).
{ok,<0.2863.0>}
194> poc:add(P9, Node, {6, data}).
In poc:add/3, P9 is the process id from the file:open.
I defined before in shell as dcA
And the third is the data - which is in this format - {Index, data}
Since I am using file:consult/1, it takes the filename as parameter. At that point, I only have process id. So I take the name from
file:pid2name(_Server).
This runs perfectly when I run it for the first time.
When I run this again - poc:add(P9, Node, {6, data2}), I get an error in this line file:pid2name(_Server).
exception error: no match of right hand side value undefined
How can I solve this issue?
I am new to Erlang. Just been a week that I started learning.
I am trying to read and write into a file. While writing into the
file, I need to check if a particular index exist in file then I don't
write and throw error.
A DETS table can easily do what you want:
-module(my).
-compile(export_all).
open_table() ->
dets:open_file(my_data, [{type, set}, {file, "./my_data.dets"}]).
close_table() ->
dets:close(my_data).
clear_table() ->
dets:delete_all_objects(my_data).
insert({Key, _Rest}=Data) ->
case dets:member(my_data, Key) of
true -> throw(index_already_exists);
false -> dets:insert(my_data, Data)
end.
all_items() ->
dets:match(my_data, '$1').
In the shell:
~/erlang_programs$ erl
Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V9.2 (abort with ^G)
1> c(my).
my.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,my}
2> my:open_table().
{ok,my_data}
3> my:clear_table().
ok
4> my:all_items().
[]
5> my:insert({1, {data, a, b}}).
ok
6> my:insert({2, {data, c, d}}).
ok
7> my:insert({3, {data, e, f}}).
ok
8> my:all_items().
[[{1,{data,a,b}}],[{2,{data,c,d}}],[{3,{data,e,f}}]]
9> my:insert({1, {data, e, f}}).
** exception throw: index_already_exists
in function my:insert/1 (my.erl, line 15)
When I run this again - poc:add(P9, Node, {6, data2}), I get an error
in this line file:pid2name(_Server):
exception error: no match of right hand side value undefined
When a process opens a file, it becomes linked to a process that handles the file I/O, which means that if the process that opens the file terminates abnormally, the I/O process will also terminate. Here is an example:
-module(my).
-compile(export_all).
start() ->
{ok, Pid} = file:open('data.txt', [read, write]),
spawn(my, add, [Pid, x, y]),
exit("bye").
add(Pid, _X, _Y) ->
timer:sleep(1000), %Let start() process terminate.
{ok, Fname} = file:pid2name(Pid),
io:format("~s~n", [Fname]).
In the shell:
1> c(my).
my.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,my}
2> my:start().
** exception exit: "bye"
in function my:start/0 (my.erl, line 7)
3>
=ERROR REPORT==== 25-Jun-2018::13:28:48 ===
Error in process <0.72.0> with exit value:
{{badmatch,undefined},[{my,add,3,[{file,"my.erl"},{line,12}]}]}
According to the pid2name() docs:
pid2name(Pid) -> {ok, Filename} | undefined
the function can return undefined, which is what the error message is saying happened.

pexpect python throw error

Although this is my first attempt at using pexpect, the python3 script using pexpect is pretty simple; yet it fails.
#!/usr/bin/env python3
import sys
import pexpect
SSH_NEWKEY = r'Are you sure you want to continue connecting \(yes/no\)\?'
child = pexpect.spawn("ssh -i /user/aws/key.pem ec2-user#xxx.xxx.xxx.xxx date")
i = child.expect( [ pexpect.TIMEOUT, SSH_NEWKEY )
if i == 1:
child.sendline('yes')
print(child.before)
The SSH_NEWKEY is the only response I'm expecting, but the example showed a list containing pexpect.TIMEOUT in it so I used it.
$ ./test.py
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 144, in read_nonblocking
s = os.read(self.child_fd, size)
OSError: [Errno 5] Input/output error
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.4/site-packages/pexpect/expect.py", line 97, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/usr/local/lib/python3.4/site-packages/pexpect/pty_spawn.py", line 455, in read_nonblocking
return super(spawn, self).read_nonblocking(size)
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 149, in read_nonblocking
raise EOF('End Of File (EOF). Exception style platform.')
pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./min.py", line 15, in <module>
i = child.expect( [ pexpect.TIMEOUT, SSH_NEWKEY ] )
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 315, in expect
timeout, searchwindowsize, async)
File "/usr/local/lib/python3.4/site-packages/pexpect/spawnbase.py", line 339, in expect_list
return exp.expect_loop(timeout)
File "/usr/local/lib/python3.4/site-packages/pexpect/expect.py", line 102, in expect_loop
return self.eof(e)
File "/usr/local/lib/python3.4/site-packages/pexpect/expect.py", line 49, in eof
raise EOF(msg)
pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
<pexpect.pty_spawn.spawn object at 0x7f70ea4fbcf8>
command: /usr/bin/ssh
args: ['/usr/bin/ssh', '-i', '/user/aws/key.pem', 'ec2-user#xxx.xxx.xxx.xxx', 'date']
searcher: None
buffer (last 100 chars): b''
before (last 100 chars): b'Fri May 6 13:50:18 EDT 2016\r\n'
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: 0
flag_eof: True
pid: 31293
child_fd: 5
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
What am I missing?
CentOS 6.4
python 3.4.3
An EOF error is being raised during your expect call. This means that the response received does not match SSH_NEWKEY, and reaches end of file within the timeout period. To catch this exception, you should change your except line to read:
i = child.expect( [ pexpect.TIMEOUT, SSH_NEWKEY, pexpect.EOF)
You can then make your if more robust:
if i == 1:
child.sendline('yes')
elif i == 0:
print "Timeout"
elif i == 2:
print "EOF"
print(child.before)
This doesn't solve the reason behind why you are on receiving a response with the expected string - it's hard to know without looking at more code but it's likely because you have the response slightly wrong. If you manually type in the SSH string, you should be able to see the response you can expect, and enter this response into your code.
You can also print child.before after your expect call, or print child.read() instead of your expect call to see what is being sent back as a response.

Resources