How to log from command line to a specific systemd log namespace?
When using Systemd to start a service, one can give
LogNamespace=myNamespace in combination with
StandardOutput=journal as part of the configuration in the unit file.
To see the output of this namespace, it is sufficient to call journalctl --namespace=myNamespace to see only output of this namespace.
With systemd-catit is possible to print directly from command line into the journal:
echo "Hello Journal!" | systemd-cat
The print "Hello Journal!" does show up in the default (anonymous?) namespace, which is visible with journalctl. It is not within any namespace and not visible when using journalctl --namespace=myNamespace.
To be more specific on the initial question:
When looking at journalct -f --namespace=myNamespace, how to make the output of a process started from command line (without systemd, just plain binary) visible in this view?
Something like echo "Hello Journal!" | systemd-cat --log-namespace=myNamespace
I use LogNamespaces to separate different application logs. If this is not the intended use, an answer which explains how to this in another (better) way is also accepted.
first find out the pid of your journald with your namespace.
ps aux|grep jou
root 243 0.0 0.6 56176 26212 ? Ss 10:06 0:01 /lib/systemd/systemd-journald
root 10214 0.0 0.3 42308 14740 ? Ss 11:34 0:00 /lib/systemd/systemd-journald DEBUG
our instance is DEBUG
next find out what socket that instance uses:
lsof -p 10214 |grep dev-log
systemd-j 10214 root 5u unix 0x0000000003787334 0t0 61560 /run/systemd/journal.DEBUG/dev-log type=DGRAM
now we are ready to log to the desired namespace:
logger -u /run/systemd/journal.DEBUG/dev-log HAHAHAHA
journalctl --namespace=DEBUG
Jan 22 11:35:46 myportal-deb root[10339]: HAHAHAHA
Jan 22 11:38:50 myportal-deb root[10559]: HAHAHAHA
Related
I am trying to run a Julia script in paralell on a cluster.
The cluster uses Moab and Torque for the scheduler and resource manager.
Since SSH seems to be restricted, I use MPI for multiprocessing.
I throw the following job, requesting for 3 nodes:
#!/bin/bash
#PBS -l walltime=1:00:00
#PBS -l pmem=10gb
#PBS -l nodes=3:ppn=1
#PBS -j oe
#PBS -A open
#PBS -o (some path)
#PBS -e (some path)
cd (some path)
echo ""
echo "JOB Started on $(hostname -s) at $(date)"
echo ""
module purge
module use (some path)/modules
module load julia
module load openmpi
mpirun -np 3 -display-allocation julia --project=. "(some path)/test.jl"
echo ""
echo "JOB ended at $(date)"
But it if I look at the output script, it seems that it recognizes only one node, comp-bc-0384:
JOB Started on comp-bc-0384 at Sat Mar 19 22:05:12 EDT 2022
====================== ALLOCATED NODES ======================
comp-bc-0384: slots=24 max_slots=0 slots_inuse=0 state=UP
=================================================================
--------------------------------------------------------------------------
[[12308,1],2]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: comp-bc-0384
Another transport will be used instead, although this may result in
lower performance.
NOTE: You can disable this warning by setting the MCA parameter
btl_base_warn_component_unused to 0.
--------------------------------------------------------------------------
[comp-bc-0384.acib.production.int.aci.ics.psu.edu:10656] 2 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[comp-bc-0384.acib.production.int.aci.ics.psu.edu:10656] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
10.214858 seconds (116.21 k allocations: 6.110 MiB)
JOB ended at Sat Mar 19 22:05:36 EDT 2022
I was expecting the ALLOCATED NODES section to display the other node(s) I was assigned to.
A similar question in the past (openMPI/mpich2 doesn't run on multiple nodes) suggests that it has something to do with host file.
Therefore I also tried with mpirun -hostfile $PBS_NODEFILE -np 3 -display-allocation julia --project=. "(some path)/test.jl" . It then returns the following:
JOB Started on comp-bc-0384 at Sat Mar 19 22:16:15 EDT 2022
Host key verification failed.
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
--------------------------------------------------------------------------
JOB ended at Sat Mar 19 22:16:16 EDT 2022
What could be the cause here?
I am trying to set up repmgr version 5 on Debian with PostgtrSql 11.
Seems like the documentation is more oriented towards centos/RHEL.
When I am trying to setup the witnes node to start the repmgr daemon, I get an error without any idea where to look for for seeing what is the cause of the error.
This is my repmgr.conf file:
node_id=3
node_name='PG-Node-Witness'
conninfo='host=10.97.7.140 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/11/main'
failover='automatic'
promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'
priority=60
monitor_interval_secs=2
connection_check_type='ping'
reconnect_attempts=6
reconnect_interval=8
primary_visibility_consensus=true
standby_disconnect_on_failover=true
repmgrd_service_start_command='sudo /etc/init.d/repmgrd start' #??????
repmgrd_service_stop_command='sudo //etc/init.d/repmgrd stop'#??????
service_start_command='sudo /usr/bin/systemctl start postgresql#11-main.service'
service_stop_command='sudo /usr/bin/systemctl stop postgresql#11-main.service'
service_restart_command='sudo /usr/bin/systemctl restart postgresql#11-main.service'
service_reload_command='sudo /usr/bin/systemctl relaod postgresql#11-main.service'
monitoring_history=yes
log_status_interval=60
register is OK:
repmgr -f /etc/repmgr.conf witness register -h 10.97.7.97
INFO: connecting to witness node "PG-Node-Witness" (ID: 3)
INFO: connecting to primary node
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
INFO: witness registration complete
NOTICE: witness node "PG-Node-Witness" (ID: 3) successfully registered
repmgr daemon dry-run OK too:
$repmgr -f /etc/repmgr.conf daemon start --dry-run
INFO: prerequisites for starting repmgrd met
DETAIL: following command would be executed:
sudo /usr/bin/systemctl start postg...#11-main.service
I setup /etc/default/repmgrd with:
REPMGRD_ENABLED=yes
and
REPMGRD_CONF="/etc/repmgr.conf"
But still get error when trying to run the daemon start:
$ repmgr -f /etc/repmgr.conf daemon start
I get:
NOTICE: executing: "sudo /etc/init.d/repmgrd start"
ERROR: repmgrd does not appear to have started after 15 seconds
HINT: use "repmgr service status" to confirm that repmgrd was successfully started
It is recommended to run repmgrd as a systemd service,
According to the docs (for debian) you may first need to configure /etc/default/repmgrd,
My configuration looks like this:
# default settings for repmgrd. This file is source by /bin/sh from
# /etc/init.d/repmgrd
# disable repmgrd by default so it won't get started upon installation
# valid values: yes/no
REPMGRD_ENABLED=yes
# configuration file (required)
REPMGRD_CONF="/etc/repmgr/12/repmgr.conf"
# additional options
REPMGRD_OPTS="--daemonize=false"
# user to run repmgrd as
REPMGRD_USER=postgres
# repmgrd binary
REPMGRD_BIN=/bin/repmgrd
# pid file
REPMGRD_PIDFILE=/var/run/repmgrd.pid
Secondly, I would revisit sudoers (visudo) in order to check whether the non-root user can execute sudo /etc/init.d/repmgrd start.
Further, the user who runs repmgr commands should be able to write logs depending on your configuration.
Apparently the correct command to start the repmgr daemon is:
repmgrd -f /etc/prepmgr.conf
I assume I simply have to insert an entry into an nginx.conf file to resolve the error that is plaguing me (see below), but so far I haven’t had any luck figuring out the syntax. Any help would be appreciated.
I want to run nginx as a regular user while having installed it using homebrew as a user with administrative privileges. nginx is trying to write to the error.log file at /usr/local/var/log/nginx/error.log, which it cannot because my regular user lacks write privilege there.
Another wrinkle is coming from the fact that there are two nginx.conf files, a global and a local, and as far as I can tell they are both being read. They are in the default homebrew location /usr/local/etc/nginx/nginx.conf and my local project directory $BASE_DIR/nginx.conf.
Here is the error that is generated as nginx attempts to start up:
[WARN] No ENV file found
10:08:18 PM web.1 | DOCUMENT_ROOT changed to 'public/'
10:08:18 PM web.1 | Using Nginx server-level configuration include 'nginx.conf'
10:08:18 PM web.1 | 4 processes at 128MB memory limit.
10:08:18 PM web.1 | Starting php-fpm...
10:08:20 PM web.1 | Starting nginx...
10:08:20 PM web.1 | Application ready for connections on port 5000.
10:08:20 PM web.1 | nginx: [alert] could not open error log file: open() "/usr/local/var/log/nginx/error.log" failed (13: Permission denied)
10:08:20 PM web.1 | 2017/03/04 22:08:20 [emerg] 19557#0: "http" directive is duplicate in /usr/local/etc/nginx/nginx.conf:17
10:08:20 PM web.1 | Process exited unexpectedly: nginx
10:08:20 PM web.1 | Going down, terminating child processes...
[DONE] Killing all processes with signal null
10:08:20 PM web.1 Exited with exit code 1
Any help figuring out how to get nginx up and running so I can back to the development side of this project will be much appreciated.
The whole problem stems from trying to run nginx as my ordinary user self despite the fact that nginx was installed by my user self with administrative privileges. I was able to resolve both the errors shown here with the following commands executed as a user with administrative privileges:
sudo chmod a+w /usr/local/var/log/nginx/*.log
sudo chmod a+w /usr/local/var
sudo chmod a+w /usr/local/var/run
Note that the /usr/local/var directory appears to have been created by homebrew upon installing nginx and this machine is my laptop so I can’t see any reason not to open it up. You might have greater security concerns in other scenarios.
I admit that when I wrote this question I thought it was about moving the error.log file to another directory. Now I see that that is not a full solution, so instead the solution I present here is about giving ordinary users write privileges in the necessary directories.
The reason I changed my mind is that nginx can (and in this case does) generate errors before (or while) reading the nginx.conf files and needs to be able to report those errors to a log file. Modifying the nginx.conf file was never going to solve my problem. What woke me up to this issue was reading this post: How to turn off or specify the nginx error log location?
Did you try to set the log path to a custom location by editing nginx.conf?
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
Change path to somewhere use has write privileges.
I was in riak-shell when ssh lost its connection to the server. After reconnecting, I do the following:
sudo riak-shell
and get:
An instance of riak-shell is already running
So, I restarted the riak node in question. This did not seem to solve the problem. I do not see anything using ps -aux to kill. According to the docs, only one instance can run at a time. That makes sense, but when I run riak-shell from another node and try to connect to any node, I now get the following:
Error: invalid function call : connection_EXT:connect ["riak#<<<ip_address_elided>>>"]
You can connect to a specific node (whether in your riak_shell.config
or not) by typing 'connect "dev1#127.0.0.1";' substituting your
node name for dev1.
You may need to change the Erlang cookie to do this.
See also the 'reconnect' command.
Unhandled message received is {#Ref<0.0.0.135>,disconnected}
riak-shell(3)>
I have not changed the cookies during this process, and the cookie appears to be the same (at least in /etc/riak/riak_shell.config). (I am running the Riak TS AMI on AWS.)
riak-shell runs in its own Erlang VM - entirely separate from the riak node
(You don't need to run riak-shell from the machine your node is on - it uses the normal riak-erlang-client to talk to riak)
If you you are on a Linux do ps aux | grep riak_shell_app it will give you the process number you need to kill that instance:
08:30:45:~ $ ps aux | grep riak_shell_app
vagrant 4671 0.0 0.3 493260 34884 pts/4 Sl+ Aug17 0:03 /home/vagrant/riak_ee/dev/dev1/erts-5.10.3/bin/beam.smp -- -root /home/vagrant/riak_ee/dev/dev1 -progname erl -- -home /home/vagrant -- -boot /home/vagrant/riak_ee/dev/dev1/releases/2.1.1/start_clean -run riak_shell_app boot debug_off /home/vagrant/riak_ee/dev/dev1/bin/../log/riak_shell/riak_shell -noshell -config /home/vagrant/riak_ee/dev/dev1/bin/../etc/riak
I wrote a good chunk of it so let me know how you got on:
https://github.com/basho/riak_shell/graphs/contributors
I'm trying to code a daemon in Unix. I understand the part how to make a daemon up and running . Now I want the daemon to respond when I type commands in the shell if they are targeted to the daemon.
For example:
Let us assume the daemon name is "mydaemon"
In terminal 1 I type mydaemon xxx.
In terminal 2 I type mydaemon yyy.
"mydaemon" should be able to receive the argument "xxx" and "yyy".
If I interpret your question correctly, then you have to do this as an application-level construct. That is, this is something specific to your program you're going to have to code up yourself.
The approach I would take is to write "mydaemon" with the idea of it being a wrapper: it checks the process table or a pid file to see if a "mydaemon" is already running. If not, then fork/exec your new daemon. If so, then send the arguments to it.
For "send the arguments to it", I would use named pipes, like are explained here: What are named pipes? Essentially, you can think of named pipes as being like "stdin", except they appear as a file to the rest of the system, so you can open them in your running "mydaemon" and check them for inputs.
Finally, it should be noted that all of this check-if-running-send-to-pipe stuff can either be done in your daemon program, using the API of the *nix OS, or it can be done in a script by using e.g. 'ps', 'echo', etc...
The easiest, most common, and most robust way to do this in Linux is using a systemd socket service.
Example contents of /usr/lib/systemd/system/yoursoftware.socket:
[Unit]
Description=This is a description of your software
Before=yoursoftware.service
[Socket]
ListenStream=/run/yoursoftware.sock
Service=yourservicename.service
# E.x.: use SocketMode=0666 to give rw access to everyone
# E.x.: use SocketMode=0640 to give rw access to root and read-only to SocketGroup
SocketMode=0660
SocketUser=root
# Use socket group to grant access only to specific processes
SocketGroup=root
[Install]
WantedBy=sockets.target
NOTE: If you are creating a local-user daemon instead of a root daemon, then your systemd files go in /usr/lib/systemd/user/ (see pulseaudio.socket for example) or ~/.config/systemd/user/ and your socket is at /run/usr/$(id -u)/yoursoftware.sock (note that you can't actually use command substitution in pathnames in systemd.)
Example contents of /lib/systemd/system/yoursoftware.service
[Unit]
Description=This is a description of your software
Requires=yoursoftware.socket
[Service]
ExecStart=/usr/local/bin/yoursoftware --daemon --yourarg yourvalue
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target
Also=yoursoftware.socket
Run systemctl daemon-reload && systemctl enable yoursoftware.socket yoursoftware.service as root
Use systemctl --user daemon-reload && systemctl --user enable yoursoftware.socket yoursoftware.service if you're creating the service to run as a local-user
A functional example of the software in C would be way too long, so here's an example in NodeJS. Here is /usr/local/bin/yoursoftware:
#!/usr/bin/env node
var SOCKET_PATH = "/run/yoursoftware.sock";
function errorHandle(e) {
if (e) console.error(e), process.exit(1);
}
if (process.argv[0] === "--daemon") {
var logFile = require("fs").createWriteStream(
"/var/log/yoursoftware.log", {flags: "a"});
require('net').createServer(errorHandle)
.listen(SOCKET_PATH, s => s.pipe(logFile));
} else process.stdin.pipe(
require('net')
.createConnection(SOCKET_PATH, errorHandle)
);
In the example above, you can run many yoursoftware instances at the same time, and the stdin of each of the instances will be piped through to the daemon, which appends all the stuff it receives to a log file.
For non-Linux OSes and distros without systemd, you would use the (typically shell-scripted) startup system to begin your process at boot and the user would receive an error like could not connect to socket /run/yoursoftware.sock when something goes wrong with your daemon.