jMeter Distributed Testing: Master won't shut down - networking

I have a simple 4 server setup running jMeter (3 slaves, 1 master):
Slave 1: 10.135.62.18 running ./jmeter-server -Djava.rmi.server.hostname=10.135.62.18
Slave 2: 10.135.62.22 running ./jmeter-server -Djava.rmi.server.hostname=10.135.62.22
Slave 3: 10.135.62.20 running ./jmeter-server -Djava.rmi.server.hostname=10.135.62.20
Master: 10.135.62.11 with remote_hosts=10.135.62.18,10.135.62.22,10.135.62.20
I start the test with ./jmeter -n -t /root/jmeter/simple.jmx -l /root/jmeter/result.jtl -r
With the following output:
Writing log file to: /root/apache-jmeter-3.0/bin/jmeter.log
Creating summariser <summary>
Created the tree successfully using /root/jmeter/simple.jmx
Configuring remote engine: 10.135.62.18
Configuring remote engine: 10.135.62.22
Configuring remote engine: 10.135.62.20
Starting remote engines
Starting the test # Mon Aug 29 11:22:38 UTC 2016 (1472469758410)
Remote engines have been started
Waiting for possible Shutdown/StopTestNow/Heapdump message on port 4445
The Slaves print:
Starting the test on host 10.135.62.22 # Mon Aug 29 11:22:39 UTC 2016 (1472469759257)
Finished the test on host 10.135.62.22 # Mon Aug 29 11:22:54 UTC 2016 (1472469774871)
Starting the test on host 10.135.62.18 # Mon Aug 29 11:22:39 UTC 2016 (1472469759519)
Finished the test on host 10.135.62.18 # Mon Aug 29 11:22:57 UTC 2016 (1472469777173)
Starting the test on host 10.135.62.20 # Mon Aug 29 11:22:39 UTC 2016 (1472469759775)
Finished the test on host 10.135.62.20 # Mon Aug 29 11:22:56 UTC 2016 (1472469776670)
Unfortunately the master waits for messages on port 4445 indefinitely event though all slaves finished the test.
Is there anything I have missed?

I figured it out myself just before submitting the question. I guess the solution could be useful nonetheless:
Once I start the test (on the main server) with this:
./jmeter -n -t /root/jmeter/simple.jmx -l /root/jmeter/result.jtl -r -Djava.rmi.server.hostname=10.135.62.11 -Dclient.rmi.localport=4001
It works just fine. I wonder why the documentation doesn't mention something like this.

Related

NSS+Pam+Tacacs+ firs session fails

I have device that i want to autorize to using TACACS+ server.
I have TACACS version: tac_plus version F4.0.4.26
I have tacacs server with next configuration
accounting file = /var/log/tac_plus.acct
key = testing123
default authentication = file /etc/passwd
user = sf {
default service = permit
login = cleartext 1234
}
user = DEFAULT {
# login = PAM
service = ppp protocol = ip {}
}
on device i have NSS with config:
/etc/nsswitch.conf
passwd: files rf
group: files
shadow: files
hosts: files dns
networks: files dns
protocols: files
services: files
ethers: files
rpc: files
and pam.d with sshd file in it
# SERVER 1
auth required /lib/security/pam_rf.so
auth [success=done auth_err=die default=ignore] /lib/security/pam_tacplus.so server=172.18.177.162:49 secret=testing123 timeout=5
account sufficient /lib/security/pam_tacplus.so server=172.18.177.162:49 service=ppp protocol=ip timeout=5
session required /lib/security/pam_rf.so
session sufficient /lib/security/pam_tacplus.so server=172.18.177.162:49 service=ppp protocol=ip timeout=5
password required /lib/security/pam_rf.so
# PAM configuration for the Secure Shell service
# Standard Un*x authentication.
auth include common-auth
# Disallow non-root logins when /etc/nologin exists.
account required pam_nologin.so
# Standard Un*x authorization.
account include common-account
# Set the loginuid process attribute.
session required pam_loginuid.so
# Standard Un*x session setup and teardown.
session include common-session
# Standard Un*x password updating.
password include common-password
and the problem, while i connect to device first time vie TeraTerm, i see that inputed user name was added in session start to /etc/passwd and /etc/shadow
but logging not succeed and in tacacs server i see in logs
Mon Dec 17 19:00:05 2018 [25418]: session.peerip is 172.17.236.2
Mon Dec 17 19:00:05 2018 [25418]: forked 5385
Mon Dec 17 19:00:05 2018 [5385]: connect from 172.17.236.2 [172.17.236.2]
Mon Dec 17 19:00:05 2018 [5385]: Found entry for alex in shadow file
Mon Dec 17 19:00:05 2018 [5385]: verify
IN $6$DUikjB1i$4.cM87/pWRZg2lW3gr3TZorAReVL7JlKGA/2.BRi7AAyHQHz6bBenUxGXsrpzXkVvpwp0CrtNYAGdQDYT2gaZ/
Mon Dec 17 19:00:05 2018 [5385]:
IN encrypts to $6$DUikjB1i$AM/ZEXg6UAoKGrFQOzHC6/BpkK0Rw4JSmgqAc.xJ9S/Q7n8.bT/Ks73SgLdtMUAGbLAiD9wnlYlb84YGujaPS/
Mon Dec 17 19:00:05 2018 [5385]: Password is incorrect
Mon Dec 17 19:00:05 2018 [5385]: Authenticating ACLs for user 'DEFAULT' instead of 'alex'
Mon Dec 17 19:00:05 2018 [5385]: pap-login query for 'alex' ssh from 172.17.236.2 rejected
Mon Dec 17 19:00:05 2018 [5385]: login failure: alex 172.17.236.2 (172.17.236.2) ssh
after that if i close TeraTerm and opening it again and trying to connect, connection established successfully, after that if i close TeraTerm and open again, the same problem appears each seccond try.
what may be a problem with it, i am driving crazy already
after deeply discovering problem, i fount out that iit was my fault, i compiled my name service using g++ instead of gcc.
Because of name service using
#include <pwd.h>
that defines interface for functions like nss_service_getpwnam_r and others, that was written in C, therefore i was must to:
extern "C" {
#include <pwd.h>
}
or to compile my program using GCC, hope in once someone will face same problem it will help him / her. good luck

Shiny server Connection closed. Info: {"type":"close","code":4503,"reason":"The application unexpectedly exited","wasClean":true}

I've encountered a problem with deploying my shiny app on linux Ubuntu 16.04 LTS.
After I run sudo systemctl start shiny-server, and open up my browser heading to http://192.168..*:3838/StockVis/, the web page greys out in a second.
I found some warnings in the web console as below, and survey some information on the web for like two weeks, but still have no solution. :(
***"Thu Feb 16 2017 12:20:49 GMT+0800 (CST) [INF]: Connection opened. http://192.168.**.***:3838/StockVis/"
Thu Feb 16 2017 12:20:49 GMT+0800 (CST) [DBG]: Open channel 0
The application unexpectedly exited.
Diagnostic information is private. Please ask your system admin for permission if you need to check the R logs.
**Thu Feb 16 2017 12:20:50 GMT+0800 (CST) [INF]: Connection closed. Info: {"type":"close","code":4503,"reason":"The application unexpectedly exited","wasClean":true}
Thu Feb 16 2017 12:20:50 GMT+0800 (CST) [DBG]: SockJS connection closed
Thu Feb 16 2017 12:20:50 GMT+0800 (CST) [DBG]: Channel 0 is closed
Thu Feb 16 2017 12:20:50 GMT+0800 (CST) [DBG]: Removed channel 0, 0 left*****
Please kindly give some suggestions to move on.
This can indicate something in your R code is causing an error. As that R error could be anything, this answer is to help you gather that info. The browser console messages will not tell you what that is. In order to access the error, you need to configure Shiny to not delete the log upon exiting the application.
Assuming you have sudo access:
$ sudo vi /etc/shiny-server/shiny-server.conf
Place the following line in the file after run_as shiny; :
preserve_logs true;
Restart shiny:
sudo systemctl restart shiny-server
Reload your Shiny app.
In the var/log/shiny-sever/ directory there will be a log file with your application name. Viewing that file will give you more information on what is going on.
Warning. After you are done, take out the preserve_logs true; line in the conf file and restart Shiny. If not, you will start generating a bunch of log files you don't want.

uWSGI, Nginx, Flask app service keeps failing

Going to my app produces a 502 gateway error. Found out that it was because my how_lit.service is failing. But I am having trouble finding out why.
Tried editing the application and the ini document. Cannot figure out whats wrong.
The Nginx and uWSGI services are up and running fine.
Service Status:
lit#digitalocean:~/howlit$ sudo service how_lit status
[sudo] password for lit:
● how_lit.service - uWSGI instance to serve how lit rest api
Loaded: loaded (/etc/systemd/system/how_lit.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2016-08-04 00:30:44 EDT; 5 days ago
Process: 14294 ExecStart=/home/lit/howlit/env/bin/uwsgi --ini /home/lit/howlit/howlit.ini (code=exited, status=1/FAILURE)
Main PID: 14294 (code=exited, status=1/FAILURE)
Aug 04 00:30:44 digitalocean systemd[1]: Started uWSGI instance to serve how lit rest api.
Aug 04 00:30:44 digitalocean uwsgi[14294]: [uWSGI] getting INI configuration from /home/lit/howlit/howlit.ini
Aug 04 00:30:44 digitalocean systemd[1]: how_lit.service: Main process exited, code=exited, status=1/FAILURE
Aug 04 00:30:44 digitalocean systemd[1]: how_lit.service: Unit entered failed state.
Aug 04 00:30:44 digitalocean systemd[1]: how_lit.service: Failed with result 'exit-code'.
Directory and Permissions:
lit#digitalocean:~/howlit$ ls -l .
total 16
drwx---r-x 6 lit www-data 4096 Jul 29 11:47 env
-rwx---r-x 1 lit www-data 202 Aug 3 23:29 howlit.ini
-rwx---r-x 1 lit www-data 1203 Aug 3 23:01 how_lit_restapi.py
-rwxr-xr-x 1 lit www-data 72 Aug 3 23:27 wsgi.py
/etc/systemd/system/how_lit.service:
lit#digitalocean:~/howlit$ cat /etc/systemd/system/how_lit.service
[Unit]
Description=uWSGI instance to serve how lit rest api
After=network.target
[Service]
User=lit
Group=www-data
WorkingDirectory=/home/lit/howlit/
Environment="PATH=/home/lit/howlit/env/bin"
ExecStart=/home/lit/howlit/env/bin/uwsgi --ini /home/lit/howlit/howlit.ini
[Install]
WantedBy=multi-user.target
howlit.ini file:
lit#digitalocean:~/howlit$ cat howlit.ini
[uwsgi]
module = wsgi:app
uid = lit
gid = www-data
master = true
processes = 5
socket = how_lit_restapi.sock
chmod-sock = 666
vacum = true
die-on-term = true
gto = /var/log/uwsgi/%n.log
Tried running it by hand:
lit#digitalocean:~/howlit$ /home/lit/howlit/env/bin/uwsgi --ini /home/lit/howlit/howlit.ini
[uWSGI] getting INI configuration from /home/lit/howlit/howlit.ini
*** Starting uWSGI 2.0.13.1 (64bit) on [Tue Aug 9 18:28:25 2016] ***
compiled with version: 5.4.0 20160609 on 29 July 2016 11:48:08
os: Linux-4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016
nodename: digitalocean
machine: x86_64
clock source: unix
detected number of CPU cores: 1
current working directory: /home/lit/howlit
detected binary path: /home/lit/howlit/env/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
your processes number limit is 1896
your memory page size is 4096 bytes
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
bind(): Permission denied [core/socket.c line 230]
permission error again?
SOLVED IT: By sending my socket into tmp, but still getting bad gateway error when I navigate to my site :(
Solved my own problem.
First I checked my services.
sudo service nginx status
sudo service uwsgi status
sudo service how_lit status
then I saw them all running and up but was still getting the bad gateway error. Well after checking the logs had no errors. I had to assume my configs.
Then I realized my mistake....I never restarted all of it, just certain parts at certain times. So I restarted every single one as such:
sudo service nginx restart
sudo service uwsgi restart
sudo service how_lit restart
now it works.
About the permission issue I tried it by putting the socket into the /tmp directory that way www-data group users can access it as well as root. I learned that you need to be able to create the socket and allow access to the system for it.
I moved it out of tmp btw later for production as I was told that was not best practice.

I want all the records till the end of the file which are greater than given date in unix

I want all the records till the end of the log file which are greater than given date...
suppose given date is Mon Dec 14 22:00:03 2015 then i want all the lines in the log file till end when first occurrence of greater than this date is found.
ex: awk ' { if ( $0 > "Tue Dec 15 08:00:00 2015") print } ' file.out
only prints lines greater than date but i want all lines after match till end of file..
please note that
1. i can not use regex since i dont know if entry is present in the log file for that date i.e H:M:S so i have to use greater than date functionality.
Date is not present on every line of the lg file. it is present in between but on new line
please help
sample log file:::
Mon Dec 14 02:00:00 2015
Clearing Resource Manager plan via parameter
Mon Dec 14 07:02:57 2015
***********************************************************************
Fatal NI connect error 12504, connecting to:
(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=)(CID=(PROGRAM=oracle)(HOST=ltest8)(USER=oracle)))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.115)(PORT=1521)))
VERSION INFORMATION:
TNS for Linux: Version 11.1.0.6.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.6.0 - Production
Time: 14-DEC-2015 07:02:57
Tracing not turned on.
Tns error struct:
ns main err code: 12564
TNS-12564: TNS:connection refused
ns secondary err code: 0
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Mon Dec 14 08:01:37 2015
***********************************************************************
Fatal NI connect error 12504, connecting to:
(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=)(CID=(PROGRAM=oracle)(HOST=ltest8)(USER=oracle)))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.115)(PORT=1521)))
VERSION INFORMATION:
TNS for Linux: Version 11.1.0.6.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.6.0 - Production
Time: 14-DEC-2015 08:01:37
Tracing not turned on.
Tns error struct:
ns main err code: 12564
TNS-12564: TNS:connection refused
ns secondary err code: 0
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Mon Dec 14 08:54:33 2015
***********************************************************************
Fatal NI connect error 12504, connecting to:
(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=)(CID=(PROGRAM=oracle)(HOST=ltest8)(USER=oracle)))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.115)(PORT=1521)))
VERSION INFORMATION:
TNS for Linux: Version 11.1.0.6.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.6.0 - Production
Time: 14-DEC-2015 08:54:33
Tracing not turned on.
Tns error struct:
ns main err code: 12564
TNS-12564: TNS:connection refused
ns secondary err code: 0
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Mon Dec 14 08:57:18 2015
Thread 1 advanced to log sequence 232
Current log# 2 seq# 232 mem# 0: /u04/app/oracle/oradata/kcom/redo02.log
Mon Dec 14 08:57:19 2015
Errors in file /u01/app/oracle/diag/rdbms/kcom/Rialto/trace/Rialto_arc3_3953.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 268435456 bytes is 100.00% used, and has 0 remaining bytes available.
************************************************************************
You have following choices to free up space from flash recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
system command was used to delete files, then use RMAN CROSSCHECK and
DELETE EXPIRED commands.
************************************************************************
Mon Dec 14 09:17:45 2015
***********************************************************************
Fatal NI connect error 12504, connecting to:
(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=)(CID=(PROGRAM=oracle)(HOST=ltest8)(USER=oracle)))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.115)(PORT=1521)))
VERSION INFORMATION:
TNS for Linux: Version 11.1.0.6.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.1.0.6.0 - Production
Time: 14-DEC-2015 09:17:45
Tracing not turned on.
Tns error struct:
ns main err code: 12564
TNS-12564: TNS:connection refused
ns secondary err code: 0
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Mon Dec 14 10:25:24 2015
QKSRC: ViewText[ecode=942] = SELECT /*+ result_cache */ ID, 'PLUGIN_'||NAME AS NAME, STANDARD_ATTRIBUTES, SQL_MIN_COLUMN_COUNT, NVL(SQL_MAX_COLUMN_COUNT, 999) AS SQL_MAX_COLUMN_COUNT, SQL_EXAMPLES FROM WWV_FLOW_PLUGINS WHERE FLOW_ID = :B2 AND PLUGIN_TYPE = :B1
Mon Dec 14 14:31:14 2015
Thread 1 advanced to log sequence 233
Current log# 3 seq# 233 mem# 0: /u04/app/oracle/oradata/kcom/redo03.log
Mon Dec 14 14:31:15 2015
Errors in file /u01/app/oracle/diag/rdbms/kcom/Rialto/trace/Rialto_arc0_3947.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 268435456 bytes is 100.00% used, and has 0 remaining bytes available.
************************************************************************
You have following choices to free up space from flash recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
system command was used to delete files, then use RMAN CROSSCHECK and
DELETE EXPIRED commands.
************************************************************************
Mon Dec 14 20:28:23 2015
Thread 1 advanced to log sequence 234
Current log# 4 seq# 234 mem# 0: /u04/app/oracle/oradata/kcom/redo04.log
Mon Dec 14 20:28:24 2015
Errors in file /u01/app/oracle/diag/rdbms/kcom/Rialto/trace/Rialto_arc1_3949.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 268435456 bytes is 100.00% used, and has 0 remaining bytes available.
************************************************************************
You have following choices to free up space from flash recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
system command was used to delete files, then use RMAN CROSSCHECK and
DELETE EXPIRED commands.
************************************************************************
Mon Dec 14 22:00:00 2015
Setting Resource Manager plan SCHEDULER[0x2C09]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Mon Dec 14 22:00:03 2015
Mon Dec 14 22:00:03 2015
Logminer Bld: Lockdown Complete. DB_TXN_SCN is UnwindToSCN (LockdownSCN) is 18957974
Tue Dec 15 02:00:00 2015
Clearing Resource Manager plan via parameter
Tue Dec 15 02:00:02 2015
Simple python 2 parser with hardcoded datetime and input file. Tested with your given log. It has plenty of room for optimization but it works as a start I guess.
#/usr/bin/env python
import re
from datetime import datetime
# year, month, day, hour, minute
filter_from = datetime(2015, 12, 13, 23, 40)
with open("tmp.log") as log:
match = False
for line in log:
if (match):
print line
else:
candidate = re.match(r'[A-Z][a-z][a-z] [A-Z][a-z][a-z] \d\d \d\d:\d\d:\d\d \d\d\d\d', line)
if candidate:
parsed_date = datetime.strptime(candidate.group(0), "%a %b %d %X %Y")
if parsed_date > filter_from:
match = True
print line

How can I get more precise log sources from my Deis apps/containers?

I have a Deis cluster running in a (hopefully-soon-to-be) Production environment, with quite a few different apps using the Dockerfile deployment method. Everything's running fine, but promoting this system to a true Production environment (that is, converting the DNS over) isn't really possible unless I can get some worthwhile log output. Using the standard Deis logging platform, here's some sample output of a Web hit (with a bit more output, for context):
Feb 10 01:46:04 ip-10-21-2-154.ec2.internal systemd[1]: Starting Generate /run/coreos/motd...
Feb 10 01:46:04 ip-10-21-2-154.ec2.internal systemd[1]: Started Generate /run/coreos/motd.
Feb 10 01:46:08 ip-10-21-2-154.ec2.internal docker[1867]: [info] GET /containers/json
Feb 10 01:46:08 ip-10-21-2-154.ec2.internal docker[1867]: [215084df] +job containers()
Feb 10 01:46:08 ip-10-21-2-154.ec2.internal docker[1867]: [215084df] -job containers() = OK (0)
Feb 10 01:46:09 ip-10-21-2-154.ec2.internal sh[1316]: 2015/02/10 01:46:09 set /deis/services/production-web/production-web_v8.cmd.1 -> 10.21.2.154:49409
Feb 10 01:46:12 ip-10-21-2-154.ec2.internal sh[9844]: 2015-02-10 01:46:12.302721 7f213ae14700 0 mon.ip-10-21-2-154.ec2.internal#4(peon).data_health(58) update_stats avail 80% total 102400 MB, used 17621 MB, avail 82542 MB
Feb 10 01:46:18 ip-10-21-2-154.ec2.internal docker[1867]: [info] GET /containers/json
Feb 10 01:46:18 ip-10-21-2-154.ec2.internal docker[1867]: [215084df] +job containers()
Feb 10 01:46:18 ip-10-21-2-154.ec2.internal docker[1867]: [215084df] -job containers() = OK (0)
Feb 10 01:46:19 ip-10-23-1-151.ec2.internal sh[1521]: [INFO] - [10/Feb/2015:01:46:27 +0000] - 10.21.2.179 - - - 200 - "GET / HTTP/1.1" - 4927 - "-" - "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.111 Safari/537.36" - "~^production-web\x5C.(?<domain>.+)$" - 10.21.2.154:49409
Feb 10 01:46:19 ip-10-21-2-154.ec2.internal sh[8468]: ===========
Feb 10 01:46:19 ip-10-21-2-154.ec2.internal sh[8468]: HIT TRACKER
Feb 10 01:46:19 ip-10-21-2-154.ec2.internal sh[8468]: SLUG: public/javascripts/bundle.js
Feb 10 01:46:19 ip-10-21-2-154.ec2.internal sh[8468]: ===========
That contains alot of platform information – which is great to have, if only I could filter it out. The problem is all the lines for which the source is sh, but with different PIDs. Those are each completely different containers:
1316 is deis-publisher
9844 is deis-store-monitor
1521 is deis-router
8468 is my web application, production-web
The only way for me to find that out is to ssh into the box and run ps. What's worse, if I had any logs from my other containers at the same time, they would have also shown up as sh – in a production environment with several active apps all logging to the same stream, this situation is obviously untenable. The ideal situation would have sh replaced by the name of the Docker container or, preferably, the Deis app.
I've poured over the documentation and dug into the logspout and logger source code, but I can't find anything to fix this. Any chance I could get some pointers here?
In order to get at the name of the deis container that logged the line, the best way I've found is either:
To run the output of journalctl -f -o short through netcat to a fluentd or logstash tcp listener. You can use these tools to summarize the fields like _SYSTEMD_UNIT that appeal to your needs.
Use ianblenke/fluentd with LOG_DOCKER_JSON defined or fork and modify the autobuild source docker-ianblenke/fluentd. This uses the fluentd-docker plugin to follow the raw docker container json logs.
If you're using CoreOS, I use this fluentd.cloud-init to auto-feed my logs to a local elasticsearch instance on TCP 9200. Will fill find other useful CoreOS cloud-init configs in that project as well.

Resources