I recently started a job that will involve a lot of performance tweaking.
I was wondering whether tools like eBPF and perf can be used with RBAC? Or will full root access be required? Getting root access might be difficult. We're mainly using fairly old Linux machines - RHEL 6.5. I'm not too familiar with RBAC. It home I have used Dtrace on Solaris, macOS and FreeBSD, but there I have the root password.
RHEL lists several profiling and tracing solutions for RHEL6 including perf in its
Performance Tuning Guide and Developer Guide:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/performance_tuning_guide/s-analyzperf-perf
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/developer_guide/perf-using
Chapter 3. Monitoring and Analyzing System Performance of Performance Tuning Guide mentions several tools: Gnome System Monitor, KDE System Guard, Performance Co-Pilot (PCP), top/ps/vmstat/sar, tuned and ktune, MRG Tuna, and application profilers SystemTap, Oprofile, Valgrind (which is not real profiler, but cpu emulator with instruction and cache event counting), perf.
Chapter 5. Profiling of Developer Guide lists Valgrind, oprofile, SystemTap, perf, and ftrace.
Usually profiling of kernel or whole system is allowed only for root, or for user with CAP_SYS_ADMIN capability. Some profiling is limited by sysctl variables
kernel.perf_event_paranoid (documented in https://www.kernel.org/doc/Documentation/sysctl/kernel.txt):
perf_event_paranoid:
Controls use of the performance events system by unprivileged
users (without CAP_SYS_ADMIN). The default value is 2.
-1: Allow use of (almost) all events by all users
Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>=0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
Disallow raw tracepoint access by users without CAP_SYS_ADMIN
>=1: Disallow CPU event access by users without CAP_SYS_ADMIN
>=2: Disallow kernel profiling by users without CAP_SYS_ADMIN
kernel.kptr_restrict (https://www.kernel.org/doc/Documentation/sysctl/kernel.txt), which also change perf ability to profile kernel
kptr_restrict:
This toggle indicates whether restrictions are placed on
exposing kernel addresses via /proc and other interfaces.
More recent versions of ubuntu and rhel (7.4) has also kernel.yama.ptrace_scope http://security-plus-data-science.blogspot.com/2017/09/some-security-updates-in-rhel-74.html
... use kernel.yama.ptrace_scope to set who can ptrace. The different
values have the following meaning:
# 0 - Default attach security permissions.
# 1 - Restricted attach. Only child processes plus normal permissions.
# 2 - Admin-only attach. Only executables with CAP_SYS_PTRACE.
# 3 - No attach. No process may call ptrace at all. Irrevocable until next boot.
You can temporarily set it like this:
echo 2 > /proc/sys/kernel/yama/ptrace_scope
To profile a program you should have access to debugging it, like attaching with gdb (ptrace capability) or strace. I don't know RHEL or its RBAC so you should check what is available to you. Generally perf profiling of own userspace programs on software events is available for more cases. Access to per-process cpu hardware counters, profiling of programs of other users, profiling of kernel is more limited. I can expect that correctly enabled RBAC should not allow you or root to profile kernel, as perf can inject tracing probes and leak information from kernel or other users.
Qeole says in comment that eBPF is not implemented for RHEL6 (added in RHEL7.6; with XDP - eXpress Data Path in RHEL8), so you only can try ftrace for tracing or stap (SystemTap) for advanced tracing.
Related
What's the UNIX command to see the processes table, remember that table contains:
process status
pointers
process size
user ids
process ids
event descriptors
priority
etc
The "process table" as such lives in the kernel's memory. Some systems (such as AIX, Solaris and Linux--which is not "unix") have a /proc filesystem which makes those tables visible to ordinary programs. Without that, programs such as ps (on very old systems such as SunOS 4) required elevated privileges to read the /dev/kmem (kernel memory) special device, as well as having detailed knowledge about the kernel memory layout.
Your question is open ended, and an answer to a specific question you may have had can be looked up in any man page as #Alfasin suggests in his answer. A lot depends on what you are trying to do.
As #ThomasDickey points out in his response, in UNIX and most of its' derivatives, the command for viewing processes being run in the background or foreground is in fact the ps command.
ps stands for 'process status', answering your first bullet item. But the command uses over 30 options and depending on what information you seek, and permissions granted to you by the systems administrator, you can get various types of information from the command.
For example, for the second bullet item on your list above, depending on what you are looking for, you can get information on 3 different types of pointers - the session pointer (with option 'sess'), the terminal session pointer (tsess), and the process pointer (uprocp).
The rest of your items that you have listed are mostly available as standard output of the command.
Some UNIX variants implement a view of the system process table inside of the file system to support the running of programs such as ps. This is normally mounted on /proc (see #ThomasDickey response above)
Typical reasons for understanding the working of the command include system-administration responsibilities such as tracking the origin of the initiated processes, killing runaway or orphaned processes, examining the file size of the process and setting limits where necessary, etc. UNIX developers can also use it in conjunction with ipc features, etc. An understanding of the process table and status will help with associated UNIX features such as the kvm interface to examine crash dump, etc. or to get or set the kernal state.
Hope this helps
Is there an MPI implementation that allows nodes to be dynamically added/removed at runtime? Do any recover from complete hardware failure of a node, allowing the node to be repaired and relaunched without restarting the program?
Is there an MPI implementation that allows nodes to be dynamically added/removed at runtime?
This is actually two questions. Nodes usually can be dynamically added at runtime using calls like MPI_Comm_spawn. As #Hristo pointed out in the comments, you should set the correct info key in Open MPI. It may also be possible in other implementations. As for removing nodes, that's a big area of research at the moment. Most MPI implementations currently have varying levels of success surviving a total node failure. In the current releases of Open MPI, I don't believe there is any support for that sort of failure [citation needed], though there is work to improve that ongoing. In the current version of MPICH, you can pass the flag -disable-auto-cleanup to mpiexec and it will not automatically clean up your application after a process/node failure. However, you'll still have to modify your MPI application to handle this situation. The various derivatives of MPICH (Intel MPI, Cray MPI, IBM MPI, MVAPICH, etc.) all don't support this feature AFAIK. There are other research implementations that are also available to extend the support of the MPI Standard. User Level Failure Mitigation is currently being considered by the standardization body as a way of letting the user handle process failures. There is a research implementation based on Open MPI available at the website linked, and an experimental prototype will also be in the next version of MPICH (3.2).
Do any recover from complete hardware failure of a node, allowing the node to be repaired and relaunched without restarting the program?
This is essentially the same process as above. You would need to use the APIs to remove a process and then somehow find out that it's available and add it back using spawn. These calls have to be made from inside the application though, not externally.
What level do daemon processes like init, httpd, ftpd, dhcpd, etc run? Is it in kernel level or user level like shell, library functions and applications?
I read several Unix books and internet articles but none mention where do they run.
They run in userspace but with root privileges for some of them. There is no requirement for a daemon (in general) to run in kernel space. Kernel space is restricted for tasks that handle the lowest level of interaction with the hardware (drivers) and back the vital functions of the OS (memory management, file system, etc.).
Our company is working on integrating Guidewire(claims processing system) into the existing claims system. We will be executing performance tests on the integrated system shortly. I wanted to know if there was some way to monitor the integration points specific to guidewire.
The system is connected through Web Services. We have access to Loadrunner and Sitescope, and are comfortable with using other open source tools also.
I realize monitoring WSDL files is an option, Could you suggest additional methods to monitor the integration points?
Look at the architecture of Guidewire. You OS have OS monitoring points and you have application monitoring points. The OS is straightforward using SiteScope, SNMP (with SiteScope or LoadRunner), Hyperic, Native OS tools or a tool like Splunk.
You likely have a database involved: This monitoring case is well known and understood.
Monitoring the services? As the application experts inside of your organization what they look at to determine if the application is healthy and running well. You might be implementing a set of terminal users (RTE) with datapoints, log monitoring through SiteScope, custom monitors scheduled to run on the host piping the output through SED to a standard form that can be imported into Analysis at the end of the test.
Think Architecturally. Decompose each host in the stack into OS and services. Map your known monitors to the hosts and layers. Where you run into issues grab the application experts and have them write down the monitors they use (they will have more faith in your results and analysis as a result)
I manage Unix systems where, sometimes, programs like CGI scripts run forever, sometimes eating a lot of CPU time and wasting resources.
I want a program (typically invoked from cron) which can kill these runaways, based on the following criteria (combined with AND and OR):
Name (given by a regexp)
CPU time used
elapsed time (for programs which are blocked on an I/O)
I do not really know what to type in a search engine for this sort of program. I certainly could write it myself in Python but I'm lazy and there is may be a good program already existing?
(I did not tag my question with a language name since a program in Perl or Ruby or whatever would work as well)
Try using system-level quota enforcement instead. Most systems will allow to set per-process CPU time limit for different users.
Examples:
Linux: /etc/security/limits.conf
FreeBSD: /etc/login.conf
CGI scripts can usually be run under their own user ID, for example using mod_suid for Apache.
This might be something more like what you were looking for:
http://devel.ringlet.net/sysutils/timelimit/
Most of the watchdig-like programs or libraries are just trying to see whether a given process is running, so I'd say you'd better off writing your own, using the existing libraries that give out process information.