This commit is based on exploratory work by Sohaib Mohamed.
The end goal is two-fold - to support addition of Meters we
build via configuration files for both the PCP platform and
for scripts ( https://github.com/htop-dev/htop/issues/526 )
Here, we focus on generic code and the PCP support. A new
class DynamicMeter is introduced - it uses the special case
'param' field handling that previously was used only by the
CPUMeter, such that every runtime-configured Meter is given
a unique identifier. Unlike with the CPUMeter this is used
internally only. When reading/writing to htoprc instead of
CPU(N) - where N is an integer param (CPU number) - we use
the string name for each meter. For example, if we have a
configuration for a DynamicMeter for some Redis metrics, we
might read and write "Dynamic(redis)". This identifier is
subsequently matched (back) up to the configuration file so
we're able to re-create arbitrary user configurations.
The PCP platform configuration file format is fairly simple.
We expand configs from several directories, including the
users homedir alongside htoprc (below htop/meters/) and also
/etc/pcp/htop/meters. The format will be described via a
new pcp-htop(5) man page, but its basically ini-style and
each Meter has one or more metric expressions associated, as
well as specifications for labels, color and so on via a dot
separated notation for individual metrics within the Meter.
A few initial sample configuration files are provided below
./pcp/meters that give the general idea. The PCP "derived"
metric specification - see pmRegisterDerived(3) - is used
as the syntax for specifying metrics in PCP DynamicMeters.
Reading and parsing /proc/<pid>/maps is quite expensive.
Do not check for deleted libraries if the main binary has been deleted;
in this case the deleted binary takes precedence.
Do not check in threads. The check is void for kernel threads and user-
land threads can just inherit the state from the main process structure.
O_PATH is available since Linux 2.6.39, but we are using fstat(2) on the
returned file descriptor in LinuxProcessList_statProcessDir(), which
is only supported since Linux 3.6.
Fixes#534
Shared libraries can be replaced by an upgrade, highlight processes
using deleted shared libraries.
Link with highlightDeletedExe setting, enabled by default.
Currently only checked on Linux.
The following assert failure might happen on short running threads with
an empty comm value in /proc/${pid}/stat:
htop: Process.c:1159: void Process_updateCmdline(Process *, const char *, int, int): Assertion `(cmdline && basenameStart < (int)strlen(cmdline)) || (!cmdline && basenameStart == 0)' failed.
The specific task is:
comm=''
exe='(null)'
cmdline='/usr/bin/ruby /usr/bin/how-can-i-help --apt'
So basenameStart is 0, while strlen(cmdline) is also 0.
- fix header width of IO_READ_RATE
- save data in bytes (not kilobytes) to better compute rate
- fix rate data: multiply with 1000 to compensate time difference in
milliseconds
- rename unit less variable now into realtimeMs
- use Process_printBytes(..., data * pageSize, ...) instead of
Process_printKBytes(..., data * pageSizeKB, ...) to avoid wrapper
* Rename internal identifier from TTY_NR to just TTY
* Unify column header on platforms
* Use devname(3) on BSD derivate to show the actual terminal,
simplifies current FreeBSD implementation.
* Use 'unsigned long int' as id type, to fit dev_t on Linux.
Only on Solaris the terminal path is not yet resolved.
Refactor the sample time code to make one call to gettimeofday
(aka the realtime clock in clock_gettime, when available) and
one to the monotonic clock. Stores each in more appropriately
named ProcessList fields for ready access when needed. Every
platform gets the opportunity to provide their own clock code,
and the existing Mac OS X specific code is moved below darwin
instead of in Compat.
A couple of leftover time(2) calls are converted to use these
ProcessList fields as well, instead of yet again sampling the
system clock.
Related to https://github.com/htop-dev/htop/pull/574
The end goal is to consolidate all the points in htop that can only work in
live-only mode today, so that will be able to inject PCP archive mode and have
a chance at it working.
The biggest problem we've got at this moment is all the places that are
independently asking the kernel to 'give me the time right now'.
Each of those needs to be audited and ultimately changed to allow platforms to
manage their own idea of time.
So, all the calls to gettimeofday(2) and time(2) are potential problems.
Ultimately I want to get these down to just one or two.
Related to https://github.com/htop-dev/htop/pull/574
This prefers the `#if defined()` syntax over the `#ifdef` variant
whenever there's also a `#elif defined()` clause, thus making the
multiple branching structure more obvious and the overall use
more consistent.
Combine reading CPU count and CPU usage, only open the file once.
Do not separately initialize totalPeriod and totalTime, cause the value
0 is handled in Platform_setCPUValues().
Take the number of currently running process from the entry
procs_running in /proc/stat instead of counting all scanned process
with state 'R', to include hidden tasks, e.g. threads.
Use similar calculation than procps.
Show AvailableMemory in text mode.
Use total minus available memory instead of manually computed used-
memory as fraction part in bar mode (if available).
pgrp and session might be -1
linux/LinuxProcessList.c:312:20: runtime error: implicit conversion from type 'unsigned long' of value 18446744073709551615 (64-bit, unsigned) to type 'unsigned int' changed the value to 4294967295 (32-bit, unsigned)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior linux/LinuxProcessList.c:312:20 in
linux/LinuxProcessList.c:314:23: runtime error: implicit conversion from type 'unsigned long' of value 18446744073709551615 (64-bit, unsigned) to type 'unsigned int' changed the value to 4294967295 (32-bit, unsigned)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior linux/LinuxProcessList.c:314:23 in
- avoid UBSAN conversions
- print N/A on no data (i.e. as unprivileged user)
- fix rate calculation to show bytes (instead of a thousandth)
- print bytes as human number (i.e. 8MB) instead of 8388608
- stabilize sorting by adjusting NAN values to very tiny negative number