This commit is based on exploratory work by Sohaib Mohamed.
The end goal is two-fold - to support addition of Meters we
build via configuration files for both the PCP platform and
for scripts ( https://github.com/htop-dev/htop/issues/526 )
Here, we focus on generic code and the PCP support. A new
class DynamicMeter is introduced - it uses the special case
'param' field handling that previously was used only by the
CPUMeter, such that every runtime-configured Meter is given
a unique identifier. Unlike with the CPUMeter this is used
internally only. When reading/writing to htoprc instead of
CPU(N) - where N is an integer param (CPU number) - we use
the string name for each meter. For example, if we have a
configuration for a DynamicMeter for some Redis metrics, we
might read and write "Dynamic(redis)". This identifier is
subsequently matched (back) up to the configuration file so
we're able to re-create arbitrary user configurations.
The PCP platform configuration file format is fairly simple.
We expand configs from several directories, including the
users homedir alongside htoprc (below htop/meters/) and also
/etc/pcp/htop/meters. The format will be described via a
new pcp-htop(5) man page, but its basically ini-style and
each Meter has one or more metric expressions associated, as
well as specifications for labels, color and so on via a dot
separated notation for individual metrics within the Meter.
A few initial sample configuration files are provided below
./pcp/meters that give the general idea. The PCP "derived"
metric specification - see pmRegisterDerived(3) - is used
as the syntax for specifying metrics in PCP DynamicMeters.
Reading and parsing /proc/<pid>/maps is quite expensive.
Do not check for deleted libraries if the main binary has been deleted;
in this case the deleted binary takes precedence.
Do not check in threads. The check is void for kernel threads and user-
land threads can just inherit the state from the main process structure.
O_PATH is available since Linux 2.6.39, but we are using fstat(2) on the
returned file descriptor in LinuxProcessList_statProcessDir(), which
is only supported since Linux 3.6.
Fixes#534
Shared libraries can be replaced by an upgrade, highlight processes
using deleted shared libraries.
Link with highlightDeletedExe setting, enabled by default.
Currently only checked on Linux.
Add process columns showing the elapsed time since the process was
started.
Similar to STARTTIME, but shows the time passed since the process start
instead of the fixed start time of the process.
Closes https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=782636
The following assert failure might happen on short running threads with
an empty comm value in /proc/${pid}/stat:
htop: Process.c:1159: void Process_updateCmdline(Process *, const char *, int, int): Assertion `(cmdline && basenameStart < (int)strlen(cmdline)) || (!cmdline && basenameStart == 0)' failed.
The specific task is:
comm=''
exe='(null)'
cmdline='/usr/bin/ruby /usr/bin/how-can-i-help --apt'
So basenameStart is 0, while strlen(cmdline) is also 0.
- fix header width of IO_READ_RATE
- save data in bytes (not kilobytes) to better compute rate
- fix rate data: multiply with 1000 to compensate time difference in
milliseconds
- rename unit less variable now into realtimeMs
- use Process_printBytes(..., data * pageSize, ...) instead of
Process_printKBytes(..., data * pageSizeKB, ...) to avoid wrapper
Make functions formatting data for a process field column less error
prone, unify interfaces and improve some internals.
* Process_printBytes
- rename from Process_humanNumber
- take number in bytes, not kilobytes
- handle petabytes
- increase buffer to avoid crashes when the passed value is
~ ULLONG_MAX
* Process_printKBytes
- add wrapper for Process_printBytes taking kilobytes keeping -1 as
special value
* Process_printCount
- rename from Process_colorNumber
* Process_printTime
- add coloring parameter as other print functions
- improve coloring and formatting for larger times
* Process_printRate
- rename from Process_outputRate
- use local buffer instead of passed one; this function prints to the
RichString after all
`RichString_appendWide()` is more expensive than
`RichString_appendAscii()` due to the calls to `mbstowcs(3)` and
`iswprint(3)`.
Use the latter to print the process field buffer by default.
For the following fields this theoretically can corrupt the output:
- SECATTR
- CGROUP
- CTID
`RichString_appendnAscii()` avoids a `strlen(3)` call over
` RichString_appendAscii()`.
Use the former where the length is available from a previous checked
`snprintf(3)` call.
Keep `RichString_appendAscii()` when passing a string literal and
rely on compilers to optimize the `strlen(3)` call away.
* Rename internal identifier from TTY_NR to just TTY
* Unify column header on platforms
* Use devname(3) on BSD derivate to show the actual terminal,
simplifies current FreeBSD implementation.
* Use 'unsigned long int' as id type, to fit dev_t on Linux.
Only on Solaris the terminal path is not yet resolved.
Refactor the sample time code to make one call to gettimeofday
(aka the realtime clock in clock_gettime, when available) and
one to the monotonic clock. Stores each in more appropriately
named ProcessList fields for ready access when needed. Every
platform gets the opportunity to provide their own clock code,
and the existing Mac OS X specific code is moved below darwin
instead of in Compat.
A couple of leftover time(2) calls are converted to use these
ProcessList fields as well, instead of yet again sampling the
system clock.
Related to https://github.com/htop-dev/htop/pull/574
The end goal is to consolidate all the points in htop that can only work in
live-only mode today, so that will be able to inject PCP archive mode and have
a chance at it working.
The biggest problem we've got at this moment is all the places that are
independently asking the kernel to 'give me the time right now'.
Each of those needs to be audited and ultimately changed to allow platforms to
manage their own idea of time.
So, all the calls to gettimeofday(2) and time(2) are potential problems.
Ultimately I want to get these down to just one or two.
Related to https://github.com/htop-dev/htop/pull/574
The libcap code is Linux-specific so move it all below
the linux/ platform subdirectory. As this feature has
custom command-line long options I provide a mechanism
whereby each platform can add custom long options that
augment the main htop options. We'll make use this of
this with the pcp/ platform in due course to implement
the --host and --archive options there.
Related to https://github.com/htop-dev/htop/pull/536
This prefers the `#if defined()` syntax over the `#ifdef` variant
whenever there's also a `#elif defined()` clause, thus making the
multiple branching structure more obvious and the overall use
more consistent.
Do not read driver depended labels, just count the number of
temperatures given:
on #CPU:
platform temp = max cpu temp
CPU temps = first to last
on #CPU + 1:
platform temp = first temp
CPU temps = second to last
on #CPU / 2:
platform temp = max cpu temp
CPU temps = first to last concat first to last
(with SMT core x + cpu count is the logical core of the physical
core x)
on #CPU / 2 + 1:
platform temp = first temp
CPU temps = second to last concat second to last
(with SMT core x + cpu count is the logical core of the physical
core x)
Closes: #529Closes: #538
Combine reading CPU count and CPU usage, only open the file once.
Do not separately initialize totalPeriod and totalTime, cause the value
0 is handled in Platform_setCPUValues().
Take the number of currently running process from the entry
procs_running in /proc/stat instead of counting all scanned process
with state 'R', to include hidden tasks, e.g. threads.
Code that is shared across some (but not all) platforms
is moved into a 'generic' home. Makefile.am cleanups to
match plus some minor alphabetic reordering/formatting.
As discussed in https://github.com/htop-dev/htop/pull/553
Several of our newer meters have merged coding concerns in terms
of extracting values and displaying those values. This commit
rectifies that for the SysArch and Hostname meters, allowing use
of this code with alternative front/back ends. The SysArch code
is also refined to detect whether the platform has an os-release
file at all and/or the sys/utsname.h header via configure.ac.
On Linux kernels the size of the values exported for block
device bytes has used a 64 bit integer for quite some time
(2.6+ IIRC). Make the procfs value extraction use correct
types and change internal types used to rate convert these
counters (within the DiskIO Meter) 64 bit integers, where
appropriate.
On Linux kernels the size of the values exported for network
device bytes and packets has used a 64 bit integer for quite
some time (2.6+ IIRC). Make the procfs value extraction use
correct types and change internal types used to rate convert
these counters (within the NetworkIO Meter) 64 bit integers,
where appropriate.
Use similar calculation than procps.
Show AvailableMemory in text mode.
Use total minus available memory instead of manually computed used-
memory as fraction part in bar mode (if available).
The State struct holds a pointer to the main process panel.
Use the distinct MainPanel type, to improve maintainability regrading
its usage.
This avoids usages of down-casts from Panel to MainPanel, only up-casts
from MainPanel to Panel are now required.
At start, SysArchMeter calls the uname function to obtain the kernel
version and architecture. If available, the distro version is obtained
by calling lsb_release. The obtained values are stored in static
variables and used when updating the meter.
pgrp and session might be -1
linux/LinuxProcessList.c:312:20: runtime error: implicit conversion from type 'unsigned long' of value 18446744073709551615 (64-bit, unsigned) to type 'unsigned int' changed the value to 4294967295 (32-bit, unsigned)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior linux/LinuxProcessList.c:312:20 in
linux/LinuxProcessList.c:314:23: runtime error: implicit conversion from type 'unsigned long' of value 18446744073709551615 (64-bit, unsigned) to type 'unsigned int' changed the value to 4294967295 (32-bit, unsigned)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior linux/LinuxProcessList.c:314:23 in
- avoid UBSAN conversions
- print N/A on no data (i.e. as unprivileged user)
- fix rate calculation to show bytes (instead of a thousandth)
- print bytes as human number (i.e. 8MB) instead of 8388608
- stabilize sorting by adjusting NAN values to very tiny negative number
If no terminal name can be found, fall back to generic display method
with major and minor device numbers.
Print special value '(none)' in case both are zero.
On some AMD and Intel CPUs read()ing scaling_cur_freq is quite slow
(> 1ms). This delay accumulates for every core.
If the read on CPU 0 takes longer than 500us bail out and fall back to
reading the frequencies from /proc/cpuinfo.
Once the condition has been met, bail out early for the next couple of
scans.
Closes: #471
According to the Linux kernel documentation, "SwapCached" tracks "memory
that once was swapped out, is swapped back in but still also is
in the swapfile (if memory is needed it doesn't need to be swapped out
AGAIN because it is already in the swapfile. This saves I/O)."
It is only used on Linux to optimize memory handling in case the command
changes to a smaller-or-equal string.
This "optimization" however causes more code bloat and maintenance cost
on string handling issues than it gains.