We have various options for strerror(), with ups and downsides:
- strerror()
- returns pointer that is overwritten on next call. It's convenient
to use, but dangerous.
- not thread-safe.
- not guaranteed to be UTF-8.
- strerror_r()
- takes input buffer and is less convenient to use. At least, we
are in control of when the buffer gets overwritten.
- there is a Posix/XSI and a glibc variant, making it sligthly
inconvenient to used. This could be solved by a wrapper we implement.
- thread-safe.
- not guaranteed to be UTF-8.
- g_strerror()
- convenient and safe to use. Also the buffer is never released for the
remainder of the program.
- passing untrusted error numbers to g_strerror() can result in a
denial of service, as the internal buffer grows until out-of-memory.
- thread-safe.
- guaranteed to be UTF-8 (depending on locale).
Add our own wrapper nm_strerror_native(). It is:
- convenient to use (returning a buffer that does not require
management).
- slightly dangerous as the buffer gets overwritten on the next call
(like strerror()).
- thread-safe.
- guaranteed to be UTF-8 (depending on locale).
- doesn't keep an unlimited cache of strings, unlike g_strerror().
You can't have it all. g_strerror() is leaking all generated error messages.
I think that is unacceptable, because it would mean we need to
keep track where our error numbers come from (and trust libraries we
use to only set a restricted set of known error numbers).
The native error numbers (from <errno.h>) and our nmerr extention on top
of them are almost the same. But there are peculiarities.
Both errno and nmerr must be positive values. That is because some API
(systemd) like to return negative error codes. So, a positive errno and
its negative counter part indicate the same error. We need normalization
functions that make an error number positive (these are nm_errno() and
nm_errno_native()).
This means, G_MININT needs special treatment, because it cannot be
represented as a positive integer. Also, zero needs special
treatment, because we want to encode an error, and zero already encodes
no-error. Take care of these special cases.
On top of that, nmerr reserves a range within native error numbers for
NetworkManager specific failure codes. So we need to transition from native
numbers to nmerr numbers via nm_errno_from_native().
Take better care of some special cases and clean them up.
Also add NM_ERRNO_NATIVE() macro. While nm_errno_native() coerces a
value in the suitable range, NM_ERRNO_NATIVE() asserts that the number
is already positive (and returns it as-is). It's use is only for
asserting and implicitly documenting the requirements we have on the
number passed to it.
Platform had it's own scheme for reporting errors: NMPlatformError.
Before, NMPlatformError indicated success via zero, negative integer
values are numbers from <errno.h>, and positive integer values are
platform specific codes. This changes now according to nm-error:
success is still zero. Negative values indicate a failure, where the
numeric value is either from <errno.h> or one of our error codes.
The meaning of positive values depends on the functions. Most functions
can only report an error reason (negative) and success (zero). For such
functions, positive values should never be returned (but the caller
should anticipate them).
For some functions, positive values could mean additional information
(but still success). That depends.
This is also what systemd does, except that systemd only returns
(negative) integers from <errno.h>, while we merge our own error codes
into the range of <errno.h>.
The advantage is to get rid of one way how to signal errors. The other
advantage is, that these error codes are compatible with all other
nm-errno values. For example, previously negative values indicated error
codes from <errno.h>, but it did not entail error codes from netlink.
This will be our extension on top of <errno.h>.
We want to use (integer) error numbers, that can both
contain native errors from <errno.h> and our own defines,
both merge in one domain. That is, we will reserve a small
range of integers for our own defines (that hopefully won't
clash with errors from <errno.h>).
We can use this at places where GError is too cumbersome to use.
The advantage is, that our error numbers extend <errno.h> and can
be mixed.
This is what "src/platform/nm-netlink.h" already does with nl_errno(). Next,
the netlink errors from there will be merged into "nm-errno.h".
Also, platform has NMPlatformError, which are a distinct set of error
numbers. But these work differently in the sense that negative values
represent codes from <errno.h> and positive numbers are our own platform
specific defines. NMPlatformError will also be merged into "nm-errno.h".
"nm-errno.h" will unify the error handling of platform and netlink,
making it more similar to what we are used to from systemd, and give
room to extend it for our own purpose.