When the logging level is DEBUG or TRACE, we keep all the sysctl
values we read in a cache to log how they change. Currently there is
no limit on the size of this cache and it can take a large amount of
memory.
Implement a LRU cache where the oldest entries are deleted to make
space for new ones.
https://github.com/NetworkManager/NetworkManager/pull/294
Previously, _nm_logging_clear_platform_logging_cache was an extern variable,
and NMLinuxPlatform would set it to a function pointer at certain points.
That's unnecessary complex, also when trying to make nm-logging thread-safe,
it's just more global variables that need to be considered. Don't do it
that way, but just link in a regular function.
Since commit 9ecdba316 ('platform: create netlink messages directly
without libnl-route-3') we're unconditionally setting IFA_ADDRESS to
the peer address, even if there's no peer and it's all zeroes.
The kernel actually stopped caring somewhere around commit caeaba790
('ipv6: add support of peer address') in v3.10, but Ubuntu Touch likes
to run Android's v3.4 on some poorly supported hardware.
Fixes: 9ecdba316chttps://gitlab.freedesktop.org/NetworkManager/NetworkManager/merge_requests/77
The caller may not wish to replace existing peers, but only update/add
the peers explicitly passed to nm_platform_link_wireguard_change().
I think that is in particular interesting, because for the most part
NetworkManager will configure the same set of peers over and over again
(whenever we resolve the DNS name of an IP endpoint of the WireGuard
peer).
At that point, it seems disruptive to drop all peers and re-add them
again. Setting @replace_peers to %FALSE allows to only update/add.
Add cmp/hash functions that correctly honor the well known fields, instead
of doing memcmp/memcpy of the entire sockaddr structure.
Also, move the set function to nm_sock_addr_union_cpy() and
nm_sock_addr_union_cpy_untrusted(). This also gets it right
to ensure all bytes of the union are initialized (to zero).
We need more information what failed. Don't only return success/failure,
but an error number.
Note that we still don't actually return an error number. Only
the link_add() function is changed to return an nm-error integer.
Platform had it's own scheme for reporting errors: NMPlatformError.
Before, NMPlatformError indicated success via zero, negative integer
values are numbers from <errno.h>, and positive integer values are
platform specific codes. This changes now according to nm-error:
success is still zero. Negative values indicate a failure, where the
numeric value is either from <errno.h> or one of our error codes.
The meaning of positive values depends on the functions. Most functions
can only report an error reason (negative) and success (zero). For such
functions, positive values should never be returned (but the caller
should anticipate them).
For some functions, positive values could mean additional information
(but still success). That depends.
This is also what systemd does, except that systemd only returns
(negative) integers from <errno.h>, while we merge our own error codes
into the range of <errno.h>.
The advantage is to get rid of one way how to signal errors. The other
advantage is, that these error codes are compatible with all other
nm-errno values. For example, previously negative values indicated error
codes from <errno.h>, but it did not entail error codes from netlink.
While nm_utils_inet*_ntop() accepts a %NULL buffer to fallback
to a static buffer, don't do that.
I find the possibility of using a static buffer here error prone
and something that should be avoided. There is of course the downside,
that in some cases it requires an additional line of code to allocate
the buffer on the stack as auto-variable.
I think this is preferred over memset(), because it allows the
compiler to better unstand what is happening.
Also, strictly speaking in the C language, %NULL pointers are not
guaranteed to have an all zero bit pattern. Of course, that is already
required on any architecture where NetworkManager is running.
- previously, parsing wireguard genl data resulted in memory corruption:
- _wireguard_update_from_allowedips_nla() takes pointers to
allowedip = &g_array_index (buf->allowedips, NMWireGuardAllowedIP, buf->allowedips->len - 1);
but resizing the GArray will invalidate this pointer. This happens
when there are multiple allowed-ips to parse.
- there was some confusion who owned the allowedips pointers.
_wireguard_peers_cpy() and _vt_cmd_obj_dispose_lnk_wireguard()
assumed each peer owned their own chunk, but _wireguard_get_link_properties()
would not duplicate the memory properly.
- rework memory handling for allowed_ips. Now, the NMPObjectLnkWireGuard
keeps a pointer _allowed_ips_buf. This buffer contains the instances for
all peers.
The parsing of the netlink message is the complicated part, because
we don't know upfront how many peers/allowed-ips we receive. During
construction, the tracking of peers/allowed-ips is complicated,
via a CList/GArray. At the end of that, we prettify the data
representation and put everything into two buffers. That is more
efficient and simpler for user afterwards. This moves complexity
to the way how the object is created, vs. how it is used later.
- ensure that we nm_explicit_bzero() private-key and preshared-key. However,
that only works to a certain point, because our netlink library does not
ensure that no data is leaked.
- don't use a "struct sockaddr" union for the peer's endpoint. Instead,
use a combintation of endpoint_family, endpoint_port, and
endpoint_addr.
- a lot of refactoring.
Move NMLinuxPlatformPrivate earlier.
In the past, I moved the declaration of NMLinuxPlatformPrivate
after utility functions which are independent from platform
instance.
However, parsing netlink messages actually requires
NMLinuxPlatformPrivate, because we want to access the "genl"
socket.
So, move the types to the beginning of the file, like we do
for most other source files.
The _lookup_cached_link() function, should not skip over links which are
currently in the cache, but not in netlink. Instead, let the callers
skip them, as they see fit.
No change in behavior, because the few callers now explicitly check
for this.
- drop "goto error_label" in favor of returning right away.
At most places, there was no need to do any cleanup or
the cleanup is handled via nm_auto().
- adjust the return types of wireguard functions to return
a boolean success/failure, instead of some error code which
we didn't use.
- the change to _wireguard_get_link_properties() is intentional.
This was wrong previously, because in _wireguard_get_link_properties()
obj is always a newly created instance, and never has a genl
family ID set. This will be improved later.
When reading a file, we may allocate intermediate buffers (realloc()).
Also, reading might fail halfway through the process.
Add a new flag that makes sure that this memory is cleared. The
point is when reading secrets, that we don't accidentally leave
private sensitive material in memory.
Using '#ifdef' is generally error prone. It's better to always define
a define and check for it explicitly. This way, the compiler can issue
a warning if the define does not exist.
Also, note how meson would always define NM_MORE_LOGGING, possibly to
"0". That means, for meson, we unintentionally always enabled more
logging because the define was always present.
Fix that.
These should be logged on DEBUG level:
<warn> platform-linux: do-change-link[2]: failure changing link: failure 97 (Address family not supported by protocol)
<warn> device (wlo1): failed to enable userspace IPv6LL address handling (unspecified)
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/issues/10
We commonly don't use the glib typedefs for char/short/int/long,
but their C types directly.
$ git grep '\<g\(char\|short\|int\|long\|float\|double\)\>' | wc -l
587
$ git grep '\<\(char\|short\|int\|long\|float\|double\)\>' | wc -l
21114
One could argue that using the glib typedefs is preferable in
public API (of our glib based libnm library) or where it clearly
is related to glib, like during
g_object_set (obj, PROPERTY, (gint) value, NULL);
However, that argument does not seem strong, because in practice we don't
follow that argument today, and seldomly use the glib typedefs.
Also, the style guide for this would be hard to formalize, because
"using them where clearly related to a glib" is a very loose suggestion.
Also note that glib typedefs will always just be typedefs of the
underlying C types. There is no danger of glib changing the meaning
of these typedefs (because that would be a major API break of glib).
A simple style guide is instead: don't use these typedefs.
No manual actions, I only ran the bash script:
FILES=($(git ls-files '*.[hc]'))
sed -i \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\>\( [^ ]\)/\1\2/g' \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\> /\1 /g' \
-e 's/\<g\(char\|short\|int\|long\|float\|double\)\>/\1/g' \
"${FILES[@]}"
Add platform support for IP6GRE and IP6GRETAP tunnels. The former is a
virtual tunnel interface for GRE over IPv6 and the latter is the L2
variant.
The platform code internally reuses and extends the same structure
used by IPv6 tunnels.
Certain platform operations are logged both in nm-platform.c and
nm-linux-platform.c, resulting in duplicate messages. Drop log prints
from the latter.
Add support for a new wireguard link type to the platform code. For now
this only covers querying existing links via genetlink and parsing them
into platform objects.