The base image for the "check-tree" test got bumped to Fedora 39. This
brings a new python-black version (23.7.0 vs. 22.8.0) and requires
reformatting.
Maybe we should stick to 22.8.0, via `pip install`. But it seems better
to just follow the latest black version (the one from current Fedora).
So do the reformatting instead.
https://black.readthedocs.io/en/stable/change_log.html#id38
Let's not have unexpected, non-thread-safe functions somewhere deep down.
NML3ConfigData -- as a data structure -- is not thread-safe, nor aims it to
be. However, our code(!) should be thread-safe. That means, it should be
possible to call our code on separate data from multiple threads.
Violating that is a code smell and a foot gun.
This basically means that code should not access global data (unless
thread-local) or that the access to global-data needs to be
synchronized.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1806
- after modifying .gitlab-ci, the template must be regenerated by
running `ci-fairy generate-template`
- when swapping tier1 from f38 to f39, the list in tier2 must be updated
too.
Fixes: e2f04f0d2cc3 ('device: fix generated 'wifi.cloned-mac-address="stable-ssid"' for stable-id')
As Fedora 39 is officially out, we should use it in our gitlab-ci.
Please notice that this change will also update the
nm-code-format-container.sh to use Fedora 39 aswell.
This patch add support to HSR/PRP interface. Please notice that PRP
driver is represented as HSR too. They are different drivers but on
kernel they are integrated together.
HSR/PRP is a network protocol standard for Ethernet that provides
seamless failover against failure of any network component. It intends
to be transparent to the application. These protocols are useful for
applications that request high availability and short switchover time
e.g electrical substation or high power inverters.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1791
Dynamic added routes i.e ECMP single-hop routes, are not managed by l3cd
as the other ones. Therefore, they need to be tracked properly and
marked as dirty when they are.
Without this, the state of those ECMP single hop routes is not properly
tracked, and they are for example not removed by NML3Cfg when they
should.
Usually, routes to be configured originate from the combined
NML3ConfigData and are resolved early during a commit. For example,
_obj_states_update_all() creates for each such route an obj_state_hash
entry. Let's call those static, or "non-dynamic".
Later, we can receive additional routes. We get them back from NMNetns
during nm_netns_ip_route_ecmp_commit() (_commit_collect_routes()).
Let's call them "dynamic".
For those routes, we also must track an obj-state.
Now we have two reasons why an obj-state is tracked. The "non-dynamic"
and the "dynamic" one. Add two flags "os_dynamic" and "os_non_dynamic"
to the ObjStateData and consider the flags at the necessary places.
Co-authored-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
When a single-hop route is merged with another single-hop route from a
different interface creating a new ECMP route, NetworkManager needs to
schedule a commit on the l3cfg that is managing the first single-hop
route. Otherwise, the single-hop will continue to be there until a new
commit is scheduled.
Fixes: d9d33e2acc ('netns: fix configuring onlink routes for ECMP routes')
nm_netns_ip_route_ecmp_commit() returns the ECMP routes that didn't find
a merge-partner. It expects NML3Cfg to configure such routes in
platform.
Note that these routes have a positive "weight", which was used for
tracking the ECMP information in NML3Cfg and NMNetns.
But in kernel, single-hop routes don't have a weight. All routes in
NMPlatform will have a zero weight. While it somewhat works to call
nm_platform_ip_route_add() with a single-hop route of non-zero weight,
the result will be a different(!) route. This means for example that
nm_platform_ip_route_sync() will search the NMPlatform cache for whether
the routes exist, but in the cache single-hop routes with positive weight
never exist. Previously this happened and nm_platform_ip_route_sync()
would not delete those routes when it should.
It's also important because NML3Cfg tracks the object states of routes
that it wants to configure, vs routes in kernel (NMPlatform cache). The
route in kernel has a weight of zero. The route we want to configure
must also have a weight of zero.
The solution is that nm_netns_ip_route_ecmp_commit() does not tell
NML3Cfg to add a route, that cannot be added. Instead, it must first
normalize the weight to zero, to have a "regular" single-hop route.
Fixes: 5b5ce42682 ('nm-netns: track ECMP routes')
It wouldn't work otherwise. The object state is used to track routes
and compare them to what we find in platform.
A "metric_any" is useful at higher layers, to track a route where the
metric is decided by somebody else. But at the point when we add such an
object to the object-state, a fixed metric must be chosen.
Assert for that.
The hash/cmp functions were wrong with respect to IPv4 routes. Fix that.
- the weight was cast to a guint8, which is too small to hold all
values.
- NM_PLATFORM_IP_ROUTE_CMP_TYPE_ID comparison would normalize a zero
weight to one
NM_CMP_DIRECT(NM_MAX(a->weight, 1u), NM_MAX(b->weight, 1u));
That was very wrong.
- the handling of the weight depends on the n_nexthops parameter.
See _ip4_route_weight_normalize().
The remarkable thing is that upper layers find it useful to track IPv4
single-hop routes with a non-zero weight. Consequently, this is honored
by NM_PLATFORM_IP_ROUTE_CMP_TYPE_ID to treat single-hop routes
different, when their weight differs. However, adding such a route in
kernel will not work. To kernel, single-hop routes have no weight. While
the route exists as distinct in our hash tables, according to the
implemented identity, it never exists in kernel (or NMPlatform cache).
Well, you can call nm_platform_ip_route_add() with such a route, but the
result will have a weight of zero (making it a different route). See also
nm_platform_ip_route_normalize().
This works all mostly fine. The only thing to take care is that you
cannot look into the NMPlatform cache and ever find a IPv4 single-hop
route with a non-zero weight. If you preform such a lookup, realize that
such routes don't exist in platform. You can however normalize the
weight to zero first (nm_platform_ip_route_normalize()) and look for a
similar route with a zero weight.
Fixes: 1bbdecf5e1 ('platform: manage ECMP routes')
Users should not be allowed to start or stop a wifi-p2p scan unless
they have some kind of permission. Since we already have the
"org.freedesktop.NetworkManager.wifi.scan" permission for wifi scans,
check that.
Fixes: dd0c59c468 ('core/devices: Add DBus methods to start/stop a P2P find')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1795
It also doesn't make much sense to have this. We may use a
GPtrArray to construct and keep track of a (dynamic) strv list.
Then we add the strings to the GPtrArray one by one.
We almost never will want to create a GPtrArray based on a strv array.
It doesn't scale to export all addresses/routes on D-Bus as properties.
In particular not combined with PropertiesChanged signal. On a busy
system, this causes severe performance issues. It also doesn't seem very
useful. Routes and addresses are complex things (e.g. policy routing).
If you want to do anything serious, you must check netlink (or find
another way to get the information).
Note that NMPlatform already ignores routes of certain protocols
(ip_route_is_alive()). It also does not expose most route attributes,
making the output only useful for very limited cases (e.g. displaying to
the user for information).
Limit the number of exported entries to 100.
Try adding 100K routes one-by-one. Run a `nmcli monitor` instance.
Re-nice the nmcli process and/or keep the CPUs busy. Then start a script
that adds 100k routes. Observe. Glib's D-Bus worker thread receives the
messages and queues them for the main thread. The main thread is too
slow to process them, the memory consumption grows very quickly in Giga
bytes. Afterwards, the memory also is not returned to the operation
system, either because of fragmentation or because the libc allocator
does anyway not return heap memory.
It doesn't work to expose an unlimited number of objects on D-Bus. At
least not with an API, that sends the full list of all routes, whenever
a route changes. Nobody can use that feature either, because the only
use is a quick overview in `nmcli` output or a GUI. If you see 100+
routes there, that becomes unmanageable anyway. Instead use netlink if
you want to handle the full list of addresses/routes (or some other
API).
It doesn't scale. If you add 100k routes one-by-one, then upon each
change from platform, we will send the growing list of the routes on
D-Bus.
That is too expensive. Especially, if you imagine that the receiving end
is a NMClient instance. There is a D-Bus worker thread that queues the
received GVariant messages, while the main thread may not be able to
process them fast enough. In that case, the memory keeps growing very
fast and due to fragmentation it is not freed.
Instead, rate limit updates to 3 per second.
Note that the receive buffer of the netlink socket can fill up and we
loose messages. Therefore, already on the lowest level, we may miss
addresses/routes. Next, on top of NMPlatform, NMIPConfig listens to
NM_L3_CONFIG_NOTIFY_TYPE_PLATFORM_CHANGE_ON_IDLE events. Thereby it
further will miss intermediate state (e.g. a route that exists only for
a short moment).
Now adding another delay and rate limiting on top of that, does not make
that fundamentally different, we anyway didn't get all intermediate states
from netlink. We may miss addresses/routes that only exist for a short
amount of time. This makes "the problem" worse, but not fundamentally
new.
We can only get a (correct) settled state, after all events are
processed. And we never know, whether there isn't the next event
just waiting to be received.
Rate limiting is important to not overwhelm D-Bus clients. In reality,
none of the users really need this information, because it's also
incomplete. Users who really need to know addresses/routes should use
netlink or find another way (a way that scales and where they explicitly
request this information).
If you add a large number of addresses/routes, then the output of
`nmcli` is unusable. It also doesn't seem too useful.
Limit the number to show up to 10 addresses and 10 routes.
If there are more than 10 addresses, then print an 11th line with
inet4 ... N more
Actually, if there are exactly 11 addresses, then don't waste an extra
line to print "1 more". Instead, still print the 11th address. Same for
routes.
The user can set "$NMTST_LIBTOOL" to empty or the path to libtool.
If unset, we want to detect it. However, previously we would always
use "$SCRIPT_PATH/../libtool", even if that file doesn't exit. Improve
that.
For example, when running on Ubuntu with the ".gitlab-ci/debian-install.sh"
script, we don't have libtool. Entering such a container and running the
"run-nm-test.sh" script will fail as "$SCRIPT_PATH/../libtool" doesn't
exist.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1800
This was forgotten to implement. But we cannot just forget about it.
Libnm emits a warning about unknown properties, exactly to catch such
bugs. Properties that are not implemented, must be marked to be ignored.
Next, support for this property will be added. But that introduces new
API, which cannot be backported. Hence, first fix the problem by marking
the property as ignored. This is a backportable change.
$ LIBNM_CLIENT_DEBUG="warning" G_DEBUG=fatal-warnings nmcli
(process:270215): nm-WARNING **: 15:22:56.125: libnm-dbus: <warn > nmclient[8094a8c217aae461]: get-managed-objects: [/org/freedesktop/NetworkManager/Devices/5]: ignore unknown property org.freedesktop.NetworkManager.Device.IPTunnel.FwMark
Trace/breakpoint trap (core dumped)
Fixes: 351c562491 ('devices: support VTI tunnels')
This commit fixes the build process for the documentation that was previously
unable to build separately via meson due to a dependency issue.
Previously, trying to build the API documentation via `ninja NetworkManager-doc`
failed due to missing dependencies (for example, `nm-dbus-types.xml` was not built).
I believe this happens due to some different handling of static paths vs. custom_target
by meson in this case.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1801
Fixes: 03637ad8b5 ('build: add initial support for meson build system')
README.md:
As this document is mainly for normal users, and in Gitlab it's
displayed by default, it's the main entry point to get information when
someone get into the repo, either via web or cloned.
Added explanations about how to install and use NM and how to find the
documentation for users. Added brief info about how to report issues
and ask for help, with links to CONTRIBUTING.md to get more details.
Added brief link for potential contributors to read CONTRIBUTING.md.
People not familiar with open source projects might not be aware of it.
Deleted the "moved to freedesktop" message. After 15 years, people might
know yet.
Added brief explanation about the free software license.
CONTRIBUTING.md:
Added a link to the list of all communication channels, only mailing
list and IRC were listed.
Added detailed explanation about how to report issues and attach logs.
It also references the new tool anonymize-logs.py.
Added brief guidelines about how to start contributing choosing issues
from the tracker.
Fixed some small formatting issues and added a reference to nm-in-vm,
fixing the link to nm-in-container too.
MAINTAINERS.md:
Added explanation about how to triage and properly label the issues.
This is basically based on the kind-of-workflow that we already have,
but maybe it's a good time to check that it's correct or to propose
small changes (so, please propose changes in review).
Script to do some anonymization to NetworkManager logs. It does
very basic stuff so it shouldn't be trusted without manually
reviewing the logs, but it can still be useful to replace lot
of potentially sensitive data.
What it masks by default:
- MAC addresses
- Public IP addresses
- Hostnames detected from `hostname` command and some known
log messages from NM.
- Hostnames ending in some common domains such as .com or .org
- Hostnames specified via --hostname argument
What it can mask but it doesn't by default:
- Private IPs
Options like --show-macs and --hide-private-ips can override the default
behaviour.
Note that masking IP addresses can make difficult to analyze routing
problems, and trying to be smart analyzing the defined routes from the
logs or from `ip route` can lead to even worse results. Because of this,
if routing problems need to be analyzed, --show-public-ips need to be
passed.
For most strv or string properties, we cannot distinguish between
NULL/unset/default and empty.
It would be difficult to enter in nmcli or grasp how it differs. There
are probably many bugs, where we accept empty strings, and fail to
handle them correctly.
Anyway. For most strv arrays, and empty array and NULL/unset/default are
treated the same. That means, g_object_get() tends to always return NULL
(never an empty strv array) and g_object_set() of an empty strv array
will internally leave the GArray at NULL.
For a few properties, there is a difference. See "ipv[46].dns-options".
See also "clear_emptyunset_fcn" hook in libnm-setting.
Add a way to handle such strv properties with the "direct" mechanism.
Unfortunately, there are several possibilities how to handle NULL and
empty arrays. Therefore we have different variants.
Clean this up, and add a way to preserve whether the array is empty
(previous variants could not distinguish that).
Functions are also renamed, so that if you backport a user of the new
API, you'll get a compiler error if this patch is missing.
Also, nm_strvarray_get_strv_notnull() no longer takes a pointer to a
"GArray*". Previously, it used that to fake an empty strv array. Now
this returns NM_STRV_EMPTY_CC().
_nm_utils_dns_option_validate() allows specifying the address family,
and filters based on that. Note that all options are valid for IPv6,
but some are not valid for IPv4.
It's not obvious, that such filtering is only performed if
"option_descs" argument is provied. Otherwise, the "ipv6" argument is
ignored.
Regardless, it's also confusing to have a boolean "ipv6". When most
callers don't want a filtering based on the address family. They
actually don't want any filtering at all, as they don't pass an
"option_descs". At the same time passing a TRUE/FALSE "ipv6" is
redundant and ignored. It should be possible, to explicitly not select
an address family (as it's ignored anyway).
Instead, make the "gboolean ipv6" argument an "int addr_family".
Selecting AF_UNSPEC means clearly to accept any address family.
This valgrind error is raised by Ubuntu 23:10, with valgrind 1:3.21.0-0ubuntu1 and
glibc 2.38-1ubuntu6:
==141967== Source and destination overlap in memcpy_chk(0x1ffefffe98, 0x1ffefffe92, 8)
==141967== at 0x4851042: __memcpy_chk (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==141967== by 0x502A9CC: memmove (string_fortified.h:36)
==141967== by 0x502A9CC: inet_pton6 (inet_pton.c:226)
==141967== by 0x502A9CC: __inet_pton_length (inet_pton.c:56)
==141967== by 0x502A9CC: inet_pton (inet_pton.c:69)
==141967== by 0x114FA3: nmtst_inet6_from_string_p (nm-test-utils.h:1764)
==141967== by 0x114FA3: _nmtst_assert_ip6_address (nm-test-utils.h:1856)
==141967== by 0x114FA3: test_stable_privacy (test-utils.c:26)
==141967== by 0x4DEC85D: ??? (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7800.0)
==141967== by 0x4DEC78A: ??? (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7800.0)
==141967== by 0x4DECD2A: g_test_run_suite (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7800.0)
==141967== by 0x4DECE07: g_test_run (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7800.0)
==141967== by 0x4F070CF: (below main) (libc_start_call_main.h:58)
==141967==
{
<insert_a_suppression_name_here>
Memcheck:Overlap
fun:__memcpy_chk
fun:memmove
fun:inet_pton6
fun:__inet_pton_length
fun:inet_pton
fun:nmtst_inet6_from_string_p
fun:_nmtst_assert_ip6_address
fun:test_stable_privacy
obj:/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7800.0
obj:/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.7800.0
fun:g_test_run_suite
fun:g_test_run
fun:(below main)
}
This is because memmove() can call __memcpy_chk(), which triggers the
false positive in valgrind.
It probably affects other places too, but for us it's sufficient to
restrict the valgrind suppression to calls from inet_pton(). This
suppression will probably break with LTO enabled.
See-also: https://bugs.kde.org/show_bug.cgi?id=402833https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1802
Instead of doing the broken `podman run` and `podman start` approach,
build an image ("nm-code-format:f38"), cache it, and use it to run
"nm-code-format.sh" via `podman run`. We should build and keep a
container image, not a container.
The benefit is that this allows to hand over the command line arguments
to "nm-code-format.sh". In particular the "-u" and "-F" options, which
are life savers.
This means,
$ contrib/scripts/nm-code-format-container.sh -u
works.
Try also
$ contrib/scripts/nm-code-format-container.sh -h
which tells you that you are running inside the container, and how to
delete/renew the container image.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1798
When connection down is explicitly called on the controller, the port
connection should also be deactivated with the reason user-requested,
otherwise any following connection update on the controller profile
will unblock the port connection and unnessarily make the port to
autoconnet again.
Fixes: 645a1bb0ef ('core: unblock autoconnect when master profile changes')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager-ci/-/merge_requests/1568