We have many static helper libraries, like libnm-glib-aux or libnm-core.
These can be statically linked in any end-binary as internal API. However, they
must only be linked once.
Also, we have various plugins (device, settings, ppp, wwan) which are
dlopened by NetworkManager. They should use the symbols from
NetworkManager core. It is important that they do not link with the
static libraries already, because also NetworkManager core links with
it, so these symbols will be duplicate.
As the symbols are internal, you might think that it is not a real
problem to duplicate them. However, there are also global variables,
like the hash tables for NMRefStr or the seed for NMHash. These global
variables must be only be used once, and hence also these symbols must
no be duplicated.
Fix that by adding a new dependency that is for the core plugins. This
dependency only has "include_directories" but not "link_with".
Naming for us is hard, because everything is an "nm". However, let's
standardize on the term "core" for the daemon, and not "daemon".
Eventually I would like to move the daemon from "src/" to "src/core/",
rename the dep in anticipation of that.
A timeout for tests should not be reached anyway. It's only
a fail-safe for not running indefinitely (and for meson not killing
the test too early). We don't need to run test-libnm with a shorter
timeout.
This is basically all of libnm as a static-library. The name
liblibnm isn't great. Arguably, we do have quite a lot of
libnmxyz, so finding a good name is hard. But libnm-static seems
a better name. Rename.
Drop some "helper" variables that are only used once. These variables
spread out what is defined, and only make the meson file more complicated
to follow.
OVS system interfaces can start to connect even before the ovsdb is
ready. However, the connection attempt is doomed to fail and the
NMSettingsConnection gets blocked with reason FAILED.
Unblock them once the ovsdb is ready.
Ideally, NMPolicy should subscribe to the NMOvsdb::ready signal,
however NMOvsdb is in a plugin, so it's easier if NMOvsdb directly
calls a function of the core.
Sometimes during boot the device starts an activation with reason
'assumed', then the ovs interfaces gets removed from the ovsdb. At
this point the device has an active-connection associated; it has the
activate_stage1_device_prepare() activation source scheduled, but it
is still in disconnected state.
Currently the factory doesn't recognizes that the device is activating
and allows the device to proceed even if the interface was removed
from the ovsdb. Fix this by checking the presence of an
active-connection instead.
During shutdown, NM always tries to remove from ovsdb all bridges,
ports, interfaces that it previously added. Currently NM doesn't run
the main loop during shutdown and so it's not possible to perform
asynchronous operations. In particular, the NMOvsdb singleton is
disposed in a destructor where it's not possible to send out all the
queued deletions.
The result is that NM deletes only one OVS interface, keeping the
others. This needs to be fixed, but requires a rework of the shutdown
procedure that involves many parts of NM.
Even when a better shutdown procedure will be implemented, we should
support an unclean shutdown caused by e.g. a kernel panic or a NM
crash. In these cases, the interfaces added by NM would still linger
in the ovsdb.
Delete all those interface at NM startup. If there are connections
profiles for them, NM will create them again.
Also, NMOvsdb now emits a NM_OVSDB_READY signal and provides a
nm_ovsdb_is_ready() to allow other parts of the daemon to order
actions after the initial OVS cleanup.
https://bugzilla.redhat.com/show_bug.cgi?id=1861296https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/700
If an OVS system interface that is active in NM gets removed
externally from the ovsdb, NM doesn't notice it and keeps the
connection active.
Let the OVS factory also handle the removal message for system
interfaces. In that case, we need to match the removed interface name
to the actual device, which can be of any kind.
I don't think that the way we track RA data is correct. For example,
all data (like DNSSL) is tracked regardless of their router. That means,
if a router sends an update, we cannot prematurely expire a DNSSL list
which is no longer announced by that router. Anyway, that should be
reviewed and considered another time.
However, we also must rate limit the number of elements we track in
general. So far we would only do that for the addresses.
The limit for max_addresses, is already read by sysctl. We only limit
it further to at most 300 addresses. The 300 is arbitrarily chosen.
The limit for DNSSL and RDNSS are taken from systemd-networkd
(NDISC_DNSSL_MAX, NDISC_RDNSS_MAX), which is 64.
The limit for gateways and routes are not based on anything and just
made up.
RFC4861 describes how to solicit routers, but this is later extended
by RFC7559 (Packet-Loss Resiliency for Router Solicitations).
Rework the scheduling of router solicitations to follow RFC7559.
Differences from RFC7559:
- the initial random delay before sending the first RS is only
up to 250 milliseconds (instead of 1 second).
- we never completely stop sending RS, even after we receive a valid
RA. We will still send RS every hour.
We no longer honor the sysctl variables
- /proc/sys/net/ipv6/conf/$IFNAME/router_solicitations
- /proc/sys/net/ipv6/conf/$IFNAME/router_solicitation_interval
I don't think having the autoconf algorithm configurable is useful.
At least not, if the configuration happens via sysctl variables.
https://tools.ietf.org/html/rfc4861#section-6.3.7https://tools.ietf.org/html/rfc7559#section-2.1
Elements of RAs have a lifetime. Previously we would track both
the timestamp (when we received the RA) and the lifetime.
However, we are mainly interested in the expiry time. So tracking the
expiry in form of timestamp and lifetime is redundant and cumbersome
to use.
Consider also the cases nm_ndisc_add_address() were we mangle the expiry.
In that case, the timestamp becomes meaningless or it's not clear what
the timestamp should be.
Also, there are no real cases where we actually need the receive timestamp.
Note that when we convert the times to NMPlatformIP6Address, we again need
to synthesize a base time stamp. But here too, it's NMPlatformIP6Address
fault of doing this pointless split of timestamp and lifetime.
While at it, increase the precision to milliseconds. As we receive
lifetimes with seconds precision, one might think that seconds precision
is enough for tracking the timeouts. However it just leads to ugly
uncertainty about rounding, when we can track times with sufficient
precision without downside. For example, before configuring an
address in kernel, we also need to calculate a remaining lifetime
with a lower precision. By having the exact values, we can do so
more accurately. At least, in theory. Of course NMPlatformIP6Address
itself has only precision of seconds, we already loose the information
before. However, NMNDisc no longer has that problem.
This was done since NDisc code was added to NetworkManager in
commit c3a4656a68 ('rdisc: libndp implementation').
Note what it does: in clean_dns_*() we will call solicit_router()
if the half-life of any entity is expired. That doesn't seem right.
Why only for dns_servers and dns_domains, but not routes, addresses
and gateways?
Also, why would the timings for when we solicit depend on when
elements expire. It is "normal" that some of them will expire.
We should solicit based on other parameters, like keeping track
of when and how to solicit.
Note that there is a change in behavior here: if we stopped
soliciting (either because we received our first RA or because
we run out of retries), then we now will never start again.
Previously this was a mechanism so that we would eventually
start soliciting again. This will be fixed in a follow-up
commit soon.
Otherwise, if there is a problem with the test they will run
indefinitely. Sure, meson will kill them after a while, but I
don't think autotools does, does it? Anyway, give a maximum
time to wait.
We have g_idle_add() and g_timeout_add(). But these return
those odd guint source ids. That is totally pointless. The
only potential benefit is that a guint is only 4 bytes while
a pointer is 8 bytes (on 64 bit systems). Otherwise, it seems
always preferable to have an actual GSource instance instead
of an integer. It also saves the overhead in g_source_remove()
which first needs to do a hash lookup to find the GSource.
A GSource instance would theoretically work with multiple
GMainContext instances, while g_source_remove() only works
wit g_main_context_default().
On the other hand we have helper API like nm_g_idle_source_new()
and nm_g_timeout_source_new(), which is fully flexible and sensible
because it returns a reference to the GSource instance. However, it
is a bit verbose to use in the common case.
Add helper functions that simplify the use and are conceptionally
similar to g_{idle,timeout}_add() (hence the name).
This locks NM into dhcpcd-9.3.3 as that is the first version to support
the --noconfigure option. Older versions are no longer supported by NM
because they do modify the host which is undesirable.
Due to the way dhcpcd-9 uses privilege separation and that it re-parents
itself to PID 1, the main process cannot be reaped or waited for.
So we rely on dhcpcd correctly cleaning up after itself.
A new function nm_dhcp_client_stop_watch_child() has been added
so that dhcpcd can perform similar cleanup to the equivalent stop call.
As part of this change, the STOP and STOPPED reasons are mapped to
NM_DHCP_STATE_DONE and PREINIT is mapped to a new state NM_DHCP_STATE_NOOP
which means NM should just ignore this state.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/668
The prepare() step of the GSource called frequently when the outer
GMainContext is preparing.
In the common case we have
- few file descriptors in the inner context to track
- the file descriptors don't change
We also need to consider whether the file descriptors change, because
some book-keeping is necessary when they do. But usually they don't
change.
Hence, let's optimize the prepare step to avoid a heap allocation
in the common case.
Also, because we use nm_utils_g_main_context_create_integrate_source()
with the internal GMainContext of NMClient. That context always has
a small number of file descriptors to track and it doesn't see much
change. In the vast majority of cases, the heap allocation can be
avoided.
When "tools/run-nm-test.sh" is called from the build scripts,
it has as first argument "--called-from-make". Then all arguments
must follow in a well defined order, which autotools/meson understand
and follow.
Another main use is however to call "tools/run-nm-test.sh" form the command
line. In that case, we want to have the command line parsing convenient.
Some of the parameters to the script are interpreted by the script, and
some are passed on to the test. The user can use "--" to separate the
parameters:
./tools/run-nm-test.sh -m shared/nm-glib-aux/tests/test-shared-general -- -p /general/test_strv_cmp
Otherwise, on the first unknown argument "tools/run-nm-test.sh" would
assume all following arguments are for the test. So this worked too:
./tools/run-nm-test.sh -m shared/nm-glib-aux/tests/test-shared-general -p /general/test_strv_cmp
However, if you now want to run the test with valgrind, you need to edit
the command line before the test arguments, like
./tools/run-nm-test.sh -m shared/nm-glib-aux/tests/test-shared-general -v -p /general/test_strv_cmp
That is inconvenient because I call the script from the shell history and
the cursor is at the end of the line. Instead, assume that all unknown parameters
are for the test (until "--" is encountered).
Now this works:
./tools/run-nm-test.sh -m shared/nm-glib-aux/tests/test-shared-general -p /general/test_strv_cmp -v
Arguably, now also
./tools/run-nm-test.sh -m shared/nm-glib-aux/tests/test-shared-general -p -v /general/test_strv_cmp
works, which is a bid odd.
nm_device_get_effective_ip_config_method() must called only on a
device with an applied connection. Fix assertion failure [1]:
nm_device_get_effective_ip_config_method: assertion 'NM_IS_CONNECTION(connection)' failed
[1] http://faf.lab.eng.brq.redhat.com/faf/reports/20217/
Fixes: 09c8387114 ('policy: use the hostname setting'):