By default, bond/bridge/team devices ignore carrier, and so do their
ports. However, it can make sense to set '[device*].ignore-carrier' for
the controller device. Meaningfully support that.
This is a follow up to commit 8c91422954 ('device: handle carrier
changes for master device differently'), which didn't fully solve the
problem.
What already works, is that when you set ignore-carrier for the
controller, then after loss of carrier and a carrier wait timeout, the
controller and ports go down. If both the controller and port profiles
have autoconnect disabled, they stay down and that's it. It works as
expected, but is not very useful, because when we want to automatically
react on carrier loss, we also want to automatically reconnect.
For controller profiles, carrier only makes sense when ports are
attached. However, we can (auto) activate controller profiles without
ports. So when the user enables autoconnect for the controller profile,
then the profile will eagerly reconnect. That means, after loss of
carrier, the device goes down and reconnects right away. It means, when
configuring a bond with ignore-carrier=no and autoconnect=yes, then
the sensible thing happens (an immediate reconnect). That is just not
a useful configuration.
The useful way to configure configure ignore-carrier=no for a controller
device, autoconnect on the master must be disabled while being enabled
on the ports. After all, it's the ports that will autoconnect based on
the carrier state and bring up the controller with them.
Note that at the moment when a port decides to autoconnect, the
controller profile is not yet selected. That only happens later during
_internal_activate_device() after searching it with find_master(). At
that point, the port profile checks whether it should autoconnect based
on its own carrier state, and abort if not.
If autoconnect is aborted due to lack of carrier, the profile gets
blocked from autoconnect with reason "failed". Hence, when the carrier
returns, we need to clear any "failed" blocked reasons and schedule
another autoconnect check,
Note that this really only works if the port is itself a simple device,
like an ethernet. If the port is itself a software device (like a bond,
or a VLAN), then the carrier state in _internal_activate_device() is
unknown, and we cannot avoid autoconnect. It's unclear how that could
make sense, if at all.
This setup can be combined with "connection.autoconnect-slaves=yes". In
that case, we have the first port to autoconnect when they get carrier,
bringing up the controller too. Usually the other ports that don't have
carrier would not autoconnect, but with autoconnect-slaves they will.
The effect is, that we autoconnect whenever any of the ports has
carrier, and then we immediately also bring up the ports that don't have
carrier (which we usually would not).
https://bugzilla.redhat.com/show_bug.cgi?id=2156684https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1658
Between ppp 2.4.8 and 2.4.9, "rp-pppoe.so" was renamed to "pppoe.so" (and a
symlink created). Between 2.4.9 and 2.5.0, the symlink was dropped.
See-also: b2c36e6c0e
I guess, NetworkManager always meant to use ppp's "(rp-)pppoe.so"
plugin, and never the library from the rp-pppoe project.
If a user actually wants to use the plugin from rp-pppoe project, then
this is going to break.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1312
Fixes: afe80171b2 ('ppp: move ppp code to "nm-pppd-compat.c"')
Previously, the ppp version was only detected (and used) at one place,
in "nm-pppd-compat.c", after including the ppp headers. That was nice
and easy.
However, with that way, we could only detect it after including ppp
headers, and given the ugliness of ppp headers, we only want to include
them in "nm-pppd-compat.c" (and nowhere else).
In particular, 'nm-pppd-compat.c" uses symbols from the ppp daemon, it
thus can only be linked into a ppp plugin, not in NetworkManager core
itself. But at some places we will need to know the ppp version, outside
of the ppp plugin and "nm-pppd-compat.c".
Additionally, detect it at configure time and place it in "config.h".
There is a static assert that we are in agreement with the two ways of
detection.
This makes the code more generic, where _match_section_infos_lookup()
accepts a match_data argument. Previously, it supported two hard-coded
approaches (from-device, from-platform).
Note that _match_section_infos_lookup() still accepts either a
"match_data" or a "device" argument. It might be nicer, if
_match_section_infos_lookup() would only accept "match_data". If a
caller wants lookup by "device", they would need to call
nm_match_spec_device_data_init_from_device() first. However, it's done
this way with a separate "device" argument, because often we don't have
any configuration to search for, and the initialization of the
match-data can be saved. So for the common case where we want to lookup
by "device", we initialize the "match-data" lazy and on demand.
Struct allow named arguments, which seems easier to maintain instead of
a function with many arguments. Also, adding a new parameter does not
require changes to most of the callers.
The real advantage of this is that we encode all the search parameters
in one argument. And we can add that argument to
_match_section_infos_lookup(), alongside lookup by NMDevice or
NMPlatformLink.
All callers eventually want a boolean instead of a NMMatchSpecMatchType.
I think the NMMatchSpecMatchType enum still has value at the lower
layers, where the enum values are clearer (when reading the code). So
don't drop NMMatchSpecMatchType entirely.
However, let's add nm_match_spec_match_type_to_bool() to convert the
match-type to a boolean to avoid duplicating the code.
Arguably, a kernel link is needed for DHCP and so the interface name
univocally identifies a device (for example, the OVS interface). But
for consistency and clarity, store the device type to be used for
logging.
When logging, messages include the interface name to specify what
device they refer to. In most case the interface name is unique.
There are some devices that don't have a kernel link associated, and
their interface name is not guaranteed to be unique. This is currently
the case for OVS bridges and OVS ports. When reading a log with
duplicate interface names, it is difficult to understand what is
happening. And this is made worse by the fact that it is common
practice to assign the same name to all devices in a OVS hierarchy
(bridge, port, interface).
To make logs unambiguous, we want to print the device type together
with the name; however we don't want to *always* print the type
because in most cases it's not useful and it would consume valuable
real estate on the screen. Adopt a simple heuristic of showing the
type only for OVS devices.
This commit adds a helper function to return the device type to show
in logs, when it is needed.
This is rather bad, because if we reach the "goto again" case,
the error variable is not cleared. Subsequently passing the
error location to nm_device_reapply_finish() will trigger a glib
warning.
Fixes: 29b0420be7 ('nm-cloud-setup: set preserve-external-ip flag during reapply')
Once we start reconfiguring the system, we need to finish on all
interfaces. Otherwise, we might reconfigure some interfaces, abort
and leave the network broken. When that happens, a subsequent run
might also be unable to recover, because we are unable to reach the
HTTP meta data service.
https://bugzilla.redhat.com/show_bug.cgi?id=2207812
Fixes: 69f048bf0c ('cloud-setup: add tool for automatic IP configuration in cloud')
network-online.target should not be reached before nm-cloud-setup
completes configuring the network, which may make user service get
started before the network is fully configured.
Setting nm-cloud-setup.service as "Before=network-online.target" would
maybe have already achieved that. However, also use a pre-up dispatcher
script, so that the device activation in NetworkManager is also waiting
for nm-cloud-setup to complete.
https://bugzilla.redhat.com/show_bug.cgi?id=2151040https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1653
dnsmasq since 2.80 properly forwards all incoming queries with DO bit
set. That ensures even if the dnsmasq does not do validation, it will
always serve all DNSSEC records if the upstream server provides them.
Regardless local validation is enabled or disabled, it will always offer
all data required for validation to its clients.
But does not set AD bit on local responses unless it did the actual
validation itself.
In case users trust their connection to validating DNS server, they
would have to declare it by adding dnssec-proxy option to dnsmasq conf.d
directory. Because there is no negated no-dnssec-proxy, it cannot be
turned off. I think there is no good reason to be on for all cases and
it would be possible to enable it if still wanted. Move the decision to
the user.
That makes it conform with RFC 4035, paragraph 3.2.3.
Signed-off-by: Petr Menšík <pemensik@redhat.com>
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1639