It's utterly useless: the textual version of the reason if logged only if
the plugin fails; but the plugin failure already logs the plugin state
change reason which is directly translated to the connection one.
Track both device and active connections at the same time and decide
whether activation finished (either successfully or not) in a single
place, check_activated().
This makes the already too complex code path a bit more straightforward
and also makes it possible to be more reasonable about the diagnostics.
Now that the active connection signals the reason, we include it; but if
the failure is due to the device disconnection while we're activating,
include a device reason instead, since it's often more useful. Like:
Before:
Error: Connection activation failed.
Without considering the device:
Error: Connection activation failed: the base network connection was interrupted
After:
Error: Connection activation failed: The Wi-Fi network could not be found
Note that the reason tracking starts as soon as the object exists (which
is immediately after GDBusObject is created), not when the asynchronous
NMObject initialization finishes. That is so that we the reason changes
in between are not lost.
The vpn-connection should probably be doing the same.
It includes a reason code that makes it possible for the clients to be
more reasonable about error messages.
The reason code is essentially copied from the VPN, plus three more
reasons that were useful for non-VPN connections.
When deciding whether to touch a device we sometimes look at whether
the active connection is external/assumed. In many cases however,
there is no active connection around (e.g. while moving the device
from state unmanaged to disconnected before assuming).
So in most cases we instead look at the device-state-reason to decide
whether to touch the interface (see nm_device_state_reason_check()).
Often it's desirable to have no state and passing data as function
arguments. However, the state reason has to be passed along several hops
(e.g. a queued state change). Or a change to a master/slave can affect
the slave/master, where we pass on the state reason. Or an intermediate
event might invalidate a previous state reason. Passing the state
whether to touch a device or not as a state-reason is cumbersome
and limited.
Instead, the device should be aware of whats going on. Add a
sys-iface-state with:
- SYS_IFACE_STATE_EXTERNAL: meaning, NM should not touch it
- SYS_IFACE_STATE_ASSUME: meaning, NM is gracefully taking over
- SYS_IFACE_STATE_MANAGED: meaning, the device is managed by NM
- SYS_IFACE_STATE_REMOVED: the device no longer exists
This replaces most checks of nm_device_state_reason_check() and
nm_active_connection_get_activation_type() by instead looking at
the sys-iface-state of the device.
This patch probably has still issues, but the previous behavior was
not very clear either. We will need to identify those issues in future
tests and tweak the behavior. At least, now there is one flag that
describes how to behave.
Now we only search for a candiate with matching UUID. No need to
first lookup all activatable connections, just find the candidate
by UUID and see if it is activatable.
- rename find_ac_for_connection() to
active_connection_find_first_by_connection().
This function has the unexpected(?) behavior to either
search by the @connection using pointer identity, or
by looking up the UUID (depending on whether @connection
is a NMSettingsConnection).
- Split out a active_connection_find_first() part.
The name "find-first" makes it clear that we walk the list until
a matching active-connection is found. That's why I added the
max_state argument. Otherwise, it would have to be called
"find-first-non-disconnected".
- drop nm_manager_get_connection_device(). It only had two callers.
It also used the ambiguity of the @connection argument, but
only one of the two callers cared about that. Meaning,
_internal_activate_device() now explicitly does a lookup by
the NMSettingsConnection.
There is only one caller of assume_connection().
The tasks there are not clearly separate and it is clearer just to
have one large recheck_assume_connection() function which proceeds
step by step, instead of breaking it into separate parts and move
them apart in the source code. The latter implies that -- unless
we forward-declare assume_connection() -- the order of definition
with assume_connection,recheck_assume_connection is contrary the
flow of the code.
Having a separate function in this case is not a simplification
so merge it.
An EXTERNAL activation type comes with a nm-generated, volatile,
in-memory connection. When the user actively modifies that connection, we
shall mark the active connection as fully managed.
Before, we would have the concept of assumed connections, which is used
for (1) externally configured device that NetworkManager should not
touch and (2) connections that NetworkManager should gracefully take
over after a restart (seamlessly, non-destructively).
The behavior was unclear and mixed. It wasn't clear whether the device
is in no-touch mode (1) or gracefully take-over (2).
Previous commits already introduce separate activation types EXTERNAL (1)
and ASSUME (2).
Also, previously, we would for both (1) and (2) try to find a matching
connection and use it. That doesn't make sense for either.
In the external case (1), we should not pretend that an existing connection
is active. Let's always create a new in-memory connection for these
cases. Note that this means, external devices now will always generate
a connection, instead of pretending an existing one is active.
For the assume case (2), we shall not use nm_utils_match_connection() to
guess which connection might be active. It can only the one that was
active on a previous run of NetworkManager. So, use the information from
the state file and try to activate it. If that fails, it is not an
assume activation type. Note, that this means we now most of the time
don't do ASSUME anymore. Most of the time we do EXTERNAL activation
That is because the state information is only available after restart
of NetworkManager.
We need a distinction between external activations and assuming
connections. The former shall have the meaning of devices that are
*not* managed by NetworkManager, the latter are configurations that
are gracefully taken over after restart (but fully managed).
Express that in the activation-type of the active connection.
Also, no longer use the settings NM_SETTINGS_CONNECTION_FLAGS_VOLATILE
flag to determine whether an assumed connection is "external". These
concepts are entirely orthogonal (although in pratice, external
activations are in-memory and flagged as volatile, but the inverse
is not necessarily true).
Also change match_connection_filter() to consider all connections.
Later, we only call nm_utils_match_connection() for the connection
we want to assume -- which will be a regular settings connection,
not a generated one.
nm_device_uses_assumed_connection() basically called
nm_active_connection_get_assumed() on the device.
Rename those functions to be closer to the activation-type
flags.
The concepts of "assume", "external", and "assume_or_external"
will make sense with the following commits.
The concept of assumed-connection will change. Currently we mark
connections that are generated and assumed as "nm-generated-assumed".
That has several consequences, one of them being that such a settings
connection gets deleted when the device disconnects.
That is, such a settings connection lingers around as long as it's active,
but once it deactivates it gets automatically deleted. As such, it's
a more volatile concept of an in-memory connection.
The concept of such automatically cleaned up connections is useful beyond
generated-assumed. See the related bug rh#1401515.
Currently, we determine NMD_CONNECTION_PROPS_EXTERNAL based
on the settings connection. That is not optimal, because whether
a connection is assumed or externally managed, should be really a
property of the active-connection. So, in the this will change soon
and we would need yet another argument to nm_dispatcher_call().
Instead, drop the settings-connection and applied-connection
arguments and fetch them from the device as needed (but allow
to pass a specific act-request argument to explicitly state
which active connection to use).
Also, rename nm_dispatcher_call() to nm_dispatcher_call_device(),
it this is not a generic dispatcher call, but it is particularly
related to device events. Likewise, rename nm_dispatcher_call_sync()
to nm_dispatcher_call_device_sync().
Remove the redundant action argument. It is clear, that
nm_dispatcher_call_connectivity() is called with action
NM_DISPATCHER_ACTION_CONNECTIVITY_CHANGE.
On the other hand, add the async callbacks. Altough they are
not used at the moment, it seems more correct that an async
API has a callback and a call-id to cancel the invocation.
If the connection candidate contains a static route
without route-metric, but overrides the default
route-metric, matching would have failed:
ipv4.route-metric: 200
ipv4.routes: { ip = 192.168.5.0/32 }
Then, we must not compare existing routes using the device's
metric, but must use the overwrite from the connection.
When hostname changes, resolv.conf should be rewritten to update the
"search" option with the new domain parameters. If no device is
active nor going to activate, skip triggering resolv.conf update.