When a new device gets added or when an existing one changes name,
virtual connections that refer to it as parent device may become
ready, so let's try to activate them.
Previously we tried to activate virtual connections only on startup
and, after commit d8e1590c50 ("manager: retry device creation for
connection that would use a newly created device"), also when a
connection was added/changed, but this doesn't cover the case in which
a parent device appears or changes name at runtime.
https://bugzilla.redhat.com/show_bug.cgi?id=1275875
With DHCPv6 it might be quite some time (we may even time out) until we get all
the information to finish activation, but we already have the addresses and
routes.
With the final removal the reason of NOW_UNMANAGED causes the cleanup on the
device to be run, which downs the device:
#0 nm_platform_link_set_down (self=0x555555a29bb0, ifindex=1711) at platform/nm-platform.c:1111
#1 0x00005555555d6ccf in nm_device_take_down (self=self@entry=0x555555c07c70, block=block@entry=1) at devices/nm-device.c:8175
#2 0x00005555555df0c7 in _set_state_full (self=0x555555c07c70, state=NM_DEVICE_STATE_UNMANAGED, reason=NM_DEVICE_STATE_REASON_NOW_UNMANAGED, quitting=quitting@entry=0) at devices/nm-device.c:9825
#3 0x00005555555dfa97 in nm_device_state_changed (self=<optimized out>, state=<optimized out>, reason=<optimized out>) at devices/nm-device.c:10084
#4 0x00005555555e472c in nm_device_set_unmanaged_flags (self=<optimized out>, flag=flag@entry=NM_UNMANAGED_INTERNAL, unmanaged=unmanaged@entry=1, reason=reason@entry=NM_DEVICE_STATE_REASON_NOW_UNMANAGED)
at devices/nm-device.c:8745
#5 0x00005555555e54a9 in nm_device_set_unmanaged_quitting (self=<optimized out>) at devices/nm-device.c:8806
#6 0x000055555565b1aa in remove_device (manager=manager@entry=0x555555a4a2c0, device=0x555555c07c70, quitting=quitting@entry=1, allow_unmanage=allow_unmanage@entry=1) at nm-manager.c:833
#7 0x0000555555660b81 in nm_manager_stop (self=0x555555a4a2c0) at nm-manager.c:4389
#8 0x00005555555b3f9b in main (argc=1, argv=0x7fffffffdba8) at main.c:493
The function returns early when autoconnect is off, so there's no reason to
branch for that case below. The signal is only generated for autoconnect=true.
When there's a slave that allows autoconnection and an unrealized master this
would cause the master activation to fail.
For the actual auto-activations the proper check is already done in NMPolicy's
auto_activate_device().
complete_request() possibly frees the request, making the @script and @request
a dangling pointer. We must not use those pointers, except a plain pointer-
comparison.
The request could have been completed by the call to complete_request() just above.
In that case we should just return instead of looking for nowait scripts to run.
Core was generated by `/usr/libexec/nm-dispatcher'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f6c916ad0fb in complete_script (script=<error reading variable: value has been optimized out>, script=<error reading variable: value has been optimized out>) at nm-dispatcher.c:349
349 if ( script->request->num_scripts_nowait == 0
(gdb) bt
#0 0x00007f6c916ad0fb in complete_script (script=<error reading variable: value has been optimized out>, script=<error reading variable: value has been optimized out>) at nm-dispatcher.c:349
#1 0x00007f6c8f02d484 in g_child_watch_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at gmain.c:5148
#2 0x00007f6c8f03079a in g_main_context_dispatch (context=0x7f6c9358d9a0) at gmain.c:3109
#3 0x00007f6c8f03079a in g_main_context_dispatch (context=context@entry=0x7f6c9358d9a0) at gmain.c:3708
#4 0x00007f6c8f030ae8 in g_main_context_iterate (context=0x7f6c9358d9a0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3779
#5 0x00007f6c8f030dba in g_main_loop_run (loop=0x7f6c9358da80) at gmain.c:3973
#6 0x00007f6c916abcd4 in main (argc=1, argv=0x7ffee4326768) at nm-dispatcher.c:935
(gdb)
Fixes: e97a334e37https://bugzilla.redhat.com/show_bug.cgi?id=1297826
The idea of NMDevice:realize() was to
(1) update the device properties
(2) fail realization if some critical properties are missing
(1) is already done during nm_device_setup_start().
(2) was only implemented by NMDeviceVlan:realize(), but it
basically was just checking whether such a platform device exists.
Other implementations don't do that either and it opens up for a race
when the device gets deleted externally.
Extract a function update_properties() similar to other device
implementations. It reads the device specific data from platform
and raises property-changed notifications.
update_properties() is now called by realize_start_notify().
NMDeviceVlan:realize() -- which previously implemented something like
update_properties() -- now does nothing. Note that previously
realize() might have failed, but this is different from other device
implementations and it was unclear what a failure really meant.
Ok, it might fail because the link was not found in the platform cache.
But other implementations don't check that either, so why vlan? And
how to handle that properly anyway?
Therefore realize()'s implementation is no longer needed because
nm_device_realize() already calls realize_start_setup(), which
calls realize_start_notify() and update_properties().
update_connection() no longer refreshes the device properties.
Instead it only modifies the passed-in NMConnection. It's a bit
ugly that update_connection() uses both cached properties from
NMDeviceVlan (vlan_id) and platform properties (xgress maps).
Also, update_connection() doesn't return early on error but continues
trying to update the NMConnection. The reason is that update_connection()
cannot return a failure status.
The virtual function NMDevice:realize() is only called by
nm_device_realize() and immediately followed by nm_device_setup_start().
Devices already overwrite setup_start_notify() to update their properties.
No need to duplicate that in realize().
All implementations of NMDevice:setup_start() in derived classes
invoke the parent implementation first. Enforce that by moving
NMDevice:setup_start() to realize_start_setup() and only notify
derived classes afterwards via NMDevice:realize_start_notify().
Reapplying a connection should not be done by iterating over and
(unsorted) @diffs array. Instead the order matters! E.g. first layer 2
before IP settings. Thus extracting those individual updates on a per-setting
base to different reapply_*() functions is more complicated, albeit incorrect
in complex cases. We need full control over how to reapply changes, one
after the other.
Also, once we start applying changes, we cannot really abort on error.
We can only continue best-effort and hope for the best.
Also, always reapply certain settings, even if the configuration doesn't
change. That means, if the user externally deletes a static IP address,
he can call reapply() to restore it. Even though he doesn't provide a
different setting to apply.
Also revert the changes to nm_device_reapply_settings_immediately().
Effectively there is little code that can be reused.
Add audit logging.
The main reason to introduce the "no-wait.d" dispatcher directory was
"10-ifcfg-rh-routes.sh", which (as a pre-up script) delays activation.
We even extracted the script to a separate package on RHEL to avoid
delays by default.
Invoke the script via no-wait.d.