Randomly choosing one between the two link creation methods (platform
and external) makes it more difficult to immediately detect when a bug
in introduced. Always execute them using both methods to have a better
code coverage, but still try a third time with a random one to test
different combinations of methods.
Older version of iproute2 fail to add the vxlan (e.g. on Ubuntu 12.04)
Running command: ip link add nm-test-device type vxlan id 42 dev nm-test-parent local 23.1.2.164 group 239.1.2.134 ttl 0 tos 00 dstport 4789 srcport 0 0 ageing 1245
Garbage instead of arguments "id ...". Try "ip link help".
Fallback using only platform.
Strangely on Ubuntu 12.04, when not setting the port range for a vxlan
device, kernel chooses
5: nm-test-device: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
link/ether 72:09:79:62:9c:45 brd ff:ff:ff:ff:ff:ff promiscuity 0
vxlan id 42 group 239.1.2.134 local 23.1.2.164 dev nm-test-parent srcport 32768 61000 dstport 4789 ageing 1245
The macro EWOULDBLOCK is another name for EAGAIN; they are always the
same in the GNU C Library.
https://www.gnu.org/savannah-checkouts/gnu/libc/manual/html_node/Error-Codes.html
Otherwise, we would need a workaround for EWOULDBLOCK too, because
libnl maps that to NLE_FAILURE. So we would have to detect EAGAIN
as (nle == -NLE_FAILURE && errno == EWOULDBLOCK).
When the errno was accidentally set to EAGAIN or EWOULDBLOCK,
we would only read one single message and return that there is
nothing to read.
This means, if there were more then one messages ready to read,
we would only read the first one and return to the main-loop
(which then again calls back to platform as more data is ready
to be read).
The explainations no longer hold for the most part. Documentation
that is not directly the code is bound the expire.
The code is the best documentation!! :)
With g_return() we try to catch wrong invocations of a function
or passing invalid arguments.
Having ~forgot~ to overwrite a virtual function is such a serious
and fundamental issue, that checking for it at runtime is wasteful
and unnessesary.
If we ever hit such a situation, just crash.
Link related functions should have a "nm_platform_link" prefix. Rename.
Naming is a subjective matter and one might argue that omitting
the "link" part from the name is shorter and even preferred.
However, I think functions related to links should have a common
prefix as the underlyings are strongly related.
Let the link-add functions return the internal pointer to the platform
link object. Similar to link-get, which doesn't copy the link either.
Also adjust the sole users of the add-functions (create-and-realize)
to take the pointer.
Eventually we still copy the returned data, because accessing platform can
invalidate the returned pointer. Thus we don't actually safe any copying
by this (at least every use of the function currently leads to the data
being copied).
Still change it, because I think the API of NMPlatform should look like that.
Check if 'ipip' and 'ip6_tunnel' modules are loaded when trying to
perform link tests that require them and skip the tests if the modules
are not available.
Fixes: 133724d958
Fixes: 1a3448b43b
We potentially emit a lot of signals. Don't look up the
signal by name because that adds quite some additional
overhead, like peeking for a GQuark.
Instead pass the numeric signal-id directly.
This enum was unused and meaningless because the platform signals
are emitted as a consequence of netlink messages. It is not clear
whether a netlink message was received due to an external event
or an internal action.
Unslaving from a bridge causes a wrong RTM_DELLINK event for
the former slave.
# ip link add dummy0 type dummy
# ip link add bridge0 type bridge
# ip link set bridge0 up
# ip link set dummy0 master bridge0
# ip monitor link &
# ip link set dummy0 nomaster
18: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop master bridge0 state DOWN group default
link/ether 76:44:5f:b9:38:02 brd ff:ff:ff:ff:ff:ff
18: dummy0: <BROADCAST,NOARP> mtu 1500 master bridge0 state DOWN
link/ether 76:44:5f:b9:38:02
Deleted 18: dummy0: <BROADCAST,NOARP> mtu 1500 master bridge0 state DOWN
link/ether 76:44:5f:b9:38:02
18: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 76:44:5f:b9:38:02 brd ff:ff:ff:ff:ff:ff
19: bridge0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
19: bridge0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
Previously, during do_request_link() we would remember the link that is
about to be requested (delayed_deletion) and delay processing a new
RTM_DELLINK message until the end of do_request_link() -- and possibly
forget about about the deletion, if RTM_DELLINK was followed by a
RTM_NEWLINK.
However, this hack does not catch the case where an external command
unslaves the link.
Instead just accept the wrong event and raise a "removed" signal right
away. This brings the cache in an externally visible, wrong state that
will be fixed by a following "added" signal.
Still do that because working around the kernel bug is complicated. Also,
we already might emit wrong "added" signals for devices that are already
removed. As a consequence, a user should not consider the platform signals
until all events are processed.
Listeners to that signal should accept that added/removed link changes
can be wrong and should preferably handle them idly, when the events
have settled.
It can even be worse, that a RTM_DELLINK is not fixed by a following
RTM_NEWLINK:
...
# ip link set dummy0 nomaster
36: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop master bridge0 state DOWN
link/ether e2:f2:20:98:3a:be brd ff:ff:ff:ff:ff:ff
36: dummy0: <BROADCAST,NOARP> mtu 1500 master bridge0 state DOWN
link/ether e2:f2:20:98:3a:be
Deleted 36: dummy0: <BROADCAST,NOARP> mtu 1500 master bridge0 state DOWN
link/ether e2:f2:20:98:3a:be
37: bridge0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
37: bridge0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
So, when a slave is deleted, we have to refetch it too.
https://bugzilla.redhat.com/show_bug.cgi?id=1285719
On some kernels (at least RHEL-7.2) we receive a spurious RTM_NEWLINK
message after the RTM_DELLINK message for deleting a bond master.
On RHEL-7, the following commands give:
# ip link add dummy0 type dummy
# ip link add bond0 type bond
# ip link set bond0 up
# ip link set dummy0 master bond0
# ip monitor link &
# ip link del bond0
21: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noqueue state DOWN
link/ether 1e:a6:6c:81:c1:8d brd ff:ff:ff:ff:ff:ff
Deleted 21: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
link/ether 1e:a6:6c:81:c1:8d brd ff:ff:ff:ff:ff:ff
20: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
link/ether 1e:a6:6c:81:c1:8d brd ff:ff:ff:ff:ff:ff
21: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
link/ether da:ee:58:70:6f:e5 brd ff:ff:ff:ff:ff:ff
^^^^^^^^^^^^^^^ RTM_NEWLINK after RTM_DELLINK (and there follows no
RTM_DELLINK afterwards)
21: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
link/ether da:ee:58:70:6f:e5 brd ff:ff:ff:ff:ff:ff
20: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noqueue state DOWN
link/ether 1e:a6:6c:81:c1:8d brd ff:ff:ff:ff:ff:ff
20: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noqueue state DOWN
link/ether 1e:a6:6c:81:c1:8d brd ff:ff:ff:ff:ff:ff
Fix that by reverting clear_REFRESH_LINK(). This fix has two downsides:
- on kernels where this hack is not necessary, we unnecessarily refetch
a link
- the platform cache first removes the link, adds it again and removes
it. This is ugly, but should have no real consequences because all
listeners to the platform signals delay processing the signals to an
idle handler.
https://bugzilla.redhat.com/show_bug.cgi?id=1285719
This reverts commit f4f4e1cf09.
The related bug rh#1262908 in kernel causes missing netlink notifications
when moving a IFA_LINK interface to another netns.
Add a test for our workaround.
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1262908
The related bug rh#1285827 in kernel causes a missing IFLA_LINK/parent
attribute when creating a veth pair:
# ip monitor link &
[1] 6745
# ip link add dev vm1 type veth peer name vm2
30: vm2@NONE: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether be:e3:b7:0e:14:52 brd ff:ff:ff:ff:ff:ff
31: vm1@vm2: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN
link/ether da:e6:a6:c5:42:54 brd ff:ff:ff:ff:ff:ff
Add a workaround and test.
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1285827