Coccinelle:
@@
expression a, b;
@@
-a ? a : b
+a ?: b
Applied with:
spatch --sp-file ternary.cocci --in-place --smpl-spacing --dir .
With some manual adjustments on spots that Cocci didn't catch for
reasons unknown.
Thanks to the marvelous effort of the GNU compiler developer we can now
spare a couple of bits that could be used for more important things,
like this commit message. Standards commitees yet have to catch up.
For completeness, extend the API to support non-persistant
device. That requires that nm_platform_link_tun_add()
returns the file descriptor.
While NetworkManager doesn't create such devices itself,
it recognizes the IFLA_TUN_PERSIST / IFF_PERSIST flag.
Since ip-tuntap (obviously) cannot create such devices,
we cannot add a test for how non-persistent devices look
in the platform cache. Well, we could instead add them
with ioctl directly, but instead, just extend the platform
API to allow for that.
Also, use the function from test-lldp.c to (optionally) use
nm_platform_link_tun_add() to create the tap device.
Due to a bug, the current rc-kernel will emit the first netlink
notification about tun devices before the device is initialized.
Hence, the content of the message is bogus. If the message
looks like to be this case, re-request it right away.
Now that kernel supports providing information about tun/tap devices
via netlink, make use of it.
Also, enable the hack that:
- when we first see a link that has no lnk data, we refetch
it on the assumption, that kernel just didn't send it
the first time.
For old kernels that do not yet support tun properties on netlink,
this means that we will always refetch the link once, the first
time we see it. I think that is acceptable, and the more correct
behavior for newer kernels that do support it.
Rework the code to if-else-if, to not schedule the same
DELAYED_ACTION_TYPE_REFRESH_LINK instance multiple times.
Note that delayed_action_schedule() already would check that
no duplicates are scheduled, but we can avoid that.
It doesn't make sense that NetworkManager adds non-persist tun
devices, likewise, only the type IFF_TUN or IFF_TAP is supported.
Assert that the values are as expected.
Kernel recently got support for exposing TUN/TAP information on netlink
[1], [2], [3]. Add support for it to the platform cache.
The advantage of using netlink is that querying sysctl bypasses the
order of events of the netlink socket. It is out of sync and racy. For
example, platform cache might still think that a tun device exists, but
a subsequent lookup at sysfs might fail because the device was deleted
in the meantime. Another point is, that we don't get change
notifications via sysctl and that it requires various extra syscalls
to read the device information. If the tun information is present on
netlink, put it into the cache. This bypasses checking sysctl while
we keep looking at sysctl for backward compatibility until we require
support from kernel.
Notes:
- we had two link types NM_LINK_TYPE_TAP and NM_LINK_TYPE_TUN. This
deviates from the model of how kernel treats TUN/TAP devices, which
makes it more complicated. The link type of a NMPlatformLink instance
should match what kernel thinks about the device. Point in case,
when parsing RTM_NETLINK messages, we very early need to determine
the link type (_linktype_get_type()). However, to determine the
type of a TUN/TAP at that point, we need to look into nested
netlink attributes which in turn depend on the type (IFLA_INFO_KIND
and IFLA_INFO_DATA), or even worse, we would need to look into
sysctl for older kernel vesions. Now, the TUN/TAP type is a property
of the link type NM_LINK_TYPE_TUN, instead of determining two
different link types.
- various parts of the API (both kernel's sysctl vs. netlink) and
NMDeviceTun vs. NMSettingTun disagree whether the PI is positive
(NM_SETTING_TUN_PI, IFLA_TUN_PI, NMPlatformLnkTun.pi) or inverted
(NM_DEVICE_TUN_NO_PI, IFF_NO_PI). There is no consistent way,
but prefer the positive form for internal API at NMPlatformLnkTun.pi.
- previously NMDeviceTun.mode could not change after initializing
the object. Allow for that to happen, because forcing some properties
that are reported by kernel to not change is wrong, in case they
might change. Of course, in practice kernel doesn't allow the device
to ever change its type, but the type property of the NMDeviceTun
should not make that assumption, because, if it actually changes, what
would it mean?
- note that as of now, new netlink API is not yet merged to mainline Linus
tree. Shortcut _parse_lnk_tun() to not accidentally use unstable API
for now.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1277457
[2] https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=1ec010e705934c8acbe7dbf31afc81e60e3d828b
[3] https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=118eda77d6602616bc523a17ee45171e879d1818https://bugzilla.redhat.com/show_bug.cgi?id=1547213https://github.com/NetworkManager/NetworkManager/pull/77
Return the extended ack message from the WaitForNlResponse delayed
action so that the caller can print a detailed reason with the
appropriate logging level.
From v4.12 the kernel appends some attributes to netlink acks
containing a textual description of the error and other fields (see
commit [1]). Parse those attributes and print the error message.
Examples:
platform-linux: netlink: recvmsg: error message from kernel: Network is unreachable (101) "Nexthop has invalid gateway" for request 12
platform-linux: netlink: recvmsg: error message from kernel: Invalid argument (22) "Local address cannot be multicast" for request 21
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d4bc93368f5a0ddb57c8c885cdad9c9b7a10ed5
IPv4 routes that are a response to RTM_GETROUTE must have the cloned
flag while IPv6 routes don't have to. Don't check the flag for IPv6
routes and add a test case to verify that RTM_GETROUTE works for IPv6.
https://bugzilla.gnome.org/show_bug.cgi?id=793962
- refactor the loop in event_handler_read_netlink() to mark pending
requests as answered by adding a new helper function
delayed_action_wait_for_nl_response_complete_check()
- delayed_action_wait_for_nl_response_complete_all() can be implemented
in terms of delayed_action_wait_for_nl_response_complete_check()
- if nm_platform_netns_push() fails, also complete all pending requests
with a new error code WAIT_FOR_NL_RESPONSE_RESULT_FAILED_SETNS.
Now that we cleaned up nl_recv(), we have full control over which error
variables are returned when. We no longer need to check "errno"
directly, and we no longer need the NLE_USER_* workaround.
Glib is not out of memory safe, meaning it always aborts the program
when an allocation fails. It is not possible to meaningfully handle
out of memory when using glib.
Replace all allocation functions for netlink message with their glib
counter part and remove the NULL checks.
Originally, these were error numbers from libnl3. These error numbers
are separate from errno, which is unfortunate, because sometimes we
care about the native errno returned from kernel.
Now, refactor them so that the error numbers are in the shared realm
of errno, but failures from kernel or underlying API are still returned
via their native errno.
- NLE_INVAL doesn't exist anymore. Passing invalid arguments to a function
is commonly a bug. g_return_*(NLE_BUG) is the right answer to that.
- NLE_NOMEM and NLE_AGAIN is replaced by their errno counterparts.
- drop several error numbers. If nobody cares about these numbers,
there is no reason to have a specific error number for them.
NLE_UNSPEC is sufficient.
From libnl3, we only used the helper function to parse/generate netlink
messages and the socket functions to send/receive messages. We don't
need an external dependency to do that, it is simple enough.
Drop the libnl3 dependency, and replace all missing code by directly
copying it from libnl3 sources. At this point, I mostly tried to
import the required bits to make it working with few modifications.
Note that this increases the binary size of NetworkManager by 4736 bytes
for contrib/rpm build on x86_64. In the future, we can simplify the code
further.
A few modifications from libnl3 are:
- netlink errors NLE_* are now in the domain or regular errno.
The distinction of having to bother with two kinds of error
number domains was annoying.
- parts of the callback handling is copied partially and unused parts
are dropped. Especially, the verbose/debug handlers are not used.
In following commits, the callback handling will be significantly
simplified.
- the complex handling of seleting ports was simplified. We now always
let kernel choose the right port automatically.
Add a function that allows to re-request all objects of a certain type.
Usually, the cache is supposed to keep itself in a consistent state and
this function is not useful.
It is however useful during testing and debugging to explicitly reload
an object type.
If you ever think to need this function in non-testing code, then
something else is probably wrong with the cache implementation.
Add a WifiDataClass struct, that is immutable and contains all the
function pointers that were previously embedded in WifiData directly.
They are not ever modified after creation, hence this allows to have
a "static const" allocated instance of the VTable.
Also rename wifi_data_deinit() to wifi_data_unref(). It does not only
deinitialize the instance, instead it also frees it. Hence, rename it
to "unref()".
Kernel (as of 4.14) merely ACKs our RTM_DELQDISC and RTM_DELTFILTER, not
bothering to signal the full RTM_DEL* message unless the removal is
external to NetworkManager.
https://bugzilla.redhat.com/show_bug.cgi?id=1527197
NM_FLAGS_HAS() uses a static-assert that the second argument is a
single flag (power of two). With a single flag, NM_FLAGS_HAS(),
NM_FLAGS_ANY() and NM_FLAGS_ALL() are all identical.
The second argument must be a compile time constant, and if that is
not the case, one must not use NM_FLAGS_HAS().
Use NM_FLAGS_ANY() in these cases.
We're going to need that one for TC filter & action support.
<linux/tc_act/tc_defact.h> was moved to user-space API only in 2013
by commit 5bc3db5c9ca8407f52918b6504d3b27230defedc. Our travis CI currently
fails to build due to that.
Re-implement the header.
It only makes sense to call delete() with NMPObjects that
we obtained from the platform cache. Otherwise, if we didn't
get it from the cache in the first place, we wouldn't know
what to delete.
Hence, the input argument is (almost) always an NMPObject
in the first place. That is different from add(), where
we might create a new specific NMPlatform* instance on the
stack. For add() it makes slightly more sense to have different
functions depending on the type. For delete(), it doesn't.
We also do this for libnm, where it causes visible changes
in behavior. But if somebody would rely on the hashing implementation
for hash tables, it would be seriously flawed.
The file descriptor is owned by the netlink socket instance,
which we close in finalize. We most not close it when destroying
the IO channel, otherwise the file descriptor gets closed twice.
Closing an invalid file descriptor (or a descriptor that is already closed)
is a serious bug, because the integer values are re-used, so there is a race
that the close might affect an innocent file descriptor instead of just
failing with EBADF.