The routing-rule tests generate a number of routing rules and tries to
add and delete them.
For that, _rule_create_random() sets random fields of the rule.
Note that especially interesting are rules that leave most fields
unset (at zero), because they trigger kernel issues rh#1686075 and
rh#1685816.
But a rule has many fields, so in order to generate rules that have most
fields unset, we need to use low probabilities when rolling the dice for
setting a field. Otherwise, most rules end up with several fields set
and don't reproduce the kernel issue (especially the test failed to hit
rh#1686075).
Routing rules are unlike addresses or routes not tied to an interface.
NetworkManager thinks in terms of connection profiles. That works well
for addresses and routes, as one profile configures addresses and routes
for one device. For example, when activating a profile on a device, the
configuration does not interfere with the addresses/routes of other
devices. That is not the case for routing rules, which are global, netns-wide
entities.
When one connection profile specifies rules, then this per-device configuration
must be merged with the global configuration. And when a device disconnects later,
the rules must be removed.
Add a new NMPRulesManager API to track/untrack routing rules. Devices can
register/add there the routing rules they require. And the sync method will
apply the configuration. This is be implemented on top of NMPlatform's
caching API.
Add and implement NMPlatformRoutingRule types and let the platform cache
handle rules.
Rules are special in two ways:
- they don't have an ifindex. That makes them different from all other
currently existing NMPlatform* types, which have an "ifindex" field and
"implement" NMPlatformObjWithIfindex.
- they have an address family, but contrary to addresses and routes, there
is only one NMPlatformRoutingRule object to handle both address
families.
Both of these points require some special considerations.
Kernel treats routing-rules quite similar to routes. That is, kernel
allows to add different rules/routes, as long as they differ in certain
fields. These "fields" make up the identity of the rules/routes. But
in practice, it's not defined which fields contribute to the identity
of these objects. That makes using the netlink API very hard. For
example, when kernel gains support for a new attribute which
NetworkManager does not know yet, then users can add two rules/routes
that look the same to NetworkManager. That can easily result in cache
inconsistencies.
Another problem is, that older kernel versions may not yet support all
fields, which NetworkManager (and newer kernels) considers for identity.
The older kernel will not simply reject netlink messages with these unknown
keys, instead it will proceed adding the route/rule without it. That means,
the added route/rule will have a different identity than what NetworkManager
intended to add.
Currently, there is a directy one to one relation between
- DELAYED_ACTION_TYPE_REFRESH_ALL_*
- REFRESH_ALL_TYPE_*
- NMP_OBJECT_TYPE_*
For IP addresses, routes and routing policy rules, when we request
a refresh-all (NLM_F_DUMP), we want to specify the address family.
For addresses and routes that is currently solved by having two
sets of NMPObject types, for each address family one.
I think that is cumbersome because the implementations of both address
families are quite similar. By implementing both families as different
object types, we have a lot of duplicate code and it's hard to see where
the families actually differ. It would be better to have only one NMPObject
type, but then when we "refresh-all" such types, we still want to be able
to dump all (AF_UNSPEC) or only a particular address family (AF_INET, AF_INET6).
Decouple REFRESH_ALL_TYPE_* from NMP_OBJECT_TYPE_* to make that
possible.
While these numbers are strongly related to DELAYED_ACTION_TYPE_REFRESH_ALL_*,
they differ in their meaning.
These are the refresh-all-types that we support. While some of the delayed-actions
are indeed for refresh-all, they are not the same thing.
Rename the enum.
The function is unused. It would require redesign to work with
future changes, and since it's unused, just drop it.
The long reasoning is:
Currently, a refresh-all is tied to an NMPObjectType. However, with
NMPObjectRoutingRule (for policy-routing-rules) that will no longer
be the case.
That is because NMPObjectRoutingRule will be one object type for
AF_INET and AF_INET6. Contrary to IPv4 addresses and routes, where
there are two sets of NMPObject types.
The reason is, that it's preferable to treat IPv4 and IPv6 objects
similarly, that is: as the same type with an address family property.
That also follows netlink, which uses RTM_GET* messages for both
address families, and the address family is expressed inside the
message.
But then an API like nm_platform_refresh_all() makes little sense,
it would require at least an addr_family argument. But since the
API is unused, just drop it.
This allows the compiler to see that this function does nothing for %NULL.
That is not so unusual, as we use nm_auto_nmpobj to free objects. Often
the compiler can see that these pointers are %NULL.
Until now, all implemented NMPObject types have an ifindex field (from
links, addresses, routes, qdisc to tfilter).
The NMPObject structure contains a union of all available types, that
makes it easier to down-case from an NMPObject pointer to the actual
content.
The "object" field of NMPObject of type NMPlatformObject is the lowest
common denominator.
We will add NMPlatformRoutingRules (for policy routing rules). That type
won't have an ifindex field.
Hence, drop the "ifindex" field from NMPlatformObject type. But also add
a new type NMPlatformObjWithIfindex, that can represent all types that
have an ifindex.
Based on Ubuntu's "Modify NMDeviceModem's available logic" patch by
Tony Espy <espy@canonical.com>. The original commit message:
This patch modifies NMDeviceModem's available logic such that the device
is only considered available if the modem_state is
>= NM_MODEM_STATE_REGISTERED. NMDevice defines 'available' as meaning the
device is in such a state that it can be activated. This change prevents
NM from trying to activate a modem which is not yet ready to be activated.
Bug-Ubuntu: https://bugs.launchpad.net/bugs/1445080https://github.com/NetworkManager/NetworkManager/pull/312
On Fedora rawhide we get the following build failure:
In file included from shared/systemd/src/basic/alloc-util.c:3:
./shared/systemd/sd-adapt-shared/nm-sd-adapt-shared.h:114:21: error: static declaration of 'gettid' follows non-static declaration
114 | static inline pid_t gettid(void) {
| ^~~~~~
In file included from /usr/include/unistd.h:1170,
from /usr/include/glib-2.0/gio/gcredentials.h:32,
from /usr/include/glib-2.0/gio/gio.h:46,
from ./shared/nm-utils/nm-macros-internal.h:31,
from ./shared/nm-default.h:293,
from ./shared/systemd/sd-adapt-shared/nm-sd-adapt-shared.h:22,
from shared/systemd/src/basic/alloc-util.c:3:
/usr/include/bits/unistd_ext.h:34:16: note: previous declaration of 'gettid' was here
34 | extern __pid_t gettid (void) __THROW;
| ^~~~~~
glibc supports now gettid() call ([1]) which conflicts with our compat
implementation. Rename it.
[1] https://sourceware.org/git/?p=glibc.git;a=commit;h=1d0fc213824eaa2a8f8c4385daaa698ee8fb7c92
(cherry picked from commit 10276322bd)
On Fedora rawhide we get the following build failure:
In file included from shared/systemd/src/basic/alloc-util.c:3:
./shared/systemd/sd-adapt-shared/nm-sd-adapt-shared.h:114:21: error: static declaration of 'gettid' follows non-static declaration
114 | static inline pid_t gettid(void) {
| ^~~~~~
In file included from /usr/include/unistd.h:1170,
from /usr/include/glib-2.0/gio/gcredentials.h:32,
from /usr/include/glib-2.0/gio/gio.h:46,
from ./shared/nm-utils/nm-macros-internal.h:31,
from ./shared/nm-default.h:293,
from ./shared/systemd/sd-adapt-shared/nm-sd-adapt-shared.h:22,
from shared/systemd/src/basic/alloc-util.c:3:
/usr/include/bits/unistd_ext.h:34:16: note: previous declaration of 'gettid' was here
34 | extern __pid_t gettid (void) __THROW;
| ^~~~~~
glibc supports now gettid() call ([1]) which conflicts with our compat
implementation. Rename it.
[1] https://sourceware.org/git/?p=glibc.git;a=commit;h=1d0fc213824eaa2a8f8c4385daaa698ee8fb7c92
Using test-networkmanager-servic.py, I get the following error when
trying to add manual config with a dns address:
Error: g-io-error-quark: Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/dbus/service.py", line 707, in _message_cb
retval = candidate_method(self, *args, **keywords)
File "tools/test-networkmanager-service.py", line 1727, in AddConnection
return self.add_connection(con_hash)
File "tools/test-networkmanager-service.py", line 1731, in add_connection
con_inst = Connection(self.c_counter, con_hash, do_verify_strict)
File "tools/test-networkmanager-service.py", line 1601, in __init__
NmUtil.con_hash_verify(con_hash, do_verify_strict=do_verify_strict)
File "tools/test-networkmanager-service.py", line 497, in con_hash_verify
BusErr.raise_nmerror(e)
File "tools/test-networkmanager-service.py", line 419, in raise_nmerror
raise e
Exception: Unsupported value ipv4.dns = dbus.Array([dbus.UInt32(168430090L), dbus.UInt32(218893066L)], signature=dbus.Signature('u'), variant_level=1) (Cannot convert array element to type 'u': Must be number, not Variant)
https://mail.gnome.org/archives/networkmanager-list/2019-March/msg00013.html
(cherry picked from commit 9a71d7d273)
When the link goes down the kernel removes IPv6 addresses from the
interface. In update_ext_ip_config() we detect that addresses were
removed externally and drop them from various internal
configurations. Don't do that if the link is down so that those
addresses will be restored again on link up.
(cherry picked from commit 505d2adbc2)
Add a new argument to nm_ip_config_* helpers to also ignore addresses
similarly to what we already do for routes. This will be used in the
next commit; no change in behavior here.
(cherry picked from commit 39b7257208)
We can detect false DAD failures if the link goes down. Don't try to
prevent them, but just reset the counter if the link goes down.
(cherry picked from commit 056470a4ba)
When the interface is down DAD failures becomes irrelevant and we
shouldn't try to add a link-local address even if the configuration
contains other IPv6 addresses.
(cherry picked from commit 72385f363c)
When the link goes down the kernel removes IPv6 addresses from the
interface. In update_ext_ip_config() we detect that addresses were
removed externally and drop them from various internal
configurations. Don't do that if the link is down so that those
addresses will be restored again on link up.
Add a new argument to nm_ip_config_* helpers to also ignore addresses
similarly to what we already do for routes. This will be used in the
next commit; no change in behavior here.
When the interface is down DAD failures becomes irrelevant and we
shouldn't try to add a link-local address even if the configuration
contains other IPv6 addresses.