Kernel allows to add the same IPv4 address that only differs by
peer-address (IFL_ADDRESS):
$ ip link add dummy type dummy
$ ip address add 1.1.1.1 peer 1.1.1.3/24 dev dummy
$ ip address add 1.1.1.1 peer 1.1.1.4/24 dev dummy
RTNETLINK answers: File exists
$ ip address add 1.1.1.1 peer 1.1.2.3/24 dev dummy
$ ip address show dev dummy
2: dummy@NONE: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 52:58:a7:1e:e8:93 brd ff:ff:ff:ff:ff:ff
inet 1.1.1.1 peer 1.1.1.3/24 scope global dummy
valid_lft forever preferred_lft forever
inet 1.1.1.1 peer 1.1.2.3/24 scope global dummy
valid_lft forever preferred_lft forever
We must also consider peer-address, otherwise platform will treat
two different addresses as one and the same.
https://bugzilla.gnome.org/show_bug.cgi?id=756356
The peer-address seems less important then the prefix-length.
Also, nm_platform_ip4_address_delete() has the peer-address
argument as last.
Soon ip4_address_get() also receives a peer-address argument,
so get the order right first.
We get a lot of these debugging message, although the event is entirely
internal to NMLinuxPlatform and only interesting when debugging a problem
in platform itself.
Downgrade to TRACE level.
When debug-logging for platform is enabled, every access to sysctl
is cached (to log the last values).
This cache can grow quite large if the system has a large number of
interfaces (e.g. docker creating veth pairs for each container).
We already used to clear the cache, when we were about to access
sysctl *and* logging was disabled in the meantime.
Now, when logging setup changes, immediately clear the cache.
Having "nm-logging.c" call into platform code is a bit of a hack
and a better design would be to have logging code emit a signal to
which platform would subscribe. But that seems to involve much
more code (especially, as no other users care about such a signal
and because nm-logging is not a GObject).
Also, log a warning when the cache grows large to inform the user
about the cache and what he can do to clear it. The extra effort to
clear the cache when changing logging setup is done so that we do
what we tell the user: changing the logging level, will clear the
cache -- right away, not some time later when the next message is
logged.
When we receive an update for a link, cancel a scheduled
REFRESH_LINK delayed-action for that ifindex. At the point when we
scheduled refrehing the link, we only cared about receiving a
notification that was newer then the current state.
We scheduled requesting this new notification to resync the cache.
It is not necessary to actually request a new update, any update we
receive *after* requesting a new update will suffice.
This potentially saves extra round-trips re-requesting the link.
When moving a link to another netns, it gets removed from
NMPlatform's view.
Currently kernel does not sent a notification to inform about
that change (see related bug rh#1262908).
Ensure that we reload all linked interfaces which now might
have an invisible parent.
Defect type: CHECKED_RETURN
3. NetworkManager-1.0.6/src/platform/nm-linux-platform.c:1145: check_return: Calling "clock_gettime" without checking return value (as is done elsewhere 6 out of 7 times).
The naming of these logging macros is unexpected, as we use such
macros only here in platform.
For these messages we cannot use the default _LOGD() set of macros,
because there is no @platform instance around. So let's introduce an
alternative set of logging macros (_LOG2D(), etc) and use it.
The parent of a link (IFLA_LINK) can be in another network namespace and
thus invisible to NM.
This requires the netlink attribute IFLA_LINK_NETNSID which is supported
by recent versions of kernel and libnl.
In this case, set the parent field to NM_PLATFORM_LINK_OTHER_NETNS
and properly handle this special case.
It can happen on a regular basis when many events get raised.
It is probalby not avoidable and most likely not an issue, so
downgrade the warning to info level.
The logging macros _LOGD(), etc. are specific to each
file as they format the message according to their context.
Still, they were cumbersome to define and their implementation
was repeated over and over (slightly different at times).
Move the declaration of these macros to "nm-logging.h".
The source file now only needs to define _NMLOG(), and either
_NMLOG_ENABLED() or _NMLOG_DOMAIN.
This reduces code duplication and encourages a common implementation
and usage of these macros.
For delete-events, we only need a shallow object with the key fields
set. That sufficies to lookup in the cache and find the object to
delete.
One other issue is that _nmp_vt_cmd_plobj_init_from_nl_link() and
link_extract_type() might call to ethtool for the already deleted
instance. Just avoid that.
https://bugzilla.redhat.com/show_bug.cgi?id=1247156
Coverity detected that it was always-true:
src/platform/nm-linux-platform.c:4035: dead_error_line: Execution cannot reach the expression "preferred != 0U" inside this statement: "if (lifetime != 0U || lifet...".
When adding an IPv4 address, kernel will also add a device-route.
We don't want that route because it has the wrong metric. Instead,
we add our own route (with a different metric) and remove the
kernel-added one.
This could be avoided if kernel would support an IPv4 address flag
IFA_F_NOPREFIXROUTE like it does for IPv6 (see related bug rh#1221311).
One important thing is, that we want don't want to manage the
device-route on assumed devices. Note that this is correct behavior
if "assumed" means "do-not-touch".
If "assumed" means "seamlessly-takeover", then this is wrong.
Imagine we get a new DHCP address. In this case, we would not manage
the device-route on the assumed device. This cannot be fixed without
splitting unmanaged/assumed with related bug bgo 746440.
This is no regression as we would also not manage device-routes
for assumed devices previously.
We also don't want to remove the device-route if the user added
it externally. Note that here we behave wrongly too, because we
don't record externally added kernel routes in update_ip_config().
This still needs fixing.
Let IPv4 device-routes also be managed by NMRouteManager. NMRouteManager
has a list of all routes and can properly add, remove, and restore
the device route as needed.
One problem is, that the device-route does not get added immediately
with the address. It only appears some time later. This is solved
by NMRouteManager watching platform and if a matchin device-route shows up
within a short time after configuring addresses, remove it.
If the route appears after the short timeout, assume they were added for
other reasons (e.g. by the user) and don't remove them.
https://bugzilla.gnome.org/show_bug.cgi?id=751264https://bugzilla.redhat.com/show_bug.cgi?id=1211287
build_rtnl_addr() has two parameters "lifetime" and "preferred". Both
count from *now*.
Fix nmp_object_to_nl() to properly set these timestamps. This bug had
not real consequences, because the only place where we use
nmp_object_to_nl() the arguments are 0.
Change nm_platform_link_get() to return the cached NMPlatformLink
instance. Now what all our implementations (fake and linux) have such a
cache internal object, let's just expose it directly.
Note that the lifetime of the exposed link object is possibly quite
short. A caller must copy the returned value if he intends to preserve
it for later.
Also add nm_platform_link_get_by_ifname() and modify nm_platform_link_get_by_address()
to return the instance.
Certain functions, such as nm_platform_link_get_name(),
nm_platform_link_get_ifindex(), etc. are solely implemented based
on looking at the returned NMPlatformLink object. No longer implement
them as virtual functions but instead implement them in the base class
(nm-platform.c).
This removes code and eliminates the redundancy of the exposed
NMPlatformLink instance and the nm_platform_link_get_*() accessors.
Thereby also fix a bug in NMFakePlatform that tracked the link address
in a separate "address" field, instead of using "link.addr". That was
a case where the redundancy actually led to a bug in fake platform.
Also remove some stub implementations in NMFakePlatform that just
bail out. Instead allow for a missing virtual functions and perform
the "default" action in the accessor.
An example for that is nm_platform_link_get_permanent_address().
We must check if a conflicting route/address is configured on
*any* interface, not only on the target ifindex.
This at least restores the previous behavior. Note that this whole
check_reinstall_device_route() still has problems and it might
be better to move it all to NMRouteManager.
Fixes: 470bcefa5f
After doing all the refactoring, rename the ObjectType enum to NMPObjectType.
I didn't do this earlier to avoid problems with rebasing. But since I will
backport the platform changes to nm-1-0, this is the right time to get
the name right.
After refactoring platform, nm_platform_ipx_route_get_all() and
nm_platform_ipx_address_get_all() was broken for calling with a
non-posititive ifindex (which has the meaning: ignore ifindex).
While fixing that, also refactor the NMPCacheIdType so that it matches
better the supported id-types.
Fixes: 470bcefa5f
For NMPlatform instances we had an error reporting mechanism
which stores the last error reason in a private field. Later we
would check it via nm_platform_get_error().
Remove this. It was not used much, and it is not a great way
to report errors.
One problem is that at the point where the error happens, you don't
know whether anybody cares about an error code. So, you add code to set
the error reason because somebody *might* need it (but in realitiy, almost
no caller cares).
Also, we tested this functionality which is hardly used in non-testing code.
While this was a burden to maintain in the tests, it was likely still buggy
because there were no real use-cases, beside the tests.
Then, sometimes platform functions call each other which might overwrite the
error reason. So, every function must be cautious to preserve/set
the error reason according to it's own meaning. This can involve storing
the error code, calling another function, and restoring it afterwards.
This is harder to get right compared to a "return-error-code" pattern, where
every function manages its error code independently.
It is better to return the error reason whenever due. For that we already
have our common glib patterns
(1) gboolean fcn (...);
(2) gboolean fcn (..., GError **error);
In few cases, we need more details then a #gboolean, but don't want
to bother constructing a #GError. Then we should do instead:
(3) NMPlatformError fcn (...);