Note: here I refer to the numbers in a version as MAJOR.MINOR.MICRO.
Having stable and development releases do make sense for the MINOR
version, because we maintain separate branches for them and they
evolve separately. We have 1.47.z where we put all the changes so
anyone can pick the latest development release and test it. At the
same time, we have 1.46.z with the latest stable released version.
However, it does not make sense to have 1.46.2 and 1.46.3-dev because
the latter is not a development version. It is identical to 1.46.2,
only the version number has been bumped, there are no changes to test.
When we add commits, we will be actually testing 1.46.3-dev + some
commits, which is exactly the same as testing 1.46.2 + some commits.
So, basically, someone can use the releases of a development BRANCH,
like 1.47.4, to test the development version of NM. But using a
development MICRO version is exactly the same as using a
non-development one.
From now on, we will just increment the MICRO version each time we do a
release on a stable branch and won't create the '-dev' tag. Update
release.sh to do it this way.
Although VPN plugins are developed separately, better to document here
how to do the release of those that we maintain to avoid having to do
the changes on every repository each time.
Usually, when the method is "auto" we want to avoid configuring routes
until the automatic method completes. To achieve that, we clear the
"allow_routes_without_address" flag of l3cds when the method is "auto".
For VPNs, IP configurations with only routes are perfectly valid,
therefore set the flag.
The name "dhcp_enabled" is misleading because the flag is set for
method=auto, which doesn't necessarily imply DHCP. Also, it doesn't
convey what the flag is used for. Rename it to
"allow_routes_without_address".
An IPv4-over-IPv6 (or vice-versa) IPsec VPN can return IP
configurations with routes and without addresses. For example, in this
scenario:
+---------------+ +---------------+
| fd01::10/64 <-- VPN --> fd02::20/64 |
| host1 | | host2 |
+-------^-------+ +-------^-------+
| |
+-------v-------+ +-------v-------+
| subnet1 | | subnet2 |
| 172.16.1.0/24 | | 172.16.2.0/24 |
+---------------+ +---------------+
host1 and host2 establish a IPv6 tunnel which encapsulates packets
between the two IPv4 subnets. Therefore, in routed mode, host1 will
need to configure a route like "172.16.2.0/24 via ipsec1" even if the
host doesn't have any IPv4 address on the VPN interface.
Accept IP configurations without address from the VPN; only check that
the address and prefix are sane if they are provided.
Translatable strings were added to nm-setting-generic.c. Add this file to
POTFILES.in.
Fixes: 9322c3e9db ('libnm: add generic.device-handler property')
Connection timestamps are updated (saved to disk) on connection up and
down. This way, the last used connection will take precedence for
autoconnect if they have the same priority.
But as we don't actually do connection down when NM stops, the last
connection timestamp of all active connections is the timestamp of when
they were brought up. Then, the activation order might be wrong on next
start.
One case where timestamps are wrong (although it is not clear how
important it is because the connections are activated on different
interfaces):
1. Activate con1 <- timestamp updated
2. Activate con2 <- timestamp updated
3. Deactivate con2 <- timestamp updated
4. Stop NM <- timestamp of con2 is higher than con1, but con1 was still
active when con2 was brought down.
Other case that is reproducible (from
https://issues.redhat.com/browse/RHEL-35539):
1. Activate con1
2. Activate con2 on same interface:
- As a consequence con1 is deactivated and its timestamp updated
- The timestamp of con2 is also updated
3. Stop NM <- timestamp of con1 and con2 is the same, next activation
order will be undefined.
Fix by saving the timestamps on NM shutdown.
Problem:
Given a OVS port with `autoconnect-ports` set to default or false,
when reactivation required for checkpoint rollback,
previous activated OVS interface will be in deactivate state after
checkpoint rollback.
The root cause:
The `activate_stage1_device_prepare()` will mark the device as
failed when controller is deactivating or deactivated.
In `activate_stage1_device_prepare()`, the controller device is
retrieved from NMActiveConnection, it will be NULL when NMActiveConnection
is in deactivated state. This will cause device been set to
`NM_DEVICE_STATE_REASON_DEPENDENCY_FAILED` which prevent all follow
up `autoconnect` actions.
Fix:
When noticing controller is deactivating or deactivated with reason
`NM_DEVICE_STATE_REASON_NEW_ACTIVATION`, use new function
`nm_active_connection_set_controller_dev()` to wait on controller
device state between NM_DEVICE_STATE_PREPARE and
NM_DEVICE_STATE_ACTIVATED. After that, use existing
`nm_active_connection_set_controller()` to use new
NMActiveConnection of controller to move on.
Resolves: https://issues.redhat.com/browse/RHEL-31972
Signed-off-by: Gris Ge <fge@redhat.com>
Commit 797f3cafee ('device: fall back to saved use_tempaddr value
instead of rereading /proc') changed the behaviour of how to get the
last resort default value for ip6-privacy property.
Previously we read it from /proc/sys/net/ipv6/conf/default, buf after
this commit we started to read /proc/sys/net/ipv6/conf/<iface> instead,
because the user might have set a different value specific for that device.
As NetworkManager changes that value on connection activation, we used
the value read at the time that NetworkManager was started.
Commit 6cb14ae6a6 ('device: introduce ipv6.temp-valid-lifetime and
ipv6.temp-preferred-lifetime properties') introduced 2 new IPv6 privacy
related properties relying on the same mechanism.
However, this new behaviour is problematic because it's not predictable
nor reliable:
- NetworkManager is normally started at boot time. That means that, if a
user wants to set a new value to /proc/sys/net/ipv6/conf/<iface>,
NetworkManager is likely alread running, so the change won't take
effect.
- If NetworkManager is restarted it will read the value again, but this
value can be the one set by NetworkManager itself in the last
activation. This means that different values can be used as default in
the same system boot depending on the restarts of NetworkManager.
Moreover, this weird situation might happen:
- Connection A with ip6-privacy=2 is activated
- NetworkManager is stopped. The value in
/proc/sys/net/ipv6/conf/<iface>/use_tempaddr remains as 2.
- NetworkManager starts. It reads from /proc/sys/... and saves the value
'2' as the default.
- Connection B with no ip6-privacy setting is activated. The '2' saved
as default value is used. The connection didn't specify any value for
it, and the value '2' was set by another connection for that specific
connection only, not manually by a user that wanted '2' to be the
default.
A user shouldn't have to think on when NetworkManager starts or restarts
to known in an easy and predictable way what the default value for
certain property is. It's totally counterintuitive.
Revert back to the old behaviour of reading from
/proc/sys/net/ipv6/conf/default. Although this value is used by the
kernel only for newly created interfaces, and not for already existing
ones, it is reasonable to think on these settings as "systemwide
defaults" that the user has chosen.
Note that setting a different default in NetworkManager.conf still takes
precedence.
We are planning on completely dropping Autotools in the future.
This breaks the build process with an argument to ignore the deprecation,
so that anyone building NM is warned of this change.
This job was supposed to run periodically. However, it stopped working
when a "workflow" section was added to .gitlab-ci.yml because it
prevented pipelines of the type "scheduled" to be created.
7fa72645e5 ('gitlab-ci: make detached MR pipeline for external contributor's pipelines to run')
Now, if it's run, it fails with error:
multi_xml requires Ruby version >= 3.1.4. The current ruby version is 2.7.8.225.
Let's disable the job until we fix it and we decide what triage we want
to do. When we do it, we will need to adapt the jobs to be run with the
right periodicity, maybe using custom pipeline variables.
This will force to regenerate the various images of the distributions
that we want to test so we get an updated snapshot on them. Otherwise we
might be testing a months old version of them.
A bug in ci-fairy was making the deletion of the images to partially
fail. It is fixed in the latest version of ci-fairy, so we need to
update the value of templates_sha to pick it.
The task will run only on pipelines of type "scheduled". Then we can
create a weekly scheduled pipeline in Gitlab.
If a connection is in-memory (i.e. has flag "unsaved"), after a
checkpoint and rollback it can be wrongly persisted to disk:
- if the connection was modified and written to disk after the
rollback, during the rollback we update it again with persist mode
"keep", which keeps it on disk;
- if the connection was deleted after the rollback, during the
rollback we add it again with persist mode "to-disk".
Instead, remember whether the connection had the "unsaved" flag set
and try to restore the previous state.
However, this is not straightforward as there are 4 different possible
states for the settings connection: persistent; in-memory only;
in-memory shadowing a persistent file; in-memory shadowing a detached
persistent file (i.e. the deletion of the connection doesn't delete
the persistent file). Handle all those cases.
Fixes: 3e09aed2a0 ('checkpoint: add create, rollback and destroy D-Bus API')
When we recibe a Netlink message with a "route change" event, normally
we just ignore it if it's a route that we don't track (i.e. because of
the route protocol).
However, it's not that easy if it has the NLM_F_REPLACE flag because
that means that it might be replacing another route. If the kernel has
similar routes which are candidates for the replacement, it's hard for
NM to guess which one of those is being replaced (as the kernel doesn't
have a "route ID" or similar field to indicate it). Moreover, the kernel
might choose to replace a route that we don't have on cache, so we know
nothing about it.
It is important to note that we cannot just discard Netlink messages of
routes that we don't track if they has the NLM_F_REPLACE. For example,
if we are tracking a route with proto=static, we might receive a replace
message, changing that route to proto=other_proto_that_we_dont_track. We
need to process that message and remove the route from our cache.
As NM doesn't know what route is being replaced, trying to guess will
lead to errors that will leave the cache in an inconsistent state.
Because of that, it just do a cache resync for the routes.
For IPv4 there was an optimization to this: if we don't have in the
cache any route candidate for the replacement there are only 2 possible
options: either add the new route to the cache or discard it if we are
not interested on it. We don't need a resync for that.
This commit is extending that optimization to IPv6 routes. There is no
reason why it shouldn't work in the same way than with IPv4. This
optimization will only work well as long as we find potential candidate
routes in the same way than the kernel (comparing the same fields). NM
calls to this "comparing by WEAK_ID". But this can also happen with IPv4
routes.
It is worth it to enable this optimization because there are routing
daemons using custom routing protocols that makes tens or hundreds of
updates per second. If they use NLM_F_REPLACE, this caused NM to do a
resync hundreds of times per second leading to a 100% CPU usage:
https://issues.redhat.com/browse/RHEL-26195
An additional but smaller optimization is done in this commit: if we
receive a route message for routes that we don't track AND doesn't have
the NLM_F_REPLACE flag, we can ignore the entire message, thus avoiding
the memory allocation of the nmp_object. That nmp_object was going to be
ignored later, anyway, so better to avoid these allocations that, with
the routing daemon of the above's example, can happen hundreds of times
per second.
With this changes, the CPU usage doing `ip route replace` 300 times/s
drops from 100% to 1%. Doing `ip route replace` as fast as possible,
without any rate limitting, still keeps NM with a 3% CPU usage in the
system that I have used to test.
The D-Bus and C APIs admit setting the 802.1X certificates as blobs, as
the documentation of the properties explains. However, this is not
possible from nmcli, where only path to the certificates' files is possible.
This difference in nmcli was explained in the description message that
is shown in nmcli's editor, but this is a documentation that most users
won't ever see, and still the main documentation in nm-settings-nmcli is
missleading.
Add a nmcli specific documentation for the relevant properties and
remove the nmcli's editor descriptions as they are no longer needed.