With multi-connect enabled, this can cause infinite retries to autoconnect,
see [1].
That has bad consequences for example in initrd, where
nm-wait-online-initrd.service would wait up to one hour before failing
and blocking boot.
This reverts commit 1656d82045.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=2039734#c5
Fixes: 1656d82045 ('policy: track the autoconnect retries in devices for multi-connect')
Like for our other immutable/sealable types, make ref/unref thread safe.
That is important, as the boxed types only increase the ref-count on
copy. If ref/unref is not thread-safe, it means you cannot copy a boxed
type, and operate on the copy on another thread.
Fixes: 041e38b151 ('libnm: add NMRange')
The types NMBridgeVlan, NMIPRoutingRule, NMRange, NMWireGuardPeer
are immutable (or immutable, after the seal() function is called).
Immutable types are great, as it means a reference to them can be shared
without doing a full clone. Hence the G_DEFINE_BOXED_TYPE() for these
types prefers to take a reference instead of cloning the objects. Except
for sealable types, where it will prefer to clone unsealed values.
Likewise, nm_simple_connection_new_clone() probably will just take
another reference to the value, instead of doing a deep clone.
libnm is not a thread-safe library in the sense that you could pass a
NMConnection or NMClient instance to multiple threads and access them
without your own synchronization. However, it should be possible that
multiple threads access (seemingly) distinct objects.
As the copy function of these boxed types (and nm_simple_connection_new_clone()
and similar) prefers to share the references to immutable types, it is important
that the ref function is thread-safe too. Otherwise you cannot just clone a
NMConnection on thread1, hand the clone to thread2 and operate on the
clone and the original independently. If you do before this patch, you would
hit a subtle race condition.
Avoid that. While atomic operations have a runtime overhead, being safe
is more important. Also, we already save a full malloc()/free() by
having immutable, ref-counted types. We just need to make it safe to use
in order to fully benefit from it.
Previously, we only set the "default-duid" line in the lease file. That
means, if the lease already contained a matching entry with a
"dhcp6.client-id" option, it was not honored. That is wrong.
If the profile has "ipv6.dhcp-duid" set, then we must use it and get
rid of those options from the lease.
It's easy to reproduce:
PROFILE=eth1
nmcli connection down "$PROFILE"
rm -f /var/lib/NetworkManager/*lease
nmcli connection modify "$PROFILE" ipv6.dhcp-duid "aa:bb:cc:dd:00:00:11"
nmcli connection up "$PROFILE"
# Verify the expected duid in /var/lib/NetworkManager/*lease and "/run/NetworkManager/devices/$IFINDEX"
nmcli connection modify "$PROFILE" ipv6.dhcp-duid "aa:bb:cc:dd:00:00:22"
nmcli connection up "$PROFILE"
# Check the DUID again.
Splitting by any of "\r\n" and then joining the lines with "\n"
leads to double-newlines. That's certainly wrong.
Maybe we shouldn't care about "\r", I don't know why this was done. But
handle it differently.
dhclient writes binary data as colon-separated hex strings
like nm_utils_bin2hexstr_full() does. But it only writes single
digits for values smaller than 0x10. Add an option to support
that mode.
However, there are many callers of nm_utils_bin2hexstr_full() already,
and they all don't care about the new option. Maybe this should this
not be a boolean argument, instead the function should accept a
flags argument. That is not done for now. Just add another "fuller"
variant. It's still easy to understand, because the "full" variant
is just a more limited functionality of "fuller".
Of course, the old "priv->effective_client_id" and the new
"client_id" instances are truly separate, that is, they don't
share data, and destroying "priv->effective_client_id" before
taking a reference on "client_id" causes no problem.
It's still a code smell. It makes the function unnecessarily unsafe
under (very unusual) circumstances.
The point of using this trivial helper function is to have one function
that is related to the construction of the options dictionary, that we
can search for.
It answers the question, where do we create a option hash (at `git grep
nm_dhcp_option_create_options_dict`).
The "lease" mode is unusual, because it means to prefer the DUID
configuration from the DHCP plugin over the explicit configuration in
NetworkManager. It is only for the DHCPv6 DUID and not for the IPv4
client-id. It also is only special for the "dhclient" plugin, because
with the internal plugin, this always corresponds to a generated, stable
DUID.
Commit 58287cbcc0 ('core: rework IP configuration in NetworkManager
using layer 3 configuration') broke this. The commit refactored the code
to track the effective-client-id separately. Previously, the client-id which
was read from the dhclient lease, was overwriting NMDhcpClient.client_id. But
with the refactor, it broke because nm_dhcp_client_get_effective_client_id()
was never called.
Fix that.
Fixes: 58287cbcc0 ('core: rework IP configuration in NetworkManager using layer 3 configuration')
Note that there are no callers of nm_dhcp_client_get_effective_client_id(),
hence calling the setter had no effect. This is a bug, that we will fix
later.
But before fixing the bug, change how this works. Drop the get_duid() hook.
It's only confusing and backward.
We will keep the nm_dhcp_client_[gs]et_effective_client_id() functions.
They will be used later.
The "effective-client-id" is handled wrongly. Step 1 to clean this up.
Note that NMDhcpClientPrivate.effective_client_id is only ever get/set
via the nm_dhcp_client_[gs]et_effective_client_id() functions.
Note that only a NMDhcpDhclient instance ever calls
nm_dhcp_client_set_effective_client_id().
Hence, for NMDhcpSystemd the effective-client-id is really just the DUID
from the config. Clean this up by not calling nm_dhcp_client_get_effective_client_id()
but use the config directly. There is no change in behavior here.
The current implementation only checks that a device with name equal
to veth.peer exists and it has a parent device; it doesn't check that
its parent is actually the device we want to create. So for example,
if the profile specifies interface-name A and peer B, while in
platform we have a veth pair {B,C}, we'll skip the interface creation
and the device will remain without a ifindex, leading to a crash
later. Fix this by adding the missing check.
While at it, don't implement the check by inspecting NMDevices but
look directly at the platform cache; that seems more robust because
devices are often updated from platform events via idle handlers and
so the information there could be outdated.
Fixes: 07e0ab48d1 ('veth: drop iface peer check during create_and_realize()')
https://bugzilla.redhat.com/show_bug.cgi?id=2129829
eceefe959250 doc: update README.md for typography
df7e0ac7a792 build: release v1.3.0
293d76aded19 test-basic: use `non_constant_expr`
12f8380286f3 generic: handle compile time expression in _c_boolean_expr_(),_c_likely_()/_c_unlikely_()
92b25e384e3b test/basic: add tests for _c_boolean_expr_
4c1765bc0b4d test/api: move _c_always_inline_ test to generic group
fe95c7a78fe9 test/api: add missing test for _c_boolean_expr_
git-subtree-dir: src/c-stdaux
git-subtree-split: eceefe9592501bce485db62966853b361e90ec2f
G_TYPE_CHECK_INSTANCE_CAST() can trigger a "-Wcast-align":
src/core/devices/nm-device-macvlan.c: In function 'parent_changed_notify':
/usr/include/glib-2.0/gobject/gtype.h:2421:42: error: cast increases required alignment of target type [-Werror=cast-align]
2421 | # define _G_TYPE_CIC(ip, gt, ct) ((ct*) ip)
| ^
/usr/include/glib-2.0/gobject/gtype.h:501:66: note: in expansion of macro '_G_TYPE_CIC'
501 | #define G_TYPE_CHECK_INSTANCE_CAST(instance, g_type, c_type) (_G_TYPE_CIC ((instance), (g_type), c_type))
| ^~~~~~~~~~~
src/core/devices/nm-device-macvlan.h:13:6: note: in expansion of macro 'G_TYPE_CHECK_INSTANCE_CAST'
13 | (G_TYPE_CHECK_INSTANCE_CAST((obj), NM_TYPE_DEVICE_MACVLAN, NMDeviceMacvlan))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
Avoid that by using _NM_G_TYPE_CHECK_INSTANCE_CAST().
This can only be done for our internal usages. The public headers
of libnm are not changed.
For MACsec interfaces, kernel announces the parent ifindex in the
generic IFLA_LINK netlink attribute, which we save in
NMPlatformLink.parent. There is no need to have a dedicate member in
NMPlatformLnkMacsec.
The dedicate member was never set and during a restart of
NetworkManager the parent of the MACsec device could be unset leading
to a failed assertion:
act_stage2_config: assertion 'parent' failed
Fixes: 85103656e9 ('platform: add support for macsec links')
https://bugzilla.redhat.com/show_bug.cgi?id=2122564https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1481
When ovs plugin not installed, we got this error on dbus
org.freedesktop.NetworkManager.Failed
It should be `org.freedesktop.NetworkManager.MissingPlugin`.
Signed-off-by: Gris Ge <fge@redhat.com>
The label "try_again" is only reached by one goto. So it was correct
and sufficient to reset the state only there.
It is still error prone. The slighlty clearer approach is to clear
the state at each begin of the "try_again" step.
There should be no change in behavior.
I didn't confirm, but an optimizing compiler should (could) be able
to see that the cleanup is only necessary on retry, and generate the
same code as before. In any case, we should write code that is easier
to read, not optimize for something that a compiler should be able to
optimize itself.