Glib is not out of memory safe, meaning it always aborts the program
when an allocation fails. It is not possible to meaningfully handle
out of memory when using glib.
Replace all allocation functions for netlink message with their glib
counter part and remove the NULL checks.
With libnl3, each socket has it's own callback structure.
One would often take that callback structure, clone it, modify it
and invoke a receive operation with it.
We don't need this complexity. We got rid of all default handlers,
hence, by default all callbacks are unset.
The only callbacks that are set, are those that we specify immediately
before invoking the receive operation. Just pass the callback structure
at that point.
Also, no more ref-counting, and cloning of the callback structure. It is
so simple, just stack allocate one if you need it.
The only difference between nl_recvmsgs_report() and nl_recvmsgs() is
the return value on success. libnl3 couldn't change that for backward
compatibility reasons. We can merge them.
Originally, these were error numbers from libnl3. These error numbers
are separate from errno, which is unfortunate, because sometimes we
care about the native errno returned from kernel.
Now, refactor them so that the error numbers are in the shared realm
of errno, but failures from kernel or underlying API are still returned
via their native errno.
- NLE_INVAL doesn't exist anymore. Passing invalid arguments to a function
is commonly a bug. g_return_*(NLE_BUG) is the right answer to that.
- NLE_NOMEM and NLE_AGAIN is replaced by their errno counterparts.
- drop several error numbers. If nobody cares about these numbers,
there is no reason to have a specific error number for them.
NLE_UNSPEC is sufficient.
From libnl3, we only used the helper function to parse/generate netlink
messages and the socket functions to send/receive messages. We don't
need an external dependency to do that, it is simple enough.
Drop the libnl3 dependency, and replace all missing code by directly
copying it from libnl3 sources. At this point, I mostly tried to
import the required bits to make it working with few modifications.
Note that this increases the binary size of NetworkManager by 4736 bytes
for contrib/rpm build on x86_64. In the future, we can simplify the code
further.
A few modifications from libnl3 are:
- netlink errors NLE_* are now in the domain or regular errno.
The distinction of having to bother with two kinds of error
number domains was annoying.
- parts of the callback handling is copied partially and unused parts
are dropped. Especially, the verbose/debug handlers are not used.
In following commits, the callback handling will be significantly
simplified.
- the complex handling of seleting ports was simplified. We now always
let kernel choose the right port automatically.
When NM quits it destroys all singletons including NMOvsdb, which
invokes callbacks for every pending method call. In the shutdown,
extra care must be taken to not access objects that are already in a
inconsistent state; for example here, the callback changes the device
state, and this causes an access to data that has already been
cleared:
#0 _g_log_abort (breakpoint=breakpoint@entry=1) at gmessages.c:554
#1 g_logv (log_domain=0x5635653b6817 "NetworkManager", log_level=G_LOG_LEVEL_CRITICAL, format=<optimized out>, args=args@entry=0x7fffb4b2c1e0) at gmessages.c:1362
#2 g_log (log_domain=log_domain@entry=0x5635653b6817 "NetworkManager", log_level=log_level@entry=G_LOG_LEVEL_CRITICAL, format=format@entry=0x7fbb3f58fa4a "%s: assertion '%s' failed") at gmessages.c:1403
#3 g_return_if_fail_warning (log_domain=log_domain@entry=0x5635653b6817 "NetworkManager", pretty_function=pretty_function@entry=0x5635653b6b00 <__func__.34463> "nm_device_factory_manager_find_factory_for_connection", expression=expression@entry=0x5635653b6719 "factories_by_setting") at gmessages.c:2702
#4 nm_device_factory_manager_find_factory_for_connection (connection=connection@entry=0x56356627e0e0) at src/devices/nm-device-factory.c:243
#5 nm_manager_get_connection_iface (self=0x563566241080 [NMManager], connection=connection@entry=0x56356627e0e0, out_parent=out_parent@entry=0x0, error=error@entry=0x0) at src/nm-manager.c:1458
#6 check_connection_compatible (self=<optimized out>, connection=0x56356627e0e0) at src/devices/nm-device.c:4679
#7 check_connection_compatible (device=0x56356647b1b0 [NMDeviceOvsInterface], connection=0x56356627e0e0) at src/devices/ovs/nm-device-ovs-interface.c:95
#8 _nm_device_check_connection_available (self=0x56356647b1b0 [NMDeviceOvsInterface], connection=0x56356627e0e0, flags=NM_DEVICE_CHECK_CON_AVAILABLE_NONE, specific_object=0x0) at src/devices/nm-device.c:12102
#9 nm_device_check_connection_available (self=self@entry=0x56356647b1b0 [NMDeviceOvsInterface], connection=0x56356627e0e0, flags=flags@entry=NM_DEVICE_CHECK_CON_AVAILABLE_NONE, specific_object=specific_object@entry=0x0) at src/devices/nm-device.c:12131
#10 nm_device_recheck_available_connections (self=self@entry=0x56356647b1b0 [NMDeviceOvsInterface]) at src/devices/nm-device.c:12238
#11 _set_state_full (self=self@entry=0x56356647b1b0 [NMDeviceOvsInterface], state=state@entry=NM_DEVICE_STATE_FAILED, reason=reason@entry=NM_DEVICE_STATE_REASON_OVSDB_FAILED, quitting=quitting@entry=0) at src/devices/nm-device.c:13065
#12 nm_device_state_changed (self=self@entry=0x56356647b1b0 [NMDeviceOvsInterface], state=state@entry=NM_DEVICE_STATE_FAILED, reason=reason@entry=NM_DEVICE_STATE_REASON_OVSDB_FAILED) at src/devices/nm-device.c:13328
#13 del_iface_cb (error=<optimized out>, user_data=0x56356647b1b0) at src/devices/ovs/nm-device-ovs-port.c:160
#14 _transact_cb (self=self@entry=0x5635662b9ba0 [NMOvsdb], result=result@entry=0x0, error=0x563566259a10, user_data=user_data@entry=0x5635662ff320) at src/devices/ovs/nm-ovsdb.c:1449
#15 ovsdb_disconnect (self=self@entry=0x5635662b9ba0 [NMOvsdb]) at src/devices/ovs/nm-ovsdb.c:1331
#16 dispose (object=0x5635662b9ba0 [NMOvsdb]) at src/devices/ovs/nm-ovsdb.c:1558
#17 g_object_unref (_object=0x5635662b9ba0) at gobject.c:3293
#18 _nm_singleton_instance_destroy () at src/nm-core-utils.c:138
#19 _dl_fini () at dl-fini.c:253
#20 __run_exit_handlers (status=status@entry=0, listp=0x7fbb3e1ad6c8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:77
#21 __GI_exit (status=status@entry=0) at exit.c:99
#22 main (argc=1, argv=0x7fffb4b2cc38) at src/main.c:468
Add a new error code to indicate to callbacks that we are quitting and
no further action must be taken. This is preferable to having
additional references because it allows us to free the resources owned
by callbacks immediately, while references can easily create loops.
https://bugzilla.redhat.com/show_bug.cgi?id=1543871
Example: when dhcpv4 lease renewal fails, if ipv4.may-fail was "yes",
check also if we have a successful ipv6 conf: if not fail.
Previously we just ignored the other ip family status.
The secrets are transient -- when they are loaded into the connections and
subsequently cleared the connection itself doesn't change. The Update
signal is to be emmited only on explicit Update()/Update2() or
ClearSecrets() which is already the case.
Apart from Update being wrong, it has the ill effect of causing libnm to
drop secrets from the cached connection.
clang 5.0.1 complains:
src/systemd/src/libsystemd-network/dhcp6-option.c:40:28: error:
field 'option' with variable sized type 'struct DHCP6Option' not at
the end of a struct or class is a GNU extension
[-Werror,-Wgnu-variable-sized-type-not-at-end]
struct DHCP6Option option;
^
systemd disables this warning too.
Make it possible to add compiler options to a different variable than
CFLAGS. This is useful to conditionally disable a compiler warning for a
subpart of a tree.
I find very annoying to have to remember the numeric value of secret
flags or have to look them up in the manual every time. Accept the
textual version as well and add support for auto-completion.
$ nmcli con modify c 802-11-wireless-security.psk-flags not-required
$ nmcli con modify c 802-11-wireless-security.psk-flags <TAB>
agent-owned none not-required not-saved
Convert the string representation of ipv4.dhcp-client-id property already in
NMDevice to a GBytes. Next, we will support more client ID modes, and we
will need the NMDevice context to generate the client id.
GByteArray is a mutable array of bytes. For every practical purpose, the hwaddr
property of NMDhcpClient is an immutable sequence of bytes. Thus, make it a
GBytes.
The two boolean properties do not need to be ever reset. It's nice
to initialize such properties in the constructor and don't mutate
them afterwards.
Instead of adding two boolean GObject properties, add a new flags property
that can encode these two values. In the end, properties are too
cumbersome, let's combine them.
Optimally, NMDhcpClient would be stateless and all paramters would
be passed on as argument. Clearly that is not feasable, because there
are so many paramters, and in many cases they need to be cached for the
lifetime of the client instance.
Instead of passing info_only paramter to ip6_start() and cache it
both in NMDhcpClient and NMDhcpSystemd, keep it in NMDhcpClient at
one place.
In the next commit, we will initialize info-only only once during the
constructor, so it is immutable and somewhat stateless.
The parent's stop() implementation does nothing interesting
for NMDhcpSystem. Still, call it, it's just unexpected to
not chain up the parent implementation, if all other subclasses
do it.
In general, if the parent's implementation is not suitable to be called
by the derived class, that should be handled differently then just not
chaining up. Otherwise it's inconsistent and confusing.
Don't use message data after calling curl_multi_remove_handle(). Fixes
the following asan error:
=================================================================
==13238==ERROR: AddressSanitizer: heap-use-after-free on address 0x608000091ad0 at pc 0x55750f8d9a10 bp 0x7ffeb7f5f210 sp 0x7ffeb7f5f200
READ of size 8 at 0x608000091ad0 thread T0
#0 0x55750f8d9a0f in curl_check_connectivity (/usr/sbin/NetworkManager+0x190a0f)
#1 0x55750f8da7dd in curl_socketevent_cb (/usr/sbin/NetworkManager+0x1917dd)
#2 0x7f73cb64e8f8 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4a8f8)
#3 0x7f73cb64ec57 (/lib64/libglib-2.0.so.0+0x4ac57)
#4 0x7f73cb64ef29 in g_main_loop_run (/lib64/libglib-2.0.so.0+0x4af29)
#5 0x55750f85c3f4 (/usr/sbin/NetworkManager+0x1133f4)
#6 0x7f73c9f19384 in __libc_start_main (/lib64/libc.so.6+0x22384)
#7 0x55750f85d7f7 (/usr/sbin/NetworkManager+0x1147f7)
0x608000091ad0 is located 48 bytes inside of 88-byte region [0x608000091aa0,0x608000091af8)
freed by thread T0 here:
#0 0x7f73cd61f508 in __interceptor_free (/lib64/libasan.so.4+0xde508)
#1 0x7f73ca710eaa in curl_multi_remove_handle (/lib64/libcurl.so.4+0x32eaa)
previously allocated by thread T0 here:
#0 0x7f73cd61fa88 in __interceptor_calloc (/lib64/libasan.so.4+0xdea88)
#1 0x7f73ca710b3d in curl_multi_add_handle (/lib64/libcurl.so.4+0x32b3d)
SUMMARY: AddressSanitizer: heap-use-after-free (/usr/sbin/NetworkManager+0x190a0f)
Disable undefined sanitizer on RHEL since it's not supported. Also,
enable address sanitizer only for executables, as having it enabled in
libraries causes problems when applications built without asan load
them.
The address sanitizer is not compatible [1] with libraries dynamically
opened using RTLD_DEEPBIND: disable the flag when building with asan.
[1] https://github.com/google/sanitizers/issues/611
Shared libraries built with sanitizers are a bit inconvenient to use
because they require that any application linking to them is run with
libasan preloaded using LD_PRELOAD. This limitation makes the
sanitizer support less useful because applications will refuse to
start unless there is a special environment variable set.
Let's turn the --enable-address-sanitizer configure flag into
--with-address-sanitizer=yes|no|exec so that is possible to enable
asan only for executables.
After writing the connection to disk and rereading it, in addition to
restoring agent-owned secrets in the cache we must also restore
agent-owned secrets from the original connections since they are lost
during the write.
Reported-by: Märt Bakhoff <anon@sigil.red>
https://bugzilla.gnome.org/show_bug.cgi?id=793324
Let the matching continue when we are autocompleting arguments and we
have already found 'id', 'uuid' or 'path'.
Before:
# nmcli connection modify path<TAB>
path
After:
# nmcli connection modify path<TAB>
path
pathfinder-wifi
'help' is completed without considering other alternatives:
# nmcli connection modify h<TAB>
help
After the patch:
# nmcli connection modify h<TAB>
help
home-wifi
Fixes: 29bb6ae4fe
It looks bad and makes everyone super-sad:
$ nmcli --ask c modify 'Oracle HQ' 802-11-wireless-security.psk solaris666
System policy prevents modification of network settings for all users
(action_id: org.freedesktop.NetworkManager.settings.modify.system)
Password (lkundrak): *********
$
With --ask it might call back to nmcli's agent, causing a deadlock
while the client is waiting for the response. Let's give the client
a chance to service the agent requests while waiting:
$ nmcli --ask --show-secrets c show 'Oracle HQ'
<hang>
This is probably still rather suboptimal and inefficient, since we
still serialize the calls and block on response. However, if we submit
multiple calls to GetSecrets, the daemon would start authorizing the
first one and fail the other ones immediately before the authorization
succeeds.
This could perhaps be addressed in the daemon, but let's settle for a
fix that's compatible with the current daemon for now.
The bridge test (and no other either) no longer sets sysfs properties,
so this whole madness is no longer needed. That is good, because Linux
got somewhat stricter (at least in 4.15) about mounting sysfs and the
whole thing wouldn't work with containers where /sys is red-only from
the start.