Always ensure that the entire buffer is initialized with padding NULs.
For example, valgrind checks whether we access uninitalized memory,
so leaving this uninitalized can be unexpected and cause valgrind
failures. In general, one might be tempted to copy the ifname buffer (of
well known size IFNAMSIZ) with memcpy(). In that case, we should not
have trailing garbage there.
We could use strncpy() for that (which guarantees NUL padding), but
then we still would have to ensure NUL termination. But strncpy() is
frowned upon, so let's not use it here.
Note that g_strlcpy() does not guarantee NUL padding, so it's
unsuitable.
We could also implement this with a combination of memcpy() and
memset(). But in this case, it just seems simpler to iterate over the
16 bytes and do it manually.
==6207== Syscall param ioctl(SIOCETHTOOL) points to uninitialised byte(s)
==6207== at 0x514603B: ioctl (syscall-template.S:78)
==6207== by 0x19FC2F: _ioctl_call (nm-platform-utils.c:183)
==6207== by 0x1A026B: _ethtool_call_handle (nm-platform-utils.c:319)
==6207== by 0x1A031F: ethtool_get_stringset (nm-platform-utils.c:378)
==6207== by 0x1A03BC: ethtool_get_stringset_index (nm-platform-utils.c:414)
==6207== by 0x1A181E: nmp_utils_ethtool_supports_vlans (nm-platform-utils.c:912)
==6207== by 0x1756D7: link_supports_vlans (nm-linux-platform.c:6508)
==6207== by 0x1A81D8: nm_platform_link_supports_vlans (nm-platform.c:1536)
==6207== by 0x14B96B: test_internal (test-link.c:602)
==6207== by 0x4F5C18D: test_case_run (gtestutils.c:2597)
==6207== by 0x4F5C18D: g_test_run_suite_internal (gtestutils.c:2685)
==6207== by 0x4F5BF33: g_test_run_suite_internal (gtestutils.c:2697)
==6207== by 0x4F5C679: g_test_run_suite (gtestutils.c:2772)
==6207== by 0x4F5C694: g_test_run (gtestutils.c:2007)
==6207== by 0x166B4D: main (test-common.c:2092)
==6207== Address 0x1ffeffeecf is on thread 1's stack
==6207== in frame #1, created by _ioctl_call (nm-platform-utils.c:110)
==6207==
"ifname" is the stack-allocated array "known_ifnames" of suitable
IFNAMSIZ bytes. But it may not be fully initialized, so using memcpy()
to copy the string leads to unintialized warning.
We really should only copy the valid bytes, either with strcpy() or our
nm_utils_ifname_cpy() wrapper.
Fixes: 856322562e ('platform/ethtool,mii: retry ioctl when interface name was renamed for ehttool/mii')
"struct ifreq" contains a union field, and initalizing the struct is not
guaranteed to fill all bytes with zero (it only sets the first union
member to zero).
Since we later return the entire struct, ensure that it's initialized to
all zero by using memset().
libnm/tests/test-general statically links against libnm/libnm-utils.la
and dynamically against libnm/libnm.so. Hence, _nm_utils_init() is invoked
twice, failing the assertion.
That is a bug that must be fixed. For now, just don't assert.
Since we can easily lookup the vtable for a NMSetting8021xSchemeType,
it is convenient to also easily get the scheme_type for a given
NMSetting8021xSchemeVtable.
On my x86_64, this change is basically for free as it does not increase
the size of NMSetting8021xSchemeVtable, because the scheme_type fits in a
previously unused part of the NMSetting8021xSchemeVtable struct.
Ooherwise, the file has wrong permissions:
# ls -la /var/lib/NetworkManager/secret_key
----r-xr-x. 1 root root 50 May 14 13:52 /var/lib/NetworkManager/secret_key
Luckily, /var/lib/NetworkManager should be already
# ls -lad /var/lib/NetworkManager
drwx------. 2 root root 8192 May 14 13:57 /var/lib/NetworkManager
which mitigates this a bit.
Fixes: dbcb1d6d97 ('core: let nm_utils_secret_key_read() handle failures internally')
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/issues/175
We want the the hash-seed array is alined so it can be used both as
guint, guint32, and guint64 directly. Don't use _nm_alignas() but
instead just add the fields to the union so we get proper alignment.
While at at, also let the seed argument to c_siphash_init() be aligned
to 64 integers. c_siphash_init() does not require that, but it tries to
read the seed as (unaligned) LE 64 bit integers. So, it doesn't hurt.
We want to log pointer values to indicate the related parties of a
log message. But we should not, because plain pointer values can be
used to defeat ASLR.
Instead, we have nm_hash_obfuscate_ptr() to managle a pointer and give
a distinct (albeit not 100% unique) 64 bit integer for logging.
But for the logging messages to be meaning-full, all related parties
must use the same static-seed.
Add a macro NM_HASH_OBFUSCATE_PTR() that uses a particular seed.
- use GDBusConnection instead of GDBusProxy.
- namespace global variables with a "gl" struct.
- don't log __func__. If a log line should have a certain
topic/tag, the tag should be set manually, not based on the
function name. It just looks odd. Also, it's unnecessary.
- use GDBusConnection instead of GDBusProxy.
- rename "call-id" to "conf-id". It's really not a "call" but
configuration that gets added and NMPacrunnerManager ensures that
the configuration is send to pacrunner.
- let "conf-id" keep a reference to NMPacrunnerManager. For one,
when we remove configurations we need to call DestroyProxyConfiguration
to remove it again. We cannot just abort the requests but must linger
around until our configuration is properly cleaned up. Hence, we
anyway cannot destroy the NMPacrunnerManager earlier.
With respect to fixing shutdown not to leak anything, this merely
means that we must wait (and iterate the main loop) as long as
NMPacrunnerManager singleton still exits (that is anyway the plan
how to fix shutdown).
With these considerations it's also clear that our D-Bus calls must
have a stricter timeout: NM_SHUTDOWN_TIMEOUT_MS.
This is also nice because nm_pacrunner_manager_remove() no longer
needs a manager parameter, it can just rely on having a reference
to the manager.
- for logging the configuration IDs, don't log pointer values.
Logging pointer values should be avoided as it defeats ASLR.
Instead, give them a "log_id" number.
- pacrunner is a D-Bus activatable service. D-Bus activatable services
needs special care. We don't want to start it over and over again.
Instead, we only try to "StartServiceByName" if
- we have any configuration to add
- if pacrunner is currently confirmed not to be running (by watching
name owner changes)
- we didn't try to start it already. That means, only start it
at the beginning and afterwards set a flag to block it. When
we see pacrunner appear on D-Bus we always clear that flag,
that means if pacrunner drops of, we will try to restart it
(once).
PolicyKit is a D-Bus activatable service. I don't think it exits on idle (but maybe
it does, it certainly should).
Anyway, NetworkManager was watching the NameOwner of polkit and if the name was lost(!)
it would emit a NM_AUTH_MANAGER_SIGNAL_CHANGED, which causes the internal code to re-authenticate
right away. That means, if you stop policy kit, NetworkManager will ask it right away and
D-Bus activate it. This is not right.
In fact, we don't have to care about the name owner at all. Whenever we make a request,
we just make it and D-Bus activate the service as needed. If polkit starts, it emits a
Changed signal that we watch on D-Bus. That is the only moment when we should actually
emit NM_AUTH_MANAGER_SIGNAL_CHANGED, not when polkit disconnects.
Aside avoiding the unnecessary overhead of GDBusProxy, this simplifies
NMAuthManager because the instance is ready from the start to use D-Bus.
Previously, in the early phase requests needed to be queued until
GDBusProxy could be created asynchronously. Now, there is nothing
asynchronous involved during construction of the NMAuthManager (and
of course there are no blocking calls).
I also like this because it's non-obvious that subscription IDs from
GDBusConnection are "guint" (contrary to signal handler IDs which are
"gulong"). So, by using this API you get a compiler error when using the
wrong type.
In the past, when switching to nm_clear_g_signal_handler() this uncovered
multiple bugs where the wrong type was used to hold the ID.
... and nm_dbus_connection_call_get_name_owner().
We are going to use GDBusConnection more instead of GDBusProxy. Hence,
these two functions are the standard repertoire and used over and over.
Their arguments are complicated enough to warrant a small helper.
Aquiring the bus early tells systemd that NetworkManager is started.
Do that even before setting up/creating the singletons for NMPlatform
and NMNetns.
This is a trick so that NetworkManager is considered earlier to be started.
But it's right, because we can and should create the D-Bus socket as early as
possible to let other services (that order After=network.target) can already
start too.
Of course, NetworkManager is not yet fully running and it will take a
while longer until it actually replies on D-Bus. But the requests are
not lost and services that talk to NetworkManager that early can in the
meantime to other startup actions.
First of all, NMDBusManager takes the system D-Bus connection synchronously, so we
should avoid API that is asynchronous and first needs to get glib's G_BUS_TYPE_SYSTEM
instance.
Also, the only reason why NMDBusManager might not have a D-Bus connection is in "initrd"
configure-and-quit mode. In that mode we also don't need polkit.
We will use the D-Bus connection of our NMDBusManager singleton more.
Use a macro.
- it's shorter to type and it's one distinct word.
- the name indicates what this is: the main D-Bus connection singleton.
By searching for this name we can find all users that care about using
this singleton.
Note that various components (NMFirewallManager, NMAuthManager,
NMConnectivity, etc.pp) all request their own GDBusConnection from
glib's G_BUS_TYPE_SYSTEM singleton.
In the future, let them instead use the D-Bus connection that
NMDBusManager already has.
- NMDBusManager also uses g_bus_get(G_BUS_TYPE_SYSTEM), so in practice this
is just the same GDBusConnection instance.
- if it would not be the same GDBusConnection instance, it would
be more correct/logical to use the one that NMDBusManager uses.
- NMDBusManager already aquired the GDBusConnection synchronously
and it's ready for use. On the other hand, g_bus_get()/g_bus_get_sync()
has the notion that getting the singleton cannot be done without
waiting/blocking. So at least it involves locking or even dispatching
the async reply on D-Bus.
- in "configure-and-quit=initrd" we really don't have D-Bus available.
NMDBusManager should control whether the other components use D-Bus
or not. For example, NMFirewallManager should not ask glib for a
G_BUS_TYPE_SYSTEM singleton only to later find out that it doesn't work.
So if these components would reuse NMDBusManager's GDBusConnection,
then it must have the connection also in regular "configure-and-quit=true"
mode. In this case, we are in late boot and want do connectivity
checking and talk to firewalld.
All callers pass a C string literal as the user-data tag of NMAuthChain.
It makes little sense otherwise because you usually know which user data
you need in advance.
So don't bother with copying the string. Just reference the defacto
static string.
Rename nm_auth_chain_set_data() to nm_auth_chain_set_data_unsafe() to indicate
that the lifetime of the tag string is now the caller's responsibility.
The nm_auth_chain_set_data() macro now ensures that the tag argument is
a C string literal. In fact, all callers that we had did that already.
There are two similar cases where the caller attaches the requested
permission to the auth-chain. There the tag "perm" is used.
Rename for consistency.
Out of the 33 callers of nm_auth_chain_add_call(), the permission
argument is:
- 29 times a C string literal like NM_AUTH_PERMISSION_NETWORK_CONTROL.
- 3 times assign a string that is in fact a static string (it's just
not a string literal)
- only NMManager's device_auth_request_cb() passes a permission of
(possibly) non static origin. But it already duplicates the string
for it's own purposes and attaches it as user-data to the
NMAuthChain.
There really is no need to duplicate the string.
Replace nm_auth_chain_add_call() by a macro that ensures that the
permission string is a C literal.
Rename nm_auth_chain_add_call() to nm_auth_chain_add_call_unsafe() to
indicate that the lifetime of the permission argument is now the
responsibility of the caller.
We track the user-data in a linked list. Hence, when setting
a user data we would need to search the list whether the
tag already exists.
This has an overhead and makes set-data() O(n). But we really don't need
this. The NMAuthChain allows a simple way to attach user-data to the
request. We don't need a full-blown g_object_set_data() (which btw is
also just implemented as a linked list).
NMAuthChain tracks arbitrary user-data pointers for convenience of the
caller.
Previously, it would also track the permission request result as such
user-data. That is not optimal.
- for one, we should keep the namespaces for user data (tags) and
permissions separate. When the user requests a permission result with
nm_auth_chain_get_result() then clearly no user-data is requested.
- we already track permissions in a separate linked list. Previously, when the
auth-call completes, we would destroy the entry in the linked list for
requests and create an entry in the linked list for user-data.
Possibly this was done in the past, because the user-data list was
indexed by a hash table. So, in that case, we only would need to
lookup the permission results in the indexed user-data hash, but never
do any search of a linked list. That is no longer the case, because
in practice the number of tracked user-data/permissions is fixed and
small (at which point it's just more efficient to search a short
linked list).
- There is already distrinct API for getting user-data and permissions.
By keeping the lists separate we don't need to search the list for
entries of the wrong type (it's faster).
- also assert that nm_auth_chain_get_result() indeed asks for a result
that was previously requested. It makes no sense otherwise.
NMAuthChain usually requests several permissions at once. Hence, an error
argument in the overall callback does not make sense, because you
wouldn't know which request failed.
If at all, it could only mean that the overall request failed (like an
D-Bus failure communicating to D-Bus *for all permisssions*),
but we don't need to handle that specially. In fact, we don't really care
why permission was not granted, whether it's due to an error or legitimate
reasons.
The error in the callback was always set to %NULL. Remove it.
The number of user-datas attached to the NMAuthChain is generally fixed
and small.
For example, in current code impl_manager_get_permissions() will be the
instance that ends up with the largest number of data items. It
performs zero calls of nm_auth_chain_set_data() but 16 times
nm_auth_chain_add_call(). So currently the maximum number is 16.
With such a fixed, small number of elements it is expected to be more
efficient to just track the elements in a CList instead of a GHashTable.
- consistently name the ChainData variable "chain_data"
- return the ChainData element from _get_data(). This way it
also can be used by nm_auth_chain_steal_data(), which needs
the ChainData element.
NMAuthChain is not ref-counted.
You may call nm_auth_chain_destroy() once before the callback
gets invoked. This destroys the auth-chain instance right away.
You may call nm_auth_chain_destroy() once from inside the callback.
This basically has no effect but is allowed for convenince.
All this does is remembering that destroy was called and asserts that
destroy gets called at most once.
After the callback returns, the auth-chain will always be destroyed.
That means, generally there is no need to call nm_auth_chain_destroy()
from inside the callback.
Remove that code, and refactor some code to return early (where it makes
sense).
NMAuthChain is not really ref-counted, there is no need for that additional
complexity.
But it is graceful towards calling nm_auth_chain_destroy() from inside the
callback. The caller may do so.
But we don't need a "ref_count" to track that. Two flags suffice: one to
say whether destroy was called and one to indicate that we are in the
process of finishing (to delay deallocating the NMAuthChain struct).
We already had the "done" flag that we used to indicate that the chain
is finished. So, we just need one more flag instead.
This function was left as a reminder now for 9 years. Get rid of it.
Related: 8310593ce4 ('core: ignore authorization for sleep/wake requests (but restrict to root) (rh #638640)')
The boolean value is intended to indicate success. It would indicated
failure due to a bug.
Fixes: 297d4985ab ('core/dbus: rework D-Bus implementation to use lower layer GDBusConnection API'):