Iterating hash tables gives an undefined order. Often we want to have
a stable order, for example when printing the content of a hash or
when converting it to a "a{sv}" variant.
How to achieve that best? I think we should only iterate the hash once,
and not require additional lookups. nm_utils_named_values_from_strdict()
achieves that by returning the key and the value together. Also, often
we only need the list for a short time, so we can avoid heap allocating
the list, if it is short enough. This works by allowing the caller to
provide a pre-allocated buffer (usually on the stack) and only as fallback
allocate a new list.
If the value pointer is const, it is commonly inconvenient and requires
a cast. Requiring casts on a common base does not increase type safety,
but is annoying.
When parsing user input if is often convenient to allow stripping whitespace.
Especially with escaped strings, the user could still escape the whitespace,
if the space should be taken literally.
Add support for that to nm_utils_buf_utf8safe_unescape().
Note that this is not the same as calling g_strstrip() before/after
unescape. That is, because nm_utils_buf_utf8safe_unescape() correctly
preserves escaped whitespace. If you call g_strstrip() before/after
the unescape, you don't know whether the whitespace is escaped.
We want to use the function to unescape (compress) secrets. As such, we want
to be sure that no secrets are leaked in memory due to growing the buffer with
realloc. In fact, reallocation should never happen. Assert for that.
As reallocation cannot happen, we could directly fill a buffer with
API like nm_utils_strbuf_*(). But NMStrBuf has low overhead even in this
case.
GPtrArray does not support NULL terminating the pointer array. That
makes it cumbersome to use it for tracking a strv array. Add a few
helper functions nm_strvarray_*() that help using a GArray instead.
Add nm_utils_invoke_on_timeout() beside nm_utils_invoke_on_idle().
They are fundamentally similar, except one schedules an idle handler
and the other a timeout.
Also, use the current g_main_context_get_thread_default() as context
instead of the singleton instance. That is a change in behavior, but
the only caller of nm_utils_invoke_on_idle() is the daemon, which
doesn't use different main contexts. Anyway, to avoid anybody being
tripped up by this also change the order of arguments. It anyway
seems nicer to first pass the cancellable, and the callback and user
data as last arguments. It's more in line with glib's asynchronous
methods.
Also, in the unlikely case that the cancellable is already cancelled
from the start, always schedule an idle action to complete fast.
When we have a buffer that we want to grow exponentially with
nm_utils_get_next_realloc_size(), then there are certain buffer
sizes that are better suited.
For example, if you have an empty NMStrBuf (len == 0), and you
want to allocate roughly one kilobyte, then 1024 is a bad choice,
because nm_utils_get_next_realloc_size() will give you 2024 bytes.
NM_UTILS_GET_NEXT_REALLOC_SIZE_1000 might be better in this case.
If you have a LIST with 7 elements, and you lookup a value that
is not in the (sorted) list and would lie before the first element,
the binary search will dig down to imin=0, imid=0, imax=0 and
strcmp will give positive cmp value (indicating that the searched
value is sorted before).
Then, we would do "imax = imid - 1;", which wrapped to G_MAXUINT,
and the following "if (G_UNLIKELY (imin > imax))" would not hit,
resulting in an out of bound access next.
The easy fix is to not used unsigned integers.
The binary search was adapted from nm_utils_array_find_binary_search()
and nm_utils_ptrarray_find_binary_search(), which already used signed
integers to avoid this problem.
Fixes: 17d9b852c8 ('shared: explicitly implement binary search in NM_UTILS_STRING_TABLE_LOOKUP_DEFINE*()')
Add flags to explicitly escape leading or trailing spaces. Note
that we were already escaping trailing spaces.
This will be used later when supporting backslash escapes for
option parameters for nmcli (vpn.data).
When growing a buffer by appending a previously unknown number
of elements, the often preferable strategy is growing it exponentially,
so that the amortized runtime and re-allocation costs scale linearly.
GString just always increases the buffer length to the next power of
two. That works.
I think there is value in trying to find an optimal next size. Because
while it doesn't matter in terms of asymptotic behavior, in practice
a better choice should make a difference. This is inspired by what QT
does ([1]), to take more care when growing the buffers:
- QString allocates 4 characters at a time until it reaches size 20.
- From 20 to 4084, it advances by doubling the size each time. More
precisely, it advances to the next power of two, minus 12. (Some memory
allocators perform worst when requested exact powers of two, because
they use a few bytes per block for book-keeping.)
- From 4084 on, it advances by blocks of 2048 characters (4096 bytes).
This makes sense because modern operating systems don't copy the entire
data when reallocating a buffer; the physical memory pages are simply
reordered, and only the data on the first and last pages actually needs
to be copied.
Note that a QT is talking about 12 characters, so we use 24 bytes
head room.
[1] https://doc.qt.io/qt-5/containers.html#growth-strategies
Move the assertion for valid LIST first. It only checks static data,
and regardless of the entry_cmd, it should be done first.
Fixes: f4d12f7b59 ('shared: add NM_UTILS_STRING_TABLE_LOOKUP_STRUCT_DEFINE() macro for lookup of structs')
Depending on the type, OVS interfaces also have a corresponding netdev
in kernel (e.g. type "internal" does, type "patch" does not).
Such a case is neither NMU_IFACE_OVS nor NMU_IFACE_KERNEL (alone). There should
be a special type to represent those cases.
Add NMU_IFACE_OVS_OR_KERNEL for that.
nm_utils_ifname_valid() is to validate "connection.interface-name"
property. But the exact validation depends on the connection type.
Add "NMU_IFACE_ANY" to validate the name to check whether it would be
valid for any connection type.
This is for completeness and for places where the caller might not know
the connection type.
Several macros are used to define function. They had a "_STATIC" variant,
to define the function as static.
I think those macros should not try to abstract entirely what they do.
They should not accept the function scope as argument (or have two
variants per scope). This also because it might make sense to add
additional __attribute__(()) to the function. That only works, if
the macro does not pretend to *not* define a plain function.
Instead, embrace what the function does and let the users place the
function scope as they see fit.
This also follows what is already done with
static NM_CACHED_QUARK_FCN ("autoconnect-root", autoconnect_root_quark)
When looking at nm_utils_array_find_binary_search(), we see that binary
search really isn't complicated. In nm_utils_array_find_binary_search()
it looks complicated, because that is our general purpose function which
accepts arbitrary lists, uses an opaque compare function, accepts a user
data argument, and returns the insertion position.
This is unnecessary for the narrow purpose in NM_UTILS_STRING_TABLE_LOOKUP_DEFINE*().
When we inline the binary search, it can be simplified, and the remaining
parts is simple enough to prefer this duplication of binary search over
using our general purpose function.
Also, this gives the compiler more chance for optimization. For
example, we can use "unsigned" as index type instead of gssize, because
we know (at compile time), that this type will always be large enough
for our LIST. Also, we can directly call strcmp().
The result is that the macro's implementation is also fast in the best
case (where the needle is found with only one strcmp()) and in the cases
where there is a small number of items to search.
It thus alleviates concerns that using the macro might be slower than
an optimized implementation.
The binary size of the defined function increases slightly (from 112
bytes to 192 bytes, on x86_64, GCC, -O2). But according to my tests it
is slightly and measurably faster.
Most callers would pass FALSE to nm_utils_error_is_cancelled(). That's
not very useful. Split the two functions and have nm_utils_error_is_cancelled()
and nm_utils_error_is_cancelled_is_disposing().
This should give the compiler more possibilities to warn about wrong
use of the API.
In practice, my current compiler wouldn't flag any issues. However,
some compilers (or compile options) might.