Make normal filtering of plain ASCII lines faster

This patch adds a field lines_not_ascii to the MenuState structure. The nth entry is 0 unless the nth member of MenuState.lines has a non-ascii codepoint in it. All comparison functions (menu_match_cb type) take an additional argument to tell them if the thing they are matching is not_ascii. They can use this to determine whether to collate and case-fold the input (for non-ascii strings), or whether to use strstr/strcasestr (for ascii strings). The change is not currently implemented for flex matching, due to my laziness. However, it should be a simple enough matter to add. For my large input of 400,000 lines, this reduces typical filtering time to about ten microseconds from about 2 seconds.
2015-10-01 12:16:41 +01:00
parent 574bf2da82
commit af6a4b83eb
7 changed files with 57 additions and 21 deletions
--- a/include/helper.h
+++ b/include/helper.h
@@ -102,7 +102,7 @@ int find_arg ( const char * const key );
 *
 * @returns 1 when matches, 0 otherwise
 */
-int token_match ( char **tokens, const char *input, int case_sensitive,
+int token_match ( char **tokens, const char *input, int not_ascii, int case_sensitive,
                  __attribute__( ( unused ) ) unsigned int index,
                  __attribute__( ( unused ) ) Switcher * data );

@@ -152,4 +152,11 @@ char helper_parse_char ( const char *arg );
 * Set the application arguments.
 */
 void cmd_set_arguments ( int argc, char **argv );
+
+/**
+ * @param str a UTF8 string
+ * @return 1 if the string contains any non-ascii codepoints
+ */
+int is_not_ascii ( const char *str );
+
 #endif // ROFI_HELPER_H