Thread (15 messages) 15 messages, 2 authors, 2025-02-15
STALE494d

[RFC PATCH 2/2] merge-recursive: optimize time complexity for get_unmerged

From: Meet Soni <hidden>
Date: 2025-02-13 09:00:53
Subsystem: the rest · Maintainer: Linus Torvalds

Previously, `get_unmerged()` used `string_list_insert()`, which has an
O(n^2) complexity due to shifting elements on each insertion. It also
called `string_list_lookup()` before insertion, which performs a binary
search in O(log n). This combination made insertion costly, especially
for large index states, as each new entry required both a search and
potentially shifting many elements.

Replace `string_list_insert()` with `string_list_append()` to achieve
O(n) insertion. After all entries are added, sort the list in O(n log n)
and remove duplicates in O(n), reducing the overall complexity to
O(n log n). This improves performance significantly for large datasets
while maintaining correctness.

Signed-off-by: Meet Soni <redacted>
---
 merge-recursive.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/merge-recursive.c b/merge-recursive.c
index 884ccf99a5..6165993429 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -547,15 +547,15 @@ static struct string_list *get_unmerged(struct index_state *istate)
 		if (!ce_stage(ce))
 			continue;
 
-		item = string_list_lookup(unmerged, ce->name);
-		if (!item) {
-			item = string_list_insert(unmerged, ce->name);
-			item->util = xcalloc(1, sizeof(struct stage_data));
-		}
+		item = string_list_append(unmerged, ce->name);
+		item->util = xcalloc(1, sizeof(struct stage_data));
+
 		e = item->util;
 		e->stages[ce_stage(ce)].mode = ce->ce_mode;
 		oidcpy(&e->stages[ce_stage(ce)].oid, &ce->oid);
 	}
+	string_list_sort(unmerged);
+	string_list_remove_duplicates(unmerged, 1);
 
 	return unmerged;
 }
-- 
2.34.1
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help