From a03b739d23b59890b59d2d2288ebaa56e3be47ce Mon Sep 17 00:00:00 2001 From: Sangmin Jeon Date: Mon, 24 Jul 2023 18:29:05 +0900 Subject: [PATCH] Fix `radix_tree.py` insertion fail in ["*X", "*XX"] cases (#8870) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Fix insertion fail in ["*X", "*XX"] cases Consider a word, and a copy of that word, but with the last letter repeating twice. (e.g., ["ABC", "ABCC"]) When adding the second word's last letter, it only compares the previous word's prefix—the last letter of the word already in the Radix Tree: 'C'—and the letter to be added—the last letter of the word we're currently adding: 'C'. So it wrongly passes the "Case 1" check, marks the current node as a leaf node when it already was, then returns when there's still one more letter to add. The issue arises because `prefix` includes the letter of the node itself. (e.g., `nodes: {'C' : RadixNode()}, is_leaf: True, prefix: 'C'`) It can be easily fixed by simply adding the `is_leaf` check, asking if there are more letters to be added. - Test Case: `"A AA AAA AAAA"` - Fixed correct output: ``` Words: ['A', 'AA', 'AAA', 'AAAA'] Tree: - A (leaf) -- A (leaf) --- A (leaf) ---- A (leaf) ``` - Current incorrect output: ``` Words: ['A', 'AA', 'AAA', 'AAAA'] Tree: - A (leaf) -- AA (leaf) --- A (leaf) ``` *N.B.* This passed test cases for [Croatian Open Competition in Informatics 2012/2013 Contest #3 Task 5 HERKABE](https://hsin.hr/coci/archive/2012_2013/) * Add a doctest for previous fix * improve doctest readability --- data_structures/trie/radix_tree.py | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/data_structures/trie/radix_tree.py b/data_structures/trie/radix_tree.py index cf2f25c29..fadc50cb4 100644 --- a/data_structures/trie/radix_tree.py +++ b/data_structures/trie/radix_tree.py @@ -54,10 +54,17 @@ class RadixNode: word (str): word to insert >>> RadixNode("myprefix").insert("mystring") + + >>> root = RadixNode() + >>> root.insert_many(['myprefix', 'myprefixA', 'myprefixAA']) + >>> root.print_tree() + - myprefix (leaf) + -- A (leaf) + --- A (leaf) """ # Case 1: If the word is the prefix of the node # Solution: We set the current node as leaf - if self.prefix == word: + if self.prefix == word and not self.is_leaf: self.is_leaf = True # Case 2: The node has no edges that have a prefix to the word