i have 2 list, can call a , b. need check items in list a , see if items in b starts item a , stop check.
example of content in a:
https://some/path http://another/path http://another.some/path example of content in b:
http://another/path http://this/wont/match/anything currently i'm doing this:
def check_comps(self, comps): in self.a: b in comps: if b.startswith(a): return is there better way this?
your solution has worst-case o(nm) time complexity, o(n^2) if n ~ m. can reduce o(n log(n)) , o(log(n)). here how.
consider list of words (your comps attrubute) , target (your b)
words = ['abdc', 'abd', 'acb', 'abcabc', 'abc'] target = "abcd" observe, sorting list of words in lexicographical order, list of prefixes
prefixes = ['abc', 'abcabc', 'abd', 'abdc', 'acb'] it degenerate, because prefixes[0] prefix of prefixes[1], hence starts prefixes[1] starts prefixes[0] well. bit problematic. let's see why. let's use fast (binary) search find proper place of target in prefix list.
import bisect bisect.bisect(prefixes, target) # -> 2 this because target , prefixes[1] share prefix, target[3] > prefixes[1][3], hence lexicographically should go after. hence, if there prefix of target in prefixes, should left of index 2. obviously, target doesn't start prefixes[1] hence in worst case have search way left find whether there prefix. observe, if transform these prefixes nondegenerate list, possible prefix of target left of position returned bisect.bisect. let's reduce list of prefixes , write helper function check whether there prefix of target.
from functools import reduce def minimize_prefixes(prefixes): """ note! `prefixes` must sorted lexicographically ! """ def accum_prefs(prefixes, prefix): if not prefix.startswith(prefixes[-1]): return prefixes.append(prefix) or prefixes return prefixes prefs_iter = iter(prefixes) return reduce(accum_prefs, prefs_iter, [next(prefs_iter)]) if prefixes else [] def hasprefix(minimized_prefixes, target): position = bisect.bisect(minimized_prefixes, target) return target.startswith(minimized_prefixes[position-1]) if position else false now let's see
min_prefixes = minimize_prefixes(prefixes) print(min_prefixes) # -> ['abc', 'abd', 'acb'] hasprefix(min_prefixes, target) # -> true let's make test must fail:
min_prefs_fail = ["abcde"] hasprefix(min_prefs_fail, target) # -> false this way o(n log(n)) search, asymptotically faster o(n^2) solution. note! can (and should) store minimize_prefixes(sorted(comps)) prefix set attribute in object, making prefix search o(log (n)), more faster have now.
Comments
Post a Comment