Skip to content Skip to sidebar Skip to footer

Find All The Keys Cluster In A List

I have a 'combination' problem to find a cluster of different keys for which I try to find a optimized solution: I have this list of list 'l': l = [[1, 5], [5, 7], [4, 9], [7, 9

Solution 1:

You can see it as the problem of finding the connected components in a graph:

l = [[1, 5], [5, 7], [4, 9], [7, 9], [50, 90], [100, 200], [90, 100],
     [2, 90], [7, 50], [9, 21], [5, 10], [8, 17], [11, 15], [3, 11]]
# Make graph-like dict
graph = {}
for i1, i2 in l:
    graph.setdefault(i1, set()).add(i2)
    graph.setdefault(i2, set()).add(i1)
# Find clusters
clusters = []
for start, ends in graph.items():
    # If vertex is already in a cluster skipifany(start in cluster for cluster in clusters):
        continue
    # Cluster set
    cluster = {start}
    # Process neighbors transitively
    queue = list(ends)
    while queue:
        v = queue.pop()
        # If vertex is newif v notin cluster:
            # Add it to cluster and put neighbors in queue
            cluster.add(v)
            queue.extend(graph[v])
    # Save cluster
    clusters.append(cluster)
print(*clusters)
# {1, 2, 100, 5, 4, 7, 200, 9, 10, 50, 21, 90} {8, 17} {3, 11, 15}

Solution 2:

This is a typical use case for the union-find algorithm / disjoint set data structure. There's no implementation in the Python library AFAIK, but I always tend to have one nearby, as it's so useful...

l = [[1, 5], [5, 7], [4, 9], [7, 9], [50, 90], [100, 200], [90, 100],
 [2, 90], [7, 50], [9, 21], [5, 10], [8, 17], [11, 15], [3, 11]]

from collections import defaultdict
leaders = defaultdict(lambda: None)

deffind(x):
    l = leaders[x]
    if l isnotNone:
        leaders[x] = find(l)
        return leaders[x]
    return x

# union all elements that transitively belong togetherfor a, b in l:
    leaders[find(a)] = find(b)

# get groups of elements with the same leader
groups = defaultdict(set)
for x in leaders:
    groups[find(x)].add(x)
print(*groups.values())
# {1, 2, 4, 5, 100, 7, 200, 9, 10, 50, 21, 90} {8, 17} {3, 11, 15}

The runtime complexity of this should be about O(nlogn) for n nodes, each time requiring logn steps to get to (and update) the leader.

Post a Comment for "Find All The Keys Cluster In A List"