Skip to content Skip to sidebar Skip to footer

Splitting A List By Matching A Regex To An Element

I have a list that has some specific elements in it. I would like to split that list into 'sublists' or different lists based on those elements. For example: test_list = ['a and

Solution 1:

If you want a one-liner,

new_list = reduce(lambda a, b: a[:-1] + [ a[-1] + [ b ] ] ifnot element_regex.match(b) ornot a[0] else a + [ [ b ] ], test_list, [ [] ])

will do. The python way would however be to use a more verbose variant.

I did some speed measurements on a 4 core i7 @ 2.1 GHz. The timeit module ran this code 1.000.000 times and needed 11.38s for that. Using groupby from the itertools module (Kasras variant from the other answer) requires 9.92s. The fastest variant is the verbose version I suggested, taking only 5.66s:

new_list = [[]]for i in test_list:
    if element_regex.match(i):
        new_list.append([])
    new_list[-1].append(i)

Solution 2:

You dont need regex for that , just use itertools.groupby :

>>> from itertools import groupby
>>> from operator import add
>>> g_list=[list(g) for k,g in groupby(test_list , lambda i : 'and'in i)]
>>> [add(*g_list[i:i+2]) for i inrange(0,len(g_list),2)]
[['a and b, 123', '1', '2', 'x', 'y'], ['Foo and Bar, gibberish', '123', '321', 'June', 'July', 'August'], ['Bonnie and Clyde, foobar', 'today', 'tomorrow', 'yesterday']]

first we grouping the list by this lambda function lambda i : 'and' in i that finds the elements that have "and" in it ! and then we have this :

>>> g_list
[['a and b, 123'], ['1', '2', 'x', 'y'], ['Foo and Bar, gibberish'], ['123', '321', 'June', 'July', 'August'], ['Bonnie and Clyde, foobar'], ['today', 'tomorrow', 'yesterday']]

so then we have to concatenate the 2 pairs of lists here that we use add operator and a list comprehension !

Post a Comment for "Splitting A List By Matching A Regex To An Element"