Skip to content Skip to sidebar Skip to footer

Pandas Vectorized Operation To Get The Length Of String

I have a pandas dataframe. df = pd.DataFrame(['Donald Dump','Make America Great Again!','Donald Shrimp'], columns=['text']) What I like to have is another colum

Solution 1:

Use str.len:

print (df.text.str.len())                   
011125213
Name: text, dtype: int64

Sample:

import pandas as pd

df = pd.DataFrame(['Donald Dump','Make America Great Again!','Donald Shrimp'],
                   columns=['text'])
print (df)
                        text
0                Donald Dump
1  Make America Great Again!
2              Donald Shrimp

df['text_length'] = (df.text.str.len())                   
print (df)
                        text  text_length
0                Donald Dump           11
1  Make America Great Again!           25
2              Donald Shrimp           13

Solution 2:

I think the easiest way is to use the apply method of the DataFrame. With this method you can manipulate the data any way you want.

You could do something like:

df['text_ength'] = df['text'].apply(len)

to create a new column with the data you want.

Edit After seeing @jezrael answer I was curious and decided to timeit. I created a DataFrame full with lorem ipsum sentences (101000 rows) and the difference is quite small. For me I got:

In [59]: %timeit df['text_length'] = (df.text.str.len())
10 loops, best of 3: 20.6 ms per loop

In [60]: %timeit df['text_length'] = df['text'].apply(len)
100 loops, best of 3: 17.6 ms per loop

Post a Comment for "Pandas Vectorized Operation To Get The Length Of String"