Skip to content Skip to sidebar Skip to footer

Re.sub In Python 3.3

I am trying to change the text string from the form of file1 to file01. I am really new to python and can't figure out what should go in 'repl' location when trying to use a patte

Solution 1:

You could try this:

>>>import re    >>>text = 'file1 file2 file3'>>>x = re.sub(r'file([1-9])',r'file0\1',text)
'file01 file02 file03'

The brackets wrapped around the [1-9] captures the match, and it is the first match. You will see I used it in the replace using \1 meaning the first catch in the match.

Also, if you don't want to add the zero for files with 2 digits or more, you could add [^\d] in the regexp:

x = re.sub(r'file([1-9](\s|$))',r'file0\1',text)

A bit more of a generic solution now that I'm revisiting this answer using str.format() and a lambda expression:

import re
fmt = '{:03d}'# Let's say we want 3 digits with leading zeroes
s = 'file1 file2 file3 text40'
result = re.sub(r"([A-Za-z_]+)([0-9]+)", \
                lambda x: x.group(1) + fmt.format(int(x.group(2))), \
                s)
print(result)
# 'file001 file002 file003 text040'

A bit of details about the lambda expression:

lambda x: x.group(1) + fmt.format(int(x.group(2)))
#         ^--------^   ^-^        ^-------------^#          filename   format     file number ([0-9]+) converted to int#        ([A-Za-z_]+)            so format() can work with our format

I am using the expression [A-Za-z_]+ assuming the filename contains letters and underscores only besides the training digits. Do pick a more appropriate expression if required.

Solution 2:

To match files with single digit on the end, use a word boundary \b:

>>>text = ' '.join('file{}'.format(i) for i inrange(12))>>>text
'file0 file1 file2 file3 file4 file5 file6 file7 file8 file9 file10 file11'
>>>import re>>>re.sub(r'file(\d)\b',r'file0\1',text)
'file00 file01 file02 file03 file04 file05 file06 file07 file08 file09 file10 file11'

Solution 3:

its also possible to use \D|$ while checking for two digits presence with file, which decides whether to replace file to file0 or not

the following code will also help to achieve the required.

import re

text = 'file1 file2 file3 file4 file11 file22 file33 file1'

x = re.sub(r'file([0-9] (\D|$))',r'file0\1',text)

print(x)

Solution 4:

You could use groups to capture the parts that you wish to keep, then use those groups in the replacement text.

 x = re.sub(r'file([1-9])',r'file0\1',text)

The matching group is created by including ( ) in the regex search. You can then use it with \group, or \1 in this case since we want the first group inserted.

Solution 5:

I believe the following will help you. It is beneficial in that it will only insert a '0' where there is a single digit after 'file' (via boundary ['\b'] special character inclusion):

text = 'file1 file2 file3'

findallfile = re.findall(r'file\d\b', text)

for instance in findallfile:
    textwithzeros = re.sub('file', 'file0', text)

'textwithzeros' should now be a new version of the 'text' string with '0' before each number. Try it out!

Post a Comment for "Re.sub In Python 3.3"