Skip to content Skip to sidebar Skip to footer

Issues With Replacing Words In A String Using A Dictionary And The Replace() Function

Say I have a dictionary, a string and a list of the words in that string. Like this: the_dictionary={'mine': 'yours', 'I': 'you', 'yours': 'mine', 'you': 'I'} the_string='I though

Solution 1:

The issue is that you are calling replace on the_string each time, and when called with the optional argument, replace replaces the first occurrences of the source string.

So, the first time you encounter mine in list_string, the_string gets changed to That is yours that is yours. So far, this is what is expected.

But later, you encounter yours in list_string, and you say the_string = the_string.replace('yours', 'mine', 1). So, the first occurrence of yours in the_string gets replaced with mine, which brings us back to the original string.

Here's one way to fix it:

In [78]: the_string="That is mine that is yours"

In [79]: the_dictionary={'mine': 'yours', 'I': 'you', 'yours': 'mine', 'you': 'I'}

In [80]: list_string = the_string.split()

In [81]: for i,word inenumerate(list_string):
    if word in the_dictionary:
        list_string[i] = the_dictionary[word]
   ....:         

In [82]: print(' '.join(list_string))
That is yours that is mine

Solution 2:

Here's what's happening in your second exemple. Originally, you have :

the_string = "That is mine, that is yours"

Your script changes the first "mine" into "yours" which gives :

the_string = "That is yours, that is yours"

Then, when scanning the string again, it changes BACK the first "yours" (which was just changed !) back to "mine", giving you the original phrase again :

the_string = "That is mine, that is yours"

Well, then : why didn't it do the same for the first string ? Because it depends on which order it will pick the words in your dictionary, and there's no way to decide that. Sometimes you will get lucky and it will work, sometimes not.

First, you want to make sure that once a word is changed, it doesn't get changed back again. So, from the structure of your original script, it's better to change the list than the string. You enumerate each item in the list, if the item is in the dictionary KEYS (yup : you should always look for the keys, not for the word themselves) you change it. Then you change back the list into a string :

the_dictionary = {'I': 'you', 'mine': 'yours','yours': 'mine', 'you': 'I'}

the_string1 = 'I thought that was yours'
the_string2 = 'That is mine that is yours'


list_string1 = ['I','thought','that','was','yours']
list_string2 = ['Thas','is','mine','thas','is','yours']


for i,word inenumerate(list_string1) :
    if word in the_dictionary.keys():
        list_string1[i] = the_dictionary[word]
the_string1 = "%s "*len(list_string1) % tuple(list_string1)

for i,word inenumerate(list_string2) :
    if word in the_dictionary.keys() :
        list_string2[i] = the_dictionary[word]
the_string2 = "%s "*len(list_string2) % tuple(list_string2)

print(the_string1)
print(the_string2)

I used enumerate() which makes it easier to access both the index and the item of a list. Then I used a little trick to change the list back into a string. Not sure it's the best way... Of course, the better way would be to wrap all that up into a function. You can even change the string to a list with the regular expression module :

import re
the_string_list = re.findall(r'\w+',the_string)

Hope it helps !

Post a Comment for "Issues With Replacing Words In A String Using A Dictionary And The Replace() Function"