Skip to content Skip to sidebar Skip to footer

How To Find Frequency Of The Keys In A Dictionary Across Multiple Text Files?

I am supposed to count the frequency of all the key values of dictionary 'd' across all the files in the document 'individual-articles' Here,the document 'individual-articles' has

Solution 1:

Well, I'm not exactly sure what you mean by all the files in the document "X" but I assume it's analogous to pages in a book. With this interpretation, I would do my best to store the the data in the easiest way. Putting data in easily manipulable adds efficiency later, because you can always just add a method for accomplishing and type of output you want.

Since it seems the main key you're looking at is keyword, I would create a nested python dictionary with this structure

dict = (keyword:{file:count})

Once it's in this form, you can do any type of manipulation on the data really easily.

To create this dict,

import os
# returns the next word in the filedefwords_generator(fileobj):
    for line in fileobj:
        for word in line.split():
            yield word
word_count_dict = {}
for dirpath, dnames, fnames in os.walk("./"):
    for file in fnames:
        f = open(file,"r")
        words = words_generator(f)
        for word in words:
            if word notin word_count_dict:
                  word_count_dict[word] = {"total":0}
            if file notin word_count_dict[word]:
                  word_count_dict[word][file] = 0
            word_count_dict[word][file] += 1              
            word_count_dict[word]["total"] += 1

This will create an easily parsable dictionary.

Want the number of total words Britain?

word_count_dict["Britain"]["total"]

Want the number of times Britain is in files 74.txt and 75.txt?

sum([word_count_dict["Britain"][file] if file in word_count_dict else 0 for file in ["74.txt", "75.txt"]])

Want to see all files that the word Britain shows up in?

[file for key in word_count_dict["Britain"]]

You can of course write functions that perform these operations with a simple call.

Post a Comment for "How To Find Frequency Of The Keys In A Dictionary Across Multiple Text Files?"