ECX 30 Days of Code and Design
Day 21
Frequency Analyst
Task
Write a function that takes a string as input and:
- Returns a dictionary whose keys are the characters found in the text, and whose values are the number of occurrences of that character in the text E.g.: f("It is good!") => {"I": 2, "t": 1, "s": 1, "g":1, "o":2, "d":1, "!":1}
- Write another function that takes an input string and returns a dictionary whose keys are the words in the text, and whose values are the respective frequencies of these words. E.g.: f("It is not good, is it?") => {"It": 2, "is": 2, "not": 1, "good":1} Note: In both cases, disregard case sensitivity.
My Approach
This is the full code. It would be explained in bits throughout this article.
import re # For findall() to create a list of letters inputted
# Dictionaries to stores the values of occurring letters and words
letter_dict = {}
word_dict = {}
# Function to count occurring letters
def letter_freq(word):
"""Counts the occurrences of letters in a string"""
letter_list = re.findall(r'\w', word)
for letter in letter_list:
if letter in letter_dict:
letter_dict[letter] += 1
else:
letter_dict[letter] = 1
print(letter_dict)
# Function to count occurring words
def word_freq(word):
"""Counts the occurrences of words in a string"""
word_list = re.findall(r'\w+(?:\'\w+)?', word)
for word in word_list:
if word in word_dict:
word_dict[word] += 1
else:
word_dict[word] = 1
print(word_dict)
print(' Words and Letters Counters '.center(40, '*'))
user_input = input('Enter word(s): ').lower()
# Function calls
letter_freq(user_input)
word_freq(user_input)
First, we import the re module, and we create dictionaries for taking letters and words whose values would be the number of times each letter or word occurs.
import re
letter_dict = {}
word_dict = {}
Next, we define the function, letter_freq, which would count the number of times a specific letter occurs. Using the findall() function from the re module, we search the user's input for letters, and a list for each letter is created. Using the for loop along with if and else statements, we go through the list and for each letter and check if it is already in the dictionary; if yes, we increase its value; if no, we assign it a value of one in the letter_dict.
def letter_freq(word):
"""Counts the occurrences of letters in a string"""
letter_list = re.findall(r'\w', word)
for letter in letter_list:
if letter in letter_dict:
letter_dict[letter] += 1
else:
letter_dict[letter] = 1
print(letter_dict)
Next, we define the function, word_freq(), which counts the occurrence of each word. It is similar to the letter_freq() function, but in this case, we would search and create a list of words, and we run through the list checking each word if they are already in the word_dict or not, incrementing if they have a value already in the dictionary, while assigning the value of one if they have no previous value in the dictionary. \w+(?:\'\w+)?
searches for both also for words with contractions.
def word_freq(word):
"""Counts the occurrences of words in a string"""
word_list = re.findall(r'\w+(?:\'\w+)?', word)
for word in word_list:
if word in word_dict:
word_dict[word] += 1
else:
word_dict[word] = 1
print(word_dict)
Finally, we ask for the user input, which we would make small caps using the lower() method, which we then pass as argument to the letter_freq() and word_freq() functions.
print(' Words and letter Counters '.center(40, '*'))
user_input = input('Enter word(s): ').lower()
letter_freq(user_input)
word_freq(user_input)
Output
****** Words and Letters Counters *******
Enter word(s): Row, row, row your boat gently down the stream. Merrily, merrily, merrily, merrily; life is but a dream
{'r': 14, 'o': 6, 'w': 4, 'y': 6, 'u': 3, 'b': 2, 'a': 4, 't': 5, 'g': 1, 'e': 9, 'n': 2, 'l': 6, 'd': 2, 'h': 1, 's': 2, 'm': 6, 'i': 6, 'f': 1}
{'row': 3, 'your': 1, 'boat': 1, 'gently': 1, 'down': 1, 'the': 1, 'stream': 1, 'merrily': 4, 'life': 1, 'is': 1, 'but': 1, 'a': 1, 'dream': 1}
Run code in Replit.