Q:

Program to find the most repeated word in a text file

belongs to collection: String Programs

0

Explanation

In this program, we need to find the most repeated word present in given text file. This can be done by opening a file in read mode using file pointer. Read the file line by line. Split a line at a time and store in an array. Iterate through the array and find the frequency of each word and compare the frequency with maxcount. If frequency is greater than maxcount then store the frequency in maxcount and corresponding word that in variable word. The content of data.txt file used in the program is shown below.

data.txt  

A computer program is a collection of instructions that performs specific task when executed by a computer.

Computer requires programs to function.

Computer program is usually written by a computer programmer in programming language.

A collection of computer programs, libraries, and related data are referred to as software.

Computer programs may be categorized along functional lines, such as application software and system software.

Algorithm

  1. Variable maxCount will store the count of most repeated word.
  2. Open a file in read mode using file pointer.
  3. Read a line from file. Convert each line into lowercase and remove the punctuation marks.
  4. Split the line into words and store it in an array.
  5. Use two loops to iterate through the array. Outer loop will select a word which needs to be count. Inner loop will match the selected word with rest of the array. If match found, increment count by 1.
  6. If count is greater than maxCount then, store value of count in maxCount and corresponding word in variable word.
  7. At the end, maxCount will hold the maximum count and variable word will hold most repeated word.

Input:

file = open("data.txt", "r")  

data.txt file content:

The term "computer" is derived from Latin word "computare" which means to calculate. Computer is a programmable electronic device. Computer accepts raw data as input and processes it with set of instructions to produce result as output. The history of computer begins with the birth of abacus which is believed to be the first computer.

Output:

Most repeated word: computer

All Answers

need an explanation for this answer? contact us directly to get an explanation for this answer

Python

count = 0;  
word = "";  
maxCount = 0;  
words = [];  
   
#Opens a file in read mode  
file = open("data.txt", "r")  
      
#Gets each line till end of file is reached  
for line in file:  
    #Splits each line into words  
    string = line.lower().replace(',','').replace('.','').split(" ");  
    #Adding all words generated in previous step into words  
    for s in string:  
        words.append(s);  
   
#Determine the most repeated word in a file  
for i in range(0, len(words)):  
    count = 1;  
    #Count each word in the file and store it in variable count  
    for j in range(i+1, len(words)):  
        if(words[i] == words[j]):  
            count = count + 1;  
              
    #If maxCount is less than count then store value of count in maxCount  
    #and corresponding word to variable word  
    if(count > maxCount):  
        maxCount = count;  
        word = words[i];  
          
print("Most repeated word: " + word);  
file.close();  

 

Output:

 Most repeated word: computer

 

C

#include <stdio.h>  
#include <string.h>  
#include <stdlib.h>  
   
int main()  
{     
    FILE *file;  
    char ch, *line;  
    size_t len = 0, read;  
    char words[1000][1000], word[20];  
    int i = 0, j, k, maxCount = 0, count;  
      
    //Opens file in read mode  
    file = fopen("data.txt","r");  
      
    //If file doesn't exist  
    if (file == NULL){  
        printf("File not found");  
        exit(EXIT_FAILURE);  
    }  
      
    //Since, C doesn't provide in-built function,   
    //following code will split content of file into words  
    while ((read = getline(&line, &len, file)) != -1) {  
          
        for(k=0; line[k]!='\0'; k++){  
            //Here, i represents row and j represents column of two-dimensional array words   
            if(line[k] != ' ' && line[k] != '\n' && line[k] != ',' && line[k] != '.' ){  
                words[i][j++] = tolower(line[k]);  
            }  
            else{  
                words[i][j] = '\0';  
                //Increment row count to store new word  
                i++;  
                //Set column count to 0  
                j = 0;  
            }  
        }  
    }  
      
    int length = i;  
      
    //Determine the most repeated word in a file  
    for(i = 0; i < length; i++){  
        count = 1;  
        //Count each word in the file and store it in variable count  
        for(j = i+1; j < length; j++){  
            if(strcmp(words[i], words[j]) == 0 && (strcmp(words[i]," ") != 0)){  
                count++;  
            }   
        }  
        //If maxCount is less than count then store value of count in maxCount   
        //and corresponding word to variable word  
        if(count > maxCount){  
            maxCount = count;  
            strcpy(word, words[i]);  
        }  
    }  
      
    printf("Most repeated word: %s", word);  
    fclose(file);  
      
    return 0;  
}  

 

Output:

Most repeated word: computer

 

JAVA

import java.io.BufferedReader;  
import java.io.FileReader;  
import java.util.ArrayList;  
   
public class MostRepeatedWord {  
      
    public static void main(String[] args) throws Exception {  
        String line, word = "";  
        int count = 0, maxCount = 0;  
        ArrayList<String> words = new ArrayList<String>();  
          
        //Opens file in read mode  
        FileReader file = new FileReader("data.txt");  
        BufferedReader br = new BufferedReader(file);  
          
        //Reads each line  
        while((line = br.readLine()) != null) {  
            String string[] = line.toLowerCase().split("([,.\\s]+)");  
            //Adding all words generated in previous step into words  
            for(String s : string){  
                words.add(s);  
            }  
        }  
          
        //Determine the most repeated word in a file  
        for(int i = 0; i < words.size(); i++){  
            count = 1;  
            //Count each word in the file and store it in variable count  
            for(int j = i+1; j < words.size(); j++){  
                if(words.get(i).equals(words.get(j))){  
                    count++;  
                }   
            }  
            //If maxCount is less than count then store value of count in maxCount   
            //and corresponding word to variable word  
            if(count > maxCount){  
                maxCount = count;  
                word = words.get(i);  
            }  
        }  
          
        System.out.println("Most repeated word: " + word);  
        br.close();  
    }  
}

  

 

Output:

Most repeated word: computer

 

C#

using System;  
using System.Collections;  
   
public class MostRepeatedWord  
{      
    public static void Main()  
    {  
        String line, word = "";  
        int count = 0, maxCount = 0;  
        ArrayList words = new ArrayList();  
          
        //Opens file in read mode  
        System.IO.StreamReader file = new System.IO.StreamReader(@"data.txt");   
          
        //Reads each line  
        while((line = file.ReadLine()) != null){  
            String[] string1 = line.ToLower().Split(new Char [] {',' , '.',' '},StringSplitOptions.RemoveEmptyEntries);  
            //Adding all words generated in previous step into words  
            foreach(String s in string1){  
                words.Add(s);  
            }  
        }  
          
        //Determine the most repeated word in a file  
        for(int i = 0; i < words.Count; i++){  
            count = 1;  
            //Count each word in the file and store it in variable count  
            for(int j = i+1; j < words.Count; j++){  
                if(words[i].Equals(words[j])){  
                    count++;  
                }   
            }  
            //If maxCount is less than count then store value of count in maxCount   
            //and corresponding word to variable word  
            if(count > maxCount){  
                maxCount = count;  
                word = (String) words[i];  
            }  
        }  
          
        Console.WriteLine("Most repeated word: " + word);  
        file.Close();  
    }  
}  

 

Output:

Most repeated word: computer

 

PHP

<!DOCTYPE html>  
<html>  
<body>  
<?php  
$word = "";  
$count = $maxCount = 0;  
$words = array();  
   
//Opens file in read mode  
$file = fopen("data.txt", "r");  
   
//Reads each line  
while (($line = fgets($file)) !== false) {  
    $line = strtolower($line);  
    $line = str_replace(',' ,'', $line);  
    $line = str_replace('.', '', $line);  
    $string = explode(' ', $line);  
      
    //Adding all words generated in previous step into words  
    for($i = 0; $i < count($string); $i++){  
        array_push($words, $string[$i]);  
    }  
}  
   
//Determine the most repeated word in a file  
for($i = 0; $i < count($words); $i++){  
    $count = 1;  
    //Count each word in the file and store it in variable count  
    for($j = $i+1; $j < count($words); $j++){  
        if($words[$i] == $words[$j]){  
            $count++;  
        }   
    }  
    //If maxCount is less than count then store value of count in maxCount   
    //and corresponding word to variable word  
    if($count > $maxCount){  
        $maxCount = $count;  
        $word = $words[$i];  
    }  
}  
   
print("Most repeated word: " . $word);  
fclose($file);  
?>  
</body>  
</html>

  

 

Output:

Most repeated word: computer

need an explanation for this answer? contact us directly to get an explanation for this answer

total answers (1)

This question belongs to these collections

Similar questions


Program to find the number of words in the given t... >>
<< Program to find the largest & smallest word in a s...