Friday, July 27, 2012

Anagram Checking Algorithm in ANSI C

An anagram is a word or phrase formed by reordering the letters of another word or phrase [Free Online Dictionary].

The easiest method to check if a string is an anagram of another string is to build the letter frequency statistics for both strings and compare them. If the statistics vectors are equal, then the strings are anagrams.

An implementation of such function is:
/*
 * Description:
 *  Checks if @anagram string is an anagram of the @original string
 * Parameters:
 *  original - the string to which @anagram will be compared to
 *  anagram  - the string who is going to be verified if its an anagram
 * Returns:
 *  true - @anagram is an anagram of @original
 *  false - @anagram is not a anagram of @original
 * Preconditions:
 *  @anagram and @original should only contain letters
 */
bool IsAnagram(const char* original, const char* anagram)
{
   /*Allocates on the stack two character vectors who shall
    be populated with the character statistics from the
    two sent strings*/
    int originalStats[NUMBER_OF_LETTERS];
    int anagramStats[NUMBER_OF_LETTERS];
    /*Computes the length of the sent strings*/
    int originalSize = strlen(original);
    int anagramSize = strlen(anagram);
    int i;
    char temp;
    bool isAnagram = true;

    /*Initializes the statistics vectors*/
    for(i=0; i<NUMBER_OF_LETTERS;i++)
    {
       originalStats[i] = 0;
       anagramStats[i] = 0;
    }
    /*Builds the stats for @anagram and original. If a non-letter
    is encountered, then the function will return false*/
    for(i=0; i<originalSize;i++)
    {
       temp = tolower(original[i]);
       if(islower(temp))
       {
          originalStats[temp-'a']++;
       }
    }
    for(i=0; i<anagramSize;i++)
    {
       temp = tolower(anagram[i]);
       if(islower(temp))
       {
          anagramStats[temp-'a']++;
       }
    }
    /*Compares the stats of the two strings. If the vectors are not
     identical, then @anagram is not an anagram of @original*/
    for(i=0; (i<NUMBER_OF_LETTERS && isAnagram==true); i++)
    {
       if(originalStats[i]!=anagramStats[i])
       {
          isAnagram = false;
       }
    }
    return isAnagram;
}
The function first builds the letter frequency vectors for both strings, counting only and only the letters (punctuation and whitespace are not relevant). All letters are transformed to lowercase. After that the function compares the statistics vector element by element. If two elements having the same index (corresponding to a letter) are not equal, then the strings are not anagrams.

Example:
#include<stdio.h>
#include<ctype.h>
#include<string.h>
#include<stdbool.h>

#define STRING_SIZE 50
#define NUMBER_OF_LETTERS 26

bool IsAnagram(const char* original, const char* anagram);


int main(void)
{
   /*The original string*/
   char original[]="Clint Eastwood";
   /*An anagram of the original string*/
   char anagram1[]="Old West Action";
   /*A string who's not an anagram of the original*/
   char anagram2[]="Westwood Action";
   if(IsAnagram(original,anagram1))
   {
      printf("%s is an anagram of %s\n",anagram1, original);
   }
   else
   {
      printf("%s is not an anagram of %s\n",anagram1, original);
   }
   if(IsAnagram(original,anagram2))
   {
      printf("%s is an anagram of %s\n",anagram2, original);
   }
   else
   {
      printf("%s is not an anagram of %s\n",anagram2, original);
   }
   return 0;
}
/*Output:
 Old West Action is an anagram of Clint Eastwood
 Westwood Action is not an anagram of Clint Eastwood
 */

No comments:

Post a Comment

Got a question regarding something in the article? Leave me a comment and I will get back at you as soon as I can!

Related Posts Plugin for WordPress, Blogger...
Recommended Post Slide Out For Blogger