WordStuff #3

[Home]   [Puzzles & Projects]    [Delphi Techniques]   [Math Topics]   [Library]   [Utilities]

 

Search

 

Search DelphiForFun.org only

Support DFF

 If you shop at Amazon anyway,  consider using this link. We receive a few cents from each purchase.   Thanks.

In Association with Amazon.com

 

Support DFF

 If you benefit from the website,  in terms of knowledge, entertainment value, or something otherwise useful, consider making a donation via PayPal  to help defray the costs.  (No PayPal account necessary to donate via credit card.)  Transaction is secure.

 

 

Contact

Feedback:  Send an e-mail with your comments about this program (or anything else).

 

Search DelphiForFun.org only

 

 

 

 

Problem Description

This is the second (now third) installment of the  word manipulation series of programs. The first Wordstuff1 included a basic set of dictionaries, a dictionary maintenance program and a word completion program called CrosswordHelper.


 

Seven  programs are now  included:

Unscramble an anagram solver that make one, two, or three-word phrases from a set of scrambled letters.

Decrypt a cryptogram solver. Cryptograms are messages encoded by replacing each letter value in the message with a simple substitution code. In this type of encrypting, a each letter is replaced with a fixed letter value. I.e. all "a" with "f", all "b", with "k", etc.

Word Completion - (Crossword Helper).  Give it a partial word with unknown letters indicated by space or underscore characters, and it will return a list of all words in it's dictionary which fit the "mask". 

WordLadder - Change "dirt" to "gold"  in four steps.   This program was originally posted as a standalone program and is now also available as part of  Wordstuff.

Scrambled Pie:  Supply the the missing letter to the four letter groups in this "pie" and unscramble the resulting words.

Spellbound:  A word search helper for finding all words embedded in a scrambled set of letters.

WordStuff: A "wrapper" program that lets the user start any of the other six.

Background & Techniques

Unscramble: The trick to solving anagrams on the computer, once we have a dictionary available, is to find letter combinations that match dictionary entries. Rather that generate all possible letter combinations and see which are valid words, its generally more efficient to check words in the dictionary and see which contain the letters were matching. The complication here is that, in addition to single words, we want to generate 2 and 3 word phrases that use all of the given letters. The user provides minimum and maximum word lengths to search. Within this range we generate a firstwords list of matching words and a corresponding list of remainders, the unused letters. Then, for each of the words in this list, well repeat the search procedure with the remainder list. If were looking for three word phrases, we repeat this one more time, making a remainders2 list to check for 3rd words.

A TWords object handles the actual checking of letters against the dictionary. The length range is determined by the max and min lengths specified by the user (except for the final search which uses a length equal to all remaining letters). A "FirstLetters" string is built containing the unique letters in the input letter string   used as starting points for word retrieval from the dictionary.

Decrypt: Cryptogram solving is another example of trial and error problem solving. There are, however, 26! (26 factorial = 26x25x24x3x2x1 or about 4X1026) possible substitution codes. Thus it is highly unlikely that we could decrypt the message by trying all codes. But we can use the dictionary to prune the search space significantly. I sort the encrypted words from longest to shortest on the (untested) theory that there are fewer long words than short words in the dictionary, thus we should prune invalid combinations quicker.  The trystring string is a list of the unique letters in order of appearance  this sorted list. Rather than generate a complete substitution code, we can start by creating one of the 26 single letter substitution codes. The order of the letters to translate to comes from the tryorder string. This list is currently defined as the frequency of occurrence of letters in the English language.

Now if we substitute the first letter in tryorder for the first letter in trystring everywhere it occurs and consider all other letters unknown, we basically have the word completion problem. If there is at least one possible dictionary word for each encrypted word, we can choose a second letter from tryorder, try it, then a 3rd letter, etc. If it fails, we just mark that letter as tried and go on to the next. By doing this recursively in the FindNextLetters function, well eventually resolve all of the words or run out of letters to try.

WordStuff: This is an experimental example of creating a "wrapper" program than manages several other programs. It takes care of loading the dictionaries and calling one of three programs: CrosswordHelper, Unscramble, or Decrypt, based on used selections. The code required to allow the same unit to run as a main form or as a secondary form is fairly simple. Each of the 3 programs has a test in its FormCreate method to se if it is the main form (If applicaton.mainform=self). If it is the mainform, the program knows that it is responsible for managing the dictionary itself. Otherwise it can use the pubdic (and possibly privdic) dictionaries defined in the Udict unit and preloaded by the calling program.

Crossword Helper: This is a word complwtion program implemented and described previously as part of WordStuff1 

Addendum Feb 8, 2003:  I posted a new version of Wordstuff today which incorporates a 4th program (WordLadder) under the Wordstuff wrapper as well as some minor cleanup.  

Addendum April 5, 2003:  A revised version of Decrypt was installed in Wordstuff  today.  This version is a little smarter about deciphering encoded messages.  It builds a list of possible decipherments for long words in the message and tests those letters first.    Remaining letters are tested in the order of frequency of occurrence in the message matched against order of occurrence in the English language.    Message words not in the dictionary are a problem that still needs work.  Manual intervention has been the most effective solution so far, and I did add some "Hints and Tips"  to the Introduction  page.    A few additions were made to the "Full.dic" dictionary based on decryption testing.  The dictionary contained "cryptogram" but "cryptograms" for example.   The dictionary structure probably needs to be modified to better handle the whole problem of word suffixes.  

Addendum July 7, 2004:  In response to a user request, I've modified CrossWord Helper today to accept a list of letters to be excluded from search results.  This was to to help him in deciphering phrases in a computer game called Flip Words.   Once some letters are known, the player may guess the phrase at any time.    Excluding letters known to be in the phrase (but not in the word of interest) can reduce the number of potential answers.  

Addendum August 25, 2005:    A 5th program, Scrambled Pie, previously posted, was added to Wordstuff today.   

Addendum May 11,2006:  An update to Dicmaint was posted today with two changes:

bullet If the 'SaveAs' option was used to save the dictionary as a text file, capitalization information was lost.  Capitalized words are now saved as capitalized and the characters ",C" are appended to match the ",A" and ",F" used to flag abbreviations and foreign words.
bulletA user recent wrote about problems he was having trying to create a dictionary from a file with 350,000 French words.  Dictmaint was taking a very long time to process the file.  It turned out that the file was in Unix format (no Carriage returns) which made it look like a single record 3.7 million bytes in length to any windows standard program.  Dicmaint used a destructive Getword routine - each word was deleted from the record in memory as it was retrieved, a very slow process for a record of that length.  Getword  now extracts words without modifying the record.  It turns out that using the FileFix utility to correct the input file was even more significant it getting run time down to a few seconds.   But I left the new Getword routine in as an improvement anyway.   DFFLibV14_24Mar2014

Addendum November 17, 2006:  A few enhancements to Decrypt, the decryption function of Wordstuff were posted today.    Users may now selectively use or omit words in the decrypted message during the search for a solution.   Also specified plaintext words may be ignored and not counted as valid words in the deciphered message.    

Addendum December 7, 2006:  A small fix to Dicmaint was made today : When saving a text version of a dictionary, it should automatically have been saved as a text file.  The test for file extension was for '.ext' instead of '.txt'  so a .'.txt' dictionary would have been saved in compressed (.dic) format.

The change  uncovered the fact that the dictionary unit, UDict, which has been part of our DFF Library file some some time, was also included in the source code zip files for Dicmaint and Wordstuff.  As a result, naturally,   the UDict included with these programs was not the latest version.   As of today, UDict has been removed from the program source files and users wishing to recompile these programs will need  the DFF library zip file, version DFFLibV08 or later.     

Addendum July 8, 2007:  A few small minor program changes were posted today.  Namely:

bulletCrossword Helper was enhanced to solved a variation of the word completion problem.  My Mensa Daily Puzzle calendar has a puzzle type with a scrambled word that is missing one letter, kind of a combination of the "Crossword Helper" and "Unscramble" programs.  CrosswordHelper2 now uses some of the code from the Scrambled Pie solver to find the word when there is a single letter missing but no solution is found initially.  For example, the previous version would have completed "b?by" as "baby", but would have found no solution for "bby?".   The new version will ask if you want to search for anagrams and return "baby" if you click the Yes button.   It will also find the solution for the "ARUI?YNGR" calendar puzzle that led to the enhancement in the first place.
bulletSearch logic used in Scrambled Pie was moved to a new unit "USearchAnagrams" which is now used by both ScrambledPie and CrosswordHelper2 to implement the enhancement described above.
bulletI add words to full.dic whenever it failed to contain the answer to some word puzzle I'm working on.  The Dictionaries.zip download file was updated to contain the latest version of our full.dic dictionary file.  

Addendum August 10, 2007: Another small update to Crosswordhelper this month.  I'm not sure when it got broken, but a viewer recently pointed that the "Excluded letters" option did not work.  Crossword Helper displays a list of valid words from the current dictionary when  some of the letters and their positions are known.   The excluded letters list specifies letters that are not to be used in completing the word,  however the code was checking upper case word letters against lower case excluded letters.   Not surprisingly, it never found a match and included all letters in the resulting display.   Now fixed 

Addendum September 19, 2009: A recent laptop upgrade revealed some screen layout problems with some of the DFF programs, include the WordStuff modules described here.   See the DPI Scaling page for more details.  No significant changes otherwise.

Addendum April 25, 2010: Another minor change today to fix a problem that has bugged me for a long time.  When entering a partial word to be analyzed in "Word Completion" or letters to be analyzed in "Unscramble" it had been necessary to switch from keying to a mouse click on the Search button to find solutions.  Now, pressing the "Enter" key also triggers the search.  The simple code in the edit control's OnKeyPress exit looks like this (red text = comments):
    begin
        if ord(key)=13 then  
{Enter key was pressed}
        begin
            key:=char(0);
{null out the key}
            SolveBtnClick(sender);
{call the search button click routine}

        end;
    end;

October 14,  2010: A new program, Spellbound, was added to the Wordstuff whapper today.  Like the others, the source  is also available as a standalone program.  Spellbound helps in a game of the same name posted in the games section of the AARP website (http://games.aarp.org/games/spellbound.aspx).  The game's objective is to form as many words as possible from a given set of letters with a time limit on the search.   My program is a "helper" which searches a chosen dictionary for embedded words and displays the results.   I may add user play at a future date, but figuring out an efficient way to test all permutations of all subsets of a given set of letters was enough fun for awhile.  

June 17, 2011:  I recently posted a Delphi Techniques program about using Delphi's OnIdle exit to perform background processing without interfering with user actions.  As a more practical example, I just posted Version 2 of our Spellbound program to allow user play  (enter words which can be made from a given set of letters) while the program does its own search in the background.  If you approach it as a competition, I can pretty much guarantee that the program will win - at least that's the case for me.  It was a good programming exercise though to let the program search in lots of small chunks while you are thinking and still respond instantly when you type or click.  Wordstuff3 was recompiled to include the new version also.   

October 14, 2011:   Another update today to a program included under the Wordstuff umbrella, Unscramble this time.  I added an option to handle cases where, rather than being scrambled, words are "interlaced".  Letters for 2 or 3 words appear in the correct order but interspersed with letters of the other words.  So, for example, "CAT" and "'FROG" could appear as "CFRAOGT".  A new checkbox on the Unscramble form indicates that the letters represent interlaced words.   The sample set of 17 letters included on the form represent 3 synonymous words and are from a recent Mensa Puzzle Calendar page.  I did not successfully identify the words using human brainpower, but the program finds them in a minute or two.         

Running/Exploring the Program 

The Dictionaries and DicMaint downloads below duplicate the download on the  WordStuff 1 page.  The dictionaries were last updated on 7/3/07.  If you have downloaded dictionaries  once since then, there is no need to download again.  If you have not previously downloaded the dictionaries,  you will need to do so, before running WordStuff 2.

bulletDictionaries & DicMaint
bulletDownload Dictionaries
bullet Download DicMaint source
bullet Download  DicMaint executable
bulletWordStuff 
bulletDownload source (for all 6 programs + WordStuff wrapper)
bulletDownload  WordStuff executable 
 
bulletDFF Library
bulletDownload current DFF Source code library (DFFLibV14_24Mar2014) To recompile the source code for  any of the above programs, you will need the UDict unit contained in DFF library file, DFFVLib08 or later.

  

bulletLazarus Code
bulletDownload Lazarus Dicmaint Source
bullet Download Lazarus Source (for all 6 programs + WordStuff wrapper)
bulletDownload Lazarus Source Code Library

Suggestions for Further Explorations

Decrypt has lots of room for study. Should  the "trystring" function be built based on frequency of letters in the message, or order of occurrence when words are sorted from shortest to longest? Perhaps doubled letters should be tried first, or look for digraphs (letter pairs that occur frequently), word endings, etc. The first step would be to implement timing code so that the efficiency of various techniques could be explored.
The current version of Decrypt is "all or  nothing",  a single encrypted word not in the dictionary will result in a "No solution found" message.  That's not a good thing.  
Personally, Im going on to develop other word games using the dictionary. Next will be WordFinder find as many valid words as possible from a random array of letters by moving horizontally, vertically, or diagonally from letter to letter. More path searching for the auto-solve part - oh boy!  

 

Created:  February 4,  2001

Modified: April 06, 2013

 

  [Feedback]   [Newsletters (subscribe/view)] [About me]
Copyright 2000-2014, Gary Darby    All rights reserved.