[Home] [Puzzles & Projects] [Delphi Techniques] [Math Topics] [Library] [Utilities]
This is the second (now third) installment of the word manipulation series of programs. The first Wordstuff1 included a basic set of dictionaries, a dictionary maintenance program and a word completion program called CrosswordHelper.
Seven programs are now included:
Unscramble – an anagram solver that make one, two, or three-word phrases from a set of scrambled letters.
Decrypt – a cryptogram solver. Cryptograms are messages encoded by replacing each letter value in the message with a simple substitution code. In this type of encrypting, a each letter is replaced with a fixed letter value. I.e. all "a" with "f", all "b", with "k", etc.
Word Completion - (Crossword Helper). Give it a partial word with unknown letters indicated by space or underscore characters, and it will return a list of all words in it's dictionary which fit the "mask".
WordLadder - Change "dirt" to "gold" in four steps. This program was originally posted as a standalone program and is now also available as part of Wordstuff.
Scrambled Pie: Supply the the missing letter to the four letter groups in this "pie" and unscramble the resulting words.
Spellbound: A word search helper for finding all words embedded in a scrambled set of letters.
WordStuff: A "wrapper" program that lets the user start any of the other six.
Background & Techniques
Unscramble: The trick to solving anagrams on the computer, once we have a dictionary available, is to find letter combinations that match dictionary entries. Rather that generate all possible letter combinations and see which are valid words, it’s generally more efficient to check words in the dictionary and see which contain the letters we’re matching. The complication here is that, in addition to single words, we want to generate 2 and 3 word phrases that use all of the given letters. The user provides minimum and maximum word lengths to search. Within this range we generate a firstwords list of matching words and a corresponding list of remainders, the unused letters. Then, for each of the words in this list, we’ll repeat the search procedure with the remainder list. If we’re looking for three word phrases, we repeat this one more time, making a remainders2 list to check for 3rd words.
A TWords object handles the actual checking of letters against the dictionary. The length range is determined by the max and min lengths specified by the user (except for the final search which uses a length equal to all remaining letters). A "FirstLetters" string is built containing the unique letters in the input letter string – used as starting points for word retrieval from the dictionary.
Decrypt: Cryptogram solving is another example of trial and error problem solving. There are, however, 26! (26 factorial = 26x25x24…x3x2x1 or about 4X1026) possible substitution codes. Thus it is highly unlikely that we could decrypt the message by trying all codes. But we can use the dictionary to prune the search space significantly. I sort the encrypted words from longest to shortest on the (untested) theory that there are fewer long words than short words in the dictionary, thus we should prune invalid combinations quicker. The trystring string is a list of the unique letters in order of appearance this sorted list. Rather than generate a complete substitution code, we can start by creating one of the 26 single letter substitution codes. The order of the letters to translate to comes from the tryorder string. This list is currently defined as the frequency of occurrence of letters in the English language.
Now if we substitute the first letter in tryorder for the first letter in trystring everywhere it occurs and consider all other letters unknown, we basically have the word completion problem. If there is at least one possible dictionary word for each encrypted word, we can choose a second letter from tryorder, try it, then a 3rd letter, etc. If it fails, we just mark that letter as tried and go on to the next. By doing this recursively in the FindNextLetters function, we’ll eventually resolve all of the words or run out of letters to try.
WordStuff: This is an experimental example of creating a "wrapper" program than manages several other programs. It takes care of loading the dictionaries and calling one of three programs: CrosswordHelper, Unscramble, or Decrypt, based on used selections. The code required to allow the same unit to run as a main form or as a secondary form is fairly simple. Each of the 3 programs has a test in its FormCreate method to se if it is the main form (If applicaton.mainform=self). If it is the mainform, the program knows that it is responsible for managing the dictionary itself. Otherwise it can use the pubdic (and possibly privdic) dictionaries defined in the Udict unit and preloaded by the calling program.
Crossword Helper: This is a word complwtion program implemented and described previously as part of WordStuff1
Addendum Feb 8, 2003: I posted a new version of Wordstuff today which incorporates a 4th program (WordLadder) under the Wordstuff wrapper as well as some minor cleanup.
Addendum April 5, 2003: A revised version of Decrypt was installed in Wordstuff today. This version is a little smarter about deciphering encoded messages. It builds a list of possible decipherments for long words in the message and tests those letters first. Remaining letters are tested in the order of frequency of occurrence in the message matched against order of occurrence in the English language. Message words not in the dictionary are a problem that still needs work. Manual intervention has been the most effective solution so far, and I did add some "Hints and Tips" to the Introduction page. A few additions were made to the "Full.dic" dictionary based on decryption testing. The dictionary contained "cryptogram" but "cryptograms" for example. The dictionary structure probably needs to be modified to better handle the whole problem of word suffixes.
Addendum July 7, 2004: In response to a user request, I've modified CrossWord Helper today to accept a list of letters to be excluded from search results. This was to to help him in deciphering phrases in a computer game called Flip Words. Once some letters are known, the player may guess the phrase at any time. Excluding letters known to be in the phrase (but not in the word of interest) can reduce the number of potential answers.
Addendum August 25, 2005: A 5th program, Scrambled Pie, previously posted, was added to Wordstuff today.
Addendum May 11,2006: An update to Dicmaint was posted today with two changes:
Addendum November 17, 2006: A few enhancements to Decrypt, the decryption function of Wordstuff were posted today. Users may now selectively use or omit words in the decrypted message during the search for a solution. Also specified plaintext words may be ignored and not counted as valid words in the deciphered message.
Addendum December 7, 2006: A small fix to Dicmaint was made today : When saving a text version of a dictionary, it should automatically have been saved as a text file. The test for file extension was for '.ext' instead of '.txt' so a .'.txt' dictionary would have been saved in compressed (.dic) format.
The change uncovered the fact that the dictionary unit, UDict, which has been part of our DFF Library file some some time, was also included in the source code zip files for Dicmaint and Wordstuff. As a result, naturally, the UDict included with these programs was not the latest version. As of today, UDict has been removed from the program source files and users wishing to recompile these programs will need the DFF library zip file, version DFFLibV08 or later.
Addendum July 8, 2007: A few small minor program changes were posted today. Namely:
Addendum August 10, 2007: Another small update to Crosswordhelper this month. I'm not sure when it got broken, but a viewer recently pointed that the "Excluded letters" option did not work. Crossword Helper displays a list of valid words from the current dictionary when some of the letters and their positions are known. The excluded letters list specifies letters that are not to be used in completing the word, however the code was checking upper case word letters against lower case excluded letters. Not surprisingly, it never found a match and included all letters in the resulting display. Now fixed
Addendum September 19, 2009: A recent laptop upgrade revealed some screen layout problems with some of the DFF programs, include the WordStuff modules described here. See the DPI Scaling page for more details. No significant changes otherwise.
Addendum April 25, 2010: Another minor
change today to fix a problem that has bugged me for a long time. When
entering a partial word to be analyzed in "Word Completion" or letters to be
analyzed in "Unscramble" it had been necessary to switch from keying to a mouse
click on the Search button to find solutions. Now, pressing the "Enter"
key also triggers the search. The simple code in the edit control's OnKeyPress exit looks like this (red text = comments):
October 14, 2010: A new program, Spellbound, was added to the Wordstuff whapper today. Like the others, the source is also available as a standalone program. Spellbound helps in a game of the same name posted in the games section of the AARP website (http://games.aarp.org/games/spellbound.aspx). The game's objective is to form as many words as possible from a given set of letters with a time limit on the search. My program is a "helper" which searches a chosen dictionary for embedded words and displays the results. I may add user play at a future date, but figuring out an efficient way to test all permutations of all subsets of a given set of letters was enough fun for awhile.
June 17, 2011: I recently posted a Delphi Techniques program about using Delphi's OnIdle exit to perform background processing without interfering with user actions. As a more practical example, I just posted Version 2 of our Spellbound program to allow user play (enter words which can be made from a given set of letters) while the program does its own search in the background. If you approach it as a competition, I can pretty much guarantee that the program will win - at least that's the case for me. It was a good programming exercise though to let the program search in lots of small chunks while you are thinking and still respond instantly when you type or click. Wordstuff3 was recompiled to include the new version also.
October 14, 2011: Another update today to a program included under the Wordstuff umbrella, Unscramble this time. I added an option to handle cases where, rather than being scrambled, words are "interlaced". Letters for 2 or 3 words appear in the correct order but interspersed with letters of the other words. So, for example, "CAT" and "'FROG" could appear as "CFRAOGT". A new checkbox on the Unscramble form indicates that the letters represent interlaced words. The sample set of 17 letters included on the form represent 3 synonymous words and are from a recent Mensa Puzzle Calendar page. I did not successfully identify the words using human brainpower, but the program finds them in a minute or two.
November 22, 2014: Two programs changed with today's posting"
Running/Exploring the Program
The Dictionaries and DicMaint downloads below duplicate the download on the WordStuff 1 page. The dictionaries were last updated on 7/3/07. If you have downloaded dictionaries once since then, there is no need to download again. If you have not previously downloaded the dictionaries, you will need to do so, before running WordStuff 2. The current dictionary download also contains local additions since July 2012.
Suggestions for Further Explorations
Copyright © 2000-2014, Gary Darby
All rights reserved.