It's an interesting article, but I would have thought that the best\* letter to ...

sdevlin · on March 29, 2012

Your intuition is telling you that you should shoot for 50% so that you can divide the search space in half regardless of the outcome. A right or wrong answer will eliminate half of the possible remaining words.

But right and wrong guesses are not equally valuable in hangman. A wrong guess will eliminate all words of a given length containing the given letter. But a correct guess gives you more: you get the location (or locations) of the letter in the word as a bonus. This will reduce your remaining search space drastically.

Thus, you should optimize for the success of your next guess.

karamazov · on March 29, 2012

Additionally, you're not penalized for correct guesses.

Jabbles · on March 29, 2012

It depends whether you care about minimising the number of errors, or just wish to keep it below 11 - thereby minimising your chance of losing. These are subtly different and depend on your "reward function".

beza1e1 · on March 29, 2012

If you want to minimize the number of errors, you should try the letter with the most information. In other words, minimize the search space. That is what hythloday suggested.

Someone · on March 29, 2012

I do not think that is correct. Let's say that, at some time, you know the word has an E, with 99% probability at position 1 and with 1% at position 2. You also know there is 50% chance that the word has an O. I would guess the E, because it comes free, even though it is not the one that gives the most information.

You must somehow weigh information gain against risk.

beza1e1 · on March 30, 2012

Assuming there is always a letter with a 50% chance and we choose wrong 11 times in a row. One more and game over. At this point we have 11 Bits of information, which is enough to distuingish between 2048 words.

Actually, we should be correct 50% of the time. Which means 22 Bits of information or 4194304 words.

Additionally, we know the length of the word.

The english dictionaries seem to have between 400k and 1000k words [0] of all word sizes. With 22 Bits we get 4000k words. We do not have to worry about getting hanged using the information-reduction algorithm. ;)

[0] http://hypertextbook.com/facts/2001/JohnnyLing.shtml

beza1e1 · on March 30, 2012

I have to correct myself. You do not want a 50% chance, because the answer does not provide 1Bit of information. If we get it right, we also know the position(s) of the letter.

More precisely: Given a dictionary of all possible words and a letter to guess, we can split the dictionary into k parts. One sub-dictionary contains all words for which the answer is negative. Additionally, we have k-1 sub-dictionaries for each equivalence class of letter positions.

For each letter, we can compute the partition. Now we need to choose the strategically best partition. I believe it should be the one with "the lowest average sub-dictionary size".

kd5bjo · on March 29, 2012

The simplest way I know of to do that is by a weighted linear combination. Both information gain and risk are quantifiable; one in bits and the other in probability (or log-odds which is also bits). Then the problem reduces to finding appropriate weights, which can be done with your favorite n-dimensional search algorithm.

It may also be useful to consider how many remaining incorrect guesses there are in the game.

klipt · on March 29, 2012

If you guess a letter contained in 99% of remaining possible words, then either

- you are correct (happens 99% of the time). No penalty, and you find out at which positions your guessed letter occurs, a decent information gain.

- you are incorrect. You suffer a penalty but eliminate 99% of the remaining words, a massive information gain.

How would 50% be preferable in either case?

squeakynick · on March 30, 2012

No, we should still go for the letter with the highest frequency.

If we get a hit, it's not just the presence, or not, of the letter, we learn both the number of hits, and the positions of these hits. These are incredibly valuable pieces of information (more valuable that dividing the remaining words into two equal piles).

If we miss, we've carved away and removed the biggest non-possible set of words with one guess.