Is this problem useful for anything? I've seen multiple variations of these prisoner problems now, and I can never remember any being useful in other contexts.
Well, first of all it's a puzzle so it's useful as something fun to think about. Math is kinda like science though, you never know exactly how or if a particular discovery will be useful. In this case maybe you could find some graph traversal scenario where this would be applicable. In the Wikipedia article under variants they mention
> If the number of team members and the fraction of boxes which are opened is fixed, the winning probability stays strictly larger than zero when more empty boxes are added
Replace boxes with nodes and now you have some math to determine graph traversals. Maybe a bit more math and you can expand it for multiple paths. I don't know though, I'm not a mathematician and I don't normally have to do more than DFS and BFS in coding interviews, but sure I can see some way that this might be useful.
Genetics for one. If a DNA primer-pair is looking for it's complement [0], then this strategy could be useful. Here the prisoners are the complementary primers and the DNA-sequence-to-be-matched-to is the cabinet. Shepherding proteins/histones could restrict in a nucleic bottleneck of some sort, such that only one primer gets to go at once without repeating [1]. Only when the primers are all set-up is the DNA 'free' to be transcribed or some such thing.
[0] For a sequence like ATGC, the primer is the opposite nucleotide, therefore the 'matching number' would be not ATGC as well, but TACG.
[1] Look, bio is weird, like, jumping genes are actually a thing. This set-up, though strange, is not as unreasonable as a lot of stuff that goes on.
> Because the prisoner starts on the box of their own number they are, by definition, on the chain that contains their ticket (there is only one ticket that points to that box).
Wikipedia provides exactly the same explanation that your link does:
> The prison director's assignment of prisoner numbers to drawers can mathematically be described as a permutation of the numbers 1 to 100.
> Every permutation can be decomposed into disjoint cycles, that is, cycles which have no common elements.
> In the initial problem, the 100 prisoners are successful if the longest cycle of the permutation has a length of at most 50. Their survival probability is therefore equal to the probability that a random permutation of the numbers 1 to 100 contains no cycle of length greater than 50.
What did you think was missing from the Wikipedia article?
> The cycle notation is not unique since a cycle of length l can be written in l different ways depending on the starting number of the cycle. During the opening the drawers in the above strategy, each prisoner follows a single cycle which always ends with his own number.
Think of it this way: you are always in a chain. It can be between 1 and 100 boxes long. But it can’t be a dead end or not include the 1-note. Worse case it visits all boxes and gets back to 1.
Try forming a scenario that doesn’t. E.g a “loop”:
Prisoner #1 opens box 1. Finds the number 2. Opens box 2 finds the number 3. Opens box 3...
Could he now find the “2” that would lead to him being stuck in the 2-3 loop? No. He already found the 2. It can’t appear again. In the third box he’ll find a number he hasn’t seen. The loop cannot form in any other way than to come to where he started - which it will do when he finds the note with “1” on it.
So by starting with box 1, he is on a loop of length 1..100 including the pointer TO box 1 (which is what he is looking for).
Ok, so I‘m walking a linked list that will lead to my number. But I still don’t get why the odds are better than just opening the first 50 boxes and hope for the best? What changes because I‘m on the chain?
The chance of you finding your number with 50 random opens is 50%. But you don't live jusr because you find your number. You only live if everyone finds their number. And that chance is slim: it's .5^100 which is a very small number.
The chance of finding your number using the chain algo is the same as the chance that your number is found before the 50th step of your chain. That probability is still 50% because you are opening at most 50 boxes of a 100, the order doesn't matter. This is why you feel nothing will change, that the order shouldn't matter. That is correct: each prisoner opens 50 boxes without any prior information.
But the key is what changes for the probability of success for the group.
All prisoners use the same shuffling (it's only randomized once). Everyone will succeed when walking their chains, if the longest chain is shorter than 50. That is, if there are two chains of 60,40 then the group will fail. Because some of the prisoners who are unlucky and have numbers in the 60 loop will to reach their number in 50 steps.
But if there are two chains of 50,50 or 3 chains 30,30,40 etc., then everyone will succeed.
So the difference is that with random choice, YOUR chance of finding YOUR number is 50%. The chance that all 100 does so you live, is .5^100.
With the chain walk, YOUR chance is still 50%, as is everyone elses. Yet the risk that someone fails is now just 70% instead 99.9999999...% (Now is no multiplication - the probabilities are not independent - the probability of success for the group is now a property of the loops rather than the product of 100 independent events)
I think this explanation makes the most sense, since it explains the non-independence of the player's actions. Another way I thought of it was this: imagine (despite the prisoners' knowledge) each prisoner actually had the same number. The "choose at random" strategy would still have a vanishingly small probably of success, while (any) deterministic strategy would yield a 50% rate of group success.
Imagine Strategy A, where everyone started at box 1 and checked 1-50. Half of the prisoners would succeed, guaranteed, but half would fail, guaranteed, and prisoners would never win.
A slightly better Strategy B is to start at their own box and all subsequnt ones, wrapping around at 100. Now you have at least a small chance, but each prisoner winning independently is low.
What cycle method Strategy C does is group up failures. When prisoners fail because there's a cycle of size 51+, 51 fail. When there's a cycle of 99 boxes, all prisoners fail. But in exchange, there's more possibilities where there aren't cycles that long.
So it's not that the odds are better for an individual as much as the odds are better that the individual's successes coincide. In Strategy A, the success of an individual prisoner decreases the estimated success probability for subsequent prisoners. In Strategy B, that decreased probability is nullified by incrementing the starting location by one step. In strategy C, a prisoner's success or failure amplifies the estimated success probability for others. Not by changing any state, but revealing the state itself.
The success criteria goes from the prisoners randomly picking correctly to the warden not randomly creating a loop larger than 50. If the warden creates no loop greater than 50 then the prisoners win. If he creates one of 51 or more then the prisoners lose because the 51st number wont be found.
Think of it this way - if all the boxes are sorted then everyone finds their own number in their box.
But let's say that everything is sorted except your's and some other prisoner's number. Then it makes sense to follow the chain, because there is only one chain of two nodes.
Basically any number that is in its place is sorted and thus not part of a chain. Any number that is not in its place is not sorted and part of a chain.
There can be many different chains, but only one chain that leads to your number. By starting from your own box, you guarantee that you follow your own chain thus reducing the search space.
> If just one prisoner does not find his number, all prisoners die.
You're trying to find a strategy that is successful for all prisoners. Following the chain reduces the problem to "is there a cycle of length >50", which can be calculated easier.
Opening boxes randomly (or arbitrarily, as in "50 boxes starting with my number), by all prisoners, is 50%100 which as mentioned is vanishingly small. The trick is that chain-following actually gives information to each prisoner as they go, which helps leverage into a better win chance.
Opening 50 boxes at random gives you a 50% chance to hit the right box. But everybody else also has a 50% chance that’s independent from yours, so you just multiply the probabilities together to figure out the overall probability. This makes it vanishingly unlikely to get it right with a lot of people.
This strategy instead has you all collectively betting that the structure of the permutation is favourable, which doesn’t have that multiplicative problem.
This strategy weeds out chains that don't contain your number.
If there are 3 chains, one contains your number, and two are loops that don't contain your number, this strategy necessarily starts you in the chain that contains your number.
Open box 7, find number 2
Open box 2, find number 2700
Open box 2700, find number -16
Open box -16, find number 0.9
Open box 0.9, find number ℵ
Open box ℵ, find number 赢
Open box 赢, find number two
...
As you can see, it's not necessary to use numeric digits to identify the boxes; any identifier at all will do.
But this is a cycle; if we had started by opening box 0.9, it would look like this:
Open box 0.9, find number ℵ
Open box ℵ, find number 赢
Open box 赢, find number two
...
Open box 7, find number 2
Open box 2, find number 2700
Open box 2700, find number -16
Open box -16, find number 0.9
Exercise: given that this cycle starts with "open box 7", how does it end? (Or in other words, what is the second half of the line before "Open box 7"?) If this cycle instead started with "Open box ᚠ", how would it end in that case?
If you consider that each box only has one box that "points" to it, and that last box must necessarily be part of the box's cycle (by definition, there's no way to avoid it as if you hypothetically swapped the last box to point somewhere else then all you're doing is linking the cycles and you'll end up in the same place but later)
For me it helped to think of it as a collection of linked lists. At most, there can be 100 lists of length 1, where each number points to itself. On the other extreme, there are many ways to create a list of length 100.
The strategy takes advantage of the fact that randomly permuting the numbers is less likely to create one or two long lists, and more likely to create a bunch of shorter lists.
Since each prisoner knows which list he belongs to, he can use that information to reduce his search space from all of the boxes to just the boxes in his list.
> What did you think was missing from the Wikipedia article?
Oddly enough as stated the problem fails. It essentially assumes an evenly distributed random placement of numbers. However if the warden simply adds the same random number to every single box it ends up as a very long cycle and they all fail.
Worse, they have to communicate which order to count from. Is box 1 the top left or bottom left box?
PS: The problem shows up in many such puzzles because 'the problem' was based on the math rather than the math being chosen to solve the problem.
Depends on context, I randomly tossed it does not imply all locations it might end up are equally likely. As it rolls after impact it's more likely to end up in a dip than on top of a hill etc. At best the direction is random but not necessarily the angle with respect to the ground.
Basically, even distributions are very rare in the real world.
Even if the warden's assignment of numbers to boxes is not random (perhaps he knows the suggested winning strategy and wants to foil it), the players can make it random by randomly permuting the numbers on the outside of the boxes. This guarantees their 31% success rate even against an adversarial warden.
Players can't communicate. So, this comes down to what that means, but your suggestion is reasonable and might depending on how the problem was carried out apply.
aka, they can't all individually come up with the same random numbers without communicating. But, they could all independently come up with the same strategy if the all saw the same numbers on the outside of the boxes.
They can communicate beforehand to come up with this scheme - it's clear they just can't communicate after looking at the boxes.
The generic version of the scheme is, after looking in box N, look in box f(N) where f is any bijective function from [1, 100], and all the prisoners agree on the same choice of f. The version described on the wikipedia page corresponds to f(N) := N, but any of the other 100! such functions would do so long as they agree on the same one.
You are treating this as a math problem not a puzzle. Is there anything written in the description that says they can all agree to an ordering that they can then apply their function to. Aka if they say box 1 maps to box 37 they need to agree what box 1 is and what box 2 is ...
As I have pointed out in other comments discussing strategy without seeing the room makes finding a strict ordering difficult.
Picture them entering the room and the boxes are on a round filing cabinet. There is no obvious #1 to pick so some of the 100 would pick different #1's. Again, as a math problem it's reasonable to assume they can agree on an order, but it's not stated.
They can communicate, before they start. That's how they share the strategy with each other.
> Before the first prisoner enters the room, the prisoners may discuss strategy—but may not communicate once the first prisoner enters to look in the drawers
So they just write down a random permutation and give a copy to each prisoner while discussing strategy.
They need to communicate before seeing the boxes. You assume the can decide on a strict ordering before looking at the boxes. If they say box one is the highest box closest to the door their might be two boxes that satisfy that criteria. As a math problem it's assumed that such problems don't occur, as a riddle things are not so clear.
On the other hand, if they could communicate after getting a reasonable definition of the room and know they would not be moved then sure strict ordering is easy. But, nothing says they are given that.
Further, they need to pick random numbers without the warden deducing the scheme, but let's assume the can use a public key crypto to get around the need for privacy.
There are many strict orderings that can be used, but we just need one.
Since each number is in a physical draw in a physical room, and no two draws can occupy the exact same space, there is a strict ordering of the drawers given by each drawers distance from a fixed point in the room (say a specific corner agreed on beforehand) in each of the three orthogonal dimensions sideways (x), backwards (y), and up (z).
We say a drawer A is before a drawer B if A.x < B.x; or, if A.x = B.x then if A.y < B.y; or, if A.x = B.x and A.y = B.y then if A.z < B.z
If A.x = B.x and A.y = B.y and A.z = B.z then A = B.
This is a strict ordering that will number the drawers from 1-100, and can be determined beforehand.
Now the prisoners take that order, and shuffle it randomly, without telling the warden what they are doing.
As math that works fine, in practice it's a world of problems. A.x = B.x is a question of finite precision close enough and all prisoners will not agree if A.x = B.X or A > B or A < B. Adding precision tools does not help as you get into things like thermal expansion and the weight of prisoners deforming the floor etc.
So, while that works as a math problem, it may fail as an actual solution.
Before the first prisoner enters the room, the prisoners may discuss strategy—but may not communicate once the first prisoner enters to look in the drawers.
So they certainly could all use a pre-agreed random permutation to compensate for any insufficient or evil [non-]randomness introduced by the director.
I was not clear in my wording. They need to enter the room and know box one is this and thus I should treat that as box 22. But again without seeing the boxes they need a system to unambiguously order the boxes. That's not strictly a math problem as they are not say given exact x,y,z locations or precise measuring tools etc.
Or what if it is dark inside the room and they can not read the numbers in the drawers? Or what if there is no gravity in the room and they can not determine which drawers are at the top? Or what if the director used a script or number system none of the prisoners can read or understand? Or what if the numbers are only 100 nm tall inscribed with electron-beam lithography on small pieces of metal? Or what if there is no oxygen in the room? Or what if the room is 10,000 km long, wide, and high?
One operation to shuffle a deck of cards is to cut it creating a random offset.
So, I have seen someone just move the top few cards when asked to shuffle the deck. This makes a small offset more likely than you might assume. Remember the odds should be evenly spread across 52! which means in all of human history each deck should be unique, but in practice people have played with a default deck. (http://www.philly.com/philly/news/new_jersey/20120904_A_C__s...)
Funny thing: if that added number isn't relatively prime to 100 - and 60% of random numbers would share a factor - then they've just made a lot of smaller cycles.
And it's easy to break if you can swap two. Say the warden adds one. If you swap 50 and 100, you have two cycles of 50.
Again assuming the uniform random distribution. Select from (1, 3, 6, 7, 9, 11) is just as random as selecting from 1-100.
I would argue that not specifying the boxes have a specific order is a much larger problem, but saying 'random' does not nessiarly mean what you might think. As to moving the cards, they are specifically excluded from communication.
There is not, actually! You can show this strategy does as well as the case where prisoners have full information of each other's moves.[0] So any other strategy where each prisoner is blind to others' actions can always be executed in the full-information case above; so this bound is tight and this strategy is optimal: if there was a strategy that did better in the blind case, you could execute it in the full-information case to get a better outcome, but this is impossible.
For a nice exposition of this, see Curtin and Warshauer's article "The Locker Puzzle."
---
[0] More specifically, a game where all lockers are left open, so every strategy has the same probability of winning.
To be precise, the strategy is shown to be optimal as well in a modified game where all lockers are left open and you cannot open any more lockers if you find your own number.
Right, so picking a box at random and following the chain would be bad because you could get back to where you started having not found your number. Starting with your own numbered box is your defense against this because to go back to where you started you would have to have found your number.
I was thinking, the solution presented seemed not only invalid but ill-defined, as you could hit a cycle and still have picks left. With this, it totally makes sense why it works: if there are disjoint cycles, this strategy guarantees that each prisoner is on its own cycle. And I guess the math works out that in unidirected graphs of 100 nodes, 30% of them contain only disjoint cycles of 50 or fewer nodes.
Because of the way that the problem is set up (each of the 100 numbers is in one of the 100 lockers), it can always be written as a list of disjoint cycles (mathematically: every permutation can be written as a product of disjoint cycles).
So, by starting in your own locker, you will eventually end up finding your own number. The only constraint is that everyone has to do it in 50 tries or less, i.e. there's no cycle of length >50.
Not that it's fair to ask you this, but how would we model the probability of picking chains at random? I would assume it's much better than the (1/2)^(100) approach of picking random each time.
I would assume it's much worse than the (1/2)^(100) approach of picking random each time.
If you pick randomly with each of your 50 picks then you've got a 50% chance of winning. If you pick a random cycle and follow it, then you win iff the cycle you pick is your cycle and your number occurs within the first 50 of that cycle. So if the cycle that includes you is small, the first thing is less than 50%, and if the cycle that includes you is large, then the last thing is .. hmm well worth numbering:
Say there's a 99 cycle and a 1 cycle. So 0.99 you're in the latter. And assuming that, 50/99 you pick a box that leads you to yourself in 50 picks. And 1/100 you're in the first, and assuming that 1/100 you pick it. So, 1/100 x 1/100 + 0.99 x 50/99 = .0001 + .5 = 0.5001. Hmm, I expected it to be under .5. So .. help?
I think my math for the above case is wrong, but I've got seven empty beer cans in front of me so. But even if it's right, you can somewhat rationalize. If there are two cycles each of len 50 then it's 50% chance. And if there are two cycles where one is 99 and the other is 1, then you've got very marginally better than 50% chance. And if it's one cycle of 100 you've got 50%. But if it's more than two cycles, say three of 33 or 34, then you can quickly see that you've only got a 33% chance of finding yourself. And plenty of outcomes like that exist. So it seems like 1 cycle (or apparently 2 with 99:1) is an upper bound. And the remainder only serve to lower the odds. And given they're plenty likely (I guess), your odds are even worse than randomly guessing with each pick.
Yeah I think I forgot to account for not ending up on your cycle in the 99 case (i.e. you're in the 99 but you chose the 1), so it should be .4901 or .4951 or something even in the 99:1 case.
There's no 'start' of a cycle, they are circular. I.e. the cycle 1->2->3->1 is the same as 3->1->2->3). I.e. picking a locker at random is equivalent to picking a chain of length N with probability N/100.
In fact, the prisoners aren't given any information about the cycles and can't communicate, so they can't choose a cycle "at random".
Not sure if your question can be formulated as follows (which requires giving more information to the prisoners): Suppose that we color every group of lockers that form a cycle in a different color, and every prisoner then picks a random color and then picks a random locker among the lockers of that color. What is the chance of winning?
"Before the first prisoner enters the room, the prisoners may discuss strategy—but may not communicate once the first prisoner enters to look in the drawers."
And, in the example given, "That prisoners 5 to 8 will also find their numbers can also be derived from the information gained by the first three prisoners."
There's never any explanation of what happens when a prisoner opens a drawer and finds it empty. In the first example given, what happens when prisoner 5 goes in and there's nothing in drawer 5? The algorithm doesn't seem to account for this.
It seems like there's some sort of assumption that the later prisoners are gathering information from the earlier prisoners, but the problem set-up seems to preclude this? They're going into a room so they can't watch each other, the drawers are closed afterwards so they can't derive information from which drawers are open/closed, and they're not allowed to communicate.
1) You don't take your number when you find it. I guess my brain assumed "find" meant "find and take with you to prove you found it." Perhaps a better term might be "encounter."
2) Here's a much better explanation (thanks micaeked): http://datagenetics.com/blog/december42014/index.html. Essentially, the algorithm works because following numbers to drawers will eventually create a loop (e.g., drawer 2 points to 4 points to 6 points to 2). By starting with the drawer with your number, you're guaranteed to be in a loop with your number. The only question is whether your loop is less than 50 drawers long...
Furthermore, this confirms that there IS no transfer of information from previous prisoners in any way.
The only question is whether your loop is less than 50 drawers long...
So then the question is: what proportion of the permutations of 100 numbers contain a cycle greater than 50 vertices long? Is it 30%? The claim made in the Wikipedia article is that the prisoners have around 30% chance of surviving.
Edit: looks like that is the case. You can even take the limit of the number of boxes (hence prisoners) to infinity and their probability of survival never drops below 30%! This is an amazing result.
* Thus you will never be in some other loop. Loops are disjoint. Disjoint circles, not train tracks that interconnect. Just loops. Think about that and realize it.
* This strategy guarantees you'll be in your loop. ("Unfortunately" you'll be at the end of that loop: the box that points to your number. But still, in the loop, which is the point).
* Thus if there are multiple loops, all of length 50 or less, you'll get your number before your number is up.
* And math shows that if there are 100 nodes in a randomly unidirected graph, 30% of them contain no loops greater than length 50.
So if everyone follows this strategy, 30% chance they'll win.
The entire set-up (the numbers in the boxes) is a permutation. In other words, it’s a function s that maps bijectively from [0, n-1] to [0, n-1], or to put it more simply: for each i in [0, n-1] there is exactly one value j such that s(j) = i.
If you take i_0, then i_1 = s(i_0), then i_2 = s(s(i_0))... and so forth, at some point you will encounter a value you have already seen (because you can only visit n different values at most), and from that point onward you will loop. The trick is that the first value you will see twice MUST be i_0 (your starting point). If not, that is if s(i_k) = i_m where m > 0 (and i_k is the last value before you loop back) then s(i_k) = s(i_{m-1}). This means that i_k = i_{m-1} which is in contradiction with the fact that i_k is the last value before looping (i_k was already visited at step m-1)
Edit to answer TFQ: if your start with your own number as i_0, the proof above shows that at some point you will loop back to it. Then it’s a question of whether you loop back in 50 moves or less.
Every box points to a box and is pointed to by a box.
If at first you try box 3 and it doesn't contain your number 3, then your number 3 is out there and its box points to the one you tried. Thus its box is in your chain and has your number.
The statement "That prisoners 5 to 8 will also find their numbers can also be derived from the information gained by the first three prisoners" is poorly worded. I think the author intended it to mean the information gained to you, an outside observer trying to prove this statement. There is no transfer of information from one prisoner to the next.
Also, when the prisoners find their number they don't remove it. No drawers are ever empty. They just have to open a drawer that contains their number.
I think you're right that the statement doesn't make sense- or else we're both not understanding it.
From what I can see, no prisoner needs any information beyond knowing their own number. Once they know that, they're just searching the cycle that their number is in to see if it's less than 50-long.
My favorite version of this is the one that shows one prisoner to make a single swap of two drawers' contents in order to improve their chances. Surprisingly, this allows for a 100% success rate.
Yeah this is the cool part. If you have a secret agent, he only has to make one change to guarantee success. (Success may already be guaranteed, but by the time the secret agent leaves the room, all prisoners are guaranteed to be safe).
Doe the "hacker" get to see all the contents? Or only 50 (half?)
If the hacker needs to break the longest permutation in half (consider the simple permutation x->(x+1), they could ... swap box 0 with box 50, so box 0 points to 51 (->52 .. -> 99 -> 0) and box 50 points to box 1. That makes 2 cycles of size 50, but requires inspecting 50+1 boxes
It's a different riddle, not a different solution to the riddle.
It's asked differently: given the same setup, except you have one extra person who's the leader, and he's allowed to go in and look at all the boxes, then make one change.
What does the prisoner do if she gets to, for example, box 40 and there is a 40 in the box?
[edit] I reread the explanation and get it now. Only 1 prisoner (40) will actually get to open box 40. I was trying to think of this from the point of view of implementing it in code. I guess the cycle stops when a prisoner finds their own number or reaches 50 attempts.
I still doubt the Monty Hall problem (mentioned in the wiki-article). I understand the reasoning, summing the probability tree, but there are two ways to build the probability tree, with the step of reveal or without. Since the player doesn't know that step, why would it be part of the tree? Because we know the setup a priori, in a way.
I'm assuming the bigger tree is still incomplete. More levels can be inserted, reflecting the reasoning of the a priori knowledge. I assume the supposed advantage would thus prove to be imaginary. Or in simpler terms:
Normally the tree would have the sequence in the following order: box distribution, choose a box, reveal one empty box, switch choice or don't.
However, if you are predetermined to switch, then the order would be changed to switch before the reveal. Thereby, the reveal is irrelevant to the result of switching and the probabilities are equal again.
You make your initial guess. What's the probability you're right on your initial guess? 1/3. What's the probability you're wrong? 2/3.
So now, you're presented with your second decision. Do you stick or switch? Well, if you stick, that's going to lead to success if and only if you were right on your first guess. And if you switch, that's going to lead to success if and only if you were wrong on your first guess.
We all agree that you were more likely to be wrong on your first guess. So you should bet on the fact that you were initially wrong. I.e. switch.
>However, if you are predetermined to switch, then the order would be changed to switch before the reveal. Thereby, the reveal is irrelevant to the result of switching and the probabilities are equal again.
Before the reveal, you don't know which box to switch to. Monty knows which boxes are empty, and he gives you partial information about that by revealing one of them.
I feel similarly about Monty Hall. I literally just made a spreadsheet of all the possible outcomes, and sure enough, switching leads to winning 2/3 of the time. Even though I can see the indisputable data in front of my eyes, my brain just won't fully accept it. Looking at it, I realized an interesting little wrinkle that I think might have to do with the problem's psychological quirkiness in some way I can't fully articulate: When you stay you win 1/3 of the time. When you switch, you win 2/3 of the time. BUT if you randomly pick between staying and switching, you win 50% of the time because (1/2)(1/3) + (1/2)(2/3) = 1/6 + 2/6 = 3/6. There is a correct-ish-ness to the intuitive answer of 50%...it's just not...actually correct.
PS: of course you can't switch before the game maker reveals a box. But isn't that immaterial? In one of three cases, your first choice is correct. Therefore, in two of three cases, the first choice is incorrect ... so switching should be correct in those two.
The best way to become convinced of Monty Hall may be to grab a buddy and some playing cards, and run a simulation. 30 rounds or so should be convincing.
At some intuitive level this problem reminded me of other very different puzzles like [1] and prompted me to wonder if as a general rule the best tactic when one is faced by a riddle that appears impossible, is to try to think of solutions that involve self reference. It also makes me wonder if the brain at a basal unconscious level uses something analogous to the linked list like solution described here to quickly explore permutation space.
If I understand this correctly, this basically uses the drawer's perceived number as an out-of-band signalling mechanism for the algorithm they agreed to follow. Each actor must adhere to the same scheme to address drawers. This means that the drawers must either be uniquely named ("labelled" ahead of time), or have some sort of natural ordering that's algorithmically deterministic (e.g. lexicographic, array index, 2D-position using a previously agreed algorithm like right-then-down).
What are some practical uses for this? Distributed hash table traversal using one's own address [1]?
This "look at it as a permutation problem and follow the disjoint cycles" idea is loosely what rainbow tables are based on (although I think in that case because of the possibility of hash collisions they aren't guaranteed to be disjoint cycles, which requires extra work to solve).
Many mathematical-minded people look at this problem and conclude that the maximum probability of freedom is close to (1/2)^50.
But that's absurdly wrong. They're out by a factor of 10^14 !!! So, are there any important problems in computer science / software engineering contexts that have this form? If so, they may have speed-ups of several orders of magnitude that mathematically and algorithmically-minded people might completely miss?
One key thing about the result is that it doesn't mean that the probability of survival is 30%. The prison director can choose an arrangement such that the strategy is guaranteed to fail. If the prison director is intelligent and after your life, you are infinitely more likely to survive by opening boxes randomly.
The third variant of the problem (where one prisoner is allowed to make one change, which can guarantee that every prisoner survives) is the one I heard first; I like that problem because it seems even more counter-intuitive, but also makes it easier to understand why it works.
That's just a game of chance. The point of a skill game, even one that involves randomness, is that information you get can influence your actions. If you cannot look inside the drawers, you do not gain any information about the particular setup and thus cannot execute anything that would resemble a strategy.
They can still do slightly better than chance by ensuring their guesses have the right amount of overlaps. For example, if there were only two prisoners, the right strategy would be for one to pick box 1 and the other one to pick box 2, improving their odds of survival from 25% to 50%.
I suspect the chances are still negligible for 100 prisoners though.
> I suspect the chances are still negligible for 100 prisoners though.
I think from a probabilistic standpoint, they're equivalent as n increases. Think of it as the inverse; what's the probability each box is checked by its owner?
Pb = Nb / N, or number of people that checked that box divided by number of people total.
If there is perfect, premeditated overlap, then Nb / N is exactly 50/100, so we get the "(1/2) ^ 100" that was already cited in the article.
If prisoners choose randomly though, the expected value of Nb is still 50 though. You've maximized Nb in practice, but not really in theory.
> If prisoners choose randomly though, the expected value of Nb is still 50 though.
The expected value of Nb would still be 50 if all the prisoners checked the same subset of boxes, but the chance of the prisoners surviving would clearly be 0.
I think it's correct that if the prisoners don't get the see the contents of the boxes, then any arrangement where each box is opened by 50 prisoners is equally good and optimal - but I can't quite see a proof.
> The expected value of Nb would still be 50 if all the prisoners checked the same subset of boxes, but the chance of the prisoners surviving would clearly be 0.
Yes I should have expanded. The probability of escape is
∏ (i = 1..N) Ni / N
So if some Ni is 0, the whole thing is 0. My point was that if you're choosing randomly, then as N goes to infinity, "Ni / N" already converges to 0.5 . So your strategy is ensuring optimality, but you'd get there if you took it to the limit anyway.
As a quick test I did 1000 trials. The minimum overlap in those 1000 trials was 26. 95% of the time it was above 35. For N = 1000, 95% of the time it was above 440.
If they did, you essentially go back to random choice because of all the broken chains created. "Find" is meant to imply literally "discovering the location of data that correlates to their prisoner number".
Real world context implied that prisoners would also take the tickets, but that is not specifically stated.
Say you have 10 players in a row. You number each player by its position in the
row: 1, 2, 3, .., 10. You then write the players' names on 10 pieces of paper,
and you randomly place the papers in 10 boxes. You place exactly one paper in
each of the boxes. The boxes are also numbered from 1 to 10.
You allow each player to look into half of the boxes. Each player can choose
which boxes to look into. The player "wins" if they find a paper with their
number written on it. But they can't tell anyone if they won or lost, what they
saw in the boxes, or which boxes they picked.
The problem then, is the following: Which boxes should each player look into,
such that the chance of all of them "winning" at the same time is as high as
possible.
For example, one way to put the 10 numbers in the 10 boxes is:
Box 1: 10
Box 2: 9
Box 3: 8
Box 4: 4
Box 5: 6
Box 6: 5
Box 7: 7
Box 8: 3
Box 9: 2
Box 10: 1
The ordering of the numbers in the boxes is a permutation of the 10 numbers.
We know that there are 10! possible ways to place the numbers in the boxes.
Now, consider the following strategy that player X can use to open his 5 boxes.
He first opens the box numbered X. If he sees his number in it, then he wins.
Otherwise, he saw a different number Y. He then opens the box numbered Y. If he
finds his number, he wins. Otherwise, he continues until he has opened the 5
boxes he is allowed to open.
If the player was allowed to open as many boxes as he likes, they he would
always win. That is because he would either find his number in a box, or
continue opening boxes he hasn't opened before. Why is this true ? Well, let's
say it isn't, and that he opens boxes 3 -> 2 -> 5 -> 7 -> 5. But that cannot
happen: The first time he opened box 5 was because he saw the number 5 in box 2.
And if he had to see box 5 again, this means that box 7 also had the number 5 in
it. In fact, the only way player X could go back to a box he has seen, is when
he finds his number (X) that will lead him back to the first box he opened.
Now, we know that a player "wins" if he finds his number before opening more
than 5 boxes. Or, put it differently, if the loop he follows has length at most
5. In the example above, player 2 would open box 2, which would lead him to box
9, which contains number 2, forming a loop of length 2.
So, when do all players "win" at the same time ? When there does not exist a
loop that has length 6 or more. Or more specifically, there is no loop of length
6, or 7, or 8, or 9, or 10.
So: P_win = 1 - P_loop(6) - P_loop(7) - P_loop(8) - P_loop(9) - P_loop(10).
We can show [1] that P_loop(k) = 1/k, when k > 5. We need the loop to be larger
than 5 because in this case, there can only be one such loop on the permutation.
But then P_win would be 1 - 1/6 - 1/7 - 1/8 - 1/9 - 1/10 or ~35.4%. In general,
if we have N players and N boxes then P_win would be:
P_win = 1 - Sum[k=n/2+1 .. n]P_loop(k)
or
P_win = 1 - ( Sum[k=1 .. N]P_loop(k) - Sum[k=1 .. N/2]P_loop(k) )
or
P_win ~= 1 - ( ln(N) - ln(N/2) ) > ~30%
(each sum is a harmonic series sum.)
[1] Why is P_loop(k) = 1/k when k > N/2 though ?
Let's say you have numbers 1, 2, .., k. All permutations of them are k!.
How many of those k! permutations form a loop ? Well, at the first position,
you can place any number other than '1', so you have (k-1) choices. Say you
placed number Z at position 1. Then at position Z, you can't place '1' and you
can't place 'Z', so you are left with (k-2) choices. In the last position, you
have to place number '1' to close the loop, so you have no choice. So, you have
(k-1)(k-2)(k-3)...1 ways to form a loop, or (k-1)!.
Therefore, with a little bit of squinting, we can see that the probability that
a random permutation has a loop on it of length exactly k is (k-1)!/k!, or 1/k.
We can formalize this argument as follows: There are (N choose k) ways to pick the
k places on the permutation where there exists a cycle. There are (k-1)! valid
ways to form a cycle on these k places. There are (N-k)! ways to arrange the
remaining elements on the permutation. There are N! permutations, Therefore,
P_loop(k, N) =
(N choose k)(N-k)!(k-1)! / N! = N! / k! / (N-k)! * (N-k)! * (k-1)! / N! = 1/k