Drawing a random number with chance of redrawing a second time. Best strategy that will never lose long term?
Q: We're going to play a game. You draw a random number uniformly between 0 and 1. If you like it, you can keep it. If you don't, you can have a do-over and re-draw, but then you have to keep that final result.
I do the same. You do not know whether I've re-drawn and I do not know whether you've re-drawn. Decisions are made independently. We compare our numbers and the highest one wins $1.
What strategy do you use?
I know the answer is 0.618..., tested this multiple times (0.5 will give the highest average value, but it won't guarantee you don't lose)
How do I come to this answer?
I was thinking I call the chosen threshold for A : "p", threshold for B: "q".
There is a (1-p)*(1-q) that they both take their first # : chance A wins = p-q+0.5
There is a (1-p)*q chance A takes his first #, B redraws a random # : chance A wins = 0.5p + 0.5
There is a p*(1-q) chance B takes his first #, A redraws a random # : chance A wins = 0.5-0.5q
There is a p*q chance they both redraw a #: chance A wins: 0.5
Total chance A wins:
(1-p)(1-q)(p-q+0.5) + (1-p)*q*(0.5p+0.5) + p*(1-q)*(0.5-0.5q)+0.5pq
I rewrote this and maximized this and after maximizing this, set p=q (equilibrium) and come out a different result. I ran tested above formula with some python programming and the formula is wrong. What is wrong with my reasoning here and how would this be done correctly? I've been trying to figure this out for a day. Thank you!
*individual probabilities for different possibilities seem to be correct based on testing in Python with 10 million runs each
Answer
- The questioner was satisfied with and accepted the answer, or
- The answer was evaluated as being 100% correct by the judge.
Low bounty!
I'm asking what's wrong with my reasoning. Someone might see what I did wrong and could write it down in 1 sentence.
To see what you did wrong, one needs to first understand what you wrote here. The offered bounty is even low for just reading your long question and details.
I second that. Very low bounty!
"0.618" is not an answer to the question "What strategy do you use?" ... Just saying ...
e.g. in the 1st case, (1-p)(1-q), what do you mean by "chance that A wins: p-q+0.5 ? This is not a "chance", it can be larger than 1 and also negative. You have to take into account the possible values that A has drawn (in that case, randomly distributed in [p, 1]) and the same for B (randomly distributed in [q, 1]). Similarly for the other cases.
FWIW, probabilities that A wins in the 1st case [that happens with prob.(1-p)(1-q)], depending on p, q:
p\q 0.2 0.4 0.6 0.8
0.2 [1/2 3/8 1/4 1/8]
0.4 [5/8 1/2 1/3 1/6]
0.6 [3/4 2/3 1/2 1/4]
0.8 [7/8 5/6 3/4 1/2]
(replace
with line break if this gets scrambled)