What are the odds of at least k same outcomes in n independent trials, each with x equally likely outcomes?

So for example Yahtzee odds of getting at least 3 of a kind in a single roll of 5 dice: in this simple case, k = 3, n = 5, and x = 6, but I need a general solution as my values for k, n, and x are higher (typical: k = 50, n =1000, x = 256) and will vary a bit.

My attempt at using the binomial distribution failed. I seem to be overcounting but it is not clear how to structure this correctly. I thought it would be:

$x \times \sum_{i = k}^{n} \binom{n}{i} \times \left ( \frac{1}{x} \right )^{i} \times \left ( \frac{x-1}{x} \right )^{n-i}$

But in many cases this exceeds 100% so clearly it is incorrect.

Thank you.

  • I'm sorry, I should have asked this before reserving: would a simulation be an acceptable answer? After doing a little research, I actually don't think a closed form solution exists for this problem. The reason your binomial formula doesn't work is because it's not a binomial problem. It's a multinomial problem.

  • It may be possible to program a function in R based in the dmultinom function that can give you a computed solution that doesn't rely on simulation

  • Simulation won't work for my purposes. I need to be able to articulate with some mathematical formality what the operating characteristics are of a system I've developed. I can see why you would say it is multinomial, but I believe in this case we should be able to model it as a binomial problem by considering each specific outcome value as a separate case and then summing them up. But then I haven't figured out how to do that correctly so I'm probably wrong.

  • Is there some way for you to cancel this question without penalty for either of us? I'm almost positive a solution like one you're imagining simply doesn't exist. It's quite messy and involves a whole lot of inclusion-exclusion.

  • I'll take a look.

  • I'm new to this site but I don't see any way for me to 'cancel the question'. If there's no other way for you to rescind your acceptance to answer the question, then it appears you may need to submit a response, at which point I can 'dispute' it, and some judge will adjudicate and cancel the money. Maybe a better shortcut is for you to e-mail [email protected] because why waste a judge's time. Thanks for trying.

  • Alternatively, I will accept an answer that explains why the solution is intractable and requires simulation (I can do the simulation myself, don't worry about that).

  • That's unfortunate. I think "there is no answer" will need to be an acceptable answer, unless you want to dispute it. In my answer below, I will do my best to give reasons as to why there is no answer, as well as relevant links to related ideas that could point you in the right direction if you wanted to do more digging.

Answer

Answers can be viewed only if
  1. The questioner was satisfied and accepted the answer, or
  2. The answer was disputed, but the judge evaluated it as 100% correct.
View the answer
The answer is accepted.