Some initial thoughts...

Your question is related to--but not exactly the same as--a flavor of combinatorics questions called "stars and bars." The reason it's not the same is that you want the probability of a specific set of outcomes, whereas stars and bars is simply interested in counting the number of such outcomes.

Let's pretend like you cared about counting instead of probability for a second. Essentially, you have n balls and x bins, and your question is "how many ways can I assign the n balls to x bins such that there are at least k balls in at least one bin."

Ignoring the restriction, it's a straightforward calculation, where each unique outcome can be depicted by a unique sequence of stars and bars where stars represent balls and bars separate the bins. You need x-1 bars to separate x bins. For example, let's say n = 3 and x = 4, the outcome:

*||**|

represents the outcome where there is 1 ball in bin 1, 0 balls in bin 2, 2 balls in bin 3 and 0 balls in bin 4. The total number of outcomes would be the total number of ways to choose n spaces to be stars among n + x - 1 spaces.$${n+x-1 \choose n}$$

With the restriction that at least 1 bin needs at least k balls, well, things get really hard really fast. A sensible approach may be to try to count the number of ways in which the condition is *not* met and subtract that from the total number of ways where we disregard the condition. The number of ways where the condition is not met is equal to the number of ways to put n balls in x bins such that no bin has more than k-1 balls.

The top answer in this stack exchange post gives a nice answer as to how to calculate this number:

https://math.stackexchange.com/questions/553960/extended-stars-and-bars-problemwhere-the-upper-limit-of-the-variable-is-bounded. Note that they use n where you use x, they use r where you use k, and they use N where you use n.

The issue at hand now is that simply counting the number of ways something can happen is not enough to answer your question. This is because each 'outcome' that we're counting is not equally likely. We would also need to calculate the probability of each case. Which, for each individual case is not hard. Lets use your Yahtzee example:

For "at least one three-of-a-kind," one example of a satisfactory outcome would be three 1's and two 5's. The probability of getting exactly three 1's and two 5's would be $${5 \choose 3}{2 \choose 2}\frac{1}{6^{5}}$$

Then, I was thinking about this problem in bed and realized that what we're looking for is the distribution of the *maximum* of a multinomial distribution. Let's say X is a multinomial random variable with 6 equally likely outcomes and 5 independent trials. X is essentially a vector that counts how many of each outcome was observed. Then let Y = max(X). In your Yahtzee example, you'd be looking for: $$P(Y \geq 3)$$I did some googling to see if some research has been done in the distribution of the maximum of a multinomial distribution.

https://www.jstor.org/stable/2347220 gives an algorithmic answer, but honestly, I wouldn't know how to translate the 1970's pseudocode into a modern programming language. But I'm sure there are people out there who could. Interestingly, though, this paper also cites several approximations from previous work, which may be interesting to look into.

https://www.stat.purdue.edu/%7Edasgupta/mult.pdf provides a useful approximation if you want to say something like "95% of the time, the max is at most x." Which, again, is related to what you want but not exactly what you want.

https://royalsocietypublishing.org/doi/10.1098/rsos.190198 provides a nice history of various approximate solutions over time, as well as an approach to an exact answer. Section 4.1 gives a (quite complicated) procedure to arrive at the exact answer you're looking for.

https://repository.tudelft.nl/islandora/object/uuid:aa9c02f9-71e9-4ba4-840d-596aa0e2341a/datastream/OBJ/download is a very thorough deep-dive into methods and applications, several of which we’ve already seen.

Hopefully I’ve provided enough resources and search-terms for you to get started. Clearly this is an absolutely massive topic that, quite literally, masters theses have been written on. Many clever people have been able to come up with clever parametric approximations, as well as some algorithmic exact solutions. But no one has been able to come up with a neat closed-form solution that you can simply plug your numbers into.

I'm sorry, I should have asked this before reserving: would a simulation be an acceptable answer? After doing a little research, I actually don't think a closed form solution exists for this problem. The reason your binomial formula doesn't work is because it's not a binomial problem. It's a multinomial problem.

It may be possible to program a function in R based in the dmultinom function that can give you a computed solution that doesn't rely on simulation

Simulation won't work for my purposes. I need to be able to articulate with some mathematical formality what the operating characteristics are of a system I've developed. I can see why you would say it is multinomial, but I believe in this case we should be able to model it as a binomial problem by considering each specific outcome value as a separate case and then summing them up. But then I haven't figured out how to do that correctly so I'm probably wrong.

Is there some way for you to cancel this question without penalty for either of us? I'm almost positive a solution like one you're imagining simply doesn't exist. It's quite messy and involves a whole lot of inclusion-exclusion.

I'll take a look.

I'm new to this site but I don't see any way for me to 'cancel the question'. If there's no other way for you to rescind your acceptance to answer the question, then it appears you may need to submit a response, at which point I can 'dispute' it, and some judge will adjudicate and cancel the money. Maybe a better shortcut is for you to e-mail [email protected] because why waste a judge's time. Thanks for trying.

Alternatively, I will accept an answer that explains why the solution is intractable and requires simulation (I can do the simulation myself, don't worry about that).

That's unfortunate. I think "there is no answer" will need to be an acceptable answer, unless you want to dispute it. In my answer below, I will do my best to give reasons as to why there is no answer, as well as relevant links to related ideas that could point you in the right direction if you wanted to do more digging.