This paper considers generalized birthday problems, in which there are d classes of possible outcomes. A fraction f i of the N possible outcomes has probability ai/N, where si d=1 fi = si d=1 fiai=1. Sampling k times (with replacements), the objective is to determine (or approximate) the probability that all outcomes are different, the so-called uniqueness probability (or: no-coincidence probability). Although it is trivial to explicitly characterize this probability for the case d=1, the situation with multiple classes is substantially harder to analyze. Parameterizing k= aN, it turns out that the uniqueness probability decays essentially exponentially in N, where the associated decay rate ¿ follows from a variational problem. Only for small d this can be solved in closed form. Assuming a i is of the form 1+f ie, the decay rate ¿ can be written as a power series in É we demonstrate how to compute the corresponding coefficients explicitly. Also, a logarithmically efficient simulation procedure is proposed. The paper concludes with a series of numerical experiments, showing that (i) the proposed simulation approach is fast and accurate, (ii) assuming all outcomes equally likely would lead to estimates for the uniqueness probability that can be orders of magnitude off, and (iii) the power-series based approximations work remarkably well.
|Journal||Probability in the Engineering and Informational Sciences|
|Publication status||Published - 2014|