Does Python Have A Function Which Computes Multinomial Coefficients?

January 31, 2023 Post a Comment

I was looking for a Python library function which computes multinomial coefficients. I could not find any such function in any of the standard libraries. For binomial coefficients

Solution 1:

To partially answer my own question, here is my simple and fairly efficient implementation of the multinomial function:

def multinomial(lst):
    res, i = 1, 1
    for a in lst:
        for j in range(1,a+1):
            res *= i
            res //= j
            i += 1
    return res

It seems from the comments so far that no efficient implementation of the function exists in any of the standard libraries.

Update (January 2020). As Don Hatch has pointed out in the comments, this can be further improved by looking for the largest argument (especially for the case that it dominates all others):

def multinomial(lst):
    res, i = 1, sum(lst)
    i0 = lst.index(max(lst))
    for a in lst[:i0] + lst[i0+1:]:
        for j in range(1,a+1):
            res *= i
            res //= j
            i -= 1
    return res

Solution 2:

No, there is not a built-in multinomial library or function in Python.

Anyway this time math could help you. In fact a simple method for calculating the multinomial

keeping an eye on the performance is to rewrite it by using the characterization of the multinomial coefficient as a product of binomial coefficients:

where of course

Thanks to scipy.special.binom and the magic of recursion you can solve the problem like this:

from scipy.special import binom

def multinomial(params):
    if len(params) == 1:
        return 1
    return binom(sum(params), params[-1]) * multinomial(params[:-1])

where params = [n1, n2, ..., nk].

Note: Splitting the multinomial as a product of binomial is also good to prevent overflow in general.

Solution 3:

You wrote "sympy.ntheory.multinomial.multinomial_coefficients returns a dictionary related to multinomial coefficients", but it is not clear from that comment if you know how to extract the specific coefficients from that dictionary. Using the notation from the wikipedia link, the SymPy function gives you all the multinomial coefficients for the given m and n. If you only want a specific coefficient, just pull it out of the dictionary:

In [39]: from sympy import ntheory

In [40]: def sympy_multinomial(params):
    ...:     m = len(params)
    ...:     n = sum(params)
    ...:     return ntheory.multinomial_coefficients(m, n)[tuple(params)]
    ...: 

In [41]: sympy_multinomial([1, 2, 3])
Out[41]: 60

In [42]: sympy_multinomial([10, 20, 30])
Out[42]: 3553261127084984957001360

Busy Beaver gave an answer written in terms of scipy.special.binom. A potential problem with that implementation is that binom(n, k) returns a floating point value. If the coefficient is large enough, it will not be exact, so it would probably not help you with a Project Euler problem. Instead of binom, you can use scipy.special.comb, with the argument exact=True. This is Busy Beaver's function, modified to use comb:

In [46]: from scipy.special import comb

In [47]: def scipy_multinomial(params):
    ...:     if len(params) == 1:
    ...:         return 1
    ...:     coeff = (comb(sum(params), params[-1], exact=True) *
    ...:              scipy_multinomial(params[:-1]))
    ...:     return coeff
    ...: 

In [48]: scipy_multinomial([1, 2, 3])
Out[48]: 60

In [49]: scipy_multinomial([10, 20, 30])
Out[49]: 3553261127084984957001360

Solution 4:

I believe you can define a function to return multinomial coefficients in a single line using vectorised code (instead of for-loops) as follows:

from scipy.special import factorial

def multinomial_coeff(c): return factorial(c.sum()) / factorial(c).prod()

(Where c is an np.ndarray containing the number of counts for each different object). Usage example:

>>> coeffs = np.array([2, 3, 4])
>>> multinomial_coeff(coeffs)
1260.0

In some cases this might be slower because you will be computing certain factorial expressions multiple times, in other cases this might be faster because I believe that numpy naturally parallelises vectorised code. Also this reduces the required number of lines in your program and is arguably more readable. If someone has the time to run speed tests on these different options then I'd be interested to see the results.

Solution 5:

Your own answer (the accepted one) is quite good, and is especially simple. However, it does have one significant inefficiency: your outer loop for a in lst is executed one more time than is necessary. In the first pass through that loop, the values of i and j are always identical, so the multiplications and divisions do nothing. In your example multinomial([123, 134, 145]), there are 123 unneeded multiplications and divisions, adding time to the code.

I suggest finding the maximum value in the parameters and removing it, so those unneeded operations are not done. That adds complexity to the code but reduces the execution time, especially for short lists of large numbers. My code below executes multcoeff(123, 134, 145) in 111 microseconds, while your code takes 141 microseconds. That is not a large increase, but that could matter. So here is my code. This also takes individual values as parameters rather than a list, so that is another difference from your code.

def multcoeff(*args):
    """Return the multinomial coefficient
    (n1 + n2 + ...)! / n1! / n2! / ..."""
    if not args:  # no parameters
        return 1
    # Find and store the index of the largest parameter so we can skip
    #   it (for efficiency)
    skipndx = args.index(max(args))
    newargs = args[:skipndx] + args[skipndx + 1:]

    result = 1
    num = args[skipndx] + 1  # a factor in the numerator
    for n in newargs:
        for den in range(1, n + 1):  # a factor in the denominator
            result = result * num // den
            num += 1
    return result

Python Playground