Given two discrete random variables
X and
Y, with probability distributions
and
, respectively, denote by
the set of all couplings of
and
, that is, the set of all bivariate probability distributions that have
and
as marginals. In this paper, we study the problem of
... [Show full abstract] finding the joint probability distribution in of minimum entropy (equivalently, the joint probability distribution that maximizes the mutual information between X and Y), and we discuss several situations where the need for this kind of optimization naturally arises. Since the optimization problem is known to be NP-hard, we give an efficient algorithm to find a joint probability distribution in with entropy exceeding the minimum possible by at most 1, thus providing an approximation algorithm with additive approximation factor of 1. We also discuss some related applications of our findings.