# Notes on Regularization¶

In some cases, e.g., if the data is sparse, the iterative algorithms underlying the parameter inference functions might not converge. A pragmatic solution to this problem is to add a little bit of regularization.

Inference functions in choix provide a generic regularization argument: alpha. When $$\alpha = 0$$, regularization is turned off; setting $$\alpha > 0$$ turns it on. In practice, if regularization is needed, we recommend starting with small values (e.g., $$10^{-4}$$) and increasing the value if necessary.

Below, we briefly how the regularization parameter is used inside the various parameter inference functions.

## Markov-chain based algorithms¶

For Markov-chain based algorithms such Luce Spectral Ranking and Rank Centrality, $$\alpha$$ is used to initialize the transition rates of the Markov chain.

In the special case of pairwise-comparison data, this can be loosely understood as placing an independent Beta prior for each pair of items on the respective comparison outcome probability.

## Minorization-maximization algorithms¶

In the case of Minorization-maximization algorithms, the exponentiated model parameters $$e^{\theta_1}, \ldots, e^{\theta_n}$$ are endowed each with an independent Gamma prior distribution, with scale $$\alpha + 1$$. See Caron & Doucet (2012) for details.

## Other algorithms¶

The scipy-based optimization functions use an $$\ell_2$$-regularizer on the parameters $$\theta_1, \ldots, \theta_n$$. In other words, the parameters are endowed each with an independent Gaussian prior with variance $$1 / \alpha$$.