Equivalence of Empirical Risk Minimization to Regularization on the Family of $f- \text{Divergences}$
Article 2024 en
Authors
FD
Francisco Daunas
IE
Iñaki Esnaola
SP
Samir M. Perlaza
Abstract
1 min read
The solution to empirical risk minimization with <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$f-\mathbf{divergence}$</tex> regularization <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$(\mathbf{ERM}-f\mathbf{DR}$</tex>) is presented under mild conditions on <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$f$</tex>. Under such conditions, the optimal measure is shown to be unique. Examples of the solution for particular choices of the function <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$f$</tex> are presented. Previously known solutions to common regularization choices are obtained by lever-aging the flexibility of the family of <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$f-\mathbf{divergences}$</tex>, These include the unique solutions to empirical risk minimization with relative entropy regularization (Type-I and Type-II). The analysis of the solution unveils the following properties of <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$f-\mathbf{divergences}$</tex> when used in the ERM-f DR problem: <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$i$</tex>) <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$f-\mathbf{divergence}$</tex> regularization forces the support of the solution to coincide with the support of the reference measure, which introduces a strong inductive bias that dominates the evidence provided by the training data; and ii) any <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$f-\mathbf{divergence}$</tex> regularization is equivalent to a different <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$f-\mathbf{divergence}$</tex> regularization with an appropriate transformation of the empirical risk function.
Discussion(0)
No comments yet. Be the first to comment.