Eigenvector perturbation analysis plays a vital role in various data science applications. A large body of prior works, however, focused on establishing <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\ell _{2}$ </tex-math></inline-formula> eigenvector perturbation bounds, which are often highly inadequate in addressing tasks that rely on fine-grained behavior of an eigenvector. This paper makes progress on this by studying the perturbation of linear functions of an unknown eigenvector. Focusing on two fundamental problems — matrix denoising and principal component analysis — in the presence of Gaussian noise, we develop a suite of statistical theory that characterizes the perturbation of arbitrary linear functions of an unknown eigenvector. In order to mitigate a non-negligible bias issue inherent to the natural “plug-in” estimator, we develop de-biased estimators that <xref ref-type="disp-formula" rid="deqn1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(1)</xref> achieve minimax lower bounds for a family of scenarios (modulo some logarithmic factor), and <xref ref-type="disp-formula" rid="deqn2" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(2)</xref> can be computed in a data-driven manner without sample splitting. Noteworthily, the proposed estimators are nearly minimax optimal even when the associated eigen-gap is <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">substantially smaller</i> than what is required in prior statistical theory.
Discussion(0)
No comments yet. Be the first to comment.