Fisher Information是未知参数的信息量,这个怎么理解,它的统计意义是什么?
1个回答
Suppose likelihood is $L(X; \theta)$, log likelihood is $l(X; \theta)$, then
(1) Fisher Information is second moment (and variance) of the gradient of log likelihood
Fisher information $I(\theta)= \mathbb{E}[(\frac{\mathrm{d}l}{\mathrm{d}\theta})^2| \theta] = Var(\frac{\mathrm{d}l}{\mathrm{d}\tilde{\theta}}| \theta)$
since $\mathbb{E}(\frac{\mathrm{d}l}{\mathrm{d}\theta}| \theta)=0$ (Proof)
(2) Fisher information is related to asymptotic distribution of MLE $\hat{\theta}_{MLE}$
By CLT and slutsky theorem, we can conclude that $$\sqrt{n}(\hat{\theta}_{MLE}-\theta) \overset{p}{\to} N(0, I(\theta)^{-1})$$
Applications: Cramer Rao Bound
Under regularity conditions, the variance of any unbiased estimator ${\hat{\theta }}$ of $\theta$ is then bounded by inverse of Fisher information $I(\theta)$ with
$$Var(\hat{\theta}) \geq \frac{1}{I(\theta)}$$
Note that this CR lower bound is just a "theoretical" lower bound, which means that it may not be applicable (i.e. fail to satisfy regularity conditions) or attainable (can't reach lower bound)
e.g.
CR applicable but not attainable for estimating $\sigma^2$ when $X \overset{i.i.d} \sim N(\mu, \sigma^2)$ since $var(s^2)= \frac{2\sigma^4}{n-1} > \frac{2\sigma^4}{n} = $CR bound.
Reference:
Sinho Chewi's Theoretical Statistics Note
SofaSofa数据科学社区DS面试题库 DS面经