\(P_{Ber}\)の\(H_n(x)\)や\(D_n(x\parallel y)\)を考える
\(n\)回の試行結果である\(x^n\)の1の発生回数を\(m\)とすると\(P_{Ber}\)の\(\theta\)の最尤推定値\(\hat{\theta}\)は\(\frac{m}{n}\)となる.
$$\begin{array}{rcl}
\displaystyle \hat{\theta}
&=&\displaystyle \frac{m}{n}\\
H_n(P)&\overset{\mathrm{def}}{=}& E^{n}_{P}\left[-\log_2{P(X^n)}\right]
\\
\displaystyle H(\theta)
&\overset{\mathrm{def}}{=}&
\displaystyle -\theta\log_{2}{\left(\theta\right)}-\left(1-\theta\right)\log_{2}{\left(1-\theta\right)}
\quad\dotso P=P_{Ber}で\thetaを引数として括弧の内側に記載する.
\\
\displaystyle H(\hat{\theta})
&=&\displaystyle -\hat{\theta}\log_{2}{\left(\hat{\theta}\right)}-\left(1-\hat{\theta}\right)\log_{2}{\left(1-\hat{\theta}\right)}\\
&=&\displaystyle -\frac{m}{n}\log_{2}{\left(\frac{m}{n}\right)}-\left(1-\frac{m}{n}\right)\log_{2}{\left(1-\frac{m}{n}\right)}\\
&=&\displaystyle \frac{1}{n}\left\{-m\log_{2}{\left(\frac{m}{n}\right)}-\left(n-m\right)\log_{2}{\left(\frac{n-m}{n}\right)}\right\}\\
&=&\displaystyle \frac{1}{n}\left\{
\displaystyle -m\log_{2}{ \left(m\right) }
\displaystyle +m\log_{2}{ \left(n\right) }
\displaystyle -\left(n-m\right)\log_{2}{ \left(n-m\right) }
\displaystyle +\left(n-m\right)\log_{2}{ \left(n\right) }
\displaystyle \right\}\\
\displaystyle D(\hat{\theta}\parallel \theta)
&=&\displaystyle \hat{\theta}\log_{2}{\left(\frac{\hat{\theta}}{\theta}\right)}+\left(1-\hat{\theta}\right)\log_{2}{\left(\frac{1-\hat{\theta}}{1-\theta}\right)}\\
&=&\displaystyle \frac{m}{n}\log_{2}{\left(\frac{\frac{m}{n}}{\theta}\right)}+\left(1-\frac{m}{n}\right)\log_{2}{\left(\frac{1-\frac{m}{n}}{1-\theta}\right)}\\
&=&\displaystyle \frac{1}{n}\left\{
\displaystyle m\log_{2}{\left(\frac{\frac{m}{n}}{\theta}\right)}
\displaystyle +\left(n-m\right)\log_{2}{\left(\frac{1-\frac{m}{n}}{1-\theta}\right)}
\displaystyle \right\}\\
&=&\displaystyle \frac{1}{n}\left\{
\displaystyle m\log_{2}{\left(\frac{m}{n}\right)}
\displaystyle -m\log_{2}{\left(\theta\right)}
\displaystyle +\left(n-m\right)\log_{2}{\left(1-\frac{m}{n}\right)}
\displaystyle \displaystyle -\left(n-m\right)\log_{2}{\left(1-\theta\right)}
\displaystyle \right\}\\
&=&\displaystyle \frac{1}{n}\left\{
\displaystyle m\log_{2}{\left(m\right)}
\displaystyle -m\log_{2}{\left(n\right)}
\displaystyle -m\log_{2}{\left(\theta\right)}
\displaystyle +\left(n-m\right)\log_{2}{\left(n-m\right)}
\displaystyle -\left(n-m\right)\log_{2}{\left(n\right)}
\displaystyle -\left(n-m\right)\log_{2}{\left(1-\theta\right)}
\displaystyle \right\}\\
\displaystyle H(\hat{\theta})+D(\hat{\theta}\parallel \theta)
&=&\displaystyle \frac{1}{n}\left\{
\displaystyle -m\log_{2}{\left(\theta\right)}-\left(n-m\right)\log_{2}{\left(1-\theta\right)}
\displaystyle \right\}\\
\displaystyle n\left\{H(\hat{\theta})+D(\hat{\theta}\parallel \theta)\right\}
&=&\displaystyle -m\log_{2}{\left(\theta\right)}-\left(n-m\right)\log_2{\left(1-\theta\right)}\\
\end{array}$$
$$\begin{array}{rcl}
-\log_2{\left(
L(\theta|x^n)
\right)}
&=&-m\log_2{\left(
\theta
\right)}-(n-m)\log_2{\left(1-\theta\right)}\\
n\left\{H\left(
\hat{\theta}
\right)+D(\hat{\theta}\parallel \theta)\right\}
&=&-m\log_2{\left(
\theta
\right)}-\left(
n-m
\right)\log_2{\left(
1-\theta
\right)}\\
-\log_2{\left(
L\left(\theta|x^n\right)
\right)}
&=&n\left\{H\left(
\hat{\theta}
\right)+D\left(
\hat{\theta}\parallel \theta
\right)\right\}\\
&=&nH\left(
\hat{\theta}
\right)+nD\left(
\hat{\theta}\parallel \theta
\right)\\
-\log_2{\left(
L\left(\hat{\theta}|x^n\right)
\right)}
&=&nH\left(
\hat{\theta}
\right)+nD\left(
\hat{\theta}\parallel \hat{\theta}
\right)\quad\dotso\theta=\hat{\theta}\\
&=&nH\left(
\hat{\theta}
\right)+n0\quad\dotso D\left(
\hat{\theta}\parallel \hat{\theta}
\right)=0\\
&=&nH\left(
\frac{m}{n}
\right)\quad\dotso \hat{\theta}=\frac{m}{n}\\
\end{array}$$
0 件のコメント:
コメントを投稿