間違いしかありません.コメントにてご指摘いただければ幸いです(気が付いた点を特に断りなく頻繁に書き直していますのでご注意ください).

単回帰モデルの最小二乗推定量の分布

単回帰モデルの最小二乗推定量\(\hat{\alpha},\hat{\beta}\)の分布

単回帰モデル

$$ \begin{eqnarray} y_i&=&\alpha+\beta x_i +\epsilon_i\;(i=1,\cdots,n)\;\dots\;\epsilon_i \overset{iid}{\sim} \mathrm{N}\left(0,\sigma^2\right) \\\mathrm{E}\left[y_i\right]&=&\mathrm{E}\left[\alpha+\beta x_i +\epsilon_i\right] \\&=&\alpha+\beta x_i +\mathrm{E}\left[\epsilon_i\right] \;\cdots\;\href{https://shikitenkai.blogspot.com/2019/06/continuous-random-variable-expected.html}{\mathrm{E}\left[X+t\right]=\mathrm{E}\left[X\right]+t} \\&=&\alpha+\beta x_i+0 \;\cdots\;\epsilon_i \overset{iid}{\sim} \mathrm{N}\left(0,\sigma^2\right) \\&=&\alpha+\beta x_i \\\mathrm{V}\left[y_i\right]&=&\mathrm{V}\left[\alpha+\beta x_i +\epsilon_i\right] \\&=&\mathrm{V}\left[\epsilon_i\right] \;\cdots\;\href{https://shikitenkai.blogspot.com/2019/06/continuous-random-variable-variance.html}{\mathrm{V}\left[X+t\right]=\mathrm{V}\left[X\right]} \\&=&\sigma^2 \;\cdots\;\epsilon_i \overset{iid}{\sim} \mathrm{N}\left(0,\sigma^2\right) \\y_i&\sim&\mathrm{N}(\alpha+\beta x_i,\sigma^2) \end{eqnarray} $$ \(y_i\)は\(\mathrm{N}\left(\alpha+\beta x_i,\sigma^2\right)\)に従う確率変数である.

\(\hat{\beta}\)を\(\sum_{i=1}^n c_iy_i\)の形で表す

推定量が\(\sum_{i=1}^n c_ix_i\;(x_i:標本,\;c_i:定数)\)の形で表現できるとき,この推定量を線形推定量(linear estimate)という.
(よく知られる線形推定量の例として平均\(\bar{x}\)があり,\(\bar{x}=\sum_{i=1}^n \frac{1}{n} x_i\)で表現される) $$ \begin{eqnarray} \hat{\beta}&=&\frac{S_{xy}}{S_{xx}} \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/03/blog-post.html}{\hat{\beta}=\frac{S_{xy}}{S_{xx}}} ,\;S_{xx}=\sum_{i=1}^n\left(x_i-\bar{x}\right)^2,\;\bar{x}=\frac{1}{n}\sum_{i=1}^nx_i \\&=&\frac{1}{S_{xx}} \sum_{i=1}^n \left(x_i-\bar{x}\right)\left(y_i-\bar{y}\right) \\&=&\frac{1}{S_{xx}} \sum_{i=1}^n \left(x_i-\bar{x}\right)y_i \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/10/sxy.html}{\sum_{i=1}^n\left(x_i-\bar{x}\right)\left(y_i-\bar{y}\right)=S_{xy}= \sum_{i=1}^n \left(x_i-\bar{x}\right)y_i} \\&=& \sum_{i=1}^n \frac{x_i-\bar{x}}{S_{xx}}y_i \\&=& \sum_{i=1}^n c_iy_i \;\cdots\;c_i=\frac{x_i-\bar{x}}{S_{xx}} \end{eqnarray} $$

\(\hat{\beta}\)の期待値を\(\sum_{i=1}^n c_iy_i\)から求めてみる

$$ \begin{eqnarray} \mathrm{E}\left[\sum_{i=1}^n c_iy_i\right] &=&\mathrm{E}\left[\sum_{i=1}^n \frac{x_i-\bar{x}}{S_{xx}}y_i\right] \\&=&\mathrm{E}\left[\frac{S_{xy}}{S_{xx}}\right] \;\cdots\;上記,\;\frac{S_{xy}}{S_{xx}}=\sum_{i=1}^n \frac{x_i-\bar{x}}{S_{xx}}y_i \\&=&\mathrm{E}\left[\hat{\beta}\right] \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/03/blog-post.html}{\hat{\beta}=\frac{S_{xy}}{S_{xx}}} \\&=&\beta \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/08/2.html}{\mathrm{E}\left[\hat{\beta}\right]=\beta} \end{eqnarray} $$

\(\hat{\beta}\)の分散を\(\sum_{i=1}^n c_iy_i\)から求めてみる

$$ \begin{eqnarray} \mathrm{V}\left[\sum_{i=1}^n c_iy_i\right] &=&\mathrm{V}\left[\sum_{i=1}^n \frac{x_i-\bar{x}}{S_{xx}}y_i\right] \\&=&\sum_{i=1}^n \mathrm{V}\left[\frac{x_i-\bar{x}}{S_{xx}}y_i\right] \;\cdots\;\href{https://shikitenkai.blogspot.com/2019/06/continuous-random-variable-variance.html}{y_iは互いに独立\mathrm{Cov}\left[y_i, y_j\right]=0,\;互いに独立の場合\mathrm{V}\left[X+Y\right]=\mathrm{V}\left[X\right]+\mathrm{V}\left[Y\right]} \\&=&\sum_{i=1}^n \left(\frac{x_i-\bar{x}}{S_{xx}}\right)^2\mathrm{V}\left[y_i\right] \;\cdots\;\href{https://shikitenkai.blogspot.com/2019/06/continuous-random-variable-variance.html}{\mathrm{V}\left[cX\right]=c^2\mathrm{V}\left[X\right]} \\&=&\sum_{i=1}^n \left(\frac{x_i-\bar{x}}{S_{xx}}\right)^2\sigma^2 \\&=&\frac{\sigma^2}{S_{xx}^2}\sum_{i=1}^n \left(x_i-\bar{x}\right)^2 \\&=&\frac{\sigma^2}{S_{xx}^2}S_{xx} \;\cdots\;S_{xx}=\sum_{i=1}^n \left(x_i-\bar{x}\right)^2 \\&=&\frac{\sigma^2}{S_{xx}} \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/08/2variancecovariance.html}{\mathrm{V}\left[\frac{S_{xy}}{S_{xx}}\right]=\frac{\sigma^2}{S_{xx}}}と同じ結果 \end{eqnarray} $$

\(\hat{\beta}\)の分布

以上のように,\(\hat{\beta}\)は線形推定量であり,正規分布に従う\(y_i\)の定数倍の和で表すことができた.よって\(\hat{\beta}\)は同様に正規分布に従い,その期待値と分散はそれぞれ上記で求めたとおりである \(\;\cdots\;\href{https://shikitenkai.blogspot.com/2020/09/zc1xc2y-2.html}{Z=c_1X+c_2Y(X\sim\mathrm{N}(\mu_1,\sigma_1^2),Y\sim\mathrm{N}(\mu_2,\sigma_2^2),Z\sim\mathrm{N}(c_1\mu_1+c_2\mu_2,c_1^2\sigma_1^2+c_2^2\sigma_2^2))}\). $$ \begin{eqnarray} \hat{\beta}&\sim& \mathrm{N}\left(\beta,\frac{\sigma^2}{S_{xx}}\right) \end{eqnarray} $$

\(\hat{\alpha}\)を\(\sum_{i=1}^n c_iy_i\)の形で表す

$$ \begin{eqnarray} \hat{\alpha}&=&\bar{y}-\hat{\beta}\bar{x} \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/03/blog-post.html}{\hat{\alpha}=\bar{y}-\hat{\beta}\bar{x}} \\&=&\sum_{i=1}^n\frac{1}{n}y_i-\frac{S_{xy}}{S_{xx}}\bar{x} \\&=&\sum_{i=1}^n\frac{1}{n}y_i-\left(\sum_{i=1}^n\frac{x_i-\bar{x}}{S_{xx}}y_i\right)\bar{x} \;\cdots\;上記,\;\frac{S_{xy}}{S_{xx}}=\sum_{i=1}^n \frac{x_i-\bar{x}}{S_{xx}}y_i \\&=&\sum_{i=1}^n\frac{1}{n}y_i-\bar{x}\sum_{i=1}^n\frac{x_i-\bar{x}}{S_{xx}}y_i \\&=&\sum_{i=1}^n\frac{1}{n}y_i-\sum_{i=1}^n\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}y_i \\&=&\sum_{i=1}^n\left(\frac{1}{n}-\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}\right)y_i \\&=& \sum_{i=1}^n c_iy_i \;\cdots\;c_i=\frac{1}{n}-\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}} \end{eqnarray} $$

\(\hat{\alpha}\)の期待値を\(\sum_{i=1}^n c_iy_i\)から求めてみる

$$ \begin{eqnarray} \mathrm{E}\left[\sum_{i=1}^n c_iy_i\right] &=&\mathrm{E}\left[\sum_{i=1}^n\left(\frac{1}{n}-\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}\right)y_i\right] \\&=&\mathrm{E}\left[\sum_{i=1}^n\left(\frac{1}{n}y_i-\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}y_i\right)\right] \\&=&\mathrm{E}\left[\sum_{i=1}^n\frac{1}{n}y_i-\sum_{i=1}^n\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}y_i\right] \\&=&\mathrm{E}\left[\sum_{i=1}^n\frac{1}{n}y_i-\sum_{i=1}^n\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}y_i\right] \\&=&\mathrm{E}\left[\sum_{i=1}^n\frac{1}{n}y_i\right]-\mathrm{E}\left[\sum_{i=1}^n\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}y_i\right] \;\cdots\;\href{https://shikitenkai.blogspot.com/2019/06/continuous-random-variable-expected.html}{\mathrm{E}\left[X+Y\right]=\mathrm{E}\left[X\right]+\mathrm{E}\left[Y\right]} \\&=&\mathrm{E}\left[\frac{1}{n}\sum_{i=1}^ny_i\right]-\mathrm{E}\left[\bar{x}\sum_{i=1}^n\frac{\left(x_i-\bar{x}\right)}{S_{xx}}y_i\right] \\&=&\frac{1}{n}\mathrm{E}\left[\sum_{i=1}^ny_i\right]-\bar{x}\mathrm{E}\left[\sum_{i=1}^n\frac{\left(x_i-\bar{x}\right)}{S_{xx}}y_i\right] \;\cdots\;\href{https://shikitenkai.blogspot.com/2019/06/continuous-random-variable-expected.html}{\mathrm{E}\left[cX\right]=c\mathrm{E}\left[X\right]} \\&=&\frac{1}{n}\sum_{i=1}^n\mathrm{E}\left[y_i\right]-\bar{x}\mathrm{E}\left[\frac{S_{xy}}{S_{xx}}\right] \;\cdots\;上記,\;\frac{S_{xy}}{S_{xx}}=\sum_{i=1}^n \frac{x_i-\bar{x}}{S_{xx}}y_i \\&=&\frac{1}{n}\sum_{i=1}^n\left(\alpha+\beta x_i\right)-\bar{x}\mathrm{E}\left[\frac{S_{xy}}{S_{xx}}\right] \;\cdots\;\mathrm{E}\left[y_i\right]=\alpha+\beta x_i \\&=&\frac{1}{n}\left(\alpha\sum_{i=1}^n1+\beta\sum_{i=1}^n x_i\right)-\bar{x}\mathrm{E}\left[\hat{\beta}\right] \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/03/blog-post.html}{\hat{\beta}=\frac{S_{xy}}{S_{xx}}} \\&=&\frac{1}{n}\left(n\alpha+\beta\;n\bar{x}\right)-\bar{x}\beta \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/08/2.html}{\mathrm{E}\left[\hat{\beta}\right]=\beta} \\&=&\frac{1}{n}n\left(\alpha+\beta\bar{x}\right)-\bar{x}\beta \\&=&\left(\alpha+\beta\bar{x}\right)-\bar{x}\beta \\&=&\alpha \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/08/2.html}{\mathrm{E}\left[\hat{\alpha}\right]=\alpha} \end{eqnarray} $$

\(\hat{\alpha}\)の分散を\(\sum_{i=1}^n c_iy_i\)から求めてみる

$$ \begin{eqnarray} \mathrm{V}\left[\sum_{i=1}^n c_iy_i\right] &=&\mathrm{V}\left[\sum_{i=1}^n\left(\frac{1}{n}-\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}\right)y_i\right] \\&=&\sum_{i=1}^n\mathrm{V}\left[\left(\frac{1}{n}-\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}\right)y_i\right] \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/08/2variancecovariance.html}{\mathrm{V}\left[\frac{S_{xy}}{S_{xx}}\right]=\frac{\sigma^2}{S_{xx}}}と同じ結果 \\&=&\sum_{i=1}^n\left(\frac{1}{n}-\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}\right)^2\mathrm{V}\left[y_i\right] \;\cdots\;\href{https://shikitenkai.blogspot.com/2019/06/continuous-random-variable-variance.html}{\mathrm{V}\left[cX\right]=c^2\mathrm{V}\left[X\right]} \\&=&\sum_{i=1}^n\left(\frac{1}{n}-\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}\right)^2\sigma^2 \\&=&\sigma^2\sum_{i=1}^n\left(\frac{1}{n}-\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}\right)^2 \\&=&\sigma^2\sum_{i=1}^n\left( \frac{1}{n^2} -2\frac{1}{n}\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}} +\left(\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}\right)^2 \right) \\&=&\sigma^2\left( \sum_{i=1}^n\frac{1}{n^2} -\sum_{i=1}^n2\frac{1}{n}\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}} +\sum_{i=1}^n\left(\frac{\bar{x}\left(x_i-\bar{x}\right)}{S_{xx}}\right)^2 \right) \\&=&\sigma^2\left( \frac{1}{n^2}\sum_{i=1}^n1 -\frac{2\bar{x}}{nS_{xx}}\sum_{i=1}^n\left(x_i-\bar{x}\right) +\frac{\bar{x}^2}{S_{xx}^2}\sum_{i=1}^n\left(x_i-\bar{x}\right)^2 \right) \\&=&\sigma^2\left( \frac{1}{n^2}n -\frac{2\bar{x}}{nS_{xx}}\cdot 0 +\frac{\bar{x}^2}{S_{xx}^2}S_{xx} \right) \\&=&\sigma^2\left( \frac{1}{n^2}\cdot n -\frac{2\bar{x}}{nS_{xx}}\cdot 0 +\frac{\bar{x}^2}{S_{xx}^2}\cdot S_{xx} \right) \\&=&\sigma^2\left( \frac{1}{n} +\frac{\bar{x}^2}{S_{xx}} \right) \;\cdots\;\href{https://shikitenkai.blogspot.com/2020/08/2variancecovariance.html}{\mathrm{V}\left[\bar{y}-\hat{\beta}\bar{x}\right]=\left(\frac{1}{n}+\frac{\bar{x}^2}{S_{xx}}\right)\sigma^2と同じ結果} \end{eqnarray} $$

\(\hat{\alpha}\)の分布

以上のように,\(\hat{\alpha}\)は線形推定量であり,正規分布に従う\(y_i\)の定数倍の和で表すことができた.よって\(\hat{\alpha}\)は同様に正規分布に従い,その期待値と分散はそれぞれ上記で求めたとおりである \(\;\cdots\;\href{https://shikitenkai.blogspot.com/2020/09/zc1xc2y-2.html}{Z=c_1X+c_2Y(X\sim\mathrm{N}(\mu_1,\sigma_1^2),Y\sim\mathrm{N}(\mu_2,\sigma_2^2),Z\sim\mathrm{N}(c_1\mu_1+c_2\mu_2,c_1^2\sigma_1^2+c_2^2\sigma_2^2))}\). $$ \begin{eqnarray} \hat{\alpha}&\sim& \mathrm{N}\left(\alpha,\sigma^2\left( \frac{1}{n} +\frac{\bar{x}^2}{S_{xx}} \right)\right) \end{eqnarray} $$

0 件のコメント:

コメントを投稿