本文研究的是正态总体的样本均值与样本方差,将会得到一系列简单但是重要的结论。

以下设\(x_1,x_2,\cdots,x_n\)是来自正态总体\(\mathcal{N}(\mu,\sigma^2)\)的样本,其样本均值 \[ \bar{x}=\dfrac{1}{n}\sum_{i=1}^{n}x_i, \] 样本方差 \[ s^2=\dfrac{1}{n-1}\sum_{i=1}^{n}(x_i-\bar{x}). \]

准备工作\(\boldsymbol{x}=(x_1,x_2,\cdots,x_n)^T\),则\(\boldsymbol{x}\)的密度函数,也即\(x_1,x_2,\cdots,x_n\)的联合密度函数 \[ \begin{aligned} p(x_1,x_2,\cdots,x_n)&=\dfrac{1}{(2\pi\sigma^2)^{\frac{n}{2}}}\cdot\exp\left\{-\sum_{i=1}^{n}\dfrac{(x_i-\mu)^2}{2\sigma^2}\right\}\\ &=\dfrac{1}{(2\pi\sigma^2)^{\frac{n}{2}}}\cdot\exp\left\{-\dfrac{1}{2\sigma^2}\cdot\left(\sum_{i=1}^{n}x_i^2-2n\bar{x}\mu+n\mu^2\right)\right\}. \end{aligned} \] 在这里,构造一个首行元素都相等的正交矩阵 \[ \boldsymbol{A}=\begin{bmatrix} \dfrac{1}{\sqrt{n}} & \dfrac{1}{\sqrt{n}} & \dfrac{1}{\sqrt{n}} & \cdots & \dfrac{1}{\sqrt{n}} \\ \dfrac{1}{\sqrt{2\cdot 1}} & -\dfrac{1}{\sqrt{2\cdot 1}} & 0 & \cdots & 0 \\ \dfrac{1}{\sqrt{3\cdot 2}} & \dfrac{1}{\sqrt{3\cdot 2}} & -\dfrac{2}{\sqrt{3\cdot 2}} & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \dfrac{1}{\sqrt{n\cdot(n-1)}} & \dfrac{1}{\sqrt{n\cdot(n-1)}} & \dfrac{1}{\sqrt{n\cdot(n-1)}} & \cdots & -\dfrac{n-1}{\sqrt{n\cdot(n-1)}} \end{bmatrix}, \] 并令\(\boldsymbol{y}=(y_1,y_2,\cdots,y_n)^T=\boldsymbol{A}\boldsymbol{x}\),则有\(\bar{x}=\dfrac{y_1}{\sqrt{n}}\),且 \[ \sum_{i=1}^{n}y_i^2=\boldsymbol{y}^T\boldsymbol{y}=\boldsymbol{x}^T\boldsymbol{A}^T\boldsymbol{A}\boldsymbol{x}=\boldsymbol{x}^T\boldsymbol{x}=\sum_{i=1}^{n}x_i^2, \] 从而\(\boldsymbol{y}\)的密度函数,也即\(y_1,y_2,\cdots,y_n\)的联合密度函数 \[ \begin{aligned} p(y_1,y_2,\cdots,y_n)&=\dfrac{1}{(2\pi\sigma^2)^{\frac{n}{2}}}\cdot\exp\left\{-\dfrac{1}{2\sigma^2}\cdot\left(\sum_{i=1}^{n}y_i^2-2\sqrt{n}y_1\mu+n\mu^2\right)\right\}\\ &=\dfrac{1}{(2\pi\sigma^2)^{\frac{n}{2}}}\cdot\exp\left\{-\dfrac{1}{2\sigma^2}\cdot\left(\sum_{i=2}^{n}y_i^2+(y_1-\sqrt{n}\mu)^2\right)\right\}\\ &=\dfrac{1}{(2\pi\sigma^2)^{\frac{n}{2}}}\cdot\exp\left\{-\dfrac{(y_1-\sqrt{n}\mu)^2}{2\sigma^2}\right\}\cdot\prod_{i=2}^{n}\exp\left\{-\dfrac{y_i^2}{2\sigma^2}\right\}, \end{aligned} \] 这便说明了\(y_1,y_2,\cdots,y_n\)相互独立,且\(y_1\sim\mathcal{N}(\sqrt{n}\mu,\sigma^2)\)\(y_2,\cdots,y_n\sim\mathcal{N}(0,\sigma^2)\)

性质1(样本均值的分布) 样本均值\(\bar{x}\sim\mathcal{N}\left(\mu,\dfrac{\sigma^2}{n}\right)\)

证明1 根据\(x_1,x_2,\cdots,x_n\sim\mathcal{N}(\mu,\sigma^2)\),可得 \[ x_1+x_2+\cdots+x_n\sim\mathcal{N}(n\mu,n\sigma^2), \] 从而\(\bar{x}=\dfrac{x_1+x_2+\cdots+x_n}{n}\sim\mathcal{N}\left(\mu,\dfrac{\sigma^2}{n}\right)\)

证明2 利用准备工作中的结果,由\(y_1=\sqrt{n}\cdot\bar{x}\sim\mathcal{N}(\sqrt{n}\mu,\sigma^2)\),可得\(\bar{x}=\dfrac{y_1}{\sqrt{n}}\sim\mathcal{N}\left(\mu,\dfrac{\sigma^2}{n}\right)\)

性质2(样本方差的分布) 样本方差\(s\)满足\(\dfrac{(n-1)s^2}{\sigma^2}\sim\chi^2(n-1)\)

证明 利用准备工作中的结果,计算得 \[ \begin{aligned} \dfrac{(n-1)s^2}{\sigma^2}&=\dfrac{1}{\sigma^2}\cdot\sum_{i=1}^{n}(x_i-\bar{x})^2\\ &=\dfrac{1}{\sigma^2}\cdot\left(\sum_{i=1}^{n}x_i^2-n\cdot\bar{x}^2\right)\\ &=\dfrac{1}{\sigma^2}\cdot\left(\sum_{i=1}^{n}y_i^2-y_1^2\right)\\ &=\sum_{i=2}^{n}\left(\dfrac{y_i}{\sigma}\right)^2, \end{aligned} \] 其中\(\dfrac{y_i}{\sigma}\sim\mathcal{N}(0,1)\),从而\(\dfrac{(n-1)s^2}{\sigma^2}\sim\chi^2(n-1)\)

性质3(由样本均值和样本方差得到\(t\)分布) \(t=\dfrac{\sqrt{n}\cdot(\bar{x}-\mu)}{s}\sim t(n-1)\)

证明 计算得 \[ t=\dfrac{\sqrt{n}\cdot(\bar{x}-\mu)}{s}=\dfrac{\dfrac{\sqrt{n}\cdot(\bar{x}-\mu)}{\sigma}}{\sqrt{\dfrac{1}{n-1}\cdot{\dfrac{(n-1)s^2}{\sigma^2}}}}, \] 其中\(\dfrac{\sqrt{n}\cdot(\bar{x}-\mu)}{\sigma}\sim\mathcal{N}(0,1)\),且根据性质2\(\dfrac{(n-1)s^2}{\sigma^2}\sim\chi^2(n-1)\),从而\(t\sim t(n-1)\)

性质4(样本均值与样本方差的独立性) 样本均值\(\bar{x}\)与样本方差\(s^2\)独立。

证明 利用准备工作中的结果,由\(\bar{x}=\dfrac{y_1}{\sqrt{n}}\),且根据性质2的证明过程得 \[ s^2=\dfrac{1}{n-1}\cdot\sum_{i=2}^{n}y_i^2, \] 结合\(y_1\)\(y_2,\cdots,y_n\)独立,得知\(\bar{x}\)\(s^2\)独立。