next up previous
Next: About this document ...

STATISTICAL INFERENCE FOR SIMPLE LINEAR REGRESSION


Assume that the data $(x_1, y_1), (x_2, y_2), \ldots, (x_n, y_n)$ are observations either of random points $(X, Y)$ or points $(x, Y)$ with random ordinates and that $Y_1, Y_2, \ldots, Y_n$ are independent, normally distributed random variables with mean

\begin{displaymath}E(Y_i\vert x_i) = \alpha + \beta x_i= \beta_0 + \beta_1 x_i \end{displaymath}

for each given $x_i$ and constant standard deviation $\sigma$. Alternatively, $Y_i = \beta_0 + \beta_1 x_i + \boldmath {\epsilon_i}$ where the $\boldmath {\epsilon_i}$'s are independent normal random variables (errors) with mean 0 and standard deviation $\sigma$.

In brief, the $Y_i$'s are independent and $N(\alpha + \beta x_i, \sigma)$.



General pattern for inference: To test $H_0: \theta = \theta_0$, use the statistic $\displaystyle T = \frac{\hat{\theta} - \theta_0}{SE_{\hat{\theta}}}$.

The endpoints of a confidence interval for $\theta$ are given by $\displaystyle \hat{\theta} \pm t^* SE_{\hat{\theta}}$ where df $= n-2$.

Here $SE_{\hat{\theta}}$ denotes the standard error of the statistic $\hat{\theta}$.



Parameter Estimate/statistic Standard Error
     
$\rho$ $\displaystyle r =
\frac{1}{n-1}\sum \left(\frac{x_i - \overline{x}}{s_x}\right)
\left(\frac{y_i - \overline{y}}{s_y}\right)$  
     
     
$\beta = \beta_1$ $\displaystyle b = b_1 = r \frac{s_y}{s_x}$ $\displaystyle SE_b = \frac{s}{\sum (x_i - \overline{x})^2}
= \frac{s}{s_x \sqrt{n-1}}$
     
     
$\alpha = \beta_0$ $a = b_0 = \overline{y} - b \overline{x}$ $\displaystyle SE_a =
s \sqrt{\frac{1}{n} + \frac{\overline{x}^2}{\sum (x_i - \overline{x})^2}}$
     
     
$\mu_y = E(Y\vert x)$ $\hat{\mu} = \hat{\mu}_y = a + bx = b_0 + b_1 x$ $\displaystyle SE_{\hat{\mu}} =
s \sqrt{\frac{1}{n} + \frac{(x-\overline{x})^2}{\sum (x_i - \overline{x})^2}}$
     
     
$y$ $\hat{y} = a + bx = b_0 + b_1 x$ $\displaystyle SE_{\hat{y}} =
s \sqrt{1 + \frac{1}{n} +
\frac{(x-\overline{x})^2}{\sum (x_i - \overline{x})^2}}$
(not actually a parameter) (predicted value) (standard error for predicted value)
     
     
$\sigma$ $\displaystyle \sqrt{ \frac{\sum (y_i - \hat{y_i})^2}{n-2}}$  
     




next up previous
Next: About this document ...
John Holte 2004-05-11