First-Difference Estimator

The First-Difference (FD) estimator is obtained by running a pooled OLS from $$\Delta y_{it}$$ on $$\Delta x_{it}$$.

The FD estimator wipes out time invariant omitted variables $$c_{i}$$ using the repeated observations over time:

$$y_{it}=x_{it}\beta + c_{i}+ u_{it}, t=1,...T$$ $$y_{it-1}=x_{it-1}\beta + c_{i}+u_{it}, t=2,...T$$ Differencing both equations, we get: $$\Delta y_{it}=y_{it}-y_{it-1}=\Delta x_{it}\beta + \Delta u_{it}, t=2,...T$$ which removes the unobserved $$c_{i}$$.

The FD estimator $$\hat{\beta}_{FD}$$ is then simply obtained by regressing changes on changes using OLS: $$\hat{\beta}_{FD} = (\Delta X'\Delta X)^{-1}\Delta X' \Delta y$$ Note that the rank condition must be met for $$\Delta X'\Delta X$$ to be invertible ($$rank[\Delta X'\Delta X]=k$$).

Similarly, $$Av\hat{a}r(\hat{\beta}_{FD})=\hat{\sigma}^{2}_{u}(\Delta X'\Delta X)^{-1}$$ where $$\hat{\sigma}^{2}_{u}$$ is given by $$\hat{\sigma}^{2}_{u} = [n(T-1)-K]^{-1}\hat{u}'\hat{u}$$

Properties
Under the assumption of $$E[u_{it}-u_{it-1}|x_{it}-x_{it-1}]=0$$, the FD estimator is unbiased, i.e. $$E[\hat{\beta}_{FD}]=\beta$$. Note that this assumption is less weaker than the assumption of weak exogeneity required for unbiasedness using the fixed effects (FE) estimator.

Relation to fixed effects estimator
For $$T=2$$, the FD and FE estimators are numerically equivalent.

Under the assumption of strong exogeneity, i.e. homoscedasticity and no serial correlation in $$u_{it}$$, the FE estimator is more efficient than the FD estimator. If $$u_{it}$$ follows a random walk, however, the FD estimator is more efficient as $$\Delta u_{it}$$ are serially uncorrelated while strong exogeneity is violated due to the presence of serial correlation in the $$u_{it}$$.

In practice, the FD estimator is easier to implement without special software, as the only transformation required is to first difference.