FD is FE’s cousin, but in Stata, reg d.y d.x (manual first-differencing) gives different standard errors than xtreg, fd due to how Stata handles time gaps. For T=2, FD=FE. For T>2, FD is less efficient if errors are serially uncorrelated. But if errors follow a random walk, FD beats FE. Most Stata users never check.
Before running any analysis, you must tell Stata which variable identifies the units and which identifies the time.
Panel data often has serial correlation (today's error is correlated with yesterday's). stata panel data
* For Fixed Effects models
xtserial y x1 x2
(Note: xtserial is a user-written command. Install via ssc install xtserial).
If significant serial correlation exists, use robust standard errors (vce(robust)) or a model that accounts for it. FD is FE’s cousin, but in Stata, reg d
This test determines whether FE or RE is appropriate.
Stata Workflow:
* 1. Run Fixed Effects
quietly xtreg y x, fe
* 2. Store the estimates
estimates store fixed
* 3. Run Random Effects
quietly xtreg y x, re
* 4. Store the estimates
estimates store random
* 5. Run the Hausman test
hausman fixed random
* Standard summary
xtsum
Before typing a single command, you must grasp how Stata "thinks" about panel data.