成绩构成
Lecture 1 Intro 第一章:计量经济学引论
Outline
- The Nature and Purpose of Econometrics
- Data
- Returns in Financial Modeling
- Steps involved in the formulation of econometric models45
1.1 计量经济学的概念与建模
Econometrics = use of statistical methods to analyze economic data
1.1.1 Steps of Econometric Modeling
1.2 计量经济学在金融中的应用
Econometrics
- Econometric and financial Econometric
- Microeconometrics and Macroeconometrics
- Theoretical econometrics and applied econometrics
1.3 数据及其获取
The Nature of Econometrics and Economic Data
Typical goals of econometric analysis:
- Testing economic theories and hypotheses,estimating relationships between economic variables
- Evaluating and implementing government and business policy
- Forecasting economic variables
Different kinds of economic data sets:
- Cross-sectional data
- Time series data
- Pooled cross sections
- Panel/Longitudinal data
1.4 相关软件简介
Lecture2 A brief review of classic linear regression 第二章:线性回归模型
Outlines:
- Linear Regression
- Ordinary Least Squares
- The Assumptions Underlying CLRM
- Properties of the OLS Estimator
- Hypothesis Test
2.0 Find a line of best fit
$$y = \alpha + \beta x \ Equation(1)$$
note: $\alpha$ and $\beta$ are “coefficients” where: $\alpha$ is a “constant” or “intercept” term $\beta$ is a “slope coefficient”.
$$\beta = \frac{\Delta y}{\Delta x} = \frac{y_{t+1}-y_{t}}{x_{t+1}-x_t}$$
2.0.1 Add a random disturbance term $\epsilon$
$$y = \alpha + \beta x_t + \epsilon_t$$
where: t = 1,…,T
The disturbance term can capture a number of features:
- Other potentially important explanatory variables may be missing (e.g., Z and W)
- Measurement error
- Incorrect functional form
- Purely random and totally unpredictable occurrences
2.0.2 Single Equation Linear Model
components:
- determinstic: $\alpha + \beta x$
- stochastic/random: $\epsilon_t$ why “determinstic”: conditional expectation which means expectation value of Y Given X
2.1 多元回归模型的建模与估计 Different Ways of Expressing the Multiple Linear Regression Model
2.1.1 The OLS Estimator for the Multiple Regression Model
2.1.2 Calculating the Standard Errors for the Multiple Regression Model
2.2 回归模型的拟合优度
2.3 OLS的经典假设与估计量性质
2.3.1 Ordinary Least Squares
population regression -> sample/estimated regression
The empirical counterpart to Equation(1) is
$$\hat{Y_t} = \hat{\alpha} + \hat{\beta}x_t + \hat{\epsilon_t}$$
Optimized Function :
$$Min\ L = Min\sum\epsilon_t^2 = Min ||Y_t - \hat{Y_t} ||^2 = Min ||Y_t - \hat{\alpha} - \hat{\beta}x_t||^2$$
求导,微分方程等于零
解出
$$\beta = \frac{\sum x_ty_t-T\overline x \overline y}{\sum x_t^2 - \overline x^2}$$
$$\alpha = \overline Y_t - \beta \overline x_t$$
2.3.2 CAPM in Econometric
$$r_i = r_f + \beta(E(r_m)-r_f)+\epsilon_t$$
$$r_t - rf_t = \hat \alpha + \hat \beta(E(r_m - rf_t))$$
where $\hat \beta = \frac{cov(r_m,rf_t)}{var(r_t)}$
2.3.3 The relationship between Regression and Corelationship
$$\hat b = \frac{cov(x,y)}{cov(x,x)} = \frac{cov(x,y)}{\sqrt{cov(x,x)} \sqrt{cov(y,y)}} \times \frac{\sqrt{cov(y,y)}}{\sqrt{cov(x,x)}} = r_{x,y} \frac{S_y}{S_x}$$
- PRF: Population Regression Function
$$Y_t = \hat \alpha + \hat \beta x_t + \hat \mu$$
- SRF: Sample Regression Function
$$Y_t = \hat \alpha + \hat \beta x_t $$
sample is a selection set of population, we can use SRF to infer likely as PRF.
2.3.4 Estimators & Estimates
- Estimators(估计量): random/stochastic
- Estimates(估计值): actual numerical value
2.3.5 The Assumptions underlying in CLRM(Classic Linear Regression Model)
The seven classical assumptions are:
- The regression model is linear, is correctly specified, and has an additive error term
- The error term has a zero population mean
- All explanatory variables are uncorrelated with the error term
- Observations of the error term are uncorrelated with each other (no serial correlation)
- The error term has a constant variance (no heteroskedasticity)
- No explanatory variable is a perfect linear function of any other explanatory variable(s) (no perfect multicollinearity)
- The error term is normally distributed(this assumption is optional but usually is invoked)
(2) $E(\epsilon_t) = 0$
(3) $E(\epsilon_t|x_t) = 0 or cov(\epsilon_t|x_t) = 0$
(4) $cov(\epsilon_t|\epsilon_{t+k}) = 0 \ k \geq 1$
(5) $var(\epsilon_t) = \sigma^2$ special case: $var(\epsilon_t) = \sigma_t^2$
(6) $x_i \neq r_1x_{k1} + r_2x_{k2} + r_3x_{k3}+ …$
(7) $\epsilon_t \ ~ N(0,\sigma^2)$
2.3.6 Properties of OLS Estimator
If assumption 1 through 6 holds, $\hat \alpha$ and $\hat \beta$ are determined by OLS well known as BLUE
BLUE: Best Linear Unbiased Estimator
- estimator: $\hat \beta$ is an estimator of the value of $\beta$
- linear: $\hat \beta$ is a linear variable
- unbiased: in average, $\hat \alpha$ and $\hat \beta$ are true value of $\alpha$ , $\beta$.
- best: means that the OLS estimator has minimum variance among the class of linear unbiased estimators
The Gauss-Markov theorem proves that the OLS estimator is best.
$$ri_t - rf_t = \alpha + \beta(rm_t - rf_t) + \epsilon_t$$
where $\alpha$ is market pricing and $\epsilon_t \ ~ iid (0,\sigma^2)$
$$Y_t = \hat \alpha + \hat \beta X_t + \epsilon_t$$
$\epsilon_t \ ~ iid(0,\sigma^2)$ and $E(\overline X) = 0$
We have $\hat \alpha$ is the mean value of $\epsilon_t$.
2.3.7 Addition
$$\hat \beta = (X’X)^{-1}X’Y = (X’X)^{-1}X’(X\beta + \epsilon)$$
$$E(\hat \beta) = \beta + E((X’X)^{-1}X’\epsilon) $$
where $E(\epsilon) = 0$
Then we get $E(\hat \beta) = \beta$
$$\hat \beta = \beta + E((X’X)^{-1}X’\epsilon)$$
We know
$$Var(\hat \beta) = Var(\beta + (X’X)^{-1}X’\epsilon) = 0 + Var((X’X)^{-1}X’E(\epsilon \epsilon’)X(X’X)^{-1})$$
$$
\epsilon =
\begin{bmatrix}
\epsilon_1 \
\epsilon_2 \
… \
\epsilon_n \
\end{bmatrix}
\begin{bmatrix}
0 \
0 \
… \
0 \
\end{bmatrix}
$$
$$
Var(\epsilon) = E(\epsilon \epsilon’) = E(\begin{bmatrix}
\epsilon_1^2 & \epsilon_1 \epsilon_2 & … & \epsilon_1 \epsilon_n \
… & \epsilon_2^2 & … & … \
\epsilon_n \epsilon_1 & … & … & \epsilon_n^2\
\end{bmatrix})
\begin{bmatrix}
\sigma^2 & 0 & … & 0 \
… & …& … & … \
0 & … & … & \sigma^2 \
\end{bmatrix}
\sigma^2 I
= \sigma^2(X’X)^{-1}
$$
We get $Var(\hat \beta) = \sigma^2 \epsilon(X’X)^{-1}$
Cz
$$y = \beta x + \epsilon$$
owing to $\epsilon\ iid(0,\sigma^2)$
and
$$y = \hat \beta x + \hat \epsilon$$
owing to $\hat \sigma^2$
2.3.8 Precision and Standard Errors
2.4 假设检验 Hypothesis Testing
2.4.1 One-Sided Hypothesis Tests
2.4.2 The Probability Distribution of the Least Squares Estimators
2.4.3 Testing Hypotheses: The Test of Significance Approach
2.4.4 The Confidence Interval Approach to Hypothesis Testing
2.4.5 Test Hypothesis
2.4.6 The Errors That We Can Make Using Hypothesis Tests
2.5 实践性教学环节
Lecture 3: Further development and analysis of the classical linear regression model 第三章:模型诊断检验
Classical linear regression model assumption and diagnositics
- The regression model is linear, is correctly specified, and has an additive error term
- The error term has a zero population mean
- All explanatory variables are uncorrelated with the error term
- Observations of the error term are uncorrelated with each other (no serial correlation)
- The error term has a constant variance (no heteroskedasticity)
- No explanatory variable is a perfect linear function of any other explanatory variable(s) (no perfect multicollinearity)
- The error term is normally distributed(this assumption is optional but usually is invoked)
3.1 违背经典假设的情形
$$Y = X\beta + \epsilon_t$$
$$\hat{\beta} = (X’X)^{-1}X’Y = \beta + (X’X)^{-1}X’\epsilon_t$$
$$Z = Var(\hat{\beta}) = E[(X’X)^{-1}X’\epsilon X(X’X)^{-1}] $$
$$=(X’X)^{-1}X’E(\epsilon \epsilon’)X(X’X)^{-1} $$
$$=(X’X)^{-1}X’\Omega X(X’X)^{-1}$$
3.2 多重共线性 Multicollinearity
3.2.1 Types of multicollinearity
- Perfect multicollinearity
$$x_3 = 2x_2$$ - Imperfect multicollinearity
$$x_{1i} = x_{2i} + w$$
3.2.2 5 Consequences of Multicollinearity
- Estimates will remain unbiased
- Variances and Standard errors will increase
- t-scores will fall
- Estimates will be very sensitive to changes in specification
- Overfit
3.2.3 Detection
High Simple Correlation coefficients
High Variance Inflation Factors(VIFs)
3.2.4 Solutions to the Problems of Multicollinearity
- Ridge Regression/ PCA
- Collect more data
- Data Transform
- drop one of the collinear variables
3.3 序列相关性 Serial Correlationor or Autocorrelation
We assumed of the CLRM’s errors that $Cov(u_i,u_j)=0$ for $i\neq j$ ,i.e.This is essentially the same as saying there is no pattern in the errors.
Obviously we never have the actual u’s,so we use their sample counterpart, the residuals (the$\hat u$). If there are patterns in the residuals from a model, we say that they are autocorrelated.
3.3.1 Serial Correlated
- Pure serial correlation
$$\epsilon_t = \rho \epsilon_{t-1} + \mu$$
where $\rho$ is the first order autocorrelation coefficient
and we can state that
$$ -1 < \rho < 1$$
2. Impure serial correlation
caused by specification error such as:
- an omitted variable
- an incorrect functional form
3.3.2 Detection Autocorrelation
3.3.2.1 The Durbin-Watson Test
The Durbin-Watson(DW) is a test for first order autocorrelation-i.e.it assumes that the relationship is between an error and the previous one
$$u_t = \rho u_{t-1} + v_t$$
where $v_t\ N(0,\sigma_v^2)$
The DW test statistic actually tests
$$H_0: \rho = 0 \ and\ H_1: \rho \neq 0$$
The test statistic is calculated by
$$DW = \frac{\sum_{t=2}^{T}(\hat{u_t} - \hat{u_{t-1}})^2}{\sum_{t=2}^T\hat{u_t}^2}$$
we can also write
$$DW \approx 2(1-\rho)$$
3.3.2.2 The Breusch-Godfrey Test
It’s a more general test for $r^{th}$ order autocorrelation
$$u_t = \rho_1u_{t-1} + \rho_2u_{t-2} + … + \rho_ru_{t-r} + v_t$$
where $v_t \ N(0,\sigma_v^2)$
The Null and alternative hypotheses are:
- $H_0: \rho_1 = 0 \ and\ \rho_2 = 0 \ and … and\ \rho_r = 0$
- $H_1: \rho_1 \neq 0 \ or \ \rho_2 \neq 0\ or … or\ \rho_r \neq 0$
3.3.2.3 LM-Test
3.3.3 Remedies for Autocorrelation
3.3.3.1 Quasi-differenced
3.3.3.2 GLS
$$y_t = \beta x_t + \epsilon_t \ (1)$$
$$\epsilon_t = \rho \epsilon_{t-1} + v_t \ (2)$$
Build
$$\rho y_{t-1} = \rho \beta x_{t-1} + v_t\ (3)$$
$(1) - (3)$ we obtain
$$y_t - \rho y_{t-1} = \beta(x_t - \rho x_{t-1}) + v_t\ (4)$$
$\beta$s in (1) and (4) are not same one.
3.3.3.3 Add lag dependent variable
3.3.3.4 Newey-West Heteroscedasticity Autocorrelation Consistent Covariance Estimator
3.3.4
3.3.4.1 Dynamic Models
Static models:
$$y_t = \beta_1 + \beta_2x_{2t} + … + \beta_kx_{kt} + u_t$$
Dynamic models:
$$y_t = \beta_1 + \beta_2x_{2t} + … + \beta_kx_{kt} + \gamma_1y_{t-1} +…+\gamma_kx_{kt-1} + u_t$$
which includes $(y_t - \rho y_{t-1})$ or $(y_t - \gamma_1 y_{t-1})$
3.3.4.2 Models in First Difference Form
Denote the first difference of $\Delta y_t = y_t - y_{t-1}$
and $\Delta x_{2t} = x_{2t} - x_{2t-1}$
The model would now be:
$$\Delta y_t = \beta_1 + \beta_2 \Delta x_{2t} + … + \beta_k \Delta x_{kt} + u_t$$
Sometimes the change in y purported to depend on previous values of y or xt as well as changes in x:
$$\Delta y_t = \beta_1 + \beta_2 \Delta x_{2t-1} + … + \beta_k \Delta y_{t-1} + u_t$$
3.4 异方差性 Heteroscedastisity
3.4.0 Difference between Heteroskedasiticity and Homoskedasiticity
- Homoskedasiticity
$$\Omega = E(\epsilon \epsilon’) = \begin{bmatrix}
\epsilon_1 \
\epsilon_2 \
… \
\epsilon_n \
\end{bmatrix}
[\epsilon_1, … , \epsilon_n] $$
$$=\begin{bmatrix}
\sigma_1^2, … , 0 \
… \
0,…, \sigma_n^2 \
\end{bmatrix}
$$
$$=\sigma^2\begin{bmatrix}
1, … , 0 \
… \
0,…, 1 \
\end{bmatrix}
$$
$$=\sigma^2I$$
And
$$Var(\hat{\beta}) = (X’X)^{-1}X’\sigma^2IX(X’X)^{-1} = \sigma^2(X’X)^{-1}$$
- Heteroskedasiticity
$Cov(X_i,X_j) = 0$
and
$Var(\epsilon_t) = \sigma_t^2 < \infty$
$$\Omega = E(\epsilon \epsilon’) = \begin{bmatrix}
\sigma_1^2, … , 0 \
… \
0,…, \sigma_i^2 \
\end{bmatrix}
$$
3.4.1 Pure and Impure heteroscedastisity
Impure caused by omitting explainary variable.
3.4.2 Assumption 1-4 & 5
If model observes 1-4 then we can say the model is BLUE If model observes 5 we can use test-statistic
3.4.3 Detectation of heteroscedastisity
- Graphic Test
- GQ-Test
split samples into 2 parts and compare the varirance of these two parts. The null hypothesis is that the variances of the disturbances are equal,
$$H_0 : \sigma_1^2 = \sigma_2^2$$
- Park-Test
$$e_i = Y_i - \hat{Y_i} = Y_i - (\hat{\beta_0} + \hat{\beta_1}X_{1i} +\hat{\beta_2}X_{2i})$$
$$ln(e_1^2) = \alpha_0 + \alpha_a lnZ_1 + u_i$$
- White-Test
$$e_i^2 = \alpha + \beta(X_1 + X_2 + X_3 + X_1X_2 + …) + u$$
The 3 and 4 mean that there’s omitted variable in error iterm.
LIKE
$$Y = \alpha + \beta X + u$$
and $u$ is composed of $x_i$ and $z_i$ which $z_i$ we don’t add it to explainary variable.
3.4.4 Remedies for heteroskedastisity
There are two main remedies for pure heteroskedasticit:
1.Heteroskedasticity-corrected standard errors
2.Redefining the variables
$$\frac{y}{z_i} = \frac{\alpha}{z_i} + \beta_1 \frac{x_1}{z_i} + …$$
- use $logs$ 缩放量纲
- 理论上 White
$$Var(\hat{\beta}) = \sigma^2(X’X)^{-1}$$
实际上由于异方差性,我们“数据驱动”的用实际数据计算:
Calculate $Var(\epsilon*) = Var(\beta - \hat{\beta})$ directly.
$$Var(\hat{\beta}) = E[(X’X)^{-1}X’(\epsilon \epsilon’)X(X’X)^{-1}]$$
$$= (X’X)^{-1}X’E(\epsilon \epsilon’)X(X’X)^{-1}$$
$$= (X’X)^{-1}(\sum\hat{\epsilon_i}^2X_iX_i’)(X’X)^{-1}$$
$$\to^{\beta}$$
依概率收敛于$(\sum\epsilon_iX_iX_i’)$
Refomulating the Equation
$$\frac{Y_i}{Pop} = \alpha_0 + \beta_1\frac{Reg_i}{Pop} + \beta_2 Price_i + \epsilon_i$$Weighted Least Square
$$\frac{Y_i}{Abs(\epsilon_i)} = \alpha_0 + \beta_1Reg_i+ \beta_2 Price_i + \frac{\epsilon_i}{Abs(\epsilon_i)}$$A double-log functional form
$$Y = \alpha_0 + \beta_1lnReg_i + \beta_2 ln Price_i + \epsilon_i $$White Method above