1、CHAPTER 1 TEACHING NOTES You have substantial latitude about what to emphasize in Chapter 1. I find it useful to talk about the economics of crime example (Example 1.1) and the wage example (Example 1.2) so that students see, at the outset, that econometrics is linked to economic reasoning, if not e
2、conomic theory. I like to familiarize students with the important data structures that empirical economists use, focusing primarily on cross-sectional and time series data sets, as these are what I cover in a first-semester course. It is probably a good idea to mention the growing importance of data
3、 sets that have both a cross-sectional and time dimension. I spend almost an entire lecture talking about the problems inherent in drawing causal inferences in the social sciences. I do this mostly through the agricultural yield, return to education, and crime examples. These examples also contrast
4、experimental and nonexperimental data. Students studying business and finance tend to find the term structure of interest rates example more relevant, although the issue there is testing the implication of a simple theory, as opposed to inferring causality. I have found that spending time talking ab
5、out these examples, in place of a formal review of probability and statistics, is more successful (and more enjoyable for the students and me). 3CHAPTER 2 TEACHING NOTES This is the chapter where I expect students to follow most, if not all, of the algebraic derivations. In class I like to derive at
6、 least the unbiasedness of the OLS slope coefficient, and usually I derive the variance. At a minimum, I talk about the factors affecting the variance. To simplify the notation, after I emphasize the assumptions in the population model, and assume random sampling, I just condition on the values of t
7、he explanatory variables in the sample. Technically, this is justified by random sampling because, for example, E(u i |x 1 ,x 2 ,x n ) = E(u i |x i ) by independent sampling. I find that students are able to focus on the key assumption SLR.3 and subsequently take my word about how conditioning on th
8、e independent variables in the sample is harmless. (If you prefer, the appendix to Chapter 3 does the conditioning argument carefully.) Because statistical inference is no more difficult in multiple regression than in simple regression, I postpone inference until Chapter 4. (This reduces redundancy
9、and allows you to focus on the interpretive differences between simple and multiple regression.) You might notice how, compared with most other texts, I use relatively few assumptions to derive the unbiasedness of the OLS slope estimator, followed by the formula for its variance. This is because I d
10、o not introduce redundant or unnecessary assumptions. For example, once SLR.3 is assumed, nothing further about the relationship between u and x is needed to obtain the unbiasedness of OLS under random sampling. 4SOLUTIONS TO PROBLEMS 2.1 (i) Income, age, and family background (such as number of sib
11、lings) are just a few possibilities. It seems that each of these could be correlated with years of education. (Income and education are probably positively correlated; age and education may be negatively correlated because women in more recent cohorts have, on average, more education; and number of
12、siblings and education are probably negatively correlated.) (ii) Not if the factors we listed in part (i) are correlated with educ. Because we would like to hold these factors fixed, they are part of the error term. But if u is correlated with educ then E(u|educ) 0, and so SLR.3 fails. 2.2 In the eq
13、uation y = 0+ 1 x + u, add and subtract 0from the right hand side to get y = ( 0+ 0 ) + 1 x + (u 0 ). Call the new error e = u 0 , so that E(e) = 0. The new intercept is 0+ 0 , but the slope is still 1 . 2.3 (i) Let y i= GPA i , x i= ACT i , and n = 8. Then x = 25.875, y = 3.2125, (x 1 n i = i x )(y
14、 i y ) = 5.8125, and (x 1 n i = i x ) 2= 56.875. From equation (2.9), we obtain the slope as 1 = 5.8125/56.875 .1022, rounded to four places after the decimal. From (2.17), 0 = y 1 x 3.2125 (.1022)25.875 .5681. So we can write = .5681 + .1022 ACT GPAn = 8. The intercept does not have a useful interp
15、retation because ACT is not close to zero for the population of interest. If ACT is 5 points higher, increases by .1022(5) = .511. GPA(ii) The fitted values and residuals rounded to four decimal places are given along with the observation number i and GPA in the following table: i GPAGPA u 1 2.8 2.7
16、143 .0857 2 3.4 3.0209 .3791 3 3.0 3.2253 .2253 4 3.5 3.3275 .1725 5 3.6 3.5319 .0681 6 3.0 3.1231 .1231 7 2.7 3.1231 .4231 8 3.7 3.6341 .0659You can verify that the residuals, as reported in the table, sum to .0002, which is pretty close to zero given the inherent rounding error. 5 (iii) When ACT =
17、 20, = .5681 + .1022(20) GPA 2.61. (iv) The sum of squared residuals, 2 1 n i i u = , is about .4347 (rounded to four decimal places), and the total sum of squares, (y 1 n i = i y ) 2 , is about 1.0288. So the R-squared from the regression is R 2= 1 SSR/SST 1 (.4347/1.0288) .577. Therefore, about 57
18、.7% of the variation in GPA is explained by ACT in this small sample of students. 2.4 (i) When cigs = 0, predicted birth weight is 119.77 ounces. When cigs = 20, = 109.49. This is about an 8.6% drop. bwght(ii) Not necessarily. There are many other factors that can affect birth weight, particularly o
19、verall health of the mother and quality of prenatal care. These could be correlated with cigarette smoking during birth. Also, something such as caffeine consumption can affect birth weight, and might also be correlated with cigarette smoking. (iii) If we want a predicted bwght of 125, then cigs = (
20、125 119.77)/( .524) 10.18, or about 10 cigarettes! This is nonsense, of course, and it shows what happens when we are trying to predict something as complicated as birth weight with only a single explanatory variable. The largest predicted birth weight is necessarily 119.77. Yet almost 700 of the bi
21、rths in the sample had a birth weight higher than 119.77. (iv) 1,176 out of 1,388 women did not smoke while pregnant, or about 84.7%. 2.5 (i) The intercept implies that when inc = 0, cons is predicted to be negative $124.84. This, of course, cannot be true, and reflects that fact that this consumpti
22、on function might be a poor predictor of consumption at very low-income levels. On the other hand, on an annual basis, $124.84 is not so far from zero. (ii) Just plug 30,000 into the equation: = 124.84 + .853(30,000) = 25,465.16 dollars. cons(iii) The MPC and the APC are shown in the following graph
23、. Even though the intercept is negative, the smallest APC in the sample is positive. The graph starts at an annual income level of $1,000 (in 1970 dollars). 6 inc 1000 10000 20000 30000 .7 .728 .853 APC MPC .9 APC MPC2.6 (i) Yes. If living closer to an incinerator depresses housing prices, then bein
24、g farther away increases housing prices. (ii) If the city chose to locate the incinerator in an area away from more expensive neighborhoods, then log(dist) is positively correlated with housing quality. This would violate SLR.3, and OLS estimation is biased. (iii) Size of the house, number of bathro
25、oms, size of the lot, age of the home, and quality of the neighborhood (including school quality), are just a handful of factors. As mentioned in part (ii), these could certainly be correlated with dist and log(dist). 2.7 (i) When we condition on inc in computing an expectation, inc becomes a consta
26、nt. So E(u|inc) = E( inc e|inc) = inc E(e|inc) = inc 0 because E(e|inc) = E(e) = 0. (ii) Again, when we condition on inc in computing a variance, inc becomes a constant. So Var(u|inc) = Var( inc e|inc) = ( inc ) 2 Var(e|inc) = 2 e inc because Var(e|inc) = 2 e . (iii) Families with low incomes do not
27、 have much discretion about spending; typically, a low-income family must spend on food, clothing, housing, and other necessities. Higher income people have more discretion, and some might choose more consumption while others more saving. This discretion suggests wider variability in saving among hi
28、gher income families. 2.8 (i) From equation (2.66), 71 %= 1 n ii i x y = / 2 1 n i i x = . Plugging in y i= 0+ 1 x i+ u igives 1 %= 01 1 () i n ii i x xu = + 2 1 n i i / x = . After standard algebra, the numerator can be written as 2 01 111 i nnn ii iii i x xx = +u . Putting this over the denominato
29、r shows we can write 1 %as 1 %= 0 1 n i i x = / 2 1 n i i x = + 1+ 1 n ii i x u = / 2 1 n i i x = . Conditional on the x i , we have E( 1 % ) = 0 1 n i i x = / 2 1 n i i x = + 1because E(u i ) = 0 for all i. Therefore, the bias in 1 %is given by the first term in this equation. This bias is obviousl
30、y zero when 0= 0. It is also zero when 1 n i i x = = 0, which is the same as x = 0. In the latter case, regression through the origin is identical to regression with an intercept. (ii) From the last expression for 1 % in part (i) we have, conditional on the x i , Var( 1 % ) = Var 2 2 1 n i i x = 1 n
31、 ii i x u = = 2 2 1 n i i x = 2 1 Var( ) n ii i x u = = 2 2 1 n i i x = 22 1 n i i x = = 2 / 2 1 n i i x = . (iii) From (2.57), Var( 1 ) = 2 / 2 1 () n i i x x = . From the hint, 2 1 n i i x = 2 1 ( n i i ) x x = , and so Var( 1 % ) Var( 1 ). A more direct way to see this is to write 2 1 () n i i x x = = 2 1 () n i i 2 x nx = , which is less than 2 1 n i i x = unless x = 0. 8