stata软件实现随机前沿估计方法.pdf-道客多多

资源描述

1、Stochastic Frontiers In this section we take the maximum likelihood approach and apply it to a fairly useful and powerful tool - stochastic frontier estimation. Whats the basic idea? Howto estimate economic relationships that ought to be modeled as upper or lower frontiers rather than averages. For

2、example: Consider a demand curve. From the theory of demand, the demand curve is a frontier which tells the rm the most it can charge for the marginal unit. Econometrics via traditional OLS would gather Price and Quantity data and estimate an average demand curve. At any Q 0 the model predicts P 0 b

3、ut the rm could actually change P A . The average OLS approach might over or under predict price. The model the underpredicts costs the rm money. The model the overpredicts might be catastrophic. Frontier estimation tries to x this problem. However, not all data are conduce to SFA. Uses: Production

4、functions, cost functions, demand models, test of union eectiveness, agency costs, reservation wages, school outcomes, protability, survivorship, merger and acquisition analysis, eect of shadow inputs such as corruption. Consider the traditional production function : 1 This is the single input case

5、where q = f(X). The slope of the ray from the origin is a measureofproductivityq=X. Notethatq=X atPointAislessthanatPointBandPoint C. We can imagine technological change over time which would be a shift in the production frontier Using multiple inputs, the picture changes a little: Point P is inecie

6、nt relative to Point M. Point A is both technically and allocatively eciency. We can measure Technical Eciency as TE = OM=OP and Technical Ineciency as 1TE =1OM=OP. Allocative eciency is measured as ON=OM 1. Overall eciency is measured as (OM=OP)(ON=OM)=(ON=OP) 2History Farrell (1957, Journal of the

7、 Royal Statistical Society, Series A): Derives a production function approach and identies two sources of rm ineciency/eciency 1. Technical Eciency: Produce the most output with a given level of inputs 2. Allocative Eciency: Produce a given output as cheaply as possible. Most of the time we focus on

8、 technical eciency in the explanation To determine if a rm is ecient, we have to know the production function of the fully ecient rm. However, we never know the fully ecient production function Farrell suggested estimating a fully ecient production function. There are two ways to do this: 1. Non par

9、ametric techniques: Data envelopment analysis. This technique assumes that all deviations from the ecient frontier is a realization of ineciency 2. Parametric techniques: Stochastic Frontier Analysis. This technique assumes that deviations from the ecient frontier can be either a realization of inec

10、iency or a random shock. Aigner, Lovell and Schmidt (1977) and van den Broeck (1977) both introduced a way to deal with SFA and production functions. Basic Setup Consider a production function q i = f(x i ;) where x i is a vector of inputs, q i is output, and is a k1 vector of parameters to be estim

11、ated. 3 We can think of eciency being measured as i multiplied by the theoretical norm where i 20;1 such that q i =f(x i ;) i If i =1 then the rm is fully ecient and produces the most it can. If i 1 then the rm is not fully ecient. Wecanletq i =f(x i ;)bethelevelofoutputthatshouldhappen. Letq 0 beth

12、eobserved output where q 0 q F because of ineciency and other factors. As q 0 q F = f(x i ;), Aigner and Chu (1968) suggested adding a non-negative random variable to f(x i ) which would capture the technical ineciency of rm i: q o =f(x i ;)u i To estimate this type of model, one could use a xed eec

13、ts model where u i was treated as the rm xed eects. Lets assume: f(x i ;) = 0 X 1 1 X 2 2 X k k lnf(x i ;) = ln 0 + 1 lnX 1 + 2 lnX 2 + k lnX k lnq i = 1 lnX 1 + 2 lnX 2 + k lnX k u i lnq i = Xu i Aigner and Chu (1968) suggested a measure of technical eciency of ObservedOutput FrontierOutput = q i e

14、xp(x i ) = exp(x i u i ) exp(x i ) 4where 0exp(u i )1. While this is a decent shot at the problem it does leave a lot to be desired. Mainlyu i issupposedtomeasureineciencybutitmightalsobecapturingotherrandom shocks that are beyond the control of the rms management. For example, would we want to hold

15、 the rms management for the impacts of Katrina or an earthquake or some other weather event? Aigner, Lovell, and Schmidt (ALS) in 1977 suggested adding a two-sided error term to the one-sided error term of AC (1968). It doesnt seem like that big of a deal, but it did take some work to derive the lik

16、elihood function. Now, q i =f(x i ) i exp(v i ) which yields lnq i =ln(f(x i )+ln( i )+v i Dening u i =ln( i ) yields lnq i =ln(f(x i )+v i u i In a Cobb-Douglas type world lnq i = 0 + k X j=1 j ln(x ji )+v i u i However, we cannot use OLS to get at the composite error term. Some basic assumptions v

17、 i iidN(0; 2 v ), u i 0 and cov(v i ;u i )=0, where v = measure- ment error, weather, random factors, and u=technical inciency (one sided). Hence, u i requires us to make an assumption about the distribution of u. 5 Popular distributions include 8 : Half-Normal Exponential Truncated Normal Gamma (ra

18、re) 9 = ; STATA can handle all of these Lets take another look at what is going on For Firm i: A is deterministic output level A 0 might be frontier output: q =exp(x i +v i ) where v i 0 A 00 might be observed output: q =exp(x i +v i u i ) For Firm j: B is deterministic output level B 0 might be fro

19、ntier output: q =exp(x j +v j ) where v j 0 B 00 might be observed output: q =exp(x j +v j u j ) Note: Thecompositeerrortermv i u i doesntcauseaproblemwithOLSaslongasv i and u i are independent of the inputs x. OLS is unbiased, consistent, and ecient amongst linear estimators, except the intercept i

20、s not consistent. Nevertheless, it is impossible to extricate 2 u and 2 v . Note: MLE yields more ecient , a consistent intercept, and a consistent var(v i u i ). 6 Note: These types of models assume a frontier from above. At times we might want to think about frontiers from below, e.g. cost functio

21、ns. Kumbhakar and Lovell (2000) show that using the dual to production it is possible to derive frontier cost functions: ln(C i )= 0 + q ln(q)+ k X j=1 j ln(P ji )+v i +u i where P ji is the price of input i for rm j. Note: u i is added to the cost frontier as ineciency is expected to RAISE costs. A

22、LS (1977): Assume v iidN(0; 2 v ), u iidN + (0; 2 u ) half normal, and dene the variance parameters as 2 = 2 v + 2 u , = u = v 0 If = 0, then there is no u and hence no technical ineciency. Battese and Cora (1977) took a dierent approach and set their variance parameters as 2 = 2 v + 2 u = ( 2 u = 2

23、 v ) If =0 then all deviations from the frontier are noise. If =1 then all deviations from the frontier are ineciencies. Estimation: Assume v is Normally distributed and let u take some form of a one-sided error term. Battese and Cora: lnL= N 2 ln( 2 ) N 2 ln( 2 )+ X ln(1(z i ) 1 2 2 X (lny i x i )

24、2 7where z i = (lny i x i ) 1 1=2 , where x i are in log form and 2 = ( 2 u + 2 v ) and is the cumulative standard normal distribution. We maximize lnL over ; 2 ; for k+2 parameters where k includes the intercept term. How to estimate? 1. Use OLS for starting values of and 2 2. Evaluate lnL for valu

25、es of 20;1 3. Combine and hope for convergence. If u are half-normal, then E(exp(u i )=21( p )exp( 2 =2) This yields the average technical eciency (TE) across the entire sample of rms. What about the eciency of individual rms? Jondrow, Materov, Lovell, and Schmidt (1982) provide one measure TE i = 1

26、 ( A + i = A ) 1( i = A ) exp( i + 2 A =2); where i = lny i x i , x i are in log form and A = ( (1 ) 2 ) 1=2 . Plug in the MLE estimates and residuals to obtain TE i . Having estimated the stochastic frontier and calculating a form of TE, all sorts of things can be done. We can use other variables t

27、o help explain eciency scores. Use eciency scores to answer other questions. 8 Testing for the existence of technical eciency is important: LR Test: H 0 : =0 where = 2 u = 2 or H 0 := u = v =0 H : 6=0 Theteststatisticis2flnL R lnL UR g 2 (1) wherelnL R isthevalueofthelog-likelihood when =0. STATA al

28、lows for three avors of one-sided error terms in the command frontier: 1. The Normal-Half Normal model (v is Normal, u is Half-Normal): lnL= N X i=1 1 2 ln( 2 )ln +ln s i 2 i 2 2 where s=+1 for a production frontier and s=1 for a cost frontierfrontier y x1 x2 x3, cost. In STATA: frontier y x1 x2 x3

29、2. The Normal-Exponential model (v is Normal, u is Exponential): lnL= N X i=1 0 ln u + 2 v 2 2 u +ln 0 s i 2 v 2 u 2 v 1 A s 2 i u 1 A In STATA: frontier y x1 x2 x3, d(e) 3. Normal-Truncated Normal (v is Normal, u is Truncated Normal): lnL= N X i=1 1 2 ln(2)ln +ln 1=2 +ln (1 )s i 2 (1 ) 1=2 1 2 i +s

30、 2 ! In STATA: frontier y x1 x2 x3, d(t) 9where = ( 2 u + 2 v ) 1=2 , = u = v , = 2 u = 2 = 2 u =( 2 u + 2 v ), i = y i x i , () is the CDF of the standard normal, and s=+1 for a production function (frontier from above) and s=1 for a cost function (frontier from below). Obviously the log likelihood

31、 functions are quite ugly to derive. Once we have our MLE estimates in hand, we want to obtain an estimate of u, which we can obtain with the predict command in STATA: predict u1, u where u=ln(TE i ) using Eu i j i predict u1, m where m=ln(TE i ) using mode Mu i j i Itispossibletousepredictte1,teest

32、imatesoftechnicaleciencyviaEexp(su i )where s=1 for a production frontier and s=1 for a cost frontier. Note that TE i 20;1. To back out an estimate of the us we can use either the mean or the mode of f(uj): Eu i j i = i + ( i ) ( i ) ! Mu i j i 8 : i if i 0 0 otherwise and TE i =Eexp(su i )j i = 1(s

33、 i = X ) 1( i = ) exp(s i + 1 2 2 ) where and i vary by the distributional assumptions made. 1. Normal-Half Normal i = s i 2 u 2 = u v where 2 =( 2 u + 2 v ); =( 2 u + 2 v ) 1=2 102. Normal-Exponential i = s i 2 v = u = v 3. Normal-Truncated Normal i = s i 2 u + 2 v 2 = u v By using dierent distribu

34、tions one will obtain dierent TE rankings. This is natural we just hope there arent dramatic dierences across the dierent specications. See example of baseball player salaries and interpretation of results. Example: Baseball Salaries Using data from 1998 describing non-pitcher MLB baseball players,

35、their salary and pro- duction statistics, we estimate a wage frontier. We might anticipate that some players are better at negotiating their wage to a wage frontier“ dened as the Marginal Revenue Product: 11 Theoriginalsampleconsistsof337observations,butnotallplayershavepositivestatistics in all inp

36、uts - those that have missing inputs will be dropped. First specication is lnSAL i = 0 + 1 lnRBI i + 2 lnHR i + 3 lnKu i +v i and is estimated using the default of Normal-Half Normal: . frontier lnsal lnrbi lnhr lnk Stoc. frontier normal/half-normal model Number of obs = 292 Wald chi2(3) = 157.62 Lo

39、 .1581507 2.87208 3.492019 - Likelihood-ratio test of sigma_u=0: chibar2(01) = 15.11 Prob=chibar2 = 0.000 We nd that only lnRBI is signicant, the other variables are not. /lnsig2v is the log( 2 v ) Here it is1:739) 2 v =0:175 or v =0:419 12 /lnsig2u is the log( 2 u ) Here it is 0:575) 2 v =1:77 or u

40、 =1:33. Combining the results, we nd that 2 = 2 v + 2 u =0:175+1:777=1:945 = u = v =1:33=0:419=3:17 The LR test that u =0 is rejected. The second specication includes runs scored as an additional input. . frontier lnsal lnrbi lnhr lnk lnruns Stoc. frontier normal/half-normal model Number of obs = 29

41、2 Wald chi2(4) = 227.24 Log likelihood = -356.78414 Prob chi2 = 0.0000 - lnsal | Coef. Std. Err. z P|z| 95% Conf. Interval -+- lnrbi | .3069333 .1315723 2.33 0.020 .0490564 .5648102 lnhr | .0971831 .0622667 1.56 0.119 -.0248574 .2192236 lnk | -.1909785 .1170475 -1.63 0.103 -.4203874 .0384304 lnruns

43、87308 1.19505 1.538201 sigma2 | 1.938299 .2184375 1.510169 2.366429 lambda | 4.285905 .1279021 4.035221 4.536589 - Likelihood-ratio test of sigma_u=0: chibar2(01) = 26.34 Prob=chibar2 = 0.000 Now =4:28 and u =0 is still rejected. 13 We grab the technical eciency score and the technical ineciency mea

44、sure u (45 missing values generated) . predict u1, u (45 missing values generated) . sort u1 . gen rank1 = _n Third specication: Here we think that there might be heteroscedasticity in v i based on the log of homeruns, we use the command frontier y x1 x2 x3, vhet(varlist) . frontier lnsal lnrbi lnhr lnk lnruns, vhet(lnhr) Stoc. frontier normal/half-normal model Number of obs = 292 Wald chi2(4) = 213.59 Log likelihood = -352.99795 Prob chi2 = 0.0000 -

展开阅读全文