1、Econometric Analysis of Panel Data,William GreeneDepartment of EconomicsStern School of Business,Econometric Analysis of Panel Data,22. Individual Heterogeneity and Random Parameter Variation,Heterogeneity,Observational: Observable differences across individuals (e.g., choice makers)Choice strategy:
2、 How consumers make decisions the underlying behaviorStructural: Differences in model frameworksPreferences: Differences in model parameters,Parameter Heterogeneity,Distinguish Bayes and Classical,Both depart from the heterogeneous model, f(yit|xit)=g(yit,xit,i)What do we mean by randomnessWith resp
3、ect to the information of the analyst (Bayesian)With respect to some stochastic process governing nature (Classical)Bayesian: No difference between fixed and randomClassical: Full specification of joint distributions for observed random variables; piecemeal definitions of random parameters. Usually
4、a form of random effects,Hierarchical Bayesian Estimation,Allenby and Rossi: Structure,Priors,Bayesian Posterior Analysis,Estimation of posterior distributions for upper level parameters: and VEstimation of posterior distributions for low (individual) level parameters, i|datai. Detailed examination
5、of individual parameters(Comparison of results to counterparts using classical methods),Classical Random Parameters,Fixed Management and Technical Efficiency in a Random Coefficients Model,Antonio Alvarez, University of OviedoCarlos Arias, University of LeonWilliam Greene, Stern School of Business,
6、New York University,The Production Function Model,Definition: Maximal output, given the inputsInputs: Variable factors, Quasi-fixed (land)Form: Log-quadratic - translogLatent Management as an unobservable input,Application to Spanish Dairy Farms,N = 247 farms, T = 6 years (1993-1998),Translog Produc
7、tion Model,Random Coefficients Model,Chamberlain/Mundlak:Same random effect appears in each random parameterOnly the first order terms are random,Discrete vs. Continuous Variation,Classical context: Description of how parameters are distributed across individualsVariationDiscrete: Finite number of d
8、ifferent parameter vectors distributed across individualsMixture is unknown as well as the parameters: Implies randomness from the point of the analyst. (Bayesian?)Might also be viewed as discrete approximation to a continuous distributionContinuous: There exists a stochastic process governing the d
9、istribution of parameters, drawn from a continuous pool of candidates.Background common assumption: An over-reaching stochastic process that assigns parameters to individuals,Discrete Parameter Variation,Latent Classes and Random Parameters,The Latent Class Model,Estimating an LC Model,Estimating Wh
10、ich Class,Estimating i,How Many Classes?,The EM Algorithm,Implementing EM,A Random Utility Model,Random Utility Model for Discrete Choice Among J alternatives at time t by person i.Uitj = j + xitj + ijtj = Choice specific constantxitj = Attributes of choice presented to person (Information processin
11、g strategy. Not all attributes will be evaluated. E.g., lexicographic utility functions over certain attributes.) = Taste weights, Part worths, marginal utilitiesijt = Unobserved random component of utility Mean=Eijt = 0; Variance=Varijt = 2,The Multinomial Logit Model,Independent type 1 extreme val
12、ue (Gumbel):F(itj) = 1 Exp(-Exp(itj) Independence across utility functionsIdentical variances, 2 = 2/6Same taste parameters for all individuals,Characteristic of MNL,Application Shoe Brand Choice,Simulated Data: Stated Choice, 400 respondents, 8 choice situations3 choice/attributes + NONEFashion = H
13、igh=1 / Low=0Quality = High=1 / Low=0Price = 25/50/75,100,125 coded 1,2,3,4,5 then divided by 25.Heterogeneity: Sex, Age (z |+-+-+-+-+-+ BF 1.47890473 .06776814 21.823 .0000 BQ 1.01372755 .06444532 15.730 .0000 BP -11.8023376 .80406103 -14.678 .0000 BN .03679254 .07176387 .513 .6082What do the coeff
14、icients mean? (They do seem to have the right signs.),Elasticities from MNL,+-+ | Elasticity Avg. over obs. | | Attribute is PRICE in choice B1| | * Choice=B1 -.889 | | Choice=B2 .291 | | Choice=B3 .291 | | Choice=NONE .291 | | Attribute is PRICE in choice B2| | Choice=B1 .313 | | * Choice=B2 -1.222
15、 | | Choice=B3 .313 | | Choice=NONE .313 | | Attribute is PRICE in choice B3| | Choice=B1 .366 | | Choice=B2 .366 | | * Choice=B3 -.755 | | Choice=NONE .366 | +-+,Estimated Latent Class Model,+-+| Latent Class Logit Model | Log likelihood function -3649.132 |+-+-+-+-+-+-+|Variable | Coefficient | St
16、andard Error |b/St.Er.|P|Z|z |+-+-+-+-+-+ Utility parameters in latent class - 1 BF|1 3.02569837 .14335927 21.106 .0000 BQ|1 -.08781664 .12271563 -.716 .4742 BP|1 -9.69638056 1.40807055 -6.886 .0000 BN|1 1.28998874 .14533927 8.876 .0000 Utility parameters in latent class - 2 BF|2 1.19721944 .1065233
17、6 11.239 .0000 BQ|2 1.11574955 .09712630 11.488 .0000 BP|2 -13.9345351 1.22424326 -11.382 .0000 BN|2 -.43137842 .10789864 -3.998 .0001 Utility parameters in latent class - 3 BF|3 -.17167791 .10507720 -1.634 .1023 BQ|3 2.71880759 .11598720 23.441 .0000 BP|3 -8.96483046 1.31314897 -6.827 .0000 BN|3 .1
18、8639318 .12553591 1.485 .1376 This is THETA(1) in class probability model. Constant -.90344530 .34993290 -2.582 .0098 _MALE|1 .64182630 .34107555 1.882 .0599 _AGE25|1 2.13320852 .31898707 6.687 .0000 _AGE39|1 .72630019 .42693187 1.701 .0889 This is THETA(2) in class probability model. Constant .3763
19、6493 .33156623 1.135 .2563 _MALE|2 -2.76536019 .68144724 -4.058 .0000 _AGE25|2 -.11945858 .54363073 -.220 .8261 _AGE39|2 1.97656718 .70318717 2.811 .0049 This is THETA(3) in class probability model. Constant .000000 .(Fixed Parameter). _MALE|3 .000000 .(Fixed Parameter). _AGE25|3 .000000 .(Fixed Par
20、ameter). _AGE39|3 .000000 .(Fixed Parameter).,Latent Class Elasticities,+-+ | Elasticity Averaged over observations. | | Effects on probabilities of all choices in the model: | | Attribute is PRICE in choice B1 MNL LCM | | * Choice=B1 .000 .000 .000 -.889 -.801 | | Choice=B2 .000 .000 .000 .291 .273
21、 | | Choice=B3 .000 .000 .000 .291 .248 | | Choice=NONE .000 .000 .000 .291 .219 | | Attribute is PRICE in choice B2 | | Choice=B1 .000 .000 .000 .313 .311 | | * Choice=B2 .000 .000 .000 -1.222 -1.248 | | Choice=B3 .000 .000 .000 .313 .284 | | Choice=NONE .000 .000 .000 .313 .268 | | Attribute is PR
22、ICE in choice B3 | | Choice=B1 .000 .000 .000 .366 .314 | | Choice=B2 .000 .000 .000 .366 .344 | | * Choice=B3 .000 .000 .000 -.755 -.674 | | Choice=NONE .000 .000 .000 .366 .302 | +-+,Individual Specific Means,Random Parameters (Mixed) Models,Mixed Model Estimation,WinBUGS: MCMC User specifies the
23、model constructs the Gibbs Sampler/Metropolis HastingsSAS: Proc Mixed. ClassicalUses primarily a kind of GLS/GMM (method of moments algorithm for loglinear models)Stata: ClassicalMixing done by quadrature. (Very slow for 2 or more dimensions)Several loglinear models - GLAMMLIMDEP/NLOGITClassicalMixi
24、ng done by Monte Carlo integration maximum simulated likelihoodNumerous linear, nonlinear, loglinear modelsKen Trains Gauss CodeMonte Carlo integrationUsed by many researchersMixed Logit (mixed multinomial logit) model only (but free!),Programs differ on the models fitted, the algorithms, the paradi
25、gm, and the extensions provided to the simplest RPM, i = +wi.,Modeling Parameter Heterogeneity,Maximum Simulated Likelihood,A Mixed Probit Model,Monte Carlo Integration,Monte Carlo Integration,Example: Monte Carlo Integral,Generating a Random Draw,Drawing Uniform Random Numbers,LEcuyers RNG,Define:n
26、orm= 2.328306549295728e-10,m1= 4294967087.0,m1= 4294944443.0,a12= 140358.0, a13n= 810728.0,a21= 527612.0,a23n= 1370589.0,Initializes10= the seed,s11= 4231773.0,s12= 1975.0,s20= 137228743.0,s21= 98426597.0,s22= 142859843.0.Preliminaries for each draw (Resets at least some of 5 seeds) p1 = a12*s11 - a
27、13n*s10, k = int(p1/m1), p1 = p1 - k*m1 if p1 p2, u = norm*(p1 - p2 + m1) otherwise.Passes all known randomness tests. Period = 2191Pierre LEcuyer. Canada Research Chair in Stochastic Simulation and Optimization. Dpartement dinformatique et de recherche oprationnelleUniversity of Montreal.,Quasi-Mon
28、te Carlo Integration Based on Halton Sequences,For example, using base p=5, the integer r=37 has b0 = 2, b1 = 2, and b3 = 1; (37=1x52 + 2x51 + 2x50). Then H(37|5) = 25-1 + 25-2 + 15-3 = 0.448.,Halton Sequences vs. Random Draws,Requires far fewer draws for one dimension, about 1/10. Accelerates estim
29、ation by a factor of 5 to 10.,Simulated Log Likelihood for a Mixed Probit Model,Application Doctor Visits,German Health Care Usage Data, 7,293 Individuals, Varying Numbers of PeriodsVariables in the file areData downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel wit
30、h 7,293 individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary choice. This is a large data set. There are altogether 27,326 observations. The number of observations ranges from 1 to 7. (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6
31、=1000, 7=987). Note, the variable NUMOBS below tells how many observations there are for each person. This variable is repeated in each row of the data for the person. (Downlo0aded from the JAE Archive) DOCTOR = 1(Number of doctor visits 0) HSAT = health satisfaction, coded 0 (low) - 10 (high) DOCVI
32、S = number of doctor visits in last three months HOSPVIS = number of hospital visits in last calendar year PUBLIC = insured in public health insurance = 1; otherwise = 0 ADDON = insured by add-on insurance = 1; otherswise = 0 HHNINC = household nominal monthly net income in German marks / 10000. (4
33、observations with income=0 were dropped) HHKIDS = children under age 16 in the household = 1; otherwise = 0 EDUC = years of schooling AGE = age in years MARRIED = marital status EDUC = years of education,Estimates of a Mixed Probit Model,+-+| Random Coefficients Probit Model | Dependent variable DOC
34、TOR | Log likelihood function -16483.96 | Restricted log likelihood -17700.96 | Unbalanced panel has 7293 individuals. |+-+-+-+-+-+-+-+|Variable | Coefficient | Standard Error |b/St.Er.|P|Z|z | Mean of X|+-+-+-+-+-+-+ Means for random parameters Constant -.09594899 .04049528 -2.369 .0178 AGE .021024
35、71 .00053836 39.053 .0000 43.5256898 HHNINC -.03119127 .03383027 -.922 .3565 .35208362 EDUC -.02996487 .00265133 -11.302 .0000 11.3206310 MARRIED -.03664476 .01399541 -2.618 .0088 .75861817+-+-+-+-+-+-+ Constant .02642358 .05397131 .490 .6244 AGE .01538640 .00071823 21.423 .0000 43.5256898 HHNINC -.
36、09775927 .04626475 -2.113 .0346 .35208362 EDUC -.02811308 .00350079 -8.031 .0000 11.3206310 MARRIED -.00930667 .01887548 -.493 .6220 .75861817,Random Parameters Probit,Diagonal elements of Cholesky matrix Constant .55259608 .05381892 10.268 .0000 AGE .279052D-04 .00041019 .068 .9458 HHNINC .03545309
37、 .04094725 .866 .3866 EDUC .00994387 .00093271 10.661 .0000 MARRIED .01013553 .00643526 1.575 .1153 Below diagonal elements of Cholesky matrix lAGE_ONE .00668600 .00071466 9.355 .0000 lHHN_ONE -.23713634 .04341767 -5.462 .0000 lHHN_AGE .09364751 .03357731 2.789 .0053 lEDU_ONE .01461359 .00355382 4.112 .0000 lEDU_AGE -.00189900 .00167248 -1.135 .2562 lEDU_HHN .00991594 .00154877 6.402 .0000 lMAR_ONE -.04871097 .01854192 -2.627 .0086 lMAR_AGE -.02059540 .01362752 -1.511 .1307 lMAR_HHN -.12276339 .01546791 -7.937 .0000 lMAR_EDU .09557751 .01233448 7.749 .0000,