资源描述
CHAPMAN & HALL/CRCEmpiricalLikelihoodArt B. OwenBoca Raton London New York Washington, D.C.2001 CRC Press LLC This book contains information obtained from authentic and highly regarded sources. Reprinted materialis quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonableefforts have been made to publish reliable data and information, but the author and the publisher cannotassume responsibility for the validity of all materials or for the consequences of their use.Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronicor mechanical, including photocopying, microfilming, and recording, or by any information storage orretrieval system, without prior permission in writing from the publisher.The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, forcreating new works, or for resale. Specific permission must be obtained in writing from CRC Press LLCfor such copying.Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and areused only for identification and explanation, without intent to infringe.Visit the CRC Press Web site at 2001 by Chapman & Hall/CRCNo claim to original U.S. Government worksInternational Standard Book Number 1-58488-071-6 Library of Congress Card Number 2001028680Printed in the United States of America 1 2 3 4 5 6 7 8 9 0Printed on acid-free paperLibrary of Congress Cataloging-in-Publication DataOwen, Art. (Art B.)Empirical likelihood / Art Owen.p. cm. (Monographs on statistics and applied probability ;92)Includes bibliographical references and index.ISBN 1-58488-071-6 (alk. paper)1. Estimation theory. 2. Probabilities. 3. Mathematicalstatistics. I. Title. II. Series. QA276.8 .094 2001519.5 44dc21 2001028680disclaimer Page 1 Tuesday, April 24, 2001 10:57 AM2001 CRC Press LLC TO PATRIZIA, GREGORY, AND ELLIOT2001 CRC Press LLC ContentsPreface1Introduction1.1Earthwormsegments,skewnessandkurtosis1.2Empiricallikelihood,parametriclikelihood,andthebootstrap1.3Bibliographicnote2Empiricallikelihood2.1Nonparametricmaximumlikelihood2.2Nonparametriclikelihoodratios2.3Tiesinthedata2.4Multinomialonthesampl2.5ELforaunivariatemea2.6Coverageaccuracy2.7One-sidedcoveragelevels2.8Powerandefcienc2.9ComputingELforaunivariatemean2.10Empiricaldiscoveryofparametricfamilies2.11Bibliographicnote2.12Exercise3ELforrandomvectors3.1NPMLEforIIDvector3.2ELforamultivariatemean3.3Fisher,Bartlett,andbootstrapcalibratio n3.4Smoothfunctionsofmean3.5Estimatingequation3.6ELforquantile3.7Tiesandquantile3.8Likelihood-basedestimatingequations3.9TransformationinvarianceofEL3.10Sideinformatio3.11Sandwichestimato3.12Robustestimator2001 CRC Press LLC 3.13Robustlikelihood3.14Computationandconvexduality3.15Euclideanlikelihoo3.16Othernonparametriclikelihoods3.17Bibliographicnote3.18Exercises4Regressionandmodeling4.1Randompredictor4.2Nonrandompredictor4.3TriangulararrayELT4.4Analysisofvariance4.5Variancemodelin4.6Nonlinearleastsquare4.7Generalizedlinearmodels4.8Poissonregressio4.9Calibration,prediction,andtoleranceregions4.10EuclideanlikelihoodforregressionandANOVA4.11Bibliographicnote4.12Exercise5Empiricallikelihoodandsmoothing5.1Kernelestimates5.2Biasandvariance5.3ELforkernelsmooth5.4Bloodpressuretrajectorie5.5Conditionalquantile5.6Simultaneousinferenc5.7Anadditivemode5.8Bibliographicnote5.9Exercise6Biasedandincompletesamples6.1Biasedsampling6.2Multiplebiasedsample6.3Truncationandcensorin6.4NPMLEsforcensoredandtruncateddata6.5Product-limitestimator6.6ELforrightcensorin6.7Proportionalhazard6.8Furtherempiricallikelihoodratioresults6.9Bibliographicnote6.10Exercise2001 CRC Press LLC 7Bandsfordistributions7.1TheECDF7.2ExactcalibrationofECDFbands7.3Asymptoticsofband7.4Bibliographicnote8Dependentdata8.1Timeserie8.2Reducingtoindependence8.3Blockwiseempiricallikelihood8.4Spectralmetho8.5Finitepopulation8.6MELEsusingsideinformation8.7Samplingdesign8.8Empiricallikelihoodratiosfornitepopulations8.9Otherdependentdata8.10Bibliographicnote8.11Exercise9Hybridsandconnection9.1Productofparametricandempiricallikelihoods9.2Parametricconditionallikelihoo9.3Parametricmodelsfordataranges9.4EmpiricallikelihoodandBaye9.5Bayesianbootstrap9.6Leastfavorablefamiliesandnonparametrictilting9.7Bootstrapli kelihoo9.8BootstrappingfromanNPMLE9.9Jackknive9.10Sieve9.11Bibliographicnote9.12Exercise10ChallengesforEL10.1Symmetry10.2Independence10.3Comparisontopermutationtests10.4Convexhullconditio10.5Inequalityandqualitativeconstraint10.6Nonsmoothestimatingequation10.7Adverseestimatingequationsandblackboxes10.8Bibliographicnote10.9Exercises2001 CRC Press LLC 11Someproofs11.1Lemma11.2UnivariateandVectorELT11.3TriangulararrayEL11.4Multi-sampleEL11.5Bibliographicnote12Algorithms12.1Statisticaltask12.2Smoothoptimization12.3Estimatingequationmethods12.4Partialderivative12.5Primalproblem12.6Sequentiallinearizatio12.7Bibliographicnote13Higherorderasymptotics13.1Bartlettcorrection13.2Bartlettcorrectionandsmoothfunctionsofmeans13.3Pseudo-likelihoodtheory13.4Signedrootcorrection13.5Largedeviation13.6Bibliographicnotes13.7ExerciseAppendiA.1OrderandstochasticordernotationA.2ParametricmodelA.3LikelihooA.4Thebootstrapide aA.5Bootstrapcondenc eintervalsA.6Betterbootstrapcondenc eintervalA.7BibliographicnoteReferences2001 CRC Press LLC PrefaceEmpirical likelihood is a nonparametric method of inference based on a data-drivenlikelihoodratiofunction.Likethebootstrapandjackknife,empiricallikeli-hoodinferencedoesnotrequireustospecifyafamilyofdistributionsforthedata.Likeparametriclikelihoodmethods,empiricallikelihoodmakesanautomaticde-termination of the shape of condence regions; it straightforwardly incorporatessideinformationexpressedthroughconstraintsorpriordistributions;itextendstobiased sampling and censored data, and it has very favorable asymptotic powerproperties. Empirical likelihood can be thought of as a bootstrap that does notresample, and as a likelihood without parametric assumptions.Thisbookdescribesandillustratesempiricallikelihoodinference,asubjectthatisreadyforabookalthoughitisstillundergoingactivedevelopment.Mostofthepublishedliteraturehasemphasizedmathematicalstudyofasymptoticsandsimu-lations of coverage properties. This book emphasizes analyzing data in ways thatillustrate the power and exibility of empirical likelihood inference. The presen-tation is aimed primarily at students and at practitioners looking for new ways tohandle theirdata. Itisalsoaimed at researchers looking fornew challenges.The rst four chapters form the core of the book. Chapters 5 through 8 extendthe ideas to problems such as smoothing, biased sampling, censored and trun-cateddata,condencebands,timeseries,andnitepopulations.Chapter9relatesempiricallikelihoodtoothermethods,andpresentssomehybrids.Chapter10de-scribessomechallengesandresultsneartheresearchfrontier.Chapters11through13containproofs,computationaldetails,andmoreadvancedtheory,respectively.Anappendixcollectssomebackgroundmaterial.Acourseinempiricallikelihoodcould be designed around Chapters 1 through 4, supplemented with those othertopics of most interesttotheinstructor and students.Much more could have been written about some aspects of empirical likeli-hood. Applications and theory relevant to survival analysis and to econometricscome to mind, as do recent developments in kernel smoothing and nite popula-tionsampling.Themathematicallevelofthetextisgearedtowardsstudentsandpractitioners,andwherepossible,thepresentationstaysclosetothedata.Inparticular,measuretheoreticsubtletiesareignored.Someveryshortandsimpleproofsareembeddedin the text. Longer or more technical theoretical discussions are conned to theirown chapters. Finally, the most difcult results are only outlined, and the reader2001 CRC Press LLC is referred to the literature for the details. A parallel triage has been applied tocomputational issues.The worked examples in the text all use real data, instead of simulated, syn-thetic, or hypothetical data. I believe that all of the statisticalproblems illustratedare important ones, although in some examples the method illustrated is not onefor which the data were gathered. The reader is asked to indulge some statisticallicense here, and to imagine his or her own data in the place of the illustratingdata.Thereisanempiricallikelihoodhomepage.AtthetimeofwritingtheURLforthat page is:http:/www.stanford.edu/owen/empiricalThe web site is for software, images, and other information related to empiricallikelihood.Asanundergraduate,IstudiedstatisticsandcomputerscienceattheUniversityof Waterloo. The statistics professors there instilled in me a habit of turning rstto the likelihood function, whenever an inference problem appeared. I arrived atStanford University for graduate study at a time when there was a lot of excite-ment about nonparametric methods. Empirical likelihood is a way of remaininginboth traditions.The idea for empirical likelihood arose when Rupert Miller assigned a prob-lem in survival analysis from Kalbeisch & Prentice (1980). The problem wasto work out the nonparametric likelihood ratio inferences for the survival func-tion as described in Thomas & Grunkemeier (1975). Around that time, there wasa debate among some statistics students as to whether nonparametric condenceintervals for a univariate mean should point in the direction that the data seemedto be skewed, or in the opposite direction. There were intuitive arguments andexisting methods to support either choice. I looked into empirical likelihood tosee if it might point the way, and was surprised to nd a nonparametric analog ofWilkss theorem, with the same distribution as in parametric settings. I now callthese ELTs (empirical likelihood theorems) after a referee remarked that they arenot Wilksstheorem.Empirical likelihood has been developed by many researchers, as is evidentfromthebibliographicnotesinthistext.Itishardtoidentifyonlyafewcontribu-tionsfromthemany,andleavesomeothersout.Butitwouldbeharderstillnottolistthefollowing:PeterHall,TomDiCiccio,andJoeRomanoobtainedsomeverydifcultandsignicantresultsonhigherorderasymptotics.TheseincludeBartlettcorrectability, signed root corrections, pseudo-likelihood theory, bootstrap cali-bration, and connections to least favorable families. Jing Qin is responsible formany very creative problem formulations mixing empirical and parametric like-lihood, combining multiple biased samples, and with Jerry Lawless, establish-ing results on using empirical likelihood with side information. Per Mykland hasshownhowtohandledependentdatainamartingalesetting.EmpiricallikelihoodforcensoredandtruncateddatahasbeeninvestigatedbyGangLi,SusanMurphy,2001 CRC Press LLC and Aad van der Vaart. Yuichi Kitamura developed connections between empir-ical likelihood and modern econometrics, and has studied the large deviationsproperties of empirical likelihood.I would like to thank Kirsty Stroud, Naomi Lynch, Hawk Denno, and EvelynMeany of Chapman & Hall/CRC Press for watching over this book through theproductionprocess.ItisapleasuretoacknowledgetheNationalScienceFounda-tion for supporting, in part, the writing of this book. I also thank the referees andeditors who handled the early empirical likelihood papers. They were construc-tive and generous with their comments. I thank Dan Bloch, Richard Gill, Ker-AiLee, Gang Li, Hal Stern, and Thomas Yee, who sent me some data; Judi Davisfor entering some data; Ingram Olkin for some tips on indexing; Tomas Rokicki,Eric Sampson, Rob Tibshirani, and C. L. Tondo for some pointers on LaTeX andrelated topics; Philip Gill, Michael Saunders, and Walter Murray for discussionsover the years on nonlinear optimization; George Judge for conversations abouteconometrics; and Balasubramanian Narasimhan, who kept the computers hum-ming and always seemed to know what software tool I should learn next. I owe adebt to Jiahua Chen, David Cox, Nancy Glenn, Fred Hickernell, David Hinkley,Kristopher Jennings, Li-Zhi Liao, Terry Therneau, and Thomas Yee for help inproofreading. Of course, I am responsible for any aws that remain. Finally, andmost of all, I thank my wife, Patrizia, and my sons, Gregory and Elliot, for theirpatience, understanding, and encouragement while thisbook was being written.ArtOwenStanford, CA, U.S.A.May 20012001 CRC Press LLC CHAPTER 1IntroductionEmpirical likelihood is a nonparametric method of statistical inference. It allowsthedataanalysttouselikelihoodmethods,withouthavingtoassumethatthedatacome from aknown familyof distributions.Likelihood methods are very effective. They can be used to nd efcient esti-mators, and to construct tests with good power properties. Those tests can in turnbeused toconstruct short condence intervals or smallcondence regions.Likelihood is also very exible, as will be seen in examples in this text. Whenthedataareincompletelyobserved,ordistorted,orsampledwithabias,likelihoodmethods can be used to offset or even correct for these problems. Likelihood canbe used to pool information from different data sources. Knowledge arising fromoutside of the data can be incorporated, too. This knowledge may take the formof constraints that restrict the domain of the likelihood function, or it may be intheform of aprior distributiontobe multipliedby the likelihood function.In parametric likelihood methods, we suppose that the joint distribution of allavailable data has a known form, apart from one or more unknown quantities.In a very simple example, there might only be one observed data value X froma Poisson distribution. Then Pr(X = x) = exp()x/x! for integers x 0,where 0 is unknown. The unknown , called the parameter, is commonly avector of values, and of course there is usually more than a single number in thedata set.A problem with parametric likelihood inferences is that we might not knowwhichparametricfamilytouse.Indeedthereisnoreasontosupposethatanewlyencountered set of data belongs to any of the well studied parametric families.Suchmisspecicationcancauselikelihood-basedestimatestobeinefcient.Whatmay be worse is that the corresponding condence intervals and tests can failcompletely.Many statisticians have turned to nonparametric inferences to avoid having tospecifyaparametricfamilyforthedata.Inadditiontoempiricallikelihood,thesemethods include the jackknife, the innitesimal jackknife, and especially, severalversionsofthebootstrap.Thesenonparametricmethodsgivecondenceintervalsand testswithvalidity not depending on strongdistributional assumptions.Eachmethodhasitsadvantages,asoutlinedinChapter1.2.Fornow,wenotethat the advantages of empirical likelihood arise because it combines the relia-bility of the nonparametric methods with the exibility and effectiveness of thelikelihood approach.2001 CRC Press LLC Thisrstchapterbeginsbylookingatsomedata,andthendescribestheadvan-tages offered by empirical likelihood. Subsequent chapters develop the method,and explore nu
展开阅读全文
相关搜索