1、Copyright 2002 by the Genetics Society of AmericaModeling Epistasis of Quantitative Trait Loci Using Cockerhams ModelChen-Hung Kao* and Zhao-Bang Zeng*Institute of Statistical Science, Academia Sinica, Taipei 11529, Taiwan, Republic of China andBioinformatics Research Center,Departments of Statistic
2、s and Genetics, North Carolina State University, Raleigh, North Carolina 27695-7566Manuscript received May 10, 2001Accepted for publication January 9, 2002ABSTRACTWe use the orthogonal contrast scales proposed by Cockerham to construct a genetic model, calledCockerhams model, for studying epistasis
3、between genes. The properties of Cockerhams model inmodeling and mapping epistatic genes under linkage equilibrium and disequilibrium are investigated anddiscussed. Because of its orthogonal property, Cockerhams model has several advantages in partitioninggenetic variance into components, interpreti
4、ng and estimating gene effects, and application to quantitativetrait loci (QTL) mapping when compared to other models, and thus it can facilitate the study of epistasisbetween genes and be readily used in QTL mapping. The issues of QTL mapping with epistasis are alsoaddressed. Real and simulated exa
5、mples are used to illustrate Cockerhams model, compare differentmodels, and map for epistatic QTL. Finally, we extend Cockerhams model to multiple loci and discussits applications to QTL mapping.GENES interact when they express their effects; i.e., components corresponding to additive, dominance,the
6、 effects of genotypes at one locus depend on and epistatic variances using the least-squares princi-what genotypes are present at other loci. Interaction ple. Cockerham (1954) further partitioned the epi-(epistasis) between genes affecting qualitative trait varia- static variance into components usi
7、ng orthogonal con-tion has been demonstrated for a long time since Gregor trasts. Kempthone (1957) and Hayman and MatherMendel in 1865. Although the evidence of epistasis be- (1955) adopted the same epistasis model. Hayman andtween genes controlling quantitative traits quantitativeMather (1955) and
8、Mather (1967) proposed othertrait loci (QTL) has been reported by traditional tech-epistasis models for modeling epistasis. Van Der Veenniques, such as variance component analyses (Brim and(1959) reviewed the genetic models of digenic epistasisCockerham 1961; Lee et al. 1968; Stuber and Mollpublishe
9、d by then and summarized them into three1971), epistasis between individual QTL generally hascategories:been difficult to discern by traditional techniques. Thea. The pure-line-metric or F-metric model (Fdenotesrecent advances in molecular biology have allowed fine-the population derived from selfin
10、g F2individualsscale genetic marker maps of various organisms to befor t generations as t ): The parameters in theconstructed for the study of individual QTL. Using suchF-metric model show orthogonality with respect tomaps, statistical methods for estimating the positions andgenotypic frequencies of
11、 an Fpopulation under link-effects of individual QTL (QTL mapping) have beenage equilibrium.proposed (Lander and Botstein 1989; Jansen 1993;b. The F2-metric model (corresponding to CockerhamsZeng 1994; Kao et al. 1999; Sen and Churchill 2001).model): The parameters in the F2-metric model areThe prob
12、lem of epistasis has been considered in somemutually orthogonal with respect to genotypic fre-QTL mapping studies (e.g., Stuber et al. 1992; Cheve-quencies of an F2population under linkage equilib-rud and Routman 1995; Doebley et al. 1995; Cocker-rium.ham and Zeng 1996; Kao et al. 1999; Goodnight 20
13、00;c. The mixed-metric model (corresponding to HaymanZeng et al. 2000), but not sufficiently, and many theoreti-and Mathers model): The mixed-metric model is acal and statistical issues involved with epistasis have notmixture of the Cockerhams model and F-metricbeen discussed. Here, we discuss a gen
14、etic model, calledmodel, and it can be transformed to the F2-metricCockerhams model, in relation to QTL mapping withmodel by subtraction of the mean.epistasis. We also investigate the model properties underlinkage disequilibrium.Later, Crow and Kimura (1970), Mather and JinksFisher (1918) first part
15、itioned genetic variance into(1982), Haley and Knott (1992), and Kearsey andPooni (1996) applied the F-metric model to the studyof epistasis between genes, and Goodnight (2000)Corresponding author: Institute of Statistical Science, Academia Sin-adopted an alternative model modified from Cocker-ica,
16、Taipei 11529, Taiwan, Republic of China.E-mail: chkaostat.sinica.edu.tw ham (1954) to study gene interaction. Although theseGenetics 160: 12431261 (March 2002)1244 C.-H. Kao and Z-B. Zengthree models can be translated to each other by addition the scale component of genotype ij for the tth contrast.
17、The first requirement ensures that deviations aroundor subtraction of a constant (see Table 1 of Van DerVeens 1959 article), they have different meanings in the mean are compared (the scales Wtijs are contrasts).The second requirement ensures that the contrasts areinterpreting gene effects, show dif
18、ferent structures ofvariance components, and possess different properties orthogonal. W1and W2(W3and W4) are the linear andquadratic orthogonal contrasts for locus A (locus B).in statistical estimation, which may affect the study ofW5is the linear H11003 linear contrast. W6is the linear H11003QTL as
19、 shown in this article.quadratic contrast. W7is the quadratic H11003 linear contrast.In this article, we start from the traditional partitionW8is the quadratic H11003 quadratic contrast. Cockerhamsof genetic variance into variance components usingorthogonal contrast scales serve the same purpose asC
20、ockerhams (1954) orthogonal contrasts, then leadthe orthogonal contrasts for partitioning the sum ofup to a definition of the genetic parameters for geneticsquares due to treatment into independent single-effects, and present Cockerhams epistasis model. Thedegree-of-freedom components in experimenta
21、l designproperties of Cockerhams model in modeling and map-(Steel and Torrie 1981). The statistical linear andping epistatic genes are investigated when genes are inquadratic terms correspond to the genetical additivelinkage equilibrium and disequilibrium. The differ-and dominance terms, respectivel
22、y. Cockerham usedences between Cockerhams model and the other mod-these orthogonal scales to partition the genetic varianceels are compared, and the advantages of Cockerhamsand find the partition of variance H92682tdue to orthogonalmodel are discussed. It shows that Cockerhams modelscale Wtbyis a mo
23、re appropriate model than the other models forthe study of epistasis between genes and QTL mappingH92682tH11005(H20858i,jpijGijWtij)2(H20858i,jpijW2tij),in the populations, such as F2and backcross. Real andsimulated examples are used to illustrate Cockerhamsmodel, compare different genetic models in
24、 the analysis where Gijdenotes the genotypic value of the genotypeof epistasis between genes, and map for epistatic QTL. ij. He also defined Gijin terms of the scales asFinally, we generalize Cockerhams model to multipleGijH11005 E0H11001H208588tH110051EtWtij, (1)loci and discuss its applications to
25、 QTL mapping.where Ets are the corresponding coefficients, by solvingCOCKERHAMS GENETIC MODELthe equations themselves, and used it to find the correla-tion between relatives in a population. His idea of defin-Cockerham (1954) used eight orthogonal contrasting the genotypic value by the orthogonal co
26、ntrast scalesscales to partition the genetic variance contributed byleads up to Cockerhams genetic model for modelingtwo genes into eight components and to define the ge-epistasis between genes.notypic value of a genotype to find the correlation be-Cockerhams genetic model: We now apply Cocker-tween
27、 relatives in a population. His definition of geno-hams orthogonal contrast scales to the F2populationtypic value using the orthogonal scales leads the way toto derive Cockerhams model for the F2population. Forconstruct a genetic model, which is called Cockerhamsan F2population, the genotypic freque
28、ncies of the ninemodel, for modeling epistasis and defining gene effectsgenotypes AABB, AABb, AAbb, AaBB, AaBb, Aabb, aaBB,in a population. In this section, the orthogonal contrastaaBb, and aabb are 1/16, 1/8, 1/16, 1/8, 1/4, 1/8, 1/16,scales are introduced to present Cockerhams model,1/8, and 1/16,
29、 respectively, and Cockerhams orthogo-and the genetic parameters of Cockerhams model arenal contrasts can be modified as shown in Table 1 (seedefined. The similarities and differences between Cock-also Cockerham and Zeng 1996). By solving Equa-erhams model and alternative models are compared,tion 1
30、with the scales in Table 1, the unique solutionsand their variance component structures are presented.of the coefficients in terms of the genotypic values areOrthogonal contrasts: Assuming that allele frequen-cies at one locus are uncorrelated with frequencies atE0H11005G2216H11001G218H11001G2016H11
31、001G128H11001G114another locus (two loci are in linkage equilibrium),Cockerham (1954) partitioned the genetic variancecaused by two loci, A and B, each with two alleles (A,H11001G108H11001G0216H11001G018H11001G0016,(2)a, and B, b), of a diploid organism using the orthogonalcontrast scales in Table 2
32、 of his article. The scalesE1H11005G228H11001G214H11001G208H11002G028H11002G014H11002G008,(3)WH11032ts, which are functions of genotypic frequencies pijs,have to satisfy two requirementsE2H11005G128H11001G114H11001G108H11002G2216H11002G218H20858i,jpijWtijH11005 0 andH20858i,jpijWtijWtH11032ijH11005
33、0,where i (j) indexed by 2, 1, or 0 refers to the genotypeH11002G2016H11002G0216H11002G018H11002G0016,(4)AA (BB), Aa (Bb), or aa (bb) at locus A (B), and Wtijis1245Cockerhams ModelTABLE 1The eight orthogonal contrast scales (W s) for the F2populationGenotype AABB AABb AAbb AaBB AaBb Aabb aaBB aaBb a
34、abbGG22G21G20G12G11G10G02G01G00P1161811618141811618116W111 000H110021 H110021 H110021W2H1100212H1100212H1100212121212H1100212H1100212H1100212W310H11002110H11002110H110021W4H110021212H1100212H110021212H1100212H110021212H1100212W510H110021000H1100210W6H110021212H11002120012H110021212W7H1100212012120 H
35、1100212H1100212012W814H110021414H110021414H110021414H110021414Gs and Ps denote the genotypic values and expected genotypic frequencies for the nine genotypes of twounlinked genes, A and B.AA and aa and thus is defined as the genetic parameterof dominance effect of gene A, d1. The same argumentE3H110
36、05G228H11001G124H11001G028H11002G208H11002G104H11002G008,(5)leads us to define coefficients E3and E4as the geneticparameters of additive and dominance effects of geneE4H11005G218H11001G114H11001G018H11002G2216H11002G128B, a2and d2. If the substitution effects at one locusdepend on genotypes at the o
37、ther locus, there is aninteraction between the two genes in the usual sense.H11002G0216H11002G2016H11002G108H11002G0016,(6)Coefficient E5quantifies the difference between addi-tive effects of gene A (gene B), (G2*H11002 G0*)/2 (G*2H11002E5H11005(G22H11002 G02) H11002 (G20H11002 G00)4G*0)/2, in the b
38、ackground of two different homozy-gotes of gene B (gene A), BB and bb (AA and aa), andthis difference is defined as the genetic parameter ofH11005(G22H11002 G20) H11002 (G02H11002 G00)4,(7)additive H11003 additive epistatic effect, iaa. The larger thedifference is, the stronger the interaction is. T
39、he sameargument leads to the definitions of E6, E7, and E8asE6H11005(2G21H11002 G22H11002 G20) H11002 (2G01H11002 G02H11002 G00)4,(8)the genetic parameters of additive H11003 dominance, iad;dominance H11003 additive, ida; and dominance H11003 domi-E7H11005(2G12H11002 G22H11002 G02) H11002 (2G10H1100
40、2 G20H11002 G00)4,(9)nance, idd; epistatic effects between genes A and B. Thedefinitions of these nine genetic parameters are summa-rized in Table 2. After defining the genetic parametersE8H110052(2G11H11002 G21H11002 G01) H11002 (2G12H11002 G22H11002 G02)4of genetic effects, Equation 1 can be expre
41、ssed moresuccinctly asH11002(2G10H11002 G20H11002 G00)4GijH11005H9262H11001a1x1H11001 d1z1H11001 a2x2H11001 d2z2H11001 iaawaaH11001 iadwadH11001 idawdaH11001 iddwdd, (11)H110052(2G11H11002 G12H11002 G10) H11002 (2G21H11002 G22H11002 G20)4by defining the coded variables asH11002(2G01H11002 G02H11002
42、G00)4.(10)x1H11005H209021 ifAisAA0 ifAisAaH110021 ifAisaa,x2H11005H209021 ifBisBB0 ifBisBbH110021 ifBisbb,If the two genes are in linkage equilibrium, E0is themean of the genotypic values, G, and therefore can bedenoted as H9262. Coefficient E1is equivalent to (G2.H11002z1H11005H2090212ifAisAaH11002
43、12otherwise,z2H11005H2090212ifBisBbH1100212otherwise,G0.)/2, which is one-half of the difference in genotypicvalue between the two homozygote means of AA and aawaaH11005 x1H11003 x2, wadH11005 x1H11003 z2, wdaH11005 z1H11003 x2,and thus is defined as the genetic parameter of additiveeffects of gene
44、A, a1. Coefficient E2is equivalent towddH11005 z1H11003 z2.(2G1.H11002 G2.H11002 G0.)/2, which represents the departureThe coded variables of this model are mutually indepen-in genotypic value of the heterozygote mean of Aa fromthe midpoint between the two homozygote means of dent to each other due
45、to orthogonality. The model1246 C.-H. Kao and Z-B. ZengTABLE 2tion, the structure of variance components for the totalgenetic variance, VG, contributed by the two genes, eachDefinition of genetic parameterswith two alleles, is shown in appendix c. From appendixc, we can see that the total genetic va
46、riance is com-Solution Parameter definition Notationposed of genetic variance of individual effects and co-E0Mean H9262variances between different effects, and it will changeE1Additive effect of locus A a1with gene frequencies (ps) and linkage disequilibriumE2Dominance effect of locus A d1(D). Certa
47、inly, the relative strengths of genetic effectsE3Additive effect of locus B a2will vary according to the change in gene frequency andE4Dominance effect of locus B d2E5Additive H11003 additive effect of loci A iaalinkage disequilibrium. For an F2population (pAH11005 pBH11005and B0.5), the total genet
48、ic variance reduces to Equation 34E6Additive H11003 dominance effect of loci iadand contains covariances between different genetic ef-A and Bfects through linkage. If genes are unlinked in the F2E7Dominance H11003 additive effect of loci idapopulation (pAH11005 pBH11005 0.5 and D H11005 0), the tota
49、l geneticA and Bvariance can be partitioned into eight independentE8Dominance H11003 dominance effect of iddcomponents without covariance asloci A and BEis are the solutions of Equation 1 with the orthogonalVGH1100512a21H1100114d21H1100112a22H1100114d22H1100114i2aaH1100118i2adH1100118i2daH11001116i2dd.contrast scales in Table 2. The exact expressions of Eis areshown in Equations 210.(12)Each variance component is contributed by its own ge-can also be represented in a different form