Theoretical Expectations for the Potential Use of Genetic Markers in Marker-assisted Selection

J.E. Staub and Thomas Horejsi

USDA­ARS, Vegetable Crops Research Unit, Horticulture Department, University of Wisconsin, Madison, WI 53706

Additional index words. computer simulation, gain from selection, linkage, linkage phase, selection response, marker interval

Abstract. From theoretical estimates of gamete frequencies, equations were constructed and used to calculate relative expected efficiency using codominant and dominant markers in marker-assisted selection (MAS). The frequency distributions of populations were affected by selection pressure intensity (i.e., number and time of selection). These frequencies were dramatically affected by linkage phase and distance, and thus a knowledge of linkage characteristics in populations is critical to the potential success of MAS. The efficacy of random amplified polymorphic DNA (RAPD) was evaluated through simulation. The most effective RAPD markers to use in MAS are flanking markers linked in repulsion phase to a target trait. Two repulsion phase RAPD markers that flank downy mildew resistance (dm) at a distance of 9.9 and 16.5 cM were used for evaluation. Theoretical estimates indicate that these markers would be expected to have a selection efficiency of 96% when applied strategically in a MAS breeding program.


Phenotypic selection for plant improvement depends upon the ability to capitalize on genetic effects which can be distinguished from environmental effects. Phenotypic selection based on traits which are conditioned by additive allelic effects can produce dramatic, economically important changes in breeding populations. Genetic markers provide potential in marker-assisted selection (MAS) when they are linked to economically important traits; especially those conditioned by additive allelic effects. MAS can increase selection efficiency by allowing for earlier selection and reducing plant population size used during selection (Beckman and Soller, 1983; Darvasi and Soller, 1994). The efficiency of marker loci during phenotypic selection is, however, dependent upon many factors, and predictions of response to selection (R) or genetic gain (G) are often difficult (Lark et al., 1995).

In order to use MAS in a plant improvement program, the position of target loci and information regarding their association with selectable markers must be known. The presence of a tight linkage (<10 cM) between qualitative trait(s) and a genetic marker(s) may be useful in MAS to

increase G (Kennard et al., 1994; Paran et al., 1991; Timmerman et al., 1994). Likewise, selection for multiple loci or quantitative trait loci (QTL) using genetic markers can be effective if a significant association is found between a quantitative trait and markers (Edwards et al., 1987; Edwards and Page 1994).

The development of genetic maps in the Cucurbitaceae has not been as rapid as in other crop species. Published information regarding genetic linkages among economically important traits in melon is sparse (McCreight, 1983; Pitrat, 1984, 1991). Pitrat (1994) has summarized linkage regarding economically important traits in melon (Cucumis melo L.). The 13 linkage groups consist of one group with five markers, one group which includes four markers, two groups which possess three markers, and four groups which contain two markers. The remaining five groups consist of one marker each. Recently, Baudracco-Arnas and Pitrat (1996) have constructed a 103-point map in melon using isozymes, RFLPs, RADP markers, and disease resistance markers. This map is the first comprehensive report of the molecular structure of the melon genome.

Cucurbitaceae '98

yield and quality components in processing cucumber with relatively high confidence (LOD >4.0).

The trait/marker associations defined in the current cucumber and melon maps could be useful in MAS. Those relationships have, however, not been rigorously tested in applied breeding programs. One way of evaluating the potential effectiveness of markers in MAS is through computer simulation where genetic information (i.e., marker/trait linkage relationships) is used as input or process data. This report presents theoretical expectations for selection based on a consideration of linkage phase and distance between codominant and dominant (e.g., RAPD) markers and Mendelian inherited target traits. An example is used in cucumber to demonstrate the potential use of loosely linked, dominant marker loci. Such simulation can provide predictions on marker efficacy for plant improvement.

Material and methods

Change in phenotypic frequency during selection of codominant loci

Computer model. Effects of linkage distance and phase on progeny distributions with or without selection were simulated using previously developed computer-based algorithms (Staub, 1994). Simulation of linkage effects was applied assuming a two-locus codominance model. As such, there can be either no linkage (r = 0.5), or linkage can occur in the coupling or repulsion phase. The coupling (C) and repulsion (R) phases of linkage can be distinguished from each other by gametes produced by the double-heterozygous individual such that

Coupling phase: AaBb (arranged as AB/ab) produces gametes

AB and ab (nonrecombinant) each in frequency


, and

Ab and aB (recombinant) each in frequency


Repulsion phase: AaBb (arranged as Ab/aB) produces gametes

Ab and aB (nonrecombinant) each in frequency


, and

Genetic maps have been constructed in cucumber (C. sativus L.) using an array of genetic markers. The cucumber genome (750 to 1000 cM in length) is predicted to have seven linkage groups (Staub and Meglic, 1993). Knerr and Staub (1992) assigned 12 of the 14 isozyme loci in cucumber to four linkage groups spanning 215 cM. More recently, 21 polymorphic and 17 monomorphic cucumber isozyme loci were identified in 15 enzyme systems (Meglic and Staub, 1996). Nine morphological markers were found to be linked to isozyme loci and were integrated to form a map containing four linkage groups spanning 584 cM with a mean linkage distance of 19 cM. Kennard et al. (1994), used RFLP, RAPD, isozyme, morphological and disease resistance markers to identify ten linkage groups in cucumber. They constructed a 58-point and a 70-point map using a narrow (C. sativus var. sativus x var. sativus) and a wide (C. sativus var. sativus x var. hardwickii) cross, respectively. Although 10 linkage groups were identified in each map, markers in the narrow cross spanned a genomic length of 766 cM (mean marker interval = 13 cM) and markers spanned a genomic length of 480 cM (mean marker interval = 7 cM) in the wide cross. An 83-point RAPD and morphological map has recently been constructed in cucumber which spans 630 cM (mean marker interval = 7.6 cM) in a relatively wide C. sativus var. sativus cross (line G421 x line H-19) (Serquen et al., 1997). This map included genes for gynoecious sex expression (F), determinate plant habit (de), and little leaf (l ) traits as well as yield components (QTL).

The significance of these maps (either in melon or cucumber) for use in MAS has not been documented. In cucumber, map saturation (>200 markers at 10 cM intervals) has not been successful largely because of the types of markers that have been available (isozyme, RFLP, and RAPD) and the lack of genetic variability in cucumber (3% to 10%). Nevertheless, isozyme loci have been found linked to economically important Mendelian-inherited traits (Meglic and Staub, 1996). Likewise, QTL have been identified in studies using RFLPs (Kennard and Havey, 1995; Dijkhuizen, 1994) and RADP markers (Serquen et al., 1997) that explain significant portions of the genetic variation for

Cucurbitaceae '98

AB and ab (recombinant) each in frequency


Simulation experiments using codominant loci. Simulations were conducted using hypothetical diploid populations (n = 1000) which either were or were not subjected to selection. Selection was applied to the Ab gamete after an initial cross of AaBb x AaBb individuals either in each of 10 generations, or nonrecurrently in generation 1 or generation 1 and 3 of 10 generations. Test simulations using codominant loci were performed to determine the effect of linkage phase and linkage strength (r = 0.1 to 0.5) on the frequency of predicted phenotypes (A_B_, A_bb, aaB_, aabb). Depending on the selection imposed and linkage relationships, gametic frequency can be predicted to change in each generation given the constraints of a specific selection model. These predicted gametic frequency changes formed the basis of phenotype changes in each generation.

Predicted efficiency of dominant marker loci in MAS

Effects of linkage distance and phase in the case of dominant markers. Expectation equations for MAS efficiency were derived based on theoretical gamete frequencies in repulsion or coupling phase as markers (M) are associated in single (Table 1) or flanking arrangements with respect to the target trait (t) (Table 2). MAS efficiency (u) is defined as the portion of selections having the desired genotype among the total selections (Table 3).

The u of MAS can vary dramatically, and is defined herein as the frequency of the desired genotype among all selected individuals. MAS efficiency depends on the number of markers linked to the target trait (t) (single or flanking), and their distance (recombination fraction) and linkage phase (coupling or repulsion). Using the frequencies listed in Tables 1 and 2, the u of genetic markers during MAS can be expressed as expectation equations (Table 3). In addition to predicting efficiency, application of the derived equations can allow for comparisons between different dominant markers as single marker/trait associations (Table 1) and as flanking marker/trait associations (Table 2).

Application in cucumber. These expectation

equations can be applied to linkage of genetic markers to the downy mildew gene (dm) in cucumber (Horejsi, 1997). The dm gene is linked to five RAPD markers G14800, X151100, AS5800, BC5191100, and BC5261000. This information resulted in mapping in two crosses WI 1983G (USDA line) x Straight 8 and Zudm-1 (Novartis line) x Straight 8. The RAPD markers were linked in WI 1983G x Straight 8 as BC526100010.2 cMAS580027.3 cMdm16.1 cMG148005.0 cMX151100 and in Zudm-1 x Straight 8 as BC526100010.2 cMAS580025.0 cMdm17.0 cMG148000.0 cMX151100. A knowledge of the marker map position (map information combined for simulation), linkage phase, and interval distance between these RAPD markers and the target trait, allowed for the theoretical calculation of selection efficiencies for MAS.

Results and discussion

Various strategies for plant improvement have been tested by Lande and Thompson (1990) and Gimelfarb and Lande (1994) using computer simulations to characterize MAS and to provide expectations for phenotypic selection. Potential increases in breeding efficiency through MAS and the population size needed to attain such increases depends upon the genetic parameters (i.e., heritability, the proportion of the additive genetic variance explained by the marker loci) and the selection method used.


Table 1. Frequency of gametes produced by an F1 individual when RAPD marker M1 is in coupling and repulsion phase linkage to the target gene (t).

Gamete Coupling Repulsion

M1 t 1/2az 1/2(1­a)y

M1 t 1/2(1­a) 1/2a

m1 t 1/2(1­a) 1/2a

m1 t 1/2a 1/2(1­a)

zRecombinant type.

yParental type.

Cucurbitaceae '98

Table 2. Frequency of gametes produced by an F1 individual when RAPD markers M1 and M2 are linked in coupling, repulsion or coupling and repulsion to the target trait (t).

Gamete Coupling Repulsion Coupling/repulsion

M1 t M2 1/2abx 1/2(1­a)(1­b)y 1/2a(1­b)z

M1 t m2 1/2a(1­b)z 1/2b(1­a)z 1/2abx

M1 t m2 1/2b(1­a)z 1/2a(1­b)z 1/2(1­a)(1­b)y

M1 t M2 1/2(1­a)(1­b)y 1/2abx 1/2b(1­a)z

m1 t M2 1/2b(1­a)z 1/2a(1­b)z 1/2(1­a)(1­b)y

m1 t m2 1/2(1­a)(1­b)y 1/2abx 1/2b(1­a)z

m1 t M2 1/2a(1­b)z 1/2b(1­a)z 1/2abx

m1 t m2 1/2abx 1/2(1­a)(1­b)y 1/2a(1­b)z

zSingle recombinant type.

yParental type.

xDouble recombinant type.

Change in phenotypic frequency during selection of codominant loci

Population structure is affected by many factors during selection. Genotypic distributions in nature are affected by mutation, selection dynamics, migration, and random genetic drift. The combined distributive affects of linkage (both phase and distance) and selection on population structure are difficult to characterize because often complete information about a population's genetic dynamics (parameters) is not available. Some factors (e.g., genetic drift and migration) during artificial selection can be controlled, and thus predictions can be more readily made when compared to selection response in nature. The effects of linkage phase and distance are considered below in order to gain an understanding of the consequence of gametic selection (i.e., selection for Ab). It must be understood that the expectations provided are the result of rigorous constraints (i.e., no mutation, genetic drift or migration), and are given without regard to other factors that may influence phenotypic distributions (e.g., restrictions to pollination). They do, however, provide a descriptive assessment of linkage during selection.

Effects of linkage distance and phase on progeny distributions with no selection over 10 generations. The distribution and frequency of phenotypes (A_B_, A_bb, aaB_, aabb) are presented after 10 cycles with no gametic selection (note the differences in scale between phenotypes on the y axes; Figure 1). The allele frequencies are according to Hardy-Weinberg expectations (i.e., high frequencies of A_B_ phenotypes and lower frequencies of aabb). The frequency of A_B_ and aabb phenotypes under coupling phase linkage of locus A to B are higher than those under repulsion linkage regardless of linkage distance. In contrast, the frequency of A_bb and aaB_ phenotypes constrained by repulsion phase linkage are higher than their coupling phase counterparts.


Table 3. Expectation equations for assessing the efficiency of MAS (u) for a monogenic trait using dominant molecular markers in different linkage phases.z

Marker class Equation

Single coupling phase uc = a2 + 1


Single repulsion phase ur = a2 ­ 2a + 1

Flanking coupling phase ucc = a2b2 ­ a2 ­ b2 + 1

4a2b2 ­ 4a2b ­ 4ab2 + 6ab + a2 + b2 ­ 2a ­ 2b + 3

Flanking repulsion phase urr = a2b2 ­ 2a2b ­ 2ab2 + 4ab + a2 + b2 ­ 2a ­ 2b + 1

4a2b2 ­ 4a2b ­ 4ab2 + 6ab + a2 + b2 ­ 2a ­ 2b + 1

Flanking coupling + repulsion phase ucr = ­ a2b2 + 2a2b ­ a2 + b2 ­ 2b + 1

­ 4a2b2 + 4a2b + 4ab2 ­ 2ab ­ a2 ­ b2 + 1

zThe MAS efficiency (u) equation, where a and b are recombination frequencies between the markers and the gene of interest.

Cucurbitaceae '98

Effects of linkage distance and phase on progeny distributions with selection over 10 generations. The distribution and frequency of phenotypes in a population under Ab gametic selection pressure differ significantly from that where no selection is imposed (Figure 1). While the relative distribution of A_B_ and aabb remains unchanged after selection and are similar to distributions predicted without selection, the frequency of A_bb genotypes are greatly reduced (e.g., the A_bb phenotype is absent after 10 generations of recurrent selection) when compared to a population under no selection pressure. Similarly, the frequency of aaB_ phenotypes is predictably increased under Ab gametic selection when compared to a population under no selection pressure.

Linkage phase has a dramatic affect on the type and frequency of the recombinational phenotypes as a result of selection of Ab. Not only is the frequency of A_bb phenotypes lower than that observed without selection, but in contrast to a population under no selection pressure, the frequency phenotypes under coupling phase linkage is higher than that of individuals constrained by repulsion phase linkage (Figure 1). In contrast, aaB_ progeny in repulsion phase were in higher frequency than their coupling phase counterparts regardless of the time of selection.

The frequency distributions of populations (having specific linkage characteristics ) were affected by selection pressure intensity (i.e., number and time of selection). The simulations presented herein demonstrate importance of knowing linkage characteristics of marker/trait associations for use in MAS (i.e., estimating R and G).

Predicted efficiency of dominant marker loci in MAS

Information from downy mildew resistance mapping studies (Horejsi, 1997) were used to evaluate the theoretical efficiency of MAS based on prediction equations (Table 3). The RAPD markers were reported to be loosely linked to dm. However, even with loose linkages these genetic marker loci might be useful for MAS (Table 4).

Flanking markers, especially markers in repulsion linkage phase, are predicted to be effective for selection. The flanking markers G14800 and BC5191100 reported herein are linked in repulsion and have potential for MAS. Using the MAS efficiency equations (Table 3), the selection efficiency (ucc) for these markers is predicted to be 0.96 (Table 4). This efficiency value means that, theoretically, 96% of the selected (based on marker phenotype) plants in specific segregating generations (i.e., F2) in these mapping populations would have the dm/dm genotype and 4% would be either Dm/dm or Dm/Dm. Thus, when these markers segregate in repulsion phase they have potential for increasing selection efficiency in a downy mildew resistance cucumber breeding program.

Figure 1. Theoretical expectations of phenotypic frequencies in the tenth generation given the initial mating of AaBb x AaBb and imposed selection for ab gametes with consideration of recombination (r = 0.0 ­ 0.5) and linkage phase; C = coupling, R = repulsion.

Table 4. Theoretical downy-mildew marker assisted selection efficiencies using single or flanking RAPD markers in coupling, repulsion, or both phase linkages. RAPD markers shown to be genetically linked to dm, the downy mildew resistance gene of cucumber.


Linkage fractions Selection

Marker(s) phase a b efficiency


Cucurbitaceae '98

Literature cited

Baudracco-Arnas, S. and M. Pitrat. 1996. A genetic map of melon (Cucumis melo L.) with RFLP, RAPD, isozyme, disease resistance and morphological markers. Theor. Appl. Genet. 93:57­64.

Beckman, J.S. and M. Soller. 1983. Restriction fragment length polymorphisms in genetic improvement: methodologies, mapping and costs. Theor. Appl. Gen. 67:35­43.

Darvasi, A. and M. Soller. 1994. Optimum spacing of genetic markers for determining linkage between marker loci and quantitative trait loci. Theor. Appl. Genet. 89:351­357.

Dijkhuizen, A. 1994. Application of restriction fragment length polymorphism for the assessment of genetic variability and study of quantitatively inherited traits in cucumber (Cucumis sativus L.). PhD diss., Univ. Wisconsin, Madison.

Edwards, M.D., C.W. Stuber, and J.F. Wendel. 1987. Molecular facilitated investigations of quantitative trait loci in maize. I. Number, genomic distribution and types of gene action. Genetics 116:113­125.

Edwards, M.D. and N.J. Page. 1994. Evaluation of marker-assisted selection through computer simulation. Theor. Appl. Genet. 88:376­382.

Gimelfarb, A. and R. Lande. 1994. Simulation of marker assisted selection for non-additive traits. Genetical Research 64:127­136.

Horejsi, T. 1997. Random amplified polymorphic DNA and sequence characterized amplified regions for studies of genetic diversity and downy mildew resistance in cucumber. PhD diss., Univ. Wisconsin, Madison

Kennard, W.C., K. Poetter, A. Dijkhuizen, V. Meglic, J.E. Staub, and M.J. Havey. 1994. Linkages among RFLP, RAPD, isozyme, disease-resistance, and morphological markers in narrow and wide crosses of cucumber. Theor. Appl. Genet., 89:92­98.

Kennard W.C. and M.J. Havey. 1995. Quantitative trait analysis of fruit quality in cucumber; QTL detection, confirmation, and comparison with mating-design variation. Theor. Appl. Genet. 91:53­61.

Knerr, L.D. and J.E. Staub. 1992. Inheritance and linkage relationships of isozyme loci in cucumber (Cucumis sativus L.). Theor. Appl. Genet., 84:217­224.


Lande, R. and R. Thompson, R. 1990. Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124:743­756.

Lark, K.G., K. Chase, F. Adler, L.M. Mansur, and J.H. Orf. 1995. Interactions between quantitative trait loci in soybean in which trait variation at one locus is conditional upon a specific allele. Proc. Natl. Acad. Sci. 92:4656­4660.

Meglic, V. and J.E. Staub. 1996. Inheritance and linkage relationships of allozyme and morphological loci in cucumber (Cucumis sativus L.). Theor. Appl. Genet. 92:865­872.

McCreight, J.D. 1983. Linkage of red stem and male-sterile-1 in muskmelon. Cucurbit Genet. Coop. Rpt. 6:48.

Paran, I., R. Kesseli, and R. Michelmore. 1991. Identification of RFLP and RAPD markers linked to downy mildew resistance genes in lettuce using near-isogenic lines. Genome 34:1021­1027.

Pitrat, M. 1984. Linkage studies in muskmelon. Cucurbit Genet. Coop. Rpt. 7:51­53.

Pitrat, M. 1991. Linkage groups in Cucumis melo L. J. Heredity 82:406­411.

Pitrat, M. 1994. Linkage groups in Cucumis melo L. Cucurbit Genet. Coop. Rpt. 17:148­149.

Serquen, F.C., J. Bacher, and J.E. Staub. 1997. Mapping and QTL analysis of horticultural traits in a narrow cross in cucumber (Cucumis sativus L.) using random-amplified polymorphic DNA markers. Mol. Breed. 3:257­268.

Staub, J.E. 1994. Crossover, concepts and applications in genetics, evolution, and breeding: An interactive computer-based laboratory manual. Univ. Wisconsin Press, Madison.

Staub, J.E. and V. Meglic. 1993. Molecular genetic markers and their legal relevance for cultivar discrimination: A case study in cucumber. HortTechnology 3:291­300.

Timmerman, G.M., T.J. Frew, N.F. Weeden, A.L. Miller, and D.S. Goulden. 1994. Linkage analysis of er-1, a recessive Pisum sativum gene for resistance to powdery mildew fungus (Erysiphe pisi D.C.). Theor. Appl. Gen. 88:1050­1055.

Cucurbitaceae '98