Part I.
Current approaches:
Gene flow on ecological time scales
(Summary of Workshop Discussions)


The study of gene flow (i.e. movement of genes among populations) has been a vital topic in evolutionary biology. Most theoretical models of gene flow stem from concepts developed by Sewall Wright that are based either on continuous populations, using an isolation by distance approach (Wright 1943, 1946), or on populations as islands that become differentiated through mutation and genetic drift. The island model assumes equilibrium conditions, gene flow among all populations, and populations of equal size. Recently, ecologists, conservation biologists, forest managers, and ecosystem managers have become interested in gene flow on an ecological time scale (Sork abstract. Part III in these proceedings). Using biochemical or molecular genetic markers, many of these scientists have borrowed the genetic structure approach to estimate gene flow (Nem). Yet, both the time scale and the spatial scale of these studies violate the assumptions of gene flow models based on F-statistic or other genetic structure approaches.

An alternative way to estimate gene flow is to use parentage analysis, which can identify parents (usually fathers) and then quantify the pattern of gene movement. Meagher (1986) presented an example of paternity analysis in a plant population in order to quantify variance in reproductive success, as a function of distance. Subsequent modifications of this basic approach allow this technique to be used for the study of gene movement into populations (Devlin and Ellstrand 1990; Roeder et al 1989; Smouse and Meagher 1994). The parental analysis approach provides a direct estimate of gene movement, which is a critical element of gene flow, but it does not yield an estimate of Nem, because it is usually based on one or two reproductive episodes, rather than gene flow over a whole generation.

Many studies of current gene flow, especially those in conservation biology, are aimed at understanding gene movement on a regional or landscape scale. As continuous populations become fragmented, they may assume metapopulation dynamics, through extinction and recolonization events of the different fragments. It is not clear whether recent modeling approaches in metapopulation biology and landscape ecology offer viable insight on gene movement nor whether current measurement of gene movement contributes the migration estimates needed for landscape models.

In this workshop report, we summarize our discussions of gene flow on an ecological time scale. A major emphasis of the workshop was the application of gene flow models to tree populations, although some participants work with other types of organisms. Most applications of gene flow models have been primarily in small populations (natural or managed) or within stands of larger populations. Most work has emphasized pollen dispersal dynamics within stand and the proportion of outside pollen into stands. However, little work to date has examined gene flow dynamics among stands. The specific objectives were: (1) to review indirect and direct methods of estimating gene flow; (2) to review available statistical models for estimating gene flow; and (3) to evaluate the extent to which landscape approaches and spatially explicit models can be incorporated into gene flow studies. The workshop included:  subgroup discussions of gene flow models (Part I, of these proceedings); discussion of new approaches (Part II of these proceedings); presentations (abstracts in Part III of this proceedings); and review of available software programs (see Appendix A. Gene flow related Software).

Indirect methods using F-statistics

Historically, the estimation of gene flow has relied on indirect methods or those based on Wright’s parameter of population differentiation FST, (see Box A). In many respects, FST is an ideal parameter that summarizes the evolutionary history of the populations under study, yielding insights about the relative importance of gene flow and genetic drift. Moreover, the relative ease of collecting the requisite data and the facility of analysis make indirect methods an obvious choice for many evolutionary and conservation biology studies. Neigel (1997) summarizes the advantages of using the indirect approach for estimating FST, as a parameter to estimate Nem, and also describes recent advances in the analysis of genealogical relationships of genes (coalescent approach) as an alternative method of estimating gene flow (see Neigel abstract, Part III of these proceedings).

Gene flow has often been modeled differently in subdivided and continuous populations. For subdivided populations, the indirect approach of F-statistics, as described above, is usually employed. In contrast, for continuous populations, it might be more common to estimate neighborhood size, based on Wright’s isolation by distance approach (see Box B). This latter approach is not an indirect method.

Shortcomings of Indirect Methods

The indirect approach of using F-statistics or F-statistic-like methods to estimate gene flow, evolutionary lineages, and population relationships has made valuable contributions to evolutionary biology (Neigel 1997). However, this approach can be misapplied to studies on a ecological time scale (e.g. Steinberg abstract, Part III of these proceedings; Stenberg and Jordan 1997). The result is that the literature in conservation biology includes many studies which report alleged levels of gene flow, based on FST estimates, that reflect long-term history, not ongoing processes. For the purposes of answering gene flow questions on an ecological time scale, FST methods are not advisable, and should be regarded as mere descriptors of historical genetic structure, along with other measures of genetic diversity. The computational robustness of FST is one of its statistical advantages, but its insensitivity to rare alleles results in an estimate that ignores on-going dynamics that are directly relevant to the interests of ecologists, conservation biologists, and ecosystem managers. We do not discount the utility of genetic structure statistics for conservation or management objectives. In fact, if one wishes to measure them, recent work on optimal sample size can provide some useful guidelines on how to maximize sampling effort (see Fernandez and Petit abstracts, Part III in these proceedings). Nonetheless, we conclude that, for the study of ongoing gene flow, indirect approaches are not appropriate.

Direct methods using parentage analysis

For the study of gene movement on an ecological time scale, parentage analysis in the sense of Roeder et al (1989), Adams and Birkes (1991), Devlin and Ellstrand (1990), and Smouse and Meagher (1994) is currently the most effective approach (see Adams and Nason abstracts, Part III of these proceedings).  This form of gene movement is part of the dynamics of gene flow, but we caution that the results cannot be interpreted as interpopulation gene flow, characterized by Nm or , the effective number of migrants per generation, on an evolutionary time scale. Moreover, parentage analysis based estimates of gene movement measures immigration into a circumscribed area which may or may not be an "population". However, one can use parentage analysis to estimate the distribution of dispersal distances, sometimes yielding a dispersion curve analogous to that of Wright’s Isolation by Distance model (see Box B). One can also use parentage analysis to examine pollen or seed mediated gene movement. Here we focus on four related models that provide estimates of pollen-mediated gene movement. The general model of parental analysis uses progeny from known maternal parents to assign paternity to a set of potential pollen donors, while the power of other models is to estimate the rate of pollen immigration from outside the experimental population.

Individual paternity. If the objective is to quantify within population patterns of pollen movement and individual male reproductive success (including selfing) then the methods of Roeder et al. (1989; see also Smouse and Meagher 1994) provide the greatest detail. Basically, this approach assumes that the focal population is isolated from outside pollen sources and that genotypes of all potential males are known.  Potential problems with this method are that it can require extensive sampling of progeny per female, and, due to constraints on assayable genetic information, often requires the number of potential pollen donors to be relatively small. Moreover, these methods do not adjust estimates (and variances) of male reproductive success for cryptic gene flow. This adjustment is important, because cryptic gene movement biases estimates of male fertility unevenly for males with low and high reproductive success. Nason (in prep.) is working on a modification of this method to make the adjustment (Nason abstract, Part III of these proceedings). However, even with an adjustment for cryptic gene flow, this approach may underestimate fertility differences among males (Adams 1992a; Adams, Birkes, and Ericson 1992). This paternity approach is useful for generating a pollen dispersion curve and for estimating gene movement from outside a circumscribed area , although, as noted below, there are more powerful methods for estimating gene "immigration".  (This task can be done using PollenGF by Nason (Appendix A) from this gene flow project's NCEAS website or using software available from Devlin).

Neighborhood model. This neighborhood model of Adams and Birkes (1991) groups fathers by distance and fits a dispersal function to the data, instead of estimating individual male reproductive success. This approach provides estimates of selfing, the probability of within-population dispersal as a function of inter-mate distance, and pollen movement into an experimentally defined population. The neighborhood model is similar to the pollen gene movement model, but it differs by not estimating fertilities of individual males within a circumscribed area or neighborhood. Instead, it estimates parameters relating mating success to factors such as distance, relative pollen fertility, or tree size (e.g., Adams abstract, Part III of these proceedings). The Individual paternity model can also be used to estimate the relationship between mating success to these same parameters by using individual male fertilities. In the individual paternity model, there are no assumptions about fertilities but the model estimates them poorly. The neighborhood model requires applying reasonable models from which estimates of model parameters can be derived. This approach works best for species with populations with evenly distributed individuals, but this spatial pattern is not a requirement. The program, as written, is limited to situations where pollen (or egg, for seed dispersal) haplotypes can be determined (possible with embryo-megametophyte systems in conifers or when DNA markers from male-inherited organelles are used). (See program by Adams and Birkes (Appendix A) and NCEAS website).

Pollen gene movement model. This method extends the paternity exclusion approach developed by Devlin and Ellstrand (1990) to estimate both the apparent and cryptic components of total immigration. So far, this approach as been applied to patchily distributed populations (e.g., Ellstrand and Marshall 1985; Hamrick and Schnabel 1986) but see application of this model to northern red oak in a continuous stand (Dyer abstract, Part III of these proceedings). Nason (in prep.) is modifying this model so that it jointly estimates individual fertilities within a circumscribed area and immigration from outside that area. Both this model and the neighborhood model described below can be done with artificially circumscribed populations within larger continuous populations (e.g. Dyer abstract, Part III of these proceedings or within isolated population patches (Nason and Hamrick 1997). (See PollenGF by Nason (Appendix A) and on NCEAS website for these proceedings.)

Multiple population gene movement model. Another modification of the parentage approach is developed by Kaufman, Smouse, and Alvarez-Buylla (Kaufman et al. 1998; see also Smouse et al. abstract, Part III of these proceedings). Unlike the neighborhood and pollen gene flow models described above, in which pollen migration into the study population is assumed to have a single source, it implements more source populations. The current version is restricted to plant populations where all known source populations can be identified and sampled.

Any of the four models could be modified to include seed-mediated gene movement, although such estimates can be more difficult to obtain. Estimating seed movement with molecular markers is hindered by the small rate of mutation in cpDNA that produces very little intrapopulation variation. Yet, cytoplasmic markers should not be dismissed, because they can provide valuable information about pollen and seed-mediated gene movement (see Petit abstract, Part III of these proceedings). Indeed, it has been found that some species (soy bean, rice and some wild species) contain hypervariable ssr sequences that are very promising for seed flow studies (e.g., McCaulley 1994; McCauley 1995b). For conservation-motivated research, the extension of these models to seed-mediated movement may be essential for the estimation of colonization probabilities based on genetic markers, and that task lies ahead of us.

The choice of any of the four methods above is determined by the question. If we want variance in male fertilities within an area, as well as gene movement, then we need to use either the individual paternity or the pollen gene movement models. Both will require high exclusion probabilities and a large number of progeny per mother. Alternatively, if we are interested in gene movement into an area, then lower exclusion probabilities and sample sizes may be adequate. The last three approaches can accomplish this estimation, although the neighborhood model can only be used for gymnosperms. By reducing the exclusion probability and sample sizes per mother, one could sample more sites. (See Box C for optimal sampling strategy and Table 1 for optimal sample sizes to estimate gene flow events.)

The use of parentage models to evaluate pollen-mediated gene flow is often quite effective at demonstrating the consequences of pollination. However, this approach can be complemented effectively with directly measured ecological data such as pollinator behavior or seedling establishment. In some cases, pollinator behavior may be easier to study and equally informative about the nature of pollen-mediated gene flow (Campbell abstract, Part III of these proceedings).

In conclusion, we recommend the use of genealogically-based direct estimates for small scale measurement of local gene movement. While this approach has limitations (see next section), numerous studies have already utilized this approach to study gene flow in fragmented stands (Ellstrand 1992; Ellstrand and Marshall 1985;  Hamrick et al 1995; Nason and Hamrick 1997) and, to a lesser extent, continuous populations (Adams and Birkes 1991; Friedman and Adams 1985; Dyer and Sork, in prep.). The choice of any of the four methods above should be determined by the question. If one wants variance in male fertilities as well as gene movement within an area than one needs to use the individual fertility or pollen gene movement approach. But in both cases, one will need high exclusion probabilities and large number of progeny per mother. If one is more interested in gene movement into an area, then the last three models will all be appropriate. In this case, lower exclusion probabilities and sample sizes may be adequate. These changes in sampling strategy would permit sampling of more sites.

Shortcomings of Direct Methods

The study of fine scale gene flow and relative male fertility is best accomplished by the use of parentage type analyses. Genetic markers currently have enough resolution and power to model fine-scale gene movement with some precision. However, a major weakness in parentage analyses is that they tell us relatively little about the nature of unassigned paternity (i.e. the source of pollen outside a circumscribed area). This unassigned paternity could come from 10 m outside the area or 1000 m. If the study of gene flow is to expand to involve longer distance movement of genes between populations or to address patterns of gene flow across increasingly larger spatial scales, it is essential to identify the particular limitations inherent in parentage analysis experiments and to suggest modifications that will allow a successful scaling up of the questions.

First, the emphasis of paternity analyses is steadily shifting away from a strict assignment of paternity and toward answering questions concerning the factors that might be contributing to the levels of apparent gene flow. For many plant populations, rates of gene flow are much higher than had been predicted, and confusion immediately arises when attempting to determine the patterns of long-distance gene flow. For example, should the scale of the paternity analysis simply be extended to include more putative fathers? If so, then increasing the scale of the paternity analysis will bring about a concomitant increase in the labor involved and a loss of genetic resolution. Moreover, the effort in identifying, sampling, genotyping, and mapping the positions of all putative fathers in the study plot may be prohibitive for most research projects.

Second, a distinction must be made between the study of gene flow, via paternity analysis, in fragmented populations and continuous populations. Logistically, fragmented populations are easier to handle, because of the smaller number of potential fathers in the immediate vicinity. Even if gene flow occurs over great geographical distances, a fragmented landscape will include fewer potential fathers than a continuous landscape. However, when identifying the number of differences in the pollination syndrome, fragmentation structure, background environmental matrix, and a multitude of potentially confounding environmental variables between species, it becomes natural to ask whether studies confined to fragmented habitats are applicable to species with continuous distributions.

Finally, parentage analysis of gene movement is restricted in both temporal and spatial scales. In most cases, paternity analyses are conducted on a limited number of maternal trees, for one or two years and in a single geographic site. Estimates of gene flow based on these studies have little replication to evaluate their variance. Year to year variation in pollen production or reception and specific geographic or maternal idiosyncrasies preclude the formation of widely general patterns from a single paternity analysis (see Hamrick abstract, Part III of these proceedings. Thus, eventually it may be necessary to shift away from paternity analyses for questions that involve larger spatial and temporal scales.

Gene flow and adaptation

Workshop discussions focused largely on gene flow alone, with little regard to the importance of locally adapted genotypes. However, it is clear that gene flow among some populations could result in reductions in progeny fitness (Savolainen abstract, Part III of these proceedings). Genetic surveys that are designed to estimate gene flow could also be used to examine the consequences of gene flow for conservation and management purposes (for discussion of optimal sampling for surveys, see Petit abstract, Part III of these proceedings). Indeed, such surveys are meant to identify diploid immigrants (seed flow), haploid immigrants (pollen flow), within-population outcrossed progenies, and selfed progenies. An evaluation of the relative fitness of these different classes of progeny would increase our understanding of the consequences for the viability and adaptability of recipient populations. Numerous studies have demonstrated reductions in the relative fitness of selfed versus outcrossed progeny, particularly in predominantly outcrossing species. Habitat modification associated with human activities has, in some cases, been correlated with increased rates of selfing, though effects on progeny fitness have not been examined in this context.

Gene flow is considered an important force for the maintenance of genetic diversity. In addition, high amounts of gene flow will reduce inbreeding. However, gene flow also has the potential to introduce poorly adapted genes (outbreeding depression) that can reduce viability of the population. While it is not clear how likely increased gene flow will result in outbreeding depression, the possibility illustrates the connection between gene flow and local adaptation. Populations that now occupy altered landscapes are likely to experience different patterns of future gene flow than those experienced over a longer period in the past (Savolainen abstract, Part III of these proceedings). If ecological conditions are changing (e.g., global change), they could introduce genes adapted to the new conditions (e.g., for Scots pine in Finland, genes from the southern part of the country may play well to climatic warming in the north).

Finally, if the regional population system functions as a metapopulation, with frequent local extinction and recolonization, the system as a whole will only persist if colonization of new patches by seeds occurs with sufficient probability. We conclude that an awareness of the fitness consequences of gene flow should be a prominent feature of future gene flow studies.

Metapopulation and landscape approaches to gene flow

The models of the infinite island gene flow, metapopulation, and landscape ecology appear to be quite compatible and complementary (see Fig. 1). All three perspectives are interested in movement between populations.  However, the assumptions of infinite island models that estimate Nem are quite different from those of the classical metapopulation model (Levins, 1970), based on extinction and colonization dynamics.  Although we are now seeing some integration of population genetic and metapopulation models (e.g. Hedrick and Gilpin 1997, Giles and Goudet 1997), the integration of landscape modeling approaches has been applied to genetic questions in only a few cases. For example, Antonovics et al. (1977) discuss a spatially explicit version of the metapopulation approach.   In  some cases, incorporation of metapopulation models can provide new insight about frequency of specific traits such as self-incompatiblity alleles in plant populations (e.g., Gilpin abstract, part III).  The advantage of the metapopulation and landscape approaches is that they can operate on the landscape scale (see Review in McCauley 1995a). Unfortunately, the gap between genetic migration studies and metapopulation migration studies is large (Antonovics 1997). Yet, a synthesis of genetic and demographic approaches should be mutually beneficially, because population genetics and population ecology require estimates of migration (see Hanski and Simberloff 1997; Hanski and Gilpin 1991). Here, we focus on existing models that might be relevant to genetic studies.

Few models are available that explicitly analyze gene flow within metapopulation or landscape perspectives, and there are virtually no general spatial models for gene flow. However, there are different types of spatially explicit models that have potential applicability to gene flow studies (Davis abstract, Part III of these proceedings). One example of such a spatially-explicit model is Steinberg and Jordan’s (1997) individual-based modeling approach (see Steinberg abstract in Part III of these proceedings). Their approach to connecting demography and genetics (‘virtual pocket gophers’) could easily be adapted to include spatial or temporal heterogeneity. Alternatively, object-oriented models would be amenable to layering landscape, demographic, and genetic processes (Davis abstract, Part III of these proceedings). The first category consists of biological transport models, individually-based / cellular automata models (i.e. Ecobeaker, by e. Meir) and metapopulation models (e.g, RAMAS-GIS, ALEX, Lindenmayer et al. 1995). A second category consists of physical transport models (i.e. FETCHR). The utility of any of these models for describing gene flow processes has not received much attention (but see Antonovics 1997; Gilpin 1991; McCauley 1995a) .

An unresolved question is whether spatially-explicit modeling offers any benefits to population geneticists. We suggest that this approach could have useful applications for some situations. For example, understanding pollen flow patterns via wind transport vectors ( i.e. wind channels, etc.) would provide means for hypothesis testing about influences of landscape changes. The use of spatially explicit mapping offers a means of mapping different selection regimes (i.e. soil types, elevation, etc.). Finally, the measurement of gene flow within a landscape mosaic allows one to measure ‘ecological distance’ between populations, as well as direct physical distance, perhaps having divergent implications for gene flow. In this case, the combination of spatially explicit genetic data, combined with environmental data, are available for the same landscape would allow one to test several hypotheses about the impact of "ecological distances" on gene flow or the influence of environmental variables on gene flow.

From a landscape modeling perspective, migration is important when considering the contribution of genetics to conservation and management. Integration of genetic and demographic data, or interpretation of either genetic or demographic processes, each with respect to the other, require the ability to translate the movement of genes (gene flow) to the migration of individuals (or of pollen/seeds) and vice versa. To make this translation (i.e., via simulations), it would be useful to have information on distributions of dispersal or gene flow distances, rather than average (i.e. Nem) estimates. So far, the type of migration parameters that are needed to connect genetic and demographic models are not being measured.

From the perspective of metapopulation or landscape models of plant populations, seed dispersal data may be as important as pollen dispersal data. While seed and pollen movement can be quite different and influence genetic structure differentially, for population demographic processes (i.e. colonization), seed dispersal, or dispersal of vegetative propagules for many species, is the key. Use of maternally-inherited markers (e.g. Demesure et al. 1996; Dumolin et al. 1995; McCauley 1994; McCauley et al. 1995), and paternally inherited markers, in conjunction with nuclear markers, would allow examination of both seed and pollen dispersal.

A key challenge for many ecological, conservation, and management studies is the adoption of a proper landscape scale. It would be useful to have genetic models that integrate both spatial variability (i.e., heterogeneous landscapes) and temporal variability (i.e., metapopulation dynamics), both to examine how these types of variation influence the genetic structure of populations, as well as to consider how these types of variation influence our interpretations of genetic structure. The application of landscape models necessitates larger scales of study. Obviously, this will often be logistically difficult. Large-scale studies will be most tractable in small isolated populations, such as Kaufman et al’s Cecropia study (Kaufman et al., 1998; see also Smouse et al. abstract, Part III of these proceedings) or for tropical trees in fragments (e.g. Nason 1997; Stacy et al. 1996) or for populations following a river course (linear population arrays). The scaling up of genetic studies might require careful selection of study systems in order to measure parameters that can then be modeled. Another approach to asking landscape-scale questions (i.e., regarding long-distance gene flow) would be to focus on the edges of species ranges, where populations are smaller and more fragmented, permitting examination of associations between distance/size of fragments and gene flow patterns, but this approach may give biased picture, relative to more centrally located populations.

For many threatened species,  extinction-recolonization dynamics have only recently been imposed through habitat loss and fragmentation. Thus, we want to emphasize that most currently fragmented populations of interest to the conservation biologist were probably not fragmented over extended evolutionary time. Landscape alteration has created metapopulations out of formerly continuous populations. Temporal scale is thus an important consideration in genetic applications of metapopulation models. Because (a) we do not know what kind of metapopulation has been created (i.e., disequilibrated, patchy, classical?), and (b) we do not know where the metapopulation is headed, methods must be sensitive to recent shifts in gene flow patterns. We conclude that standard indirect methods may not be sufficiently sensitive to estimate recent changes in gene flow.

Literature Cited

**Return to Cover Page

 Return to top of Part I