Author response: Phylogenetic divergence of cell biological features

Michael E Lynch

doi:10.7554/elife.34820.011

Abstract

20 min read

Article Figures and data Abstract eLife digest Introduction Theory Discussion References Decision letter Author response Article and author information Metrics Abstract Most cellular features have a range of states, but understanding the mechanisms responsible for interspecific divergence is a challenge for evolutionary cell biology. Models are developed for the distribution of mean phenotypes likely to evolve under the joint forces of mutation and genetic drift in the face of constant selection pressures. Mean phenotypes will deviate from optimal states to a degree depending on the effective population size, potentially leading to substantial divergence in the absence of diversifying selection. The steady-state distribution for the mean can even be bimodal, with one domain being largely driven by selection and the other by mutation pressure, leading to the illusion of phenotypic shifts being induced by movement among alternative adaptive domains. These results raise questions as to whether lineage-specific selective pressures are necessary to account for interspecific divergence, providing a possible platform for the establishment of null models for the evolution of cell-biological traits. https://doi.org/10.7554/eLife.34820.001 eLife digest When most people think about evolution, they commonly think of natural selection: the evolutionary force that helps populations to develop toward an optimum state for their environment. The observable traits and features of a cell or organism are known as its phenotype. Under natural selection, genes that produce phenotypes that help a cell or organism to thrive and reproduce are more likely to be passed on to future generations. This means that over several generations the population becomes – on average – better adapted to its environment. Other 'non-adaptive' evolutionary forces also influence phenotype. For example, damage to DNA can introduce mutations into the genes that a cell or organism passes on to their offspring. Some mutations are more likely to produce working variants of a gene than others; this is known as a mutation bias. In addition, even in the absence of natural selection, the proportion of particular gene variants in a population changes over the generations because genes are randomly transmitted and not all individuals reproduce. This is known as genetic drift. Together, mutation bias and genetic drift could prevent a population's average phenotype from reaching an optimal state. Lynch has now developed mathematical models that describe how certain biological features of cells – such as the structure of the proteins they produce – are likely to evolve due to mutation bias and genetic drift. These models show that these evolutionary processes can cause the features of the cells in a population to diversify, which often leads to a suboptimal average phenotype. Lynch calculated that two alternative phenotypes could even emerge in isolated populations in cases where there is only one optimum phenotype. For example, a mutation bias could drive some cells in one population to evolve one phenotype, while natural selection drives another population towards the other phenotype. Overall, the model emphasizes that natural selection is not the only force that drives diversity in cells. Future research into cell biology needs to take a broad view of the joint roles played by natural selection, mutation bias and genetic drift. https://doi.org/10.7554/eLife.34820.002 Introduction As with nearly all biological traits, most cellular features vary among individuals within populations in a nearly continuous fashion, owing to genetic differences among individuals and the myriad of stochastic factors experienced by all organisms (ranging from intrinsic cellular noise to external environmental forces; Lynch and Walsh, 1998). This is true, for example, for catalytic rates, rates of gene expression and intracellular transport, numbers and sizes of organelles, etc. Ultimately, some fraction of within-species genetic variation is transformed into among-species divergence as alternative alleles arise by mutation and in some cases proceed to fixation (Wright, 1969; Walsh and Lynch, 2018). The magnitude of such divergence is dictated by three major evolutionary factors: the pattern of selection (the phenotypic fitness function), which imposes a directional and/or stabilizing force on the mean phenotype; the rate of origin and distribution of mutational effects, which define the raw materials upon which natural selection operates; and the power of random genetic drift, which imposes noise on the selective process. Although considerable effort has been devoted to understanding the divergence of mean phenotypes among lineages (Walsh and Lynch, 2018), most of this work is focused on the evolution of morphological phenotypes in response to external pressures, which can vary greatly depending on the ecological setting. In contrast, owing to homeostatic effects, the internal environment of cells remains largely constant over long time scales and broad geographic locations, raising the possibility of establishing general evolutionary principles that transcend the imposition of transient ecological changes. (The same might be true for the internal organs of multicellular species). The goal here is to derive general expressions for the divergence of mean phenotypes among species under scenarios that are likely to hold for a wide variety of cellular traits. The specific focus will be on the magnitude of divergence expected among lineages in the face of identical evolutionary forces, as this helps clarify the degree to which phenotypic diversification can proceed in the absence of lineage-specific selection pressures. Such a perspective is essential to establishing the degree to which adaptive explanations need to be sought to explain patterns of variation among populations. The general approach will draw from well-established constructs employed in the field of quantitative genetics (the study of continuously distributed traits with a multifactorial genetic basis; Lynch and Walsh, 1998; Walsh and Lynch, 2018). The traditional focus of this field has been on complex traits in multicellular species, but these same methods can be profitably applied to intracellular morphological and molecular features, such as those involved in the cytoskeleton, gene expression, binding energy, and metabolic rates (Nourmohammad et al., 2013; Farhadifar et al., 2015; Phillips and Bowerman, 2015). Indeed, although most work in phenotypic evolution proceeds as though cellular details are irrelevant, the models employed may be equally if not more relevant to cell-biological traits, owing to their potentially less temporally variable fitness effects. Theory The distribution of mean phenotypes All genetically encoded traits are subject to the recurrent forces of mutation and random genetic drift, and potentially to selection. Selection favors some genotypes over others, while mutation modifies existing genotypes independent of the selective process, and random genetic drift causes stochastic variation in gene transmission across generations. Owing to this latter factor, even if the forces of selection and mutation remain constant, the population mean phenotype of a trait will wander within a certain range over evolutionary time, with the frequency of occurrence of alternative mean phenotypes depending on patterns and strengths of selective and mutational effects (Figure 1). Figure 1 Download asset Open asset An idealized overview of the model for the evolution of the distribution of mean phenotypes, given here for a trait under stabilizing selection. The upper panel denotes a hypothetical phenotype distribution at a single point in time. The population consists of multiple genotypes, each having an expected genotypic value (red) but a range of phenotypes (black distributions) resulting from variance in residual deviations (environmental effects and nonadditive genetic factors). The phenotype distribution for the entire population (red) is the sum of these genotype-specific curves, and has a mean denoted by the blue line. The exact location of this overall distribution can wander over time, owing to the joint forces of selection, mutation, and random genetic drift. The lower panel gives the overall distribution of population means over a long evolutionary time span, with 11 locations at specific points of time being denoted by the short vertical lines. Persistent mutational bias towards smaller phenotypes prevents the overall distribution of means from coinciding with the fitness-function optimum, and random genetic drift causes a dispersion of means around the overall average value. https://doi.org/10.7554/eLife.34820.003 The focus of this study, the stationary distribution of mean phenotypes, can be viewed as a summary distribution of: (1) phenotypic means across a large number of replicate populations exposed to identical conditions for a very long period; or (2) a historical survey of mean phenotypes in a single population over a long time period, again under constant environmental and population-genetic conditions. Among many other applications, such an approach has long been exploited in attempts to understand the steady-state distribution of allele frequencies expected under a constant regime of selection, mutation, and random genetic drift (e.g. Wright, 1969). From an empirical perspective, this steady-state view of evolution implicitly assumes that enough time has elapsed between observed taxa that the dynamics of the evolutionary process are of negligible significance (which would not be the case for closely related species). The approach taken here relies on the Kolmogorov forward equation for a diffusion process (Appendix 1, Walsh and Lynch, 2018), the assumption being that the trait of interest is continuously distributed, with z denoting the phenotypic value of an individual. The population mean, z¯, moves in arbitrarily small increments each generation via the deterministic forces of selection and mutation and the stochastic process of drift. Under most reasonable biological conditions, independent of the starting conditions, a stationary distribution of mean phenotypes (among hypothetical replicate populations) is eventually converged upon, at which point there is an exact balance between opposing forces. The probability that a population's mean phenotype will reside at any particular point is defined by this distribution, which has the general form (1a) Φ(z¯)=C⋅exp⁡(2∫z¯[M(x)/V(x)]dx), where M(x) defines the rate of directional change (resulting from selection and/or mutation) for a population with mean phenotype x, and V(x) is the variance in change (resulting from drift). C is the normalization constant (containing only terms that are independent of z¯) that ensures that the entire probability density sums to 1.0. For a quantitative trait, the directional term can be subdivided into independent selection and mutation components, Ms(x) and Mm(x), both of which will be discussed in detail below. Under the assumption of negligible genotype × environment interaction and epistasis, the variance of the change in means, which results from the sampling of heritable genotypic values of individuals, is equal to the underlying additive genetic variance for the trait, σA2, divided by the effective population size, Ne, in the case of haploidy (assumed here; and 2Ne in the case of diploidy). The latter is typically far below the number of reproductive individuals in the population, and defined by various demographic features and interference imposed by chromosomal linkage, with values ranging between ∼105 for multicellular eukaryotes to ∼109 for bacteria (Charlesworth, 2009; Lynch et al., 2016; Walsh and Lynch, 2018). Individual phenotypes are comprised of the sum of a heritable additive genetic component (A) and a nonheritable residual deviation (e, which includes environmental and nonadditive genetic effects), such that z=A+e, with the within-population phenotypic variance being partitioned as σz2=σA2+σe2. For cellular features, a large fraction of σe2 may be a consequence of stochastic gene expression, imprecise placement of cell-division septa, etc. Assuming that both σA2 and Ne remain constant, which is the model adhered to here, Equation (1a) can be rewritten as (1b) Φ(z¯)=C⋅exp((2Ne/σA2)∫z¯[Ms(x)+Mm(x)]dx), showing that the stationary distribution of mean phenotypes (conditional on a particular level of genetic variance, a point that will be returned to below) is proportional to the product of the distributions expected under selection alone and under mutation alone. With extremely weak selection, Ms(x) would be essentially a flat function, with the overall distribution reflecting the biases due to mutation alone. Conversely, with a flat mutation function, an unlikely scenario, the distribution will follow that expected under selection alone. The process of selection The influence of selection on the mean phenotype (the response to selection) is embodied in the breeder's equation, (2) Ms(z¯)=z¯(t+1)−z¯(t)=h2[ z¯s(t)−z¯(t) ], a general statement about the connection between directional selection within generations and the transmission of such change across generations (Walsh and Lynch, 2018). Here, z¯(t) and z¯s(t) denote the mean phenotypes before and after selection in generation t, the difference being the selection differential. The heritability of the trait, h2=σA2/σz2, which equals the proportion of the total phenotypic variance, σz2, associated with additive genetic variation, σA2, constitutes the fraction of the within-generation change in the mean transmitted to the next generation. Critical to everything that follows, the selection differential can be described in terms of the within-population phenotype distribution, p(z,t), and the function relating individual fitness to phenotype, W(z). The mean fitness in generation t is (3) W¯=∫p(z,t)⋅W(z)⋅dz. The mean phenotype after selection (but before inheritance) is then obtained by weighting the pre-selection phenotypes by their relative fitnesses, (4) z¯s(t)=1W¯∫z⋅p(z,t)⋅W(z)⋅dz. We will make use of the fact that most quantitative traits have an approximately normal phenotype distribution on some scale of measurement, which follows from the central limit theorem (Lynch and Walsh, 1998). The distribution of individual measures is therefore described completely by the phenotypic mean and variance, (5) p(z,t)=12πσz2⋅exp⁡(−[ z−z¯(t) ]22σz2). Substituting Equation (5) into (3) and differentiating, the change in mean fitness with respect to mean phenotype is (6) ∂W¯∂z¯(t)=∫∂p(z,t)∂z¯(t)⋅W(z)⋅dz=1σz2∫[ z−z¯(t) ]⋅p(z,t)⋅W(z)⋅dz (Lande, 1976). From Equation (4), the first term to the right of the integral is equal to z¯s(t)⋅W¯, and the second term is z¯(t)⋅W¯. This provides a direct link to Equation (2), which upon rearrangement becomes (7) Ms(z¯)=σA2⋅∂W¯W¯⋅∂z¯(t). This expression states that, provided the phenotype distribution is normal, the change in mean phenotype caused by selection is equal to the product of the genetic variance for the trait and the gradient in the logarithm of mean fitness with respect to mean phenotype. Evolution by natural selection comes to a standstill when there is no genetic variance for the trait or the phenotypic mean resides at a point where the slope of the function of mean fitness with respect to mean phenotype is zero. To endow this expression with practical utility, specific expressions for the fitness function, W(z), will be considered below. The process of mutation Most attempts to consider the long-term evolutionary features of quantitative traits have assumed one of two mutation models: (1) a distribution of mutational effects always having a mean equal to zero and a constant variance, independent of the starting genotype (Kimura, 1965; Lande, 1975; Lynch and Hill, 1986); or (2) a rate of appearance of each type of mutant allele being independent of the ancestral type (Cockerham, 1984; Turelli, 1984). Under the first scenario, mutation has no directional effect on the mean phenotype, and there are no bounds on the possible mutational effects or the physical limits to which the trait can evolve. Under the second scenario, there is a physical limit to phenotypic divergence, and because the directional effect of mutations depends on the current location, more extreme alleles generate mutations with effects biased back toward the center of the distribution. Neither of these mutational schemes captures the features of a wide variety of cell biological traits, which often have finite numbers of possible states and state-dependent spectra of mutational effects. A few examples will suffice to make this point. Protein-protein interactions (e.g. the interfaces between dimeric molecules) typically depend on no more than a few dozen amino-acid sites. The same is true for intramolecular interactions such as the constellation of backbone residues that assemble during protein folding. In both cases, the underlying residues operate in an approximately binary manner, for example, hydrophobic vs. hydrophilic, or hydrogen-bonding vs. non-hydrogen bonding. Likewise, the catalytic sites of enzymes often consist of a small-to-moderate numbers of residues that either facilitate or inhibit catalytic rates, and the sizes of intracellular organelles and cytoskeletal components are constrained by cell size. Many other examples could be cited, including those involved in RNA-RNA and DNA-protein interactions. The approximate structure of a mutation function with a bounded range can be arrived at by considering a trait determined by n binary factors (or sites), each with state b having effect 0, and state B having effect m. For a trait with an additive genetic basis, the mean phenotype in a haploid population can then be represented as (8) z¯=z0+nmq¯, where z0 is an arbitrary baseline value for the trait, and q¯ is the mean frequency of B-type alleles averaged over all n factors in the population (Lynch and Walsh, 1998). Letting u be the mutation rate from B to b alleles, and v be the reciprocal rate, the per-generation change in the mean phenotype resulting from mutation is (9) Mm(z¯)=nm[v(1−q¯)−uq¯]. With q^=v/(u+v) being the equilibrium frequency of B alleles under mutation pressure alone, and θm=z0+nmq^ being the expected mean phenotype under neutrality, Equation (9) further reduces to (10) Mm(z¯)=−(u+v)(z¯−θm). This expression is quite general in that (z¯−θm) is simply the distance of the mean phenotype from that expected under mutation equilibrium, and (u+v) is a measure of the mutational restoring force per locus. The essential feature of Equation (10) is that mutation acts to reduce the distance between the mean phenotype and θm to a degree that depends on the magnitude of this deviation. Charlesworth (2013) implemented a similar mutation model in an investigation of genomic features. The stationary distribution of mean phenotypes Application of Equations (7) and (10) to (1b) yields a useful simplification of the stationary distribution that will be adhered to below, (11) Φ(z¯)=C⋅[ W¯(z¯) ]2Ne⋅exp⁡(−(z¯−θm)22σN2), with σN2=σA2/[2Ne(u+v)]. As will be discussed below, under neutrality, the genetic variance σA2 often scales directly with Ne, and population size would have no influence on the distribution in this limiting case, as σN2 would be independent of Ne. More generally, σA2 is also a function of the intensity of selection, but the bulk of the steady-state distribution will be represented by mean phenotypes that are in the range of effective neutrality with respect to each other, so the scaling relationship of σA2 under neutrality is expected to be a reasonable first-order approximation. Equation (11) shows that, provided the genetic variance remains roughly constant, the stationary distribution is equal to the product of the expectation under neutrality (where mutation and drift are the only operable evolutionary forces) and the mean fitness function exponentiated by 2Ne, that is, the stationary distribution is equivalent to a transformation of the neutral expectation by a function of the fitness landscape. Thus, to obtain the overall distribution in the following applications, we require an expression for mean population fitness in terms of the trait mean. In follows, into the approximate magnitude of σN2 will be This can be by that will have values of the of magnitude of where is the mutation rate per This is equivalent to the of at neutral sites in natural populations under equilibrium, and from to with the lower and of the range being in and Thus, because of traits are typically on the of to (Lynch and Walsh, σN2 is expected to be in the range of to the average within-population phenotypic variance for the Selection for an optimum A commonly assumed form of selection, relevant to many cellular features, is the fitness function with an optimum phenotype, and a the of selection around the optimum, Application of this expression to Equations (3) and (4) leads to the expression for mean population which when applied to Equation (7) yields the expression for necessary for the stationary distribution 1). The latter expression shows that the change in the mean phenotype resulting from selection is directly proportional to the deviation of the current mean phenotype from the optimum and proportional to the sum of the of the fitness function and the total phenotypic variance (Lande, 1976). As will be below, phenotypic variance consequence of external environmental and internal cellular reduces the of selection by the between genotype and phenotype. the mean phenotype to evolve to the optimum, which is unlikely with biased mutation pressure, selection would be stabilizing in only to reduce the variation around the mean. 1 for mean population and the rate of change of the mean phenotype resulting from selection, obtained from Equations (4) and With both the selection and mutation terms in Equation (11) being the product is also (Lande, in this case leading to a stationary distribution of mean phenotypes with overall mean and variance where with and σN2 defined as being the of the associated with selection and Equation states that the mean is equal to a average of the under mutation and selection alone component being by the of the variance of the Equation states that the variance of means is equal to the mean of the associated with selection and mutation alone. As which a fitness function and an approach toward neutrality, the mean and variance on the for a driven process, θm and As which a influence of mutation on the overall distribution, the mean and variance on the for a process, and As can be from Equations a of the form of the stationary distribution of means is the which the following is the of the fitness function can be expected to be than the phenotypic deviation the selective on the trait would be and this is observed (Walsh and Lynch, 2018). the range of heritability this that the is unlikely to be than under selection, and can one to two of magnitude smaller than under weak selection. mutation rates at the single level are typically in the range of to with the being in and the latter in large multicellular species (Lynch et al., Thus, in that individual of mutation may more than single is likely to be in the range of to Together, these results a likely range for of to which Equations to With these values in Figure shows that the form of the stationary distribution with the value of extremely and extremely flat at of the for this The degree to which θm from for cellular features is but there is no to to be they can deviate from the optimum to a degree that depends on the weighting (Figure Figure Download asset Open asset distributions of mean phenotypes, with optimum phenotype and are given for three values of θm for the in which and values of for the case in which the mutational mean with the optimum (black The values in these a scale on which the phenotypic deviation is so a mean phenotype of is equivalent to a of phenotypic deviations from the fitness function Many cellular features are likely to be under continuous selection for an extreme optimum, but with of selection as the optimum is For example, many enzymes are likely to be for as a catalytic rate as protein for as rates and as binding interfaces with as as etc. of this type of selection the function, where the and define the and of the fitness response to is equal to when and one as for the mean population fitness and the change in the mean resulting from selection, obtained by the are provided in 1, and of the into Equation (11) yields the stationary distribution of mean of the of this fitness function, the distribution is no but yields an expression for the single of the distribution, with the in fitness with the distribution of mean phenotypes is from by the of mutation and the of drift. selection is always in the the expected always the neutral expectation to a degree that with the effective population size. Equation is but provided in the limit of large Ne, Although the fitness function a distribution of means to the the bulk of the distribution is approximately normal, and an to the variance can be obtained from the of the stationary distribution around the the of the of the second of the stationary As in the case of the fitness function, Equation the two terms in the are the of the expected under the limits of selection and An of the influence of population size on the stationary distribution is given in Figure where there is a mutational bias from the The distributions to the right with an in Ne, with the mean phenotype over a three range of Ne. As can be from Equation equal changes in either Ne or the neutral variance σN2 have identical effects on the mean, although effects on the variance are in Figure Download asset Open asset distributions of mean phenotypes with a

Author response: Phylogenetic divergence of cell biological features

Abstract

Discussion(0)

Open reviews(0)

Related publications

Phylogenetic divergence of cell biological features

MUTATION, SELECTION, AND THE MAINTENANCE OF LIFE-HISTORY VARIATION IN A NATURAL POPULATION

Effective Size and Polymorphism of Linked Neutral Loci in Populations Under Directional Selection

The response to artificial selection from new mutations in Drosophila melanogaster.

THE DIVERGENCE OF NEUTRAL QUANTITATIVE CHARACTERS AMONG PARTIALLY ISOLATED POPULATIONS

Related publications

Article2018
Phylogenetic divergence of cell biological features
Article2018

Article1998
MUTATION, SELECTION, AND THE MAINTENANCE OF LIFE-HISTORY VARIATION IN A NATURAL POPULATION
Article1998

Article1998
Effective Size and Polymorphism of Linked Neutral Loci in Populations Under Directional Selection
Article1998

Article1991
The response to artificial selection from new mutations in Drosophila melanogaster.
Article1991

Article1988
THE DIVERGENCE OF NEUTRAL QUANTITATIVE CHARACTERS AMONG PARTIALLY ISOLATED POPULATIONS
Article1988