|
|
老实的电梯 · 使用 REST API 或 Azure ...· 1 年前 · |
|
|
果断的小虾米 · QT "QDomDocument 的 ...· 2 年前 · |
|
|
焦虑的柳树 · git项目代码一次push,同时上传到多个g ...· 2 年前 · |
|
|
淡定的橙子 · 多路SOCKET、心跳包、注册包的作用_服务器· 2 年前 · |
* E-mail: marsland@bu.edu
Affiliation: Department of Physics, Boston University, Boston, Massachusetts, United States of America http://orcid.org/0000-0002-5007-6877
Affiliations: Department of Physics, Boston University, Boston, Massachusetts, United States of America, Department of Physics, Boston College, Chestnut Hill, Massachusetts, United States of America
Affiliation: Bioinformatics Program, Boston University, Boston, Massachusetts, United States of America
Affiliation: Department of Physics, Boston University, Boston, Massachusetts, United States of America
Natural microbial communities contain hundreds to thousands of interacting species. For this reason, computational simulations are playing an increasingly important role in microbial ecology. In this manuscript, we present a new open-source, freely available Python package called Community Simulator for simulating microbial population dynamics in a reproducible, transparent and scalable way. The Community Simulator includes five major elements: tools for preparing the initial states and environmental conditions for a set of samples, automatic generation of dynamical equations based on a dictionary of modeling assumptions, random parameter sampling with tunable levels of metabolic and taxonomic structure, parallel integration of the dynamical equations, and support for metacommunity dynamics with migration between samples. To significantly speed up simulations using Community Simulator, our Python package implements a new Expectation-Maximization (EM) algorithm for finding equilibrium states of community dynamics that exploits a recently discovered duality between ecological dynamics and convex optimization. We present data showing that this EM algorithm improves performance by between one and two orders compared to direct numerical integration of the corresponding ordinary differential equations. We conclude by listing several recent applications of the Community Simulator to problems in microbial ecology, and discussing possible extensions of the package for directly analyzing microbiome compositional data.
Citation: Marsland R, Cui W, Goldford J, Mehta P (2020) The Community Simulator: A Python package for microbial ecology. PLoS ONE 15(3): e0230430. doi:10.1371/journal.pone.0230430
Editor: Isaac Klapper, Temple University, UNITED STATES
The last decade has seen a renewed interest in the study of microbial communities. Different environments can harbor diverse communities containing from hundreds to thousands of distinct microbes [ 1 , 2 ]. A central goal of community ecology is to understand the ecological processes that shape these diverse ecosystems. The diversity and function of ecosystems are affected by a wide variety of factors including energy and resource availability [ 3 , 4 ], ecological processes such as competition between species [ 5 – 8 ] and stochastic colonization [ 9 – 12 ].
Microbial ecosystems also present several new challenges specific to microbes that are not usually addressed in the theoretical ecology literature. Classical models of community ecology (especially niche-based theories) have traditionally considered ecosystems with a few species and resources [ 13 , 14 ]. However, microbial ecosystems often have thousands of species and hundreds of small molecules that can be consumed. It is unclear how the intuitions and results from these low-dimensional settings scale to microbiomes. It is known that diverse ecosystems can exhibit distinct emergent features and phase transitions not found in low-dimensional systems [ 15 – 18 ]. Furthermore, classical ecological models usually assume a strict trophic layer separation, ignoring cross-feeding and syntrophy—the consumption of metabolic byproducts of one species by another species. It is now becoming clear that cross-feeding is a central component of microbial ecosystems [ 19 – 22 ] and any ecological model must account for this phenomenon.
For these reasons, there is a need for new ways of understanding microbial ecosystems. One powerful approach for understanding complex systems is through simulations. However, simulating diverse microbial ecosystems presents some unique challenges. First, most ecosystems are mathematically represented by complicated coupled, non-linear ordinary differential equations. Simulating these systems in ecosystems with hundreds to thousands of species and metabolites becomes computationally difficult and time-consuming. Second, these dynamical models have thousands of parameters. One needs a principled and biologically realistic way of choosing such parameters. Third, explaining real data requires incorporating ecological processes such as stochastic colonization that play an important role in shaping community structure and dynamics. Finally, we need to be able to incorporate spatial and population level structures in an experimentally realistic way.
Recently, we presented a powerful minimal model of microbial ecosystems that addresses these concerns [ 20 , 22 ]. Furthermore, we have found a mathematical mapping between ecological dynamics and constrained optimization that can be used to accelerate simulations of many large ecosystems [ 23 , 24 ]. In this paper, we present a new open-source Python package for microbial ecology called Community Simulator that implements these theoretical advances, making it easy to simulate complex microbial communities in a variety of experimentally relevant settings.
The architecture of Community Simulator is inspired by the parallel experiments commonly performed with 96-well plates, as illustrated in Fig 1 . The central object of the package is a Community class, whose instances are initialized by specifying the initial population sizes and resource concentrations for each parallel “well,” along with the functions and parameters that define the population dynamics. The initial state, dynamical equations and parameters can all be generated automatically from a dictionary of modeling assumptions, or custom-built by the user. Each instance of this class represents an n -well plate, containing n well-mixed, non-interacting communities. Once initialized, the state of the plate can be updated in one of two ways. Propagate(T) propagates the system for a time T by integrating the supplied dynamical equations, and Passage(f) builds a replacement plate by adding a fraction f μν of the contents of each old well ν to each new well μ . In the final section below, we will discuss a third method SteadyState() , which can find the fixed point of the dynamics in some models without numerical integration.
The core object of the Community Simulator is a virtual n -well plate, holding n independent well-mixed microbial communities. This plate has three properties: its current state, a dynamical law for the population dynamics, and a set of parameters. Once a plate is initialized, two actions can be performed on it: propagation in time using the given dynamical law, and passaging of given fractions of the contents of each well to fresh wells on a replacement plate. For some models, the equilibrium state of the population dynamics can also be found directly using a new algorithm summarized in Fig 5 below.
doi:10.1371/journal.pone.0230430.g001
The package also includes some functions for analysis of the simulation results, including a variety of measures of alpha diversity, as well as extraction of energy flux networks, effective interaction coefficients, and sensitivities to parameter perturbations.
In the following sections, we describe the functionality of each element of the package in turn. We particularly focus on the tools for generating the dynamical equations and the parameter sets, explaining how increasing levels of biological realism can be progressively incorporated. Each section has a corresponding segment in the Jupyter notebook Tutorial.ipynb included with the Community Simulator package. This notebook contains all the code and parameters for generating the figures found in the paper.
The state of a Community instance is contained in a pair of Pandas data frames ( https://pandas.pydata.org/ , [ 25 ]), one of size S tot × n for the microbial population sizes N i ( i = 1, 2, … S tot ) and one of size M × n for the resource abundances R α ( α = 1, 2, … M ). Each row of the data frame corresponds to a different species or resource type, while each column corresponds to a different well.
The function MakeInitialState automatically creates data frames N0,R0 of initial population sizes and resource abundances corresponding to some common experimental scenarios, specified in a dictionary of assumptions. The initial species abundances are supplied by a stochastic process that is agnostic to species identity. This roughly captures the various dispersal mechanisms including mechanical disturbances and turbulent flow that convey microbial cells to new environments. Specifically, random subsets of S species from the regional pool of size S tot are supplied to the n wells of the plate. The population sizes of these species are set to 1 by default, and can be rescaled afterwards if desired.
Initial resource abundances are generated by MakeInitialState based on a Biolog plate scenario, where each well is supplied with a single carbon source. The assumptions dictionary specifies the identity and quantity of the carbon source for each well. Arbitrarily chosen resource abundances can of course be directly supplied to the Community instance instead, to simulate more general conditions.
To capture coarse-grained metabolic structure, the M resources can be assigned to T classes (e.g. sugars, amino acids, etc.), each with M A resources where A = 1, … T and ∑ A M A = M . These class labels become functionally relevant by biasing the sampling of consumption preferences and byproduct stoichiometry, as will be described below. Likewise the S tot species can be assigned to F families, with F ≤ T , and each family preferentially consuming resources from a different resource class. A generalist family can also be included, with S gen species and no preferred resource class, so that S gen + ∑ A S A = S tot .
Instances of the Community class can be initialized with any set of differential equations, which are specified as functions of the system state that return the time derivatives dN i / dt and dR α / dt . The package includes tools for constructing these functions automatically based on a dictionary of assumptions. These built-in dynamics are based on the recently introduced Microbial Consumer Resource Model (MicroCRM) [ 20 , 22 ] illustrated in Fig 2 , which generalizes the classic consumer resource model of MacArthur and Levins [ 6 ] to the microbial context by allowing organisms to release metabolic byproducts. Table 1 lists all the parameters of this family of models, along with the corresponding units.
doi:10.1371/journal.pone.0230430.t001
In order to provide a general-purpose set of models that produce physically reasonable results, the MicroCRM assumes that all resource types are substitutable, and can all be converted to a common energy currency. This allows us to enforce energy conservation, preventing communities from bootstrapping themselves to large population sizes using metabolic secretions with no external resource supply. It also eliminates the need to specify in detail how each resource type interacts with all the others within the consumer metabolism. If such interactions are important for capturing a given experimental phenomenon, the built-in dynamics cannot be used, and custom functions must be written for dN i / dt and dR α / dt . The tutorial notebook included with the package contains an example of this kind, using Liebig’s Law of the Minimum to model phytoplankton dynamics.
We begin by defining an energy flux into a cell
J
in
, an energy flux that is used for growth
J
growth
, and an outgoing energy flux due to byproduct secretion
J
out
. Energy conservation requires
for any reasonable metabolic model. It is useful to denote the input and output energy fluxes that are consumed/secreted in metabolite
β
by
and
respectively. We can define corresponding mass fluxes by
where the conversion factor
w
β
measures the energy density of metabolite
β
. In general, all these fluxes depend on the consumer species under consideration, and will carry an extra Roman index
i
indicating the species.
We assume that a fixed quantity m i of power per cell is required for maintenance of species i , and that the per-capita growth rate is proportional to the remaining energy flux ( J growth − m i ), with proportionality constant g i . Under these assumptions, the time-evolution of the population size N i of species i can be modeled using the equation
We can model the resource dynamics by functions of the form where the function h α describes the resource dynamics in the absence of consumers. The Community Simulator has two kinds of default resource dynamics: externally supplied and self-renewing. For externally supplied resources, we take a linearized form of the dynamics: while for self-renewing we take a logistic form
We now specify the form of the input fluxes
, and of the relationships among input, output and growth that define the metabolism. We start by assuming that all resource utilization pathways are independent, resulting in input fluxes of the form
where
σ
is a single-valued function encoding the relationship between resource availability and uptake rates. The community simulator implements three kinds of response functions: Type-I, linear response functions where
a Type-II saturating Monod function,
and a Type-III Hill or sigmoid-like function
where
n
> 1.
To obtain the output fluxes, we define a leakage fraction l such that
We allow different resources to have different leakage fractions l α . A direct consequence of energy conservation ( Eq (1) ) is that
Finally, we denote by D βα the fraction of the output energy that is contained in metabolite β when a cell consumes α . Note that by definition ∑ β D βα = 1. The total energy output in metabolite β is thus
We are now in position to write down the full dynamics in terms of these quantities:
Notice that when σ is Type-I (linear) and l α = 0 for all α (no leakage or byproducts), this reduces to MacArthur’s original model [ 6 ].
The package can also generate dynamics for active metabolic regulation, which allocates a higher fraction of import capacity to nutrients with higher available energy flux. This regulation is implemented through a series of weight functions for resource
α
that reflect how much of the utilizable energy in the environment is in resource
α
(17)
with
n
reg
a Hill coefficient that tunes steepness. Another option is to regulate based on the fraction of biomass contained in resource
α
,
For the metabolically regulated model, we define the input fluxes by
Then, we can follow the exact same procedure as above. This yields the equations
The MicroCRM contains a large number of parameters: the
S
tot
-dimensional vectors
g
i
and
m
i
, the
M
-dimensional vectors
and
τ
α
or
r
α
, the
S
tot
×
M
consumer preference matrix
c
iα
and the
M
×
M
metabolic matrix
D
αβ
. Some modeling choices require a small number of additional parameters: the maximal uptake rate
σ
max
for Type-II and Type-III growth, and the exponents
n
for Type-III growth and
n
reg
for metabolic regulation. A dictionary containing all these parameters must be supplied to the
Community
instance upon initialization. A list of dictionaries may be supplied instead, to allow different wells to have different parameters.
The package contains a function MakeMatrices for generating the two matrices, which contain most of the ecological structure, based on a dictionary of modeling assumptions summarized in Table 2 . The output of this function is illustrated in Fig 3 and described in detail below.
We choose consumer preferences
c
iα
as follows. As stated earlier, we assume that each specialist family has a preference for one resource class
A
(where
A
= 1…
F
) with 0 ≤
F
≤
T
, and we denote the consumer coefficients for this family by
. We also consider generalists that have no preferences, with consumer coefficients
. The
can be drawn from one of three probability distributions: (i) a Normal/Gaussian distribution, (ii) a Gamma distribution (which ensure positivity of the coefficients), and (iii) a Bernoulli distribution with binary preference levels.
Fig 3
shows examples of all three models.
The Gaussian model is parameterized in terms of the mean
μ
c
= 〈∑
α
c
iα
〉 and variance
of the total consumption capacity, and a parameter
q
that controls how specialized each family is for its preferred resource class. In the generalist family, the mean and variance of
c
iα
are the same for all resources, and are given by
(21)
(22)
where
is the deviation from the mean value. The specialist families sample from a distribution with a larger mean for resources in their preferred class:
where
M
A
is the number of resources in class
A
and
A
is the set of resource indices in class
A
. The variances are likewise larger for the preferred class:
This makes it possible to construct pure specialist families with no off-target consumption by setting q = 1. Note that this is different from the original version of this model in [ 22 ], where all the variances were chosen to be identical.
We also consider the case where consumer preferences are drawn from Gamma distributions, which guarantee that all coefficients are positive. Since the Gamma distribution only has two parameters, it is fully determined once the mean and variance are specified. We parameterize the mean and variance for this model in the same way as for the Gaussian model.
In the binary model, there are only two possible values for each
c
iα
: a low level
and a high level
. The elements of
are given by
where
X
iα
is a binary random variable that equals 1 with probability
for the specialist families, and
for the generalists. Note that the variance in each family is
for large
M
, which depends on
q
in the same way as the variances in the Gaussian case.
We choose the metabolic matrix
D
αβ
according to a three-tiered secretion model illustrated in
Fig 3
. The first tier is a preferred class of ‘waste’ products, such as carboyxlic acids for fermentative and respiro-fermentative bacteria, with
M
w
members. The second tier contains byproducts of the same class as the input resource. For example, this could be attributed to the partial oxidation of sugars into sugar alcohols, or the antiporter behavior of various amino acid transporters. The third tier includes everything else. We encode this structure in
D
αβ
by sampling each column
β
of the matrix from a Dirichlet distribution with concentration parameters
d
αβ
that depend on the byproduct tier, so that on average a fraction
f
w
of the secreted flux goes to the first tier, while a fraction
f
s
goes to the second tier, and the rest goes to the third. The Dirichlet distribution has the property that each sampled vector sums to 1, making it a natural way of randomly allocating a fixed total quantity (such as the total secretion flux from a given input). To write the expressions for these parameters explicitly, we let
A
(
α
) represent the class containing resource
α
, and let
w
represent the ‘waste’ class. We also introduce a parameter
s
that controls the sparsity of the reaction network, ranging from a dense network with all-to-all connection when
s
→ 0, to maximal sparsity with each input resource having just one randomly chosen output resource as
s
→ 1. With this notation, we have
(28)
(29)
The final two lines handle the case when the ‘waste’ type is being consumed. For these columns, the first and second tiers are identical. This led to an ambiguity in the expression presented in the Supporting Information of [ 22 ], which we have now clarified by treating this case separately.
Once an instance of the Community class is initialized, its state can be propagated forward in time using the Propagate method, as illustrated in Fig 1 . Since the dynamical equations and parameters were supplied at initialization, the only required argument for this method is the time T . When the method is invoked, the state and parameters for each well are sent to different CPU’s (as many as are available) using the Pool.map function from the multiprocessing module in the Python standard library. Then the dynamical equations are integrated using the odeint function from SciPy, which calls the LSODA solver from the FORTRAN library ODEPACK [ 26 , 27 ].
If the initial population size of a species equals zero, but its per-capita growth rate is positive under current environmental conditions, the limited precision of any numerical solver will enable that species to spontaneously invade the community. To prevent this from happening, the Propagate method comes with an option compress_species (set to True by default), which reduces the dimensionality of the state and parameters before invoking the solver by removing all references to extinct species. Compression requires deciding whether each dimension of each parameter array corresponds to species or resources. This information is built in to the package for the MicroCRM, so c , D , w , g , m , l , R0 , tau and r are all automatically handled correctly according to their definitions in that context. If custom dynamics are used, a dictionary of parameter dimensions must be supplied to the optional argument dimensions of Community when the instance is initialized, listing which parameters are of length S, length M, or of shape SxM, SxS, or MxM, respectively.
The second method that can be invoked on a Community instance is Passage , which simulates pipetting of cultures to fresh wells in a typical 96-well plate experiment. The only required argument for Passage is a two-dimensional array f , whose elements f μν specify the fraction of the contents of well ν from the old plate that should be transfered to well μ on the new plate. The method also contains an option refresh_resource , set to True by default, that supplies the new plate with the same initial resource concentrations as the original one. This is the most direct way of simulating actual 96-well plate experiments, where the resources are resupplied in discrete intervals at each passaging step.
This method facilitates simulation of various kinds of mixing or coalescence experiments, as well as metacommunity dynamics of weakly coupled local communities. Fig 4 illustrates how this feature can capture coarse-grained spatial structure, following an experimental protocol developed for mimicking range expansions in 96-well plates [ 28 ].
In addition, passaging helps to stabilize long simulations, and simulate demographic noise, by setting all cell counts to integer values. The species abundances N i are converted to absolute cell counts through a conversion factor scale , which is by default set to 10 6 . Then the new integer cell counts are generated by multinomial sampling based on the average number of cells ∑ ν f μν ( N i ) ν of species i transferred to well μ . One source of numerical instability in ecological models is the exponentially small values reached by population sizes headed for extinction. The multinomial sampling ensures that these population sizes are fixed at zero when they become significantly smaller than 1 in absolute units. Thus a simple way of avoiding instability in a continuously resupplied chemostat model is to periodically call Passage with f set to the identity matrix and refresh_resource set to False .
Since the most common experiments involve many iterations of identical Passage and Propagate steps, the package also includes a method RunExperiment(f,T,np) , which applies a given transfer matrix f and propagation time T for np iterations, saving a snapshot of the plate after each propagation step. If passaging is being used simply to stabilize the integration as discussed above, and not to simulate a batch culture setting, then the value of T does not affect the results (as long as it is short enough to successfully eliminate instabilities). In this case, this variable mainly serves to control the time-resolution of the resulting timeseries.
In many experimental contexts, one is interested in the stable community structure reached after a long period of constant environmental conditions. Numerical integration of the dynamical equations is an inefficient way to identify these equilibrium points. When the species diversity and number of resource types are high, the equations typically contain complex transient dynamics spanning a large range of time scales. This transient behavior is often irrelevant to the identification of the final equilibrium point, and wastes significant computation time. For a typical implementation of the MicroCRM with Type-I response and a 1, 280 × 1, 280 binary consumer matrix, integrating to the steady state takes about 37 hours on standard hardware. The computation time appears to scale asymptotically as M 4 when the number of species S and the number of resources M are changed simultaneously, as shown in Fig 6 .
To address this problem, we have developed an algorithm for identifying equilibrium points directly, without integration through the transient. This algorithm is implemented in Community Simulator as the method SteadyState , and is illustrated in Fig 5 . Under the same test conditions, this algorithm converges between one and two orders of magnitude faster than numerical integration, as shown in Fig 6 , facilitating rapid hypothesis evaluation and iteration in large ecosystems. Note that while all other features of the Community Simulator depend only on packages included with a standard Anaconda installation ( https://www.anaconda.com/ ), SteadyState additionally requires prior installation of a convex optimization package called CVXPY ( https://www.cvxpy.org/ ).
(a)
Noninvadable states by definition can only exist in the region Ω of resource space where the growth rate
dN
i
/
dt
of each species
i
is zero or negative. Here, the blue and orange lines represent the combinations of resource abundances leading to zero growth rate for two different consumer species, so the noninvadable region is the space beneath both of the lines. Within this region, a recently discovered duality implies that the stationary state
R
* locally minimizes the dissimilarity
d
(
R
0
,
R
) with respect to the fixed point
R
0
of the intrinsic environmental dynamics [
23
,
24
].
(b)
Metabolic byproducts move the relevant unperturbed state from
R
0
(gray ‘x’) to
(black ‘x’), which is itself a function of the current environmental conditions. Dotted contour lines represent
, and arrows are two trajectories of the population dynamics starting from the unperturbed environmental state with two different sets of initial consumer population sizes. See main text and Appendix for model details and parameters.
(c)
Pseudocode for self-consistently computing
R
* and
, which is identical to standard expectation-maximization algorithms employed for problems with latent variables in machine learning.
doi:10.1371/journal.pone.0230430.g005
The algorithm exploits a recently discovered duality between consumer resource models and constrained optimization over resource space [ 23 , 24 ], which generalizes a minimization principle originally identified by MacArthur in the context of his original Consumer Resource Model [ 6 ]. This duality applies to a wide class of consumer-resource type models, requiring only that the environmentally mediated interactions between pairs of consumer species are symmetric [ 24 ].
For models in this class, it was shown that the vector of resource abundances
R
* in every stable equilibrium state locally minimizes a measure
d
(
R
0
,
R
) of the dissimilarity between the current resource abundances
R
and the supply point
R
0
(defined in general by
), subject to the constraint that all consumer growth rates are zero or negative (
dN
i
/
dt
≤ 0 for all
i
). Instances of the MicroCRM with Type I resource consumption, no metabolic regulation and
l
α
= 0 fall into this class. For externally supplied resources,
d
turns out to be a weighted Kullback-Leibler (KL) divergence:
while for self-renewing resources, it is a weighted Euclidean distance [
24
]:
The equilibrium consumer populations
are the Lagrange multipliers that enforce the constraints. For models with Type-I response, the non-invadable region is convex, allowing for efficient solution of the optimization problem using the Python package CVXPY [
29
,
30
].
The duality does not strictly apply to other variants of the MicroCRM. In particular, byproduct secretion breaks the symmetry of the effective interactions between consumers whenever
l
α
> 0 for some resource
α
. The duality can be recovered, however, if the equilibrium point
R
0
of the intrinsic resource dynamics is changed to a new value
, which accounts for the extra resources produced by the consumer species when the system is at its equilibrium state
R
* [
24
]. This accounting can be done in a variety of ways that all successfully recover the duality, with varying degrees of computational efficiency. The currently implemented form is
where
and
are the elements of the matrix inverse of
Q
αβ
, satisfying
. With these definitions, one can show that the MicroCRM with Type I consumption and no regulation minimizes the objective function
where
To find R *, one must now self-consistently solve the following equation:
The structure of this problem is mathematically equivalent to a standard task in machine learning, where one attempts to infer model parameters from partial data [
31
]. These parameters
θ
specify a multivariate probability distribution
p
(
y
|
θ
) for a set of measurements
y
. A standard way of estimating the parameters is to compute the values
that maximize the likelihood of the data:
. But if one actually has access to only a subset
x
of the measurement results, then the values
z
of the remaining quantities must also be estimated in order to perform this optimization. Ideally, one would use the statistical model with the optimal parameters
for this task. In the simplest case, where the value of
z
can be inferred with certainty given
θ
and
x
, this results in the following self-consistency equation:
This is identical in form to
Eq 36
, where the parameters
θ
become the resource concentrations
R
and the estimated latent variables
z
become the effective unperturbed state
. The observed data
x
become the model parameters, which are implicitly used in the calculation of
d
and
. We can think of the environmental perturbation
d
as a statistical potential or “free energy” −ln
p
, which is minimized when
p
is maximized.
This algorithm fails to converge at low resource supply levels, because both arguments must be positive when
d
is a weighted KL divergence, but
can temporarily become negative under these conditions. To solve this issue, we replaced update step for
by
where
α
is a constant rate, equivalent to the “learning rate” in machine learning [
31
].
Default values of the tolerance δ and learning rate α are set to 10 −7 and 0.5, respectively, which give robust convergence for typical simulation scenarios. They can be adjusted as optional arguments of the SteadyState method.
This algorithm can be applied to any consumer-resource type model, including models beyond the MicroCRM framework, with non-substitutable resources [ 24 ]. But the enhanced efficiency of the new approach requires that the optimization problem be convex. In the Community Simulator package, the algorithm is only implemented for Type-I response with no metabolic regulation, where convexity is guaranteed. For more complex models, the differential equations must be numerically integrated using the Propagate method discussed above.
In the tutorial notebook included with the package, we show that the MicroCRM can be bistable if the externally supplied resources are insufficient to directly support growth of any consumer species. In this scenario the state with all consumers extinct is a stable equilibrium of the dynamics, and another stable equilibrium with persisting consumers is also possible that relies on the metabolic byproducts. In this scenario
SteadyState
method can find either of the two equilibria, depending on the initial estimate of
, which can be set by an optional argument. If this initial condition is sufficiently close to the actual equilibrium state
R
0
of the intrinsic resource dynamics, the method ends in the state with the consumers extinct, where
. But if the initial condition is not deliberately tuned to be close to
R
0
, we find that the method typically finds the other state where some consumers survive.
We hope that the Community Simulator will become a valuable resource for the microbial ecology community. It has already played an important role in our own work. The package initially facilitated the systematic evaluation of the robustness of results to different modeling assumptions in a study of the effects of total energy influx on community structure, diversity and function [ 24 ]. More recently, the convex optimization approach has made it possible to perform more than 100,000 independent simulations in a reinterpretation and extension of Robert May’s classic work on diversity and stability [ 32 , 33 ]. We have also employed the package to reproduce large-scale patterns in microbial biodiversity from the Human Microbiome Project, Earth Microbiome Project, and similar surveys [ 34 ]. Finally, the random matrix approach implemented in this package is amenable to analytic calculation in the limit of large numbers of species and resources, using cavity methods from the physics of disordered systems [ 35 , 36 ]. It is our belief that the Community Simulator will facilitate the further development of these mathematical techniques through efficient testing of new conjectures.
One interesting future direction to explore is integrating the Community Simulator with methods for directly analyzing Microbiome sequencing data. For example, there has been a renewed interest in statistical techniques such as Approximate Bayesian Computation (ABC) for understanding ecology and evolution [ 37 ]. In ABC, the need to exactly calculate complicated likelihood functions—often a prerequisite for many statistical techniques—is replaced with the calculation of summary statistics and numerical simulations. For this reason, the Community Simulator Python package is ideally suited to form the backbone of new inference techniques for trying to related ecological processes to observed abundance patterns in microbial ecosystems.
We are grateful to Kirill Korolev, Alvaro Sanchez, and Daniel Segrè for many useful conversations, and to Matti Gralka for testing cross-platform compatibility of the package. The performance evaluation reported in Fig 6 was performed on the Shared Computing Cluster which is administered by Boston University Research Computing Services.
Discover a faster, simpler path to publishing in a high-quality journal. PLOS ONE promises fair, rigorous peer review, broad scope, and wide readership – a perfect fit for your research every time.
|
|
老实的电梯 · 使用 REST API 或 Azure 门户进行 Azure 通知中心和 Google Firebase Cloud Messaging (FCM) 迁移 | Microsoft Learn 1 年前 |
|
|
淡定的橙子 · 多路SOCKET、心跳包、注册包的作用_服务器 2 年前 |