首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background  

The aim of protein design is to predict amino-acid sequences compatible with a given target structure. Traditionally envisioned as a purely thermodynamic question, this problem can also be understood in a wider context, where additional constraints are captured by learning the sequence patterns displayed by natural proteins of known conformation. In this latter perspective, however, we still need a theoretical formalization of the question, leading to general and efficient learning methods, and allowing for the selection of fast and accurate objective functions quantifying sequence/structure compatibility.  相似文献   

2.
The kin-cohort design is a promising alternative to traditional cohort or case-control designs for estimating penetrance of an identified rare autosomal mutation. In this design, a suitably selected sample of participants provides genotype and detailed family history information on the disease of interest. To estimate penetrance of the mutation, we consider a marginal likelihood approach that is computationally simple to implement, more flexible than the original analytic approach proposed by Wacholder et al. (1998, American Journal of Epidemiology 148, 623-629), and more robust than the likelihood approach considered by Gail et al. (1999, Genetic Epidemiology 16, 15-39) to presence of residual familial correlation. We study the trade-off between robustness and efficiency using simulation experiments. The method is illustrated by analysis of the data from the Washington Ashkenazi Study.  相似文献   

3.
This study compares the continental phylogeographic patterns of two wild European species linked by a host-parasite relationship: the field mouse Apodemus sylvaticus and one of its specific parasites, the nematode Heligmosomoides polygyrus. A total of 740 base pairs (bp) of the mitochondrial cytochrome b (cyt b) gene were sequenced in 122 specimens of H. polygyrus and compared with 94 cyt b gene sequences (974 bp) previously acquired for A. sylvaticus. The results reveal partial spatial and temporal congruences in the differentiation of both species' lineages: the parasite and its host present three similar genetic and geographical lineages, i.e. Western European, Italian and Sicilian, and both species recolonized northwestern Europe from the Iberian refuge at the end of the Pleistocene. However, H. polygyrus presents three particular differentiation events. The relative rate of molecular evolution of the cyt b gene was estimated to be 1.5-fold higher in the parasite than in its host. Therefore, the use of H. polygyrus as a biological magnifying glass is discussed as this parasite may highlight previously undetected historical events of its host. The results show how incorporating phylogeographic information of an obligate associate can help to better understand the phylogeographic pattern of its host.  相似文献   

4.
5.
Approximate likelihood ratios for general estimating functions   总被引:1,自引:0,他引:1  
The method of estimating functions (Godambe, 1991) is commonlyused when one desires to conduct inference about some parametersof interest but the full distribution of the observations isunknown. However, this approach may have limited utility, dueto multiple roots for the estimating function, a poorly behavedWald test, or lack of a goodness-of-fit test. This paper presentsapproximate likelihood ratios that can be used along with estimatingfunctions when any of these three problems occurs. We show thatthe approximate likelihood ratio provides correct large sampleinference under very general circumstances, including clustereddata and misspecified weights in the estimating function. Twomethods of constructing the approximate likelihood ratio, onebased on the quasi-likelihood approach and the other based onthe linear projection approach, are compared and shown to beclosely related. In particular we show that quasi-likelihoodis the limit of the projection approach. We illustrate the techniquewith two applications.  相似文献   

6.
Abstract We present moments and likelihood methods that estimate a DNA substitution rate from a group of closely related sister species pairs separated at an assumed time, and we test these methods with simulations. The methods also estimate ancestral population size and can test whether there is a significant difference among the ancestral population sizes of the sister species pairs. Estimates presented in the literature often ignore the ancestral coalescent prior to speciation and therefore should be biased upward. The simulations show that both methods yield accurate estimates given sample sizes of five or more species pairs and that better likelihood estimates are obtained if there is no significant difference among ancestral population sizes. The model presented here indicates that the larger than expected variation found in multitaxa datasets can be explained by variation in the ancestral coalescence and the Poisson mutation process. In this context, observed variation can often be accounted for by variation in ancestral population sizes rather than invoking variation in other parameters, such as divergence time or mutation rate. The methods are applied to data from two groups of species pairs (sea urchins and Alpheus snapping shrimp) that are thought to have separated by the rise of Panama three million years ago.  相似文献   

7.
We examine the evolution of mesic forest ecosystems in the Pacific Northwest of North America using a statistical phylogeography approach in four animal and two plant lineages. Three a priori hypotheses, which explain the disjunction in the mesic forest ecosystem with either recent dispersal or ancient vicariance, are tested with phylogenetic and coalescent methods. We find strong support in three amphibian lineages (Ascaphus spp., and Dicampton spp., and Plethodon vandykei and P. idahoensis) for deep divergence between coastal and inland populations, as predicted by the ancient vicariance hypothesis. Unlike the amphibians, the disjunction in other Pacific Northwest lineages is likely due to recent dispersal along a northern route. Topological and population divergence tests support the northern dispersal hypothesis in the water vole (Microtus richardsoni) and northern dispersal has some support in both the dusky willow (Salix melanopsis) and whitebark pine (Pinus albicaulis). These analyses demonstrate that genetic data sampled from across an ecosystem can provide insight into the evolution of ecological communities and suggest that the advantages of a statistical phylogeographic approach are most pronounced in comparisons across multiple taxa in a particular ecosystem. Genetic patterns in organisms as diverse as willows and salamanders can be used to test general regional hypotheses, providing a consistent metric for comparison among members of an ecosystem with disparate life-history traits.  相似文献   

8.
A penalized maximum likelihood method for estimating epistatic effects of QTL   总被引:16,自引:0,他引:16  
Zhang YM  Xu S 《Heredity》2005,95(1):96-104
Although epistasis is an important phenomenon in the genetics and evolution of complex traits, epistatic effects are hard to estimate. The main problem is due to the overparameterized epistatic genetic models. An epistatic genetic model should include potential pair-wise interaction effects of all loci. However, the model is saturated quickly as the number of loci increases. Therefore, a variable selection technique is usually considered to exclude those interactions with negligible effects. With such techniques, we may run a high risk of missing some important interaction effects by not fully exploring the extremely large parameter space of models. We develop a penalized maximum likelihood method. The method developed here adopts a penalty that depends on the values of the parameters. The penalized likelihood method allows spurious QTL effects to be shrunk towards zero, while QTL with large effects are estimated with virtually no shrinkage. A simulation study shows that the new method can handle a model with a number of effects 15 times larger than the sample size. Simulation studies also show that results of the penalized likelihood method are comparable to the Bayesian shrinkage analysis, but the computational speed of the penalized method is orders of magnitude faster.  相似文献   

9.
10.
Imperfect sensitivity, or imperfect detection, is a feature of all survey methods that needs to be accounted for when interpreting survey results. Detection of environmental DNA (eDNA) is increasingly being used to infer species distributions, yet the sensitivity of the technique has not been fully evaluated. Sensitivity, or the probability of detecting target DNA given it is present at a site, will depend on both the survey method and the concentration and dispersion of target DNA molecules at a site. We present a model to estimate target DNA concentration and dispersion at survey sites and to estimate the sensitivity of an eDNA survey method. We fitted this model to data from a species‐specific eDNA survey for Oriental weatherloach, Misgurnus anguillicaudatus, at three sites sampled in both autumn and spring. The concentration of target DNA molecules was similar at all three sites in autumn but much higher at two sites in spring. Our analysis showed the survey method had ≥95% sensitivity at sites where target DNA concentrations were ≥11 molecules per litre. We show how these data can be used to compare sampling schemes that differ in the number of field samples collected per site and number of PCR replicates per sample to achieve ≥95% sensitivity at a given target DNA concentration. These models allow researchers to quantify the sensitivity of eDNA survey methods to optimize the probability of detecting target species, and to compare DNA concentrations spatially and temporarily.  相似文献   

11.
Targeted maximum likelihood estimation of a parameter of a data generating distribution, known to be an element of a semi-parametric model, involves constructing a parametric model through an initial density estimator with parameter ? representing an amount of fluctuation of the initial density estimator, where the score of this fluctuation model at ? = 0 equals the efficient influence curve/canonical gradient. The latter constraint can be satisfied by many parametric fluctuation models since it represents only a local constraint of its behavior at zero fluctuation. However, it is very important that the fluctuations stay within the semi-parametric model for the observed data distribution, even if the parameter can be defined on fluctuations that fall outside the assumed observed data model. In particular, in the context of sparse data, by which we mean situations where the Fisher information is low, a violation of this property can heavily affect the performance of the estimator. This paper presents a fluctuation approach that guarantees the fluctuated density estimator remains inside the bounds of the data model. We demonstrate this in the context of estimation of a causal effect of a binary treatment on a continuous outcome that is bounded. It results in a targeted maximum likelihood estimator that inherently respects known bounds, and consequently is more robust in sparse data situations than the targeted MLE using a naive fluctuation model. When an estimation procedure incorporates weights, observations having large weights relative to the rest heavily influence the point estimate and inflate the variance. Truncating these weights is a common approach to reducing the variance, but it can also introduce bias into the estimate. We present an alternative targeted maximum likelihood estimation (TMLE) approach that dampens the effect of these heavily weighted observations. As a substitution estimator, TMLE respects the global constraints of the observed data model. For example, when outcomes are binary, a fluctuation of an initial density estimate on the logit scale constrains predicted probabilities to be between 0 and 1. This inherent enforcement of bounds has been extended to continuous outcomes. Simulation study results indicate that this approach is on a par with, and many times superior to, fluctuating on the linear scale, and in particular is more robust when there is sparsity in the data.  相似文献   

12.
Hypotheses to explain phylogeographic structure traditionally invoke geographic features, but often fail to provide a general explanation for spatial patterns of genetic variation. Organisms' intrinsic characteristics might play more important roles than landscape features in determining phylogeographic structure. We developed a novel comparative approach to explore the role of ecological and life‐history variables in determining spatial genetic variation and tested it on frog communities in Panama. We quantified spatial genetic variation within 31 anuran species based on mitochondrial DNA sequences, for which hierarchical approximate Bayesian computation analyses rejected simultaneous divergence over a common landscape. Regressing ecological variables, on genetic divergence allowed us to test the importance of individual variables revealing that body size, current landscape resistance, geographic range, biogeographic origin and reproductive mode were significant predictors of spatial genetic variation. Our results support the idea that phylogeographic structure represents the outcome of an interaction between organisms and their environment, and suggest a conceptual integration we refer to as trait‐based phylogeography.  相似文献   

13.
Abstract At a time when historical biogeography appears to be again expanding its scope after a period of focusing primarily on discerning area relationships using cladograms, new inference methods are needed to bring more kinds of data to bear on questions about the geographic history of lineages. Here we describe a likelihood framework for inferring the evolution of geographic range on phylogenies that models lineage dispersal and local extinction in a set of discrete areas as stochastic events in continuous time. Unlike existing methods for estimating ancestral areas, such as dispersal‐vicariance analysis, this approach incorporates information on the timing of both lineage divergences and the availability of connections between areas (dispersal routes). Monte Carlo methods are used to estimate branch‐specific transition probabilities for geographic ranges, enabling the likelihood of the data (observed species distributions) to be evaluated for a given phylogeny and parameterized paleogeographic model. We demonstrate how the method can be used to address two biogeographic questions: What were the ancestral geographic ranges on a phylogenetic tree? How were those ancestral ranges affected by speciation and inherited by the daughter lineages at cladogenesis events? For illustration we use hypothetical examples and an analysis of a Northern Hemisphere plant clade (Cercis), comparing and contrasting inferences to those obtained from dispersal‐vicariance analysis. Although the particular model we implement is somewhat simplistic, the framework itself is flexible and could readily be modified to incorporate additional sources of information and also be extended to address other aspects of historical biogeography.  相似文献   

14.
We suggest a likelihood-based approach to estimate an overall rate of horizontal gene transfer (HGT) in a simplified setting. To this end, we assume that the number of occurring HGT events within a given time interval follows a Poisson process. To obtain estimates for the rate of HGT, we simulate the distribution of tree topologies for different numbers of HGT events on a clocklike species tree. Using these simulated distributions, we estimate an HGT rate for a collection of gene trees representing a set of taxa. As an illustrative example, we use the "Clusters of Orthologous Groups of proteins" (COGs). We also perform a correction of the estimated rate taking into account the inaccuracies due to gene tree reconstructions. The results suggest a corrected HGT rate of about 0.36 per gene and unit time, in other words 11 HGT events have occurred on average among the 44 taxa of the COG species tree. A software package to estimate an HGT rate is available online (http://www.cibiv.at/software/hgt/).  相似文献   

15.
We develop diagnostic measures for assessing the influence ofindividual observations when using empirical likelihood withgeneral estimating equations, and we use these measures to constructgoodness-of-fit statistics for testing possible misspecificationin the estimating equations. Our diagnostics include case-deletionmeasures, local influence measures and pseudo-residuals. Ourgoodness-of-fit statistics include the sum of local influencemeasures and the processes of pseudo-residuals. Simulation studiesare conducted to evaluate our methods, and real datasets areanalyzed to illustrate the use of our diagnostic measures andgoodness-of-fit statistics.  相似文献   

16.
In this paper we describe a new heuristic strategy designed to find optimal (parsimonious) trees for data sets with large numbers of taxa and characters. This new strategy uses an iterative searching process of branch swapping with equally weighted characters, followed by swapping with reweighted characters. This process increases the efficiency of the search because, after each round of swapping with reweighted characters, the subsequent swapping with equal weights will start from a different group (island) of trees that are only slightly, if at all, less optimal. In contrast, conventional heuristic searching with constant equal weighting can become trapped on islands of suboptimal trees. We test the new strategy against a conventional strategy and a modified conventional strategy and show that, within a given time, the new strategy finds trees that are markedly more parsimonious. We also compare our new strategy with a recent, independently developed strategy known as the Parsimony Ratchet.  相似文献   

17.
Summary Studies are carried out on the uniqueness of the stationary point on the likelihood function for estimating molecular phylogenetic trees, yielding proof that there exists at most one stationary point, i.e., the maximum point, in the parameter range for the one parameter model of nucleotide substitution. The proof is simple yet applicable to any type of tree topology with an arbitrary number of operational taxonomic units (OTUs). The proof ensures that any valid approximation algorithm be able to reach the unique maximum point under the conditions mentioned above. An algorithm developed incorporating Newton's approximation method is then compared with the conventional one by means of computers simulation. The results show that the newly developed algorithm always requires less CPU time than the conventional one, whereas both algorithms lead to identical molecular phylogenetic trees in accordance with the proof. Contribution No. 1780 from the National Institute of Genetics, Mishima 411, Japan  相似文献   

18.
Wetlands harbor rich biodiversity and biomass and provide a variety of ecosystem services. Therefore, environmental assessment of wetlands is critical for those seeking to manage these pivotal ecosystems. While the landscape development intensity (LDI) index is commonly used for wetlands assessment, it is, unfortunately, limited in scope and, in some regions, efficacy. The objectives of this study were to improve and modify methods for wetlands assessment using a multi-metric approach that incorporated both the LDI index and landscape metrics. We calculated the LDI index values for 10 test wetlands across both the area of each wetland and within a 0–600 m wide area (in 100-m intervals). The results showed that the LDI index values varied significantly as the buffer distance increased, and specifically, the wetlands plus a 300-m wide swath was found to encompass the most appropriate area of inclusion for assessment. Furthermore, based on the metrics selective criteria, only LDI index and area metrics were incorporated into the assessment schematic. Due to two types of wetlands identified significantly different in the scatter plots of the LDI index and area, the assessment system was built to accommodate appropriate cut-off points. Four levels were then designated; the coastal wetland had an overall accuracy of 70.0% (kappa coefficient of 0.61), while that of the inland wetlands, which included natural and artificial wetlands, was only 41.7% (kappa coefficient of 0.22). This study confirmed that the extent of assessment had an effect on the LDI value, and there were significant differences in the assessment schematics across wetland types. In addition, LDI and/or area indices could be incorporated into landscape assessments. The assessment methods of this study can be applied in regions with high population density and consequently altered terrain, though the metric, scoring ranges, and levels should be adjusted to local conditions.  相似文献   

19.
城市河流的景观生态学研究:概念框架   总被引:43,自引:0,他引:43  
岳隽  王仰麟  彭建 《生态学报》2005,25(6):1422-1429
城市河流作为城市景观中一种重要的生态廊道,其功能的正常实现与否关系到整个城市的可持续发展。通过分析当前城市河流的研究概况,发现应用景观生态学原理对城市河流展开多尺度、多学科的综合研究是实现“自然-人类-水体”可持续发展的必然趋势。从景观生态学的角度出发,结合城市河流的特点,提出了更为综合的、景观水平上的城市河流研究的概念框架。特别针对景观生态学研究的核心问题,对城市河流的研究尺度、格局分析、干扰程度等重要方面进行了详细论述,以期在景观水平上构建城市河流的可持续发展预案。  相似文献   

20.
We propose an approximate maximum likelihood method for estimating animal density and abundance from binary passive acoustic transects, when both the probability of detection and the range of detection are unknown. The transect survey is purposely designed so that successive data points are dependent, and this dependence is exploited to simultaneously estimate density, range of detection, and probability of detection. The data are assumed to follow a homogeneous Poisson process in space, and a second-order Markov approximation to the likelihood is used. Simulations show that this method has small bias under the assumptions used to derive the likelihood, although it performs better when the probability of detection is close to 1. The effects of violations of these assumptions are also investigated, and the approach is found to be sensitive to spatial trends in density and clustering. The method is illustrated using real acoustic data from a survey of sperm and humpback whales.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号