首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Demographic processes directly affect patterns of genetic variation within contemporary populations as well as future generations, allowing for demographic inference from patterns of both present-day and past genetic variation. Advances in laboratory procedures, sequencing and genotyping technologies in the past decades have resulted in massive increases in high-quality genome-wide genetic data from present-day populations and allowed retrieval of genetic data from archaeological material, also known as ancient DNA. This has resulted in an explosion of work exploring past changes in population size, structure, continuity and movement. However, as genetic processes are highly stochastic, patterns of genetic variation only indirectly reflect demographic histories. As a result, past demographic processes need to be reconstructed using an inferential approach. This usually involves comparing observed patterns of variation with model expectations from theoretical population genetics. A large number of approaches have been developed based on different population genetic models that each come with assumptions about the data and underlying demography. In this article I review some of the key models and assumptions underlying the most commonly used approaches for past demographic inference and their consequences for our ability to link the inferred demographic processes to the archaeological and climate records.This article is part of the theme issue ‘Cross-disciplinary approaches to prehistoric demography’.  相似文献   

2.
Bayesian hierarchical error model for analysis of gene expression data   总被引:1,自引:0,他引:1  
MOTIVATION: Analysis of genome-wide microarray data requires the estimation of a large number of genetic parameters for individual genes and their interaction expression patterns under multiple biological conditions. The sources of microarray error variability comprises various biological and experimental factors, such as biological and individual replication, sample preparation, hybridization and image processing. Moreover, the same gene often shows quite heterogeneous error variability under different biological and experimental conditions, which must be estimated separately for evaluating the statistical significance of differential expression patterns. Widely used linear modeling approaches are limited because they do not allow simultaneous modeling and inference on the large number of these genetic parameters and heterogeneous error components on different genes, different biological and experimental conditions, and varying intensity ranges in microarray data. RESULTS: We propose a Bayesian hierarchical error model (HEM) to overcome the above restrictions. HEM accounts for heterogeneous error variability in an oligonucleotide microarray experiment. The error variability is decomposed into two components (experimental and biological errors) when both biological and experimental replicates are available. Our HEM inference is based on Markov chain Monte Carlo to estimate a large number of parameters from a single-likelihood function for all genes. An F-like summary statistic is proposed to identify differentially expressed genes under multiple conditions based on the HEM estimation. The performance of HEM and its F-like statistic was examined with simulated data and two published microarray datasets-primate brain data and mouse B-cell development data. HEM was also compared with ANOVA using simulated data. AVAILABILITY: The software for the HEM is available from the authors upon request.  相似文献   

3.
Richman A 《Molecular ecology》2000,9(12):1953-1963
Extreme genetic polymorphism maintained by balancing selection (so called because many alleles are maintained in a balance by a mechanism of rare allele advantage) is intimately associated with the important task of self/non-self-discrimination. Widely disparate self-recognition systems of plants, animals and fungi share several general features, including the maintenance of large numbers of alleles at relatively even frequency, and persistence of this variation over very long time periods. Because the evolutionary dynamics of balanced polymorphism are very different from those of neutral genetic variation, data on balanced polymorphism have been used as a novel source for inference of the history of populations. This review highlights the unique evolutionary properties of balanced genetic polymorphism, and the use of theoretical understanding in analysis and application of empirical data for inference of population history. However, a second goal of this review is to point out where current theory is incomplete. Recent observations suggest that entirely novel selective forces may act in concert with balancing selection, and these novel forces may be extremely potent in shaping genetic variation at self-recognition loci.  相似文献   

4.
Coalescent theory is commonly used to perform population genetic inference at the nucleotide level. Here, we examine the procedure that fixes the number of segregating sites (henceforth the FS procedure). In this approach a fixed number of segregating sites (S) are placed on a coalescent tree (independently of the total and internode lengths of the tree). Thus, although widely used, the FS procedure does not strictly follow the assumptions of coalescent theory and must be considered an approximation of (i) the standard procedure that uses a fixed population mutation parameter theta, and (ii) procedures that condition on the number of segregating sites. We study the differences in the false positive rate for nine statistics by comparing the FS procedure with the procedures (i) and (ii), using several evolutionary models with single-locus and multilocus data. Our results indicate that for single-locus data the FS procedure is accurate for the equilibrium neutral model, but problems arise under the alternative models studied; furthermore, for multilocus data, the FS procedure becomes inaccurate even for the standard neutral model. Therefore, we recommend a procedure that fixes the theta value (or alternatively, procedures that condition on S and take into account the uncertainty of theta) for analysing evolutionary models with multilocus data. With single-locus data, the FS procedure should not be employed for models other than the standard neutral model.  相似文献   

5.
Comparison of the performance and accuracy of different inference methods, such as maximum likelihood (ML) and Bayesian inference, is difficult because the inference methods are implemented in different programs, often written by different authors. Both methods were implemented in the program MIGRATE, that estimates population genetic parameters, such as population sizes and migration rates, using coalescence theory. Both inference methods use the same Markov chain Monte Carlo algorithm and differ from each other in only two aspects: parameter proposal distribution and maximization of the likelihood function. Using simulated datasets, the Bayesian method generally fares better than the ML approach in accuracy and coverage, although for some values the two approaches are equal in performance. MOTIVATION: The Markov chain Monte Carlo-based ML framework can fail on sparse data and can deliver non-conservative support intervals. A Bayesian framework with appropriate prior distribution is able to remedy some of these problems. RESULTS: The program MIGRATE was extended to allow not only for ML(-) maximum likelihood estimation of population genetics parameters but also for using a Bayesian framework. Comparisons between the Bayesian approach and the ML approach are facilitated because both modes estimate the same parameters under the same population model and assumptions.  相似文献   

6.
In 1962, Donald Caspar and Aaron Klug published their classic theory of virus structure. They developed their theory with an explicit analogy between spherical viruses and Buckminster Fuller's geodesic domes. In this paper, I use the spherical virus-geodesic dome case to develop an account of analogy and deductive analogical inference based on the notion of an isomorphism. I also consider under what conditions there is a good reason to claim an experimentally untested analogy is plausible.  相似文献   

7.
In addition to the well-studied evolutionary parameters of (1) phenotype-fitness covariance and (2) the genetic basis of phenotypic variation, adaptive evolution by natural selection requires that (3) fitness variation is effected by heritable genetic differences among individuals and (4) phenotype-fitness covariances must be, at least in part, underlain by genetic covariances. These latter two requirements for adaptive evolutionary change are relatively unstudied in natural populations. Absence of the latter requirements could explain stasis of apparently directionally selected heritable traits. We provide complementary analyses of selection and variation at phenotypic and genetic levels for juvenile growth rate in brook charr Salvelinus fontinalis in Freshwater River, Newfoundland, Canada. Contrary to the vast majority of reports in fish, we found very little viability selection of juvenile body size. Large body size appears nonetheless to be selectively advantageous via a relationship with early maturity. Genetic patterns in evolutionary parameters largely reflected phenotypic patterns. We have provided inference of selection based on longitudinal data, which are uncommon in high fecundity organisms. Furthermore we have provided a practicable framework for further studies of the genetic basis of natural selection.  相似文献   

8.
Computer-aided process planning is becoming a widely prevalent technology in modern manufacturing systems. The research presented here describes a new methodology for generating process plans based on the analogy deductive paradigm. The method uses rules that represent relations between two shapes and allow inference of the type: shape A is to shape B as C is to D, where usually D is the unknown shape. The system uses backward chaining and therefore gradually converts the part from its finished (designed) form into its initial form. This method can generate multiple process plans for each given part; the paper also presents a method of selecting the best combination of process plans to maximize the production rate of that part. Once the dominant combination of plans is selected, the paper presents a method to calculate a proper production quantity for each process plan. This method is based on “coalition theory” and uses Shapley values to evaluate each member of such a coalition. The system has been implemented on a SUN workstation using Quintus Prolog and C++. The current implementation considers prismatic parts only.  相似文献   

9.
Gianola D  van Kaam JB 《Genetics》2008,178(4):2289-2303
Reproducing kernel Hilbert spaces regression procedures for prediction of total genetic value for quantitative traits, which make use of phenotypic and genomic data simultaneously, are discussed from a theoretical perspective. It is argued that a nonparametric treatment may be needed for capturing the multiple and complex interactions potentially arising in whole-genome models, i.e., those based on thousands of single-nucleotide polymorphism (SNP) markers. After a review of reproducing kernel Hilbert spaces regression, it is shown that the statistical specification admits a standard mixed-effects linear model representation, with smoothing parameters treated as variance components. Models for capturing different forms of interaction, e.g., chromosome-specific, are presented. Implementations can be carried out using software for likelihood-based or Bayesian inference.  相似文献   

10.
As the field of phylogeography has continued to move in the model‐based direction, researchers continue struggling to construct useful models for inference. These models must be both simple enough to be tractable yet contain enough of the complexity of the natural world to make meaningful inference. Beyond constructing such models for inference, researchers explore model space and test competing models with the data on hand, with the goal of improving the understanding of the natural world and the processes underlying natural biological communities. Approximate Bayesian computation (ABC) has increased in recent popularity as a tool for evaluating alternative historical demographic models given population genetic samples. As a thorough demonstration, Pelletier & Carstens ( 2014 ) use ABC to test 143 phylogeographic submodels given geographically widespread genetic samples from the salamander species Plethodon idahoensis (Carstens et al. 2004 ) and, in so doing, demonstrate how the results of the ABC model choice procedure are dependent on the model set one chooses to evaluate.  相似文献   

11.
The problems are discussed related to development of concepts of rational taxonomy and rational classifications (taxonomic systems) in biology. Rational taxonomy is based on the assumption that the key characteristic of rationality is deductive inference of certain partial judgments about reality under study from other judgments taken as more general and a priory true. Respectively, two forms of rationality are discriminated--ontological and epistemological ones. The former implies inference of classifications properties from general (essential) properties of the reality being investigated. The latter implies inference of the partial rules of judgments about classifications from more general (formal) rules. The following principal concepts of ontologically rational biological taxonomy are considered: "crystallographic" approach, inference of the orderliness of organismal diversity from general laws of Nature, inference of the above orderliness from the orderliness of ontogenetic development programs, based on the concept of natural kind and Cassirer's series theory, based on the systemic concept, based on the idea of periodic systems. Various concepts of ontologically rational taxonomy can be generalized by an idea of the causal taxonomy, according to which any biologically sound classification is founded on a contentwise model of biological diversity that includes explicit indication of general causes responsible for that diversity. It is asserted that each category of general causation and respective background model may serve as a basis for a particular ontologically rational taxonomy as a distinctive research program. Concepts of epistemologically rational taxonomy and classifications (taxonomic systems) can be interpreted in terms of application of certain epistemological criteria of substantiation of scientific status of taxonomy in general and of taxonomic systems in particular. These concepts include: consideration of taxonomy consistency from the standpoint of inductive and hypothetico-deductive argumentation schemes and such fundamental criteria of classifications naturalness as their prognostic capabilities; foundation of a theory of "general taxonomy" as a "general logic", including elements of the axiomatic method. The latter concept constitutes a core of the program of general classiology; it is inconsistent due to absence of anything like "general logic". It is asserted that elaboration of a theory of taxonomy as a biological discipline based on the formal principles of epistemological rationality is not feasible. Instead, it is to be elaborated as ontologically rational one based on biologically sound metatheories about biological diversity causes.  相似文献   

12.
13.
The characteristics of deleterious genes have been of great interest in both theory and practice in genetics. Because of the complex genetic mechanism of these deleterious genes, most current studies try to estimate the overall magnitude of mortality effects on a population, which is characterized classically by the number of lethal equivalents. This number is a combination of several parameters, each of which has a distinct biological effect on genetic mortality. In conservation and breeding programs, it is important to be able to distinguish among different combinations of these parameters that lead to the same number of lethal equivalents, such as a large number of mildly deleterious genes or a few lethal genes, The ability to distinguish such parameter combinations requires more than one generation of mating. We propose a model for survival data from a two-generation mating experiment on the plant species Brassica rapa, and we enable inference with Markov chain Monte Carlo. This computational strategy is effective because a vast amount of missing genotype information must be accounted for. In addition to the lethal equivalents, the two-generation data provide separate information on the average intensity of mortality and the average number of deleterious genes carried by an individual. In our Markov chain Monte Carlo algorithm, we use a vector proposal distribution to overcome inefficiency of a single-site Gibbs sampler. Information about environmental effects is obtained from an outcrossing experiment conducted in parallel with the two-generation mating experiments.  相似文献   

14.
While most outcomes may in part be genetically mediated, quantifying genetic heritability is a different matter. To explore data on twins and decompose the variation is a classical method to determine whether variation in outcomes, e.g. IQ or schooling, originate from genetic endowments or environmental factors. Despite some criticism, the model is still widely used. The critique is generally related to how estimates of heritability may encompass environmental mediation. This aspect is sometimes left implicit by authors even though its relevance for the interpretation is potentially profound. This short note is an appeal for clarity from authors when interpreting the magnitude of heritability estimates. It is demonstrated how disregarding existing theoretical contributions can easily lead to unnecessary misinterpretations and/or controversies. The key arguments are relevant also for estimates based on data of adopted children or from modern molecular genetics research.  相似文献   

15.
Journal of Mathematical Biology - Phylogenetic inference aims to reconstruct the evolutionary relationships of different species based on genetic (or other) data. Discrete characters are a...  相似文献   

16.
Estimating evolutionary parameters when viability selection is operating   总被引:2,自引:0,他引:2  
Some individuals die before a trait is measured or expressed (the invisible fraction), and some relevant traits are not measured in any individual (missing traits). This paper discusses how these concepts can be cast in terms of missing data problems from statistics. Using missing data theory, I show formally the conditions under which a valid evolutionary inference is possible when the invisible fraction and/or missing traits are ignored. These conditions are restrictive and unlikely to be met in even the most comprehensive long-term studies. When these conditions are not met, many selection and quantitative genetic parameters cannot be estimated accurately unless the missing data process is explicitly modelled. Surprisingly, this does not seem to have been attempted in evolutionary biology. In the case of the invisible fraction, viability selection and the missing data process are often intimately linked. In such cases, models used in survival analysis can be extended to provide a flexible and justified model of the missing data mechanism. Although missing traits pose a more difficult problem, important biological parameters can still be estimated without bias when appropriate techniques are used. This is in contrast to current methods which have large biases and poor precision. Generally, the quantitative genetic approach is shown to be superior to phenotypic studies of selection when invisible fractions or missing traits exist because part of the missing information can be recovered from relatives.  相似文献   

17.
Diao G  Lin DY 《Biometrics》2005,61(3):789-798
Statistical methods for the detection of genes influencing quantitative traits with the aid of genetic markers are well developed for normally distributed, fully observed phenotypes. Many experiments are concerned with failure-time phenotypes, which have skewed distributions and which are usually subject to censoring because of random loss to follow-up, failures from competing causes, or limited duration of the experiment. In this article, we develop semiparametric statistical methods for mapping quantitative trait loci (QTLs) based on censored failure-time phenotypes. We formulate the effects of the QTL genotype on the failure time through the Cox (1972, Journal of the Royal Statistical Society, Series B 34, 187-220) proportional hazards model and derive efficient likelihood-based inference procedures. In addition, we show how to assess statistical significance when searching several regions or the entire genome for QTLs. Extensive simulation studies demonstrate that the proposed methods perform well in practical situations. Applications to two animal studies are provided.  相似文献   

18.
Inferring qualitative relations in genetic networks and metabolic pathways   总被引:8,自引:0,他引:8  
MOTIVATION: Inferring genetic network architecture from time series data of gene expression patterns is an important topic in bioinformatics. Although inference algorithms based on the Boolean network were proposed, the Boolean network was not sufficient as a model of a genetic network. RESULTS: First, a Boolean network model with noise is proposed, together with an inference algorithm for it. Next, a qualitative network model is proposed, in which regulation rules are represented as qualitative rules and embedded in the network structure. Algorithms are also presented for inferring qualitative relations from time series data. Then, an algorithm for inferring S-systems (synergistic and saturable systems) from time series data is presented, where S-systems are based on a particular kind of nonlinear differential equation and have been applied to the analysis of various biological systems. Theoretical results are shown for Boolean networks with noises and simple qualitative networks. Computational results are shown for Boolean networks with noises and S-systems, where real data are not used because the proposed models are still conceptual and the quantity and quality of currently available data are not enough for the application of the proposed methods.  相似文献   

19.
Recovering gene regulatory networks from expression data is a challenging problem in systems biology that provides valuable information on the regulatory mechanisms of cells. A number of algorithms based on computational models are currently used to recover network topology. However, most of these algorithms have limitations. For example, many models tend to be complicated because of the “large p, small n” problem. In this paper, we propose a novel regulatory network inference method called the maximum-relevance and maximum-significance network (MRMSn) method, which converts the problem of recovering networks into a problem of how to select the regulator genes for each gene. To solve the latter problem, we present an algorithm that is based on information theory and selects the regulator genes for a specific gene by maximizing the relevance and significance. A first-order incremental search algorithm is used to search for regulator genes. Eventually, a strict constraint is adopted to adjust all of the regulatory relationships according to the obtained regulator genes and thus obtain the complete network structure. We performed our method on five different datasets and compared our method to five state-of-the-art methods for network inference based on information theory. The results confirm the effectiveness of our method.  相似文献   

20.
The ascertainment problem arises when families are sampled by a nonrandom process and some assumption about this sampling process must be made in order to estimate genetic parameters. Under classical ascertainment assumptions, estimation of genetic parameters cannot be separated from estimation of the parameters of the ascertainment process, so that any misspecification of the ascertainment process causes biases in estimation of the genetic parameters. Ewens and Shute proposed a resolution to this problem, involving conditioning the likelihood of the sample on the part of the data which is "relevant to ascertainment." The usefulness of this approach can only be assessed by examining the properties (in particular, bias and standard error) of the estimates which arise by using it for a wide range of parameter values and family size distributions and then comparing these biases and standard errors with those arising under classical ascertainment procedures. These comparisons are carried out in the present paper, and we also compare the proposed method with procedures which condition on, or ignore, parts of the data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号