首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In a simulation study, different designs were compared for efficiency of fine-mapping of QTL. The variance component method for fine-mapping of QTL was used to estimate QTL position and variance components. The design of many families with small size gave a higher mapping resolution than a design with few families of large size. However, the difference is small in half sib designs. The proportion of replicates with the QTL positioned within 3 cM of the true position is 0.71 in the best design, and 0.68 in the worst design applied to 128 animals with a phenotypic record and a QTL explaining 25% of the phenotypic variance. The design of two half sib families each of size 64 was further investigated for a hypothetical population with effective size of 1000 simulated for 6000 generations with a marker density of 0.25 cM and with marker mutation rate 4 × 10-4 per generation. In mapping using bi-allelic markers, 42~55% of replicated simulations could position QTL within 0.75 cM of the true position whereas this was higher for multi allelic markers (48~76%). The accuracy was lowest (48%) when mutation age was 100 generations and increased to 68% and 76% for mutation ages of 200 and 500 generations, respectively, after which it was about 70% for mutation ages of 1000 generations and older. When effective size was linearly decreasing in the last 50 generations, the accuracy was decreased (56 to 70%). We show that half sib designs that have often been used for linkage mapping can have sufficient information for fine-mapping of QTL. It is suggested that the same design with the same animals for linkage mapping should be used for fine-mapping so gene mapping can be cost effective in livestock populations.  相似文献   

2.
Examining rates and patterns of nucleotide substitution in plants   总被引:19,自引:0,他引:19  
Driven by rapid improvements in affordable computing power and by the even faster accumulation of genomic data, the statistical analysis of molecular sequence data has become an active area of interdisciplinary research. Maximum likelihood methods have become mainstream because of their desirable properties and, more importantly, their potential for providing statistically sound solutions in complex data analysis settings. In this chapter, a review of recent literature focusing on rates and patterns of nucleotide substitution rates in the nuclear, chloroplast, and mitochondrial genomes of plants demonstrates the power and flexibility of these new methods. The emerging picture of the nucleotide substitution process in plants is a complex one. Evolutionary rates are seen to be quite variable, both among genes and among plant lineages. However, there are hints, particularly in the chloroplast, that individual factors can have important effects on many genes simultaneously.  相似文献   

3.
Dopamine-beta-hydroxylase (DBH) activity in serum was measured by spectrophotometric methods in 95 persons of a large family (HGAR 2), along with 27 polymorphic markers from blood, urine and saliva. The distribution of DBH activity, after appropriate transformation and age adjustment, showed a significantly better fit to a mixture of two normal distributions than a single normal distribution. Pedigree segregation analyses showed evidence of a possible major gene governing low levels of DBH activity, segregating in this family in a recessive fashion. Linkage analyses between that major locus and the 27 polymorphic markers showed no significant lod scores favoring linkage. The highest lod score obtained was 0.81 with Lp at zero recombination fraction. In addition, published data on DBH activity measured by radiochemical assays on 22 families with 161 members were reanalyzed as a quantitative trait, with appropriate correction for ascertainment bias. The results were similar to that of HGAR 2, corroborating the existence of a major locus for DBH activity.  相似文献   

4.
Reconstructing a tree of life by inferring evolutionary history is an important focus of evolutionary biology. Phylogenetic reconstructions also provide useful information for a range of scientific disciplines such as botany, zoology, phylogeography, archaeology and biological anthropology. Until the development of protein and DNA sequencing techniques in the 1960s and 1970s, phylogenetic reconstructions were based on fossil records and comparative morphological/physiological analyses. Since then, progress in molecular phylogenetics has compensated for some of the shortcomings of phenotype-based comparisons. Comparisons at the molecular level increase the accuracy of phylogenetic inference because there is no environmental influence on DNA/peptide sequences and evaluation of sequence similarity is not subjective. While the number of morphological/physiological characters that are sufficiently conserved for phylogenetic inference is limited, molecular data provide a large number of datapoints and enable comparisons from diverse taxa. Over the last 20 years, developments in molecular phylogenetics have greatly contributed to our understanding of plant evolutionary relationships. Regions in the plant nuclear and organellar genomes that are optimal for phylogenetic inference have been determined and recent advances in DNA sequencing techniques have enabled comparisons at the whole genome level. Sequences from the nuclear and organellar genomes of thousands of plant species are readily available in public databases, enabling researchers without access to molecular biology tools to investigate phylogenetic relationships by sequence comparisons using the appropriate nucleotide substitution models and tree building algorithms. In the present review, the statistical models and algorithms used to reconstruct phylogenetic trees are introduced and advances in the exploration and utilization of plant genomes for molecular phylogenetic analyses are discussed.  相似文献   

5.
Summary .   L-splines are a large family of smoothing splines defined in terms of a linear differential operator. This article develops L-splines within the context of linear mixed models and uses the resulting mixed model L-spline to analyze longitudinal data from a grassland experiment. In the spirit of time-series analysis, a periodic mixed model L-spline is developed, which partitions data into a smooth periodic component plus smooth long-term trend.  相似文献   

6.
Small subunit rRNA sequence data were generated for 27 strains of cyanobacteria and incorporated into a phylogenetic analysis of 1,377 aligned sequence positions from a diverse sampling of 53 cyanobacteria and 10 photosynthetic plastids. Tree inference was carried out using a maximum likelihood method with correction for site-to-site variation in evolutionary rate. Confidence in the inferred phylogenetic relationships was determined by construction of a majority-rule consensus tree based on alternative topologies not considered to be statistically significantly different from the optimal tree. The results are in agreement with earlier studies in the assignment of individual taxa to specific sequence groups. Several relationships not previously noted among sequence groups are indicated, whereas other relationships previously supported are contradicted. All plastids cluster as a strongly supported monophyletic group arising near the root of the cyanobacterial line of descent.  相似文献   

7.
Statistical analysis of diversification with species traits   总被引:1,自引:0,他引:1  
Testing whether some species traits have a significant effect on diversification rates is central in the assessment of macroevolutionary theories. However, we still lack a powerful method to tackle this objective. I present a new method for the statistical analysis of diversification with species traits. The required data are observations of the traits on recent species, the phylogenetic tree of these species, and reconstructions of ancestral values of the traits. Several traits, either continuous or discrete, and in some cases their interactions, can be analyzed simultaneously. The parameters are estimated by the method of maximum likelihood. The statistical significance of the effects in a model can be tested with likelihood ratio tests. A simulation study showed that past random extinction events do not affect the Type I error rate of the tests, whereas statistical power is decreased, though some power is still kept if the effect of the simulated trait on speciation is strong. The use of the method is illustrated by the analysis of published data on primates. The analysis of these data showed that the apparent overall positive relationship between body mass and species diversity is actually an artifact due to a clade-specific effect. Within each clade the effect of body mass on speciation rate was in fact negative. The present method allows to take both effects (clade and body mass) into account simultaneously.  相似文献   

8.
Over the past decade or so it has become increasingly popular to use reconstructed evolutionary trees to investigate questions about the rates of speciation and extinction. Although the methodology of this field has grown substantially in its sophistication in recent years, here I will take a step back to present a very simple model that is designed to investigate the relatively straightforward question of whether the tempo of diversification (speciation and extinction) differs between two or more phylogenetic trees, without attempting to attribute a causal basis to this difference. It is a likelihood method, and I demonstrate that it generally shows type I error that is close to the nominal level. I also demonstrate that parameter estimates obtained with this approach are largely unbiased. As this method can be used to compare trees of unknown relationship, it will be particularly well‐suited to problems in which a difference in diversification rate between clades is suspected, but in which these clades are not particularly closely related. As diversification methods can easily take into account an incomplete sampling fraction, but missing lineages are assumed to be missing at random, this method is also appropriate for cases in which we have hypothesized a difference in the process of diversification between two or more focal clades, but in which many unsampled groups separate the few of interest. The method of this study is by no means an attempt to replace more sophisticated models in which, for instance, diversification depends on the state of an observed or unobserved discrete or continuous trait. Rather, my intention is to provide a complementary approach for circumstances in which a simpler hypothesis is warranted and of biological interest.  相似文献   

9.
We present a method for using slopes and intercepts from a linear regression of a quantitative trait as outcomes in segregation and linkage analyses. We apply the method to the analysis of longitudinal systolic blood pressure (SBP) data from the Framingham Heart Study. A first-stage linear model was fit to each subject's SBP measurements to estimate both their slope over time and an intercept, the latter scaled to represent the mean SBP at the average observed age (53.7 years). The subject-specific intercepts and slopes were then analyzed using segregation and linkage analysis. We describe a method for using the standard errors of the first-stage intercepts and slopes as weights in the genetic analyses. For the intercepts, we found significant evidence of a Mendelian gene in segregation analysis and suggestive linkage results (with LOD scores >or= 1.5) for specific markers on chromosomes 1, 3, 5, 9, 10, and 17. For the slopes, however, the data did not support a Mendelian model, and thus no formal linkage analyses were conducted.  相似文献   

10.
Dispersal can impact population dynamics and geographic variation, and thus, genetic approaches that can establish which landscape factors influence population connectivity have ecological and evolutionary importance. Mixed models that account for the error structure of pairwise datasets are increasingly used to compare models relating genetic differentiation to pairwise measures of landscape resistance. A model selection framework based on information criteria metrics or explained variance may help disentangle the ecological and landscape factors influencing genetic structure, yet there are currently no consensus for the best protocols. Here, we develop landscape‐directed simulations and test a series of replicates that emulate independent empirical datasets of two species with different life history characteristics (greater sage‐grouse; eastern foxsnake). We determined that in our simulated scenarios, AIC and BIC were the best model selection indices and that marginal R2 values were biased toward more complex models. The model coefficients for landscape variables generally reflected the underlying dispersal model with confidence intervals that did not overlap with zero across the entire model set. When we controlled for geographic distance, variables not in the underlying dispersal models (i.e., nontrue) typically overlapped zero. Our study helps establish methods for using linear mixed models to identify the features underlying patterns of dispersal across a variety of landscapes.  相似文献   

11.
Recent technological advances continue to provide noninvasive and more accurate biomarkers for evaluating disease status. One standard tool for assessing the accuracy of diagnostic tests is the receiver operating characteristic (ROC) curve. Few statistical methods exist to accommodate multiple continuous‐scale biomarkers in the framework of ROC analysis. In this paper, we propose a method to integrate continuous‐scale biomarkers to optimize classification accuracy. Specifically, we develop semiparametric transformation models for multiple biomarkers. We assume that unknown and marker‐specific transformations of biomarkers follow a multivariate normal distribution. Our models accommodate biomarkers subject to limits of detection and account for the dependence among biomarkers by including a subject‐specific random effect. We also propose a diagnostic measure using an optimal linear combination of the transformed biomarkers. Our diagnostic rule does not depend on any monotone transformation of biomarkers and is not sensitive to extreme biomarker values. Nonparametric maximum likelihood estimation (NPMLE) is used for inference. We show that the parameter estimators are asymptotically normal and efficient. We illustrate our semiparametric approach using data from the Endometriosis, Natural History, Diagnosis, and Outcomes (ENDO) study.  相似文献   

12.
Probabilistic models of sequence evolution are in widespreaduse in phylogenetics and molecular sequence evolution. Thesemodels have become increasingly sophisticated and combined withstatistical model comparison techniques have helped to shedlight on how genes and proteins evolve. Models of codon evolutionhave been particularly useful, because, in addition to providinga significant improvement in model realism for protein-codingsequences, codon models can also be designed to test hypothesesabout the selective pressures that shape the evolution of thesequences. Such models typically assume a phylogeny and canbe used to identify sites or lineages that have evolved adaptively.Recently some of the key assumptions that underlie phylogenetictests of selection have been questioned, such as the assumptionthat the rate of synonymous changes is constant across sitesor that a single phylogenetic tree can be assumed at all sitesfor recombining sequences. While some of these issues have beenaddressed through the development of novel methods, others remainas caveats that need to be considered on a case-by-case basis.Here, we outline the theory of codon models and their applicationto the detection of positive selection. We review some of themore recent developments that have improved their power andutility, laying a foundation for further advances in the modelingof coding sequence evolution.   相似文献   

13.
14.
An anther-derived doubled haploid (DH) population and an F2 mapping population were developed from an intraspecific hybrid between the eggplant breeding lines 305E40 and 67/3. The former incorporates an introgressed segment from Solanum aethiopicum Gilo Group carrying the gene Rfo-sa1, which confers resistance to Fusarium oxysporum; the latter is a selection from an intraspecific cross involving two conventional eggplant varieties and lacks Rfo-sa1. Initially, 28 AFLP primer combinations (PCs) were applied to a sample of 93 F2 individuals and 93 DH individuals, from which 170 polymorphic AFLP fragments were identified. In the DH population, the segregation of 117 of these AFLPs as well as markers closely linked to Rfo-sa1 was substantially distorted, while in the F2 population, segregation distortion was restricted to just 10 markers, and thus the latter was chosen for map development. A set of 141 F2 individuals was genotyped with 73 AFLP PCs (generating 406 informative markers), 32 SSRs, 4 tomato RFLPs, and 3 CAPS markers linked to Rfo-sa1. This resulted in the assignment of 348 markers to 12 major linkage groups. The framework map covered 718.7?cM, comprising 238 markers (212 AFLPs, 22 SSRs, 1 RFLP, and the Rfo-sa1 CAPS). Marker order and inter-marker distances in this eggplant map were largely consistent with those reported in a recently published SSR-based map. From an eggplant breeding perspective, DH populations produced by anther culture appear to be subject to massive segregation distortion and thus may not be very efficient in capturing the full range of genetic variation present in the parental lines.  相似文献   

15.
A method is described for segregation analysis that incorporates linkage markers. The model allows for segregation (penetrance), linkage (recombination fraction), and association (linkage disequilibrium) parameters. A single-locus-multiple-allele model underlying the trait phenotype is assumed. When families have been ascertained in a systematic fashion, a joint (markers, phenotypes) likelihood with ascertainment is advocated. When ascertainment correction is not feasible, a conditional (markers given phenotypes) approach is recommended, which is also valid in the presence of reduced fertility and assortative mating. This approach, oriented toward determining mode of inheritance, differs from conventional linkage analysis, which is oriented toward detection of linkage. Therefore, it is more appropriately considered an extension of the affected sib-pair method to arbitrary pedigrees, including association information and allowing for multiple alleles. Incorporation of coupling parameters allows for discrimination between pleiotropy and linkage disequilibrium. The method is demonstrated through a reanalysis of four recently published family studies on type 1 diabetes and HLA. Recessive inheritance is rejected in all four data sets. For three of them, dominant inheritance is not rejected, while in the fourth, all two-allele models are rejected in favor of three alleles. Although association with the DR3 and DR4 alleles is quite strong, pleiotropy with regard to these alleles is unlikely. The results also suggest an additional familial factor(s) (e.g., locus).  相似文献   

16.
Human serum angiotensin I-converting enzyme (ACE) levels vary substantially between individuals and are highly heritable. Segregation analysis in European families has shown that more than half of the total variability in ACE levels is influenced by quantitative-trait loci (QTL). One of these QTLs is located within or close to the ACE locus itself. Combined segregation/linkage analysis in a series of African Caribbean families from Jamaica shows that the ACE insertion-deletion polymorphism is in moderate linkage disequilibrium with an ACE-linked QTL. Linkage analysis with a highly informative polymorphism at the neighboring growth-hormone gene (GH) shows surprisingly little support for linkage (LOD score [Z] = 0.12). An extended analysis with a two-QTL model, where an ACE-linked QTL interacts additively with an unlinked QTL, significantly improves both the fit of the model (P = .002) and the support for linkage between the ACe-linked QTL interacts additively with an unlinked QTL, significantly improves both the fit of the model (P = .002) and the support for linkage between the ACe-linked QTL and GH polymorphism (Z = 5.0). We conclude that two QTLs jointly influence serum ACE levels in this population. One QTL is located within or close to the ACE locus and explains 27% of the total variability; the second QTL is unlinked to the ACE locus and explains 52% of the variability. The identification of the molecular mechanisms underlying both QTLs is necessary in order to interpret the role of ACE in cardiovascular disease.  相似文献   

17.
A number of statistics have recently been proposed to asssess the fit of the multiple logistic regression model in both prospective and retrospective studies involving two independent samples as well as in cross sectional studies. These statistics are not appropriate for assessing fit with matched case-control studies. This paper presents methods for assessing fit for matched case-control studies. Both parametric and nonparametric approaches are suggested even though none are directly analogous to the statistics proposed in the unmatched situation. Several examples are included to illustrate the methods.  相似文献   

18.
In the past, 2 kinds of Markov models have been considered to describe protein sequence evolution. Codon-level models have been mechanistic with a small number of parameters designed to take into account features, such as transition-transversion bias, codon frequency bias, and synonymous-nonsynonymous amino acid substitution bias. Amino acid models have been empirical, attempting to summarize the replacement patterns observed in large quantities of data and not explicitly considering the distinct factors that shape protein evolution. We have estimated the first empirical codon model (ECM). Previous codon models assume that protein evolution proceeds only by successive single nucleotide substitutions, but our results indicate that model accuracy is significantly improved by incorporating instantaneous doublet and triplet changes. We also find that the affiliations between codons, the amino acid each encodes and the physicochemical properties of the amino acids are main factors driving the process of codon evolution. Neither multiple nucleotide changes nor the strong influence of the genetic code nor amino acids' physicochemical properties form a part of standard mechanistic models and their views of how codon evolution proceeds. We have implemented the ECM for likelihood-based phylogenetic analysis, and an assessment of its ability to describe protein evolution shows that it consistently outperforms comparable mechanistic codon models. We point out the biological interpretation of our ECM and possible consequences for studies of selection.  相似文献   

19.
采用最大似然区间定位法对阈模型与一般线性模型的QTL定位效率进行了比较,并对影响离散性状QTL检测效率的主要因素(QTL效应、性状的遗传力和表型发生率)进行了模拟研究,实验设计为多个家系的女儿设计.资源群体大小为500头。研究结果表明:在QTL参数估计及检验功效方面,阈模型方法具有较大的优势,对离散性状QTL定位的效率明显高于LM(Linear Model)方法,定位的准确性也较高。另外,性状遗传力、QTL效应的大小和性状表型发生率对QTL定位的准确度也有直接的影响,随着性状遗传力和表型发生率的提高,随着QTL效应的增大,QTL定位的效率也进一步提高。  相似文献   

20.
In the present paper the linear logistic extension of latent class analysis is described. Thereby it is assumed that the item latent probabilities as well as the class sizes can be attributed to some explanatory variables. The basic equations of the model state the decomposition of the log-odds of the item latent probabilities and of the class sizes into weighted sums of basic parameters representing the effects of the predictor variables. Further, the maximum likelihood equations for these effect parameters and statistical tests for goodness-of-fit are given. Finally, an example illustrates the practical application of the model and the interpretation of the model parameters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号