首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Markov models of codon substitution are powerful inferential tools for studying biological processes such as natural selection and preferences in amino acid substitution. The equilibrium character distributions of these models are almost always estimated using nucleotide frequencies observed in a sequence alignment, primarily as a matter of historical convention. In this note, we demonstrate that a popular class of such estimators are biased, and that this bias has an adverse effect on goodness of fit and estimates of substitution rates. We propose a “corrected” empirical estimator that begins with observed nucleotide counts, but accounts for the nucleotide composition of stop codons. We show via simulation that the corrected estimates outperform the de facto standard estimates not just by providing better estimates of the frequencies themselves, but also by leading to improved estimation of other parameters in the evolutionary models. On a curated collection of sequence alignments, our estimators show a significant improvement in goodness of fit compared to the approach. Maximum likelihood estimation of the frequency parameters appears to be warranted in many cases, albeit at a greater computational cost. Our results demonstrate that there is little justification, either statistical or computational, for continued use of the -style estimators.  相似文献   

2.
3.
利用聚合酶链式反应 (PolymeraseChainReaction ,PCR)与限制性内切酶相结合的方法 ,设计 4条含有限制性酶切位点和相应突变的引物。以马铃薯X病毒 (PotatoVirusX ,PVX)外壳蛋白cp基因为模板 ,扩增出相应的片段 ,相应酶切后通过三片段连接构建到克隆载体pBlueKS( / - )上。随机挑选重组子测序表明 ,利用三片段拼接成功地在PVX外壳蛋白基因的不同部位产生了突变。实验结果说明利用三片段接可以大大提高筛选得到突变子的效率 ,从而节省人力、物力和时间。  相似文献   

4.
Continuous-time Markov processes are often used to model the complex natural phenomenon of sequence evolution. To make the process of sequence evolution tractable, simplifying assumptions are often made about the sequence properties and the underlying process. The validity of one such assumption, time-homogeneity, has never been explored. Violations of this assumption can be found by identifying non-embeddability. A process is non-embeddable if it can not be embedded in a continuous time-homogeneous Markov process. In this study, non-embeddability was demonstrated to exist when modelling sequence evolution with Markov models. Evidence of non-embeddability was found primarily at the third codon position, possibly resulting from changes in mutation rate over time. Outgroup edges and those with a deeper time depth were found to have an increased probability of the underlying process being non-embeddable. Overall, low levels of non-embeddability were detected when examining individual edges of triads across a diverse set of alignments. Subsequent phylogenetic reconstruction analyses demonstrated that non-embeddability could impact on the correct prediction of phylogenies, but at extremely low levels. Despite the existence of non-embeddability, there is minimal evidence of violations of the local time homogeneity assumption and consequently the impact is likely to be minor.  相似文献   

5.
SARS冠状病毒的密码子偏爱性分析   总被引:14,自引:0,他引:14  
为了分析SARS(Severe Acute Respiratory Syndrome)冠状病毒的密码子偏爱性(codon preference),为SARS冠状病毒基因表达中宿主系统的选择提供参考。运用EMBOSS(The European Molecular Biology Open Software Suite)的CHIPS(Codon Hetemzygosity in a Protein-codingSequence)和CUSP(Create a eodon usege table)程序对SARS冠状病毒的6个编码蛋白的基因进行分析,并将这6个编码序列拼接在一起进行全基因组的密码子偏爱性分析。分析结果与大肠杆菌、酵母及人的密码子偏爱性进行比较。结果显示SARS冠状病毒的CHIPS分析Nc(effective number of codons)值为53.338,S、E、M、N蛋白、3CL水解酶、RNA聚合酶的Nc值分别为45.733,61.000,59.040,46.618,46.924,51.902。编码SARS冠状病毒A,P,R,S,T,L等氨基酸的不同密码子使用频率有较大差异。大肠杆菌有25个、酵母有12个、人有20个密码子与SARS冠状病毒密码子使用偏爱性差异较大.因此可以得出结论:编码SARS冠状病毒氨基酸的密码子出现的频率较均一。SARS冠状病毒的密码子偏爱性与真核生物较接近,与原核生物相差较远,其基因表达选择在酵母等真核系统可能更为合适。  相似文献   

6.
A major challenge in computational biology is constraining free parameters in mathematical models. Adjusting a parameter to make a given model output more realistic sometimes has unexpected and undesirable effects on other model behaviors. Here, we extend a regression-based method for parameter sensitivity analysis and show that a straightforward procedure can uniquely define most ionic conductances in a well-known model of the human ventricular myocyte. The model''s parameter sensitivity was analyzed by randomizing ionic conductances, running repeated simulations to measure physiological outputs, then collecting the randomized parameters and simulation results as “input” and “output” matrices, respectively. Multivariable regression derived a matrix whose elements indicate how changes in conductances influence model outputs. We show here that if the number of linearly-independent outputs equals the number of inputs, the regression matrix can be inverted. This is significant, because it implies that the inverted matrix can specify the ionic conductances that are required to generate a particular combination of model outputs. Applying this idea to the myocyte model tested, we found that most ionic conductances could be specified with precision (R2 > 0.77 for 12 out of 16 parameters). We also applied this method to a test case of changes in electrophysiology caused by heart failure and found that changes in most parameters could be well predicted. We complemented our findings using a Bayesian approach to demonstrate that model parameters cannot be specified using limited outputs, but they can be successfully constrained if multiple outputs are considered. Our results place on a solid mathematical footing the intuition-based procedure simultaneously matching a model''s output to several data sets. More generally, this method shows promise as a tool to define model parameters, in electrophysiology and in other biological fields.  相似文献   

7.
8.
Evolutionary models that make use of site-specific parameters have recently been criticized on the grounds that parameter estimates obtained under such models can be unreliable and lack theoretical guarantees of convergence. We present a simulation study providing empirical evidence that a simple version of the models in question does exhibit sensible convergence behavior and that additional taxa, despite not being independent of each other, lead to improved parameter estimates. Although it would be desirable to have theoretical guarantees of this, we argue that such guarantees would not be sufficient to justify the use of these models in practice. Instead, we emphasize the importance of taking the variance of parameter estimates into account rather than blindly trusting point estimates – this is standardly done by using the models to construct statistical hypothesis tests, which are then validated empirically via simulation studies.  相似文献   

9.
10.
痘苗病毒基因组密码子使用频率分析   总被引:7,自引:2,他引:7  
密码子使用的差别是普遍存在的现象,每一个密码子被某些生物偏爱,而在另一些生物中则很少使用.以往这方面的研究多集中在自养生物中,而对纯寄生的病毒本身及其与宿主细胞基因密码子使用频率关系的研究则很少.分析痘苗病毒哥本哈根株189个基因的密码子使用频率发现:总体上痘苗病毒偏爱使用以A/U为结尾的密码子;基因的异质性不强,没有影响密码子使用的主要趋势;在不同转录方向上和表达时相上,基因密码子使用略有不同;不同功能的基因其密码子使用上差别较大;晚期基因比早期基因与宿主密码子使用频率的差别大.上述结果表明:密码子是影响病毒和细胞相互作用、保证其自身生存的重要机制.  相似文献   

11.
Abstract

The codon usage in the Vibrio cholerae genome is analyzed in this paper. Although there are much more genes on the chromosome 1 than on chromosome 2, the codon usage patterns of genes on the two chromosomes are quite similar, indicating that the two chromosomes may have coexisted in the same cell for a very long history. Unlike the base frequency pattern observed in other genomes, the G+C content at the third codon position of the V. cholerae genome varies in a rather small interval. The most notable feature of codon usage of V. cholerae genome is that there is a fraction of genes show significant bias in base choice at the second codon position. The 2006 known genes can be classified into two clusters according to the base frequencies at this position. The smaller cluster contains 227 genes, most of which code for proteins involved in transport and binding functions. The encoding products of these genes have significant bias in amino acids composition as compared with other genes. The codon usage patterns for the 1836 function unknown ORFs are also analyzed, which is useful to study their functions.  相似文献   

12.
SK Behura  DW Severson 《PloS one》2012,7(8):e43111

Background

Codon bias is a phenomenon of non-uniform usage of codons whereas codon context generally refers to sequential pair of codons in a gene. Although genome sequencing of multiple species of dipteran and hymenopteran insects have been completed only a few of these species have been analyzed for codon usage bias.

Methods and Principal Findings

Here, we use bioinformatics approaches to analyze codon usage bias and codon context patterns in a genome-wide manner among 15 dipteran and 7 hymenopteran insect species. Results show that GAA is the most frequent codon in the dipteran species whereas GAG is the most frequent codon in the hymenopteran species. Data reveals that codons ending with C or G are frequently used in the dipteran genomes whereas codons ending with A or T are frequently used in the hymenopteran genomes. Synonymous codon usage orders (SCUO) vary within genomes in a pattern that seems to be distinct for each species. Based on comparison of 30 one-to-one orthologous genes among 17 species, the fruit fly Drosophila willistoni shows the least codon usage bias whereas the honey bee (Apis mellifera) shows the highest bias. Analysis of codon context patterns of these insects shows that specific codons are frequently used as the 3′- and 5′-context of start and stop codons, respectively.

Conclusions

Codon bias pattern is distinct between dipteran and hymenopteran insects. While codon bias is favored by high GC content of dipteran genomes, high AT content of genes favors biased usage of synonymous codons in the hymenopteran insects. Also, codon context patterns vary among these species largely according to their phylogeny.  相似文献   

13.
A simple model of acute myeloblastic leukaemia (AML) development is introduced, explicitly including cell growth, cell differentiation and cell-cell interaction. Each of these processes is described by a single model parameter. It is hypothesized that the leukaemic cell is characterized by an alteration of only one of these processes. the kinetic behaviour of the AML system is examined separately for possible alterations of each of the three parameters describing the three processes involved. It is shown that, on the basis of the existing data on AML kinetics, the alteration of the growth and cell-cell interaction parameters can be eliminated as a possible source of AML. Thus kinetic data support the modification of the differentiation process as the origin of the AML state. Further, the growth characteristics of normal and leukaemic cells in the presence of each other are analysed. It is shown how the initial growth of leukaemic cells depends on the difference in the differentiation of normal and leukaemic cells and how the same difference determines the decay of normal cells in the presence of the predominantly leukaemic population. Correlations between the kinetic parameters of the normal and leukaemic populations are suggested to characterize the leukaemic state.  相似文献   

14.
影响链球菌属肺炎球菌基因组密码子使用的因素分析   总被引:5,自引:2,他引:5  
侯卓成  杨宁 《遗传学报》2002,29(8):747-752
链球菌属肺炎球菌(Steptococcus pneumoniae)的完整基因组序列已经测定完毕并于近期发表,对肺炎球菌基因组序列进行了详细分析,研究了基因组密码子的使用模式和影响密码子使用的因素,高水平高达基因的密码子第三位碱基使用胞嘧啶(C)的频率比表达水平低的基因使用C有显著的提高,表达水平较低的基因在密码子的第三位碱基更趋向使用嘌呤),基因的表达水平与对应分析的第一条向量轴呈显著相关(R=0.86),比较表达水平高,低的两组基因的密码子使用模式发现,基因的表达水平对于密码子使用有显著的影响,基因碱基G+C的组成与基因的表达水平(R=0.44),对应分析的第一条向量轴(R=0.5)有显著的相关,对基因的表达水平,密码子的使用有显著的影响,通过GC-skew,蛋白质的疏水性,基因的长度分析,发现不同长度的基因表达水平,GC含量,GC3s有差异,结果表明,在表达水平上的自然选择以及基因的碱基组成是影响肺炎球菌基因密码子使用的主要因素,基因的长度对密码子的使用有一定影响。  相似文献   

15.
The aim of this paper is to study the distribution of the likelihood ratio for testing whether or not one is sampling from a mixture of two distributions or from a single distribution. We study the case where some information is available on the variation range of the parameters of populations. First we study the simplest case in which the difference between the mean of the two populations is known. We show certain distortions between theoretical and simulation results. Secondly, we show how this distortion spreads to the situation where this difference belongs to an interval. Finally, we give an example concerning the detection of major genes in animal population.  相似文献   

16.
The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB) between eight taxonomically distinct microsporidian genomes: Encephalitozoon intestinalis, Encephalitozoon cuniculi, Spraguea lophii, Trachipleistophora hominis, Enterocytozoon bieneusi, Nematocida parisii, Nosema bombycis and Nosema ceranae. While the CUB was found to be weak in all eight Microsporidia, nearly all (98%) of the optimal codons in S. lophii, T. hominis, E. bieneusi, N. parisii, N. bombycis and N. ceranae are fond of A/U in third position whereas most (64.6%) optimal codons in the Encephalitozoon species E. intestinalis and E. cuniculi are biased towards G/C. Although nucleotide composition biases are likely the main factor driving the CUB in Microsporidia according to correlation analyses, directed mutational pressure also likely affects the CUB as suggested by ENc-plots, correspondence and neutrality analyses. Overall, the Encephalitozoon genomes were found to be markedly different from the other microsporidians and, despite being the first sequenced representatives of this lineage, are uncharacteristic of the group as a whole. The disparities observed cannot be attributed solely to differences in host specificity and we hypothesize that other forces are at play in the lineage leading to Encephalitozoon species.  相似文献   

17.
Reciprocal interactions between central 5-HT system and hypothalamo-pituitary-adrenal (HPA) axis are of particular relevance with regard to depression, in which alterations of both systems have been evidenced. In order to further explore these interactions, two models of mutant mice have been used. They consisted of knock-out mice lacking the 5-HT transporter (5-HTT-/-) and of transgenic mice with impaired glucocorticoid receptor (GR-i) expression. Under control conditions. the functional properties of 5-HT(1A) autoreceptors in GR-i mice were as in their paired wild-type. However, both chronic stress and long term treatment with fluoxetine induced abnormal adaptive changes in 5-HT(1A) autoreceptor functioning in GR-i mice. On the other hand, a marked desensitization of 5-HT(1A) autoreceptors was found in 5-HTT-/- mice as compared with paired wild-type animals, and this phenomenon was further enhanced by exposure to stressful conditions. These data show that alterations of HPA axis at the gene level has consequences on 5-HT neurotransmission, and reciprocally, that 5-HTT knock-out affects HPA-dependent responses to stress.  相似文献   

18.
19.
Abstract

In this paper we describe the use of molecular mechanics models to examine detailed intermolecular interactions within the liquid state of a common nonionic surfactant system, nonyl phenol ethoxylate (NPE). Using constant energy molecular dynamics simulations we have studied the relative strengths of dispersive interactions versus polar interactions and have estimated three dimensional solubility parameters for NPE systems as a function of temperature and ethylene oxide content. The predictions at 300 K are in good agreement with three dimensional solubility parameters predicted using group contribution tables. Models of the amorphous liquid state were represented by single molecular structures of NPE in a periodic cell. The solubility parameters predicted with these models were in good agreement with those values derived from models having eight NPE molecules packed into a cell with the exception of the electrostatic interactions, which are the most sensitive to system size effects.  相似文献   

20.
In a recent work (Haber, 1984) the author described a system of two genetic loci in terms of loglinear models. The present note provides a method for estimating the variances of the estimated parameters in these models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号