首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 203 毫秒
1.
系统发育谱算法作为一种有效的大规模基因组功能注释方法,已经被成功的应用到原核生物基因组的功能注释中去。通过对系统发育谱方法中的一个关键环节——相似谱的聚类进行分析,提出了一种基于统计建模的方法来对相似的系统发育谱进行聚类。实验表明,该方法在保证较高的覆盖率的同时,还有效的提高了算法的整体速度,且当参与建模的系统发育谱的数目越大时,算法的精确度越高。  相似文献   

2.
系统发育谱方法是目前研究较多的一种基于非同源性的生物大分子功能注释方法。针对现有算法存在的一些缺陷,从两个方面对该方法做了改进:一是构造基于权重的系统发育谱;二是采用改进的聚类算法对发育谱的相似性进行分析。从NCBI上下载100条Escherichia coli K12蛋白质作为实验数据,分别使用改进的算法和经典的层次聚类算法、K均值聚类算法对相似谱进行分析。结果显示,提出的改进算法在对相似谱聚类的精确度上明显优于后两种聚类算法。  相似文献   

3.
为了解甘南藏族自治州亚高寒草甸植物群落在坡向梯度上的构建机制,该文选取5个坡向的样地,构建了植物群落系统发育树,测定了各坡向土壤环境因子和植物叶片的功能性状,检验了叶片功能性状的系统发育信号。结果表明:坡向变化对土壤含水量、土壤养分含量影响显著。大部分植物的叶片特征在不同坡向的差异显著,叶片干物质含量在南坡、西南坡较高,比叶面积和叶片氮、磷含量在北坡和西北坡较高。叶片的磷含量具有微弱的系统发育信号,而叶片干物质含量、比叶面积、叶片的氮含量均没表现出显著的系统发育信号。从南坡到北坡,群落的系统发育结构由发散到聚集。生境过滤作用是南坡、西南坡群落构建的驱动因素,种间竞争是北坡和西北坡群落构建的主要驱动力。西坡系统发育指数相反,其构建机制比较复杂,可能是几种机制共同作用的结果。  相似文献   

4.
为了解甘南藏族自治州亚高寒草甸植物群落在坡向梯度上的构建机制,该文选取5个坡向的样地,构建了植物群落系统发育树,测定了各坡向土壤环境因子和植物叶片的功能性状,检验了叶片功能性状的系统发育信号。结果表明:坡向变化对土壤含水量、土壤养分含量影响显著。大部分植物的叶片特征在不同坡向的差异显著,叶片干物质含量在南坡、西南坡较高,比叶面积和叶片氮、磷含量在北坡和西北坡较高。叶片的磷含量具有微弱的系统发育信号,而叶片干物质含量、比叶面积、叶片的氮含量均没表现出显著的系统发育信号。从南坡到北坡,群落的系统发育结构由发散到聚集。生境过滤作用是南坡、西南坡群落构建的驱动因素,种间竞争是北坡和西北坡群落构建的主要驱动力。西坡系统发育指数相反,其构建机制比较复杂,可能是几种机制共同作用的结果。  相似文献   

5.
利用三种分子标记研究缘毛类纤毛虫的系统发育地位   总被引:4,自引:1,他引:3  
为了探讨缘毛类纤毛虫的系统发育地位 ,利用RAPD方法得到了 9种缘毛类纤毛虫、 1种四膜虫和1种喇叭虫的 3个随机引物的电泳带谱 ;测定了 7种缘毛类纤毛虫rRNA基因中的间隔区 1(ITS1)和小亚基核糖体核糖核酸 (SSrRNA)基因序列 ,并构建了相应的系统树。在比较和分析RAPD、ITS1和SSrRNA基因序列在缘毛类纤毛虫系统发育研究中的适用范围的基础上 ,以SSrRNA基因序列为分子标记研究了缘毛类纤毛虫系统发育地位 ,结果表明 :①缘毛亚纲是单系的 ,作为寡膜纲中一个亚纲的分类地位是合理的 ;②缘毛类纤毛虫可能是寡膜纲中较高等的一个类群。  相似文献   

6.
土壤动物粒径谱研究进展   总被引:1,自引:0,他引:1  
徐国瑞  马克明 《生态学报》2017,37(8):2506-2519
群落结构如何响应环境变化是生态学研究长期关注的核心问题之一。粒径谱由个体大小和多度构建而来,与营养级转换速率相关、反映生态系统过程动态以及表征生态系统稳定性,可以将其视为一个综合的功能多样性指标用于预测和表征群落的组成以及生态系统功能如何响应环境压力。粒径谱研究最初始于水生生态系统,近年来被引入到土壤动物群落生态学的研究中。简要回顾粒径谱的概念由来及理论基础,分析比较了当前粒径谱研究中的4种易混淆类型,介绍了常用的两类土壤动物粒径谱构建方法及其生态学意义,梳理了土壤动物粒径谱对环境梯度响应与生态化学计量学相结合的研究进展,并指出了应用粒径谱研究土壤动物群落的难点及限制条件。未来,在基础理论研究方面,土壤动物粒径谱应关注个体大小与营养级位置及能量利用关系;在应用方面,土壤动物粒径谱可结合传统的分类方法广泛应用于指示环境污染、生态恢复、保育生物以及土地利用变化等。  相似文献   

7.
目的:基于生物信息学预测人线粒体转录终止因子3(hMTERF3)蛋白的结构与功能。方法:利用GenBank、Uniprot、ExPASy、SWISS-PROT数据库资源和不同的生物信息学软件对hMTERF3蛋白进行系统研究,包括hMTERF3的理化性质、跨膜区和信号肽、二级结构功能域、亚细胞定位、蛋白质的功能分类预测、同源蛋白质多重序列比对、系统发育树构建、三级结构同源建模。结果:软件预测hMTERF3蛋白的相对分子质量为47.97×103,等电点为8.60,不具信号肽和跨膜区;二级结构分析显示主要为螺旋和无规则卷曲,包含6个MTERF基序,三级结构预测结果与二级结构预测结果相符;亚细胞定位分析结果显示该蛋白定位于人线粒体;功能分类预测其为转运和结合蛋白,参与基因转录调控;同源蛋白质多重序列比对和进化分析显示,hMTERF3蛋白与大鼠、小鼠等哺乳动物的MTERF3蛋白具有高度同源性,在系统发育树上聚为一类。结论:hMTERF3蛋白的生物信息学分析为进一步开展对该蛋白的结构和功能的实验研究提供了理论依据。  相似文献   

8.
侯嫚嫚  李晓宇  王均伟  刘帅  赵秀海 《生态学报》2017,37(22):7503-7513
群落构建一直是生态学研究的热点,基于系统发育和功能性状量化生境过滤、竞争排斥以及随机过程在群落构建中的作用,能够深入理解群落构建机制。本研究以长白山针阔混交林不同演替阶段的3个5.2 hm~2样地(次生杨桦林、次生针阔混交林、原始椴树红松林)为平台,基于被子植物分类系统Ⅲ(Angiosperm Classification System,APGⅢ)构建的系统发育树和7个关键功能性状(叶面积、比叶面积、叶片厚度、叶片氮含量、叶片磷含量、氮磷比、最大树高),结合环境数据,分析不同演替阶段群落系统发育和功能性状结构。研究表明:(1)各演替阶段7个植物功能性状都表现出显著的系统发育信号,表明植物功能性状受系统发育历史影响;(2)系统发育和功能性状结构在不同演替阶段和不同径级均为非随机状态。随着演替的推进群落系统发育和功能性状结构由聚集走向发散;随着径级的增加,系统发育和功能性状结构的聚集程度减小,表明随着演替阶段的进行和径级增大,竞争性排斥的作用逐渐明显;(3)各演替阶段系统发育和功能性状的周转都为非随机且不同因子对两者的解释力度存在差异。演替早期空间距离的解释力度小于环境距离,说明生境过滤在群落构建中的重要性,而在演替后期空间距离的解释力度大于环境距离,验证了扩散限制在群落构建中的重要性。  相似文献   

9.
太白山森林样地系统发育多样性格局及其影响因素 系统发育多样性指数常被用作区分植物群落构建过程中生态和演化过程的相对作用。系统发育多样性格局的推断方法(如系统树的构建和不同的系统发育多样性指数)、演化历史(如生活型)以及环境梯度都可能影响系统发育多样性格局的估计值,进而可能影响我们对植物群落构建过程的认知。因此,有必要区分这些因素如何作用于系统发育多样性格局的估计值,但其相对重要性及其交互作用仍不清楚。本研究利用位于太白山北坡沿海拔分布的20个森林样地(整体高差2800 m左右)的野外调查数据,包括274种木本植物和581种草本植物。对于上述样地内所有植物,我们构建了当前广泛采用的合成树和分子树以比较系统树的构建,特别是合成树末端的多歧分支结构,及其对系统发育多样性格局估计值的可能影响。同时,我们计算了每个样地的3种不同的系统发育多样性指数,包括Faith’s PD, 平均成对距离(MPD)和平均最近类群距离(MNTD),并分别对木本和草本植物进行计算。多模型比较分析系统发育多样性格局的估计值与系统树重建方法、多样性指数、生活型、海拔及其交互作用的最简约关系。研究结果表明,基于合成树和分子树所得到的系统发育多样性格局之间没有显著差异。海拔和多样性指数与生活型在解释系统发育多样性格局方面存在强烈的交互作用,并且能够解释44%以上的变异。系统发育多样性格局的估计值总体随海拔升高而降低,但草本植物相比木本植物变化更平缓。对于木本植物,3种系统发育多样性指数表现出一致的海拔分布格局(即系统发育聚集),而草本植物的平均成对距离指数则表现为随机的海拔分布格局。因此,分析沿环境梯度的系统发育多样性格局需要考虑系统发育格局的推断方法和演化历史的影响,以帮助我们更好地理解植物群落的构建过程。  相似文献   

10.
系统发育谱生成软件(Phylogenetie Profile Generator,PPG)采用Microsoft Visual Basic和Perl两种语言编写,将枸建系统发育谱所涉及的全部过程进行集成,用户只需提供原始的蛋白或核酸序列,软件即可生成所需的系统发育谱,并提供文本和XML两种形式的输出结果。软件具有Windows和Limix两个版本,可提供免费下载。软件下载地址:http://life.cnu.edu.cn/kexueyjshow.php?id=56  相似文献   

11.
Ortholog identification is used in gene functional annotation, species phylogeny estimation, phylogenetic profile construction and many other analyses. Bioinformatics methods for ortholog identification are commonly based on pairwise protein sequence comparisons between whole genomes. Phylogenetic methods of ortholog identification have also been developed; these methods can be applied to protein data sets sharing a common domain architecture or which share a single functional domain but differ outside this region of homology. While promiscuous domains represent a challenge to all orthology prediction methods, overall structural similarity is highly correlated with proximity in a phylogenetic tree, conferring a degree of robustness to phylogenetic methods. In this article, we review the issues involved in orthology prediction when data sets include sequences with structurally heterogeneous domain architectures, with particular attention to automated methods designed for high-throughput application, and present a case study to illustrate the challenges in this area.  相似文献   

12.
The phylogenetic profile method has been widely applied in the prediction of protein-protein interactions (PPIs). Studies often use all of the available complete genomes for this method. With more than 400 genomes complete and new ones on the horizon, it remains unclear how to select reference organisms for profile construction and then influence the PPI prediction. Here, we performed a systematic assessment of reference organism selection from 225 complete genomes with their evolutionary tree. Our results suggest that reference organisms should be selected from moderately and highly genetically distant organisms, from all three domains (Bacteria, Archaea, and Eukarya), and by their even distribution at the fifth hierarchical level in the evolutionary tree. Our study provides important guidance on the construction of phylogenetic profiles for PPI prediction and functional genomics, which has become challenging due to the large and increasing number of available candidate organisms.  相似文献   

13.
Zhou Y  Wang R  Li L  Xia X  Sun Z 《Journal of molecular biology》2006,359(4):1150-1159
Identifying potential protein interactions is of great importance in understanding the topologies of cellular networks, which is much needed and valued in current systematic biological studies. The development of our computational methods to predict protein-protein interactions have been spurred on by the massive sequencing efforts of the genomic revolution. Among these methods is phylogenetic profiling, which assumes that proteins under similar evolutionary pressures with similar phylogenetic profiles might be functionally related. Here, we introduce a method for inferring functional linkages between proteins from their evolutionary scenarios. The term evolutionary scenario refers to a series of events that occurred in speciation over time, which can be reconstructed given a phylogenetic profile and a species tree. Common evolutionary pressures on two proteins can then be inferred by comparing their evolutionary scenarios, which is a direct indication of their functional linkage. This scenario method has proven to have better performance compared with the classical phylogenetic profile method, when applied to the same test set. In addition, predicted results of the two methods are found to be fairly different, suggesting the possibility of merging them in order to achieve a better performance. We analyzed the influence of the topology of the phylogenetic tree on the performance of this method, and found it to be robust to perturbations in the topology of the tree. However, if a completely random tree is incorporated, performance will decline significantly. The evolutionary scenario method was used for inferring functional linkages in 67 species, and 40,006 linkages were predicted. We examine our prediction for budding yeast and find that almost all predicted linkages are supported by further evidence.  相似文献   

14.
“Phylogenetic profiling” is based on the hypothesis that during evolution functionally or physically interacting genes are likely to be inherited or eliminated in a codependent manner. Creating presence–absence profiles of orthologous genes is now a common and powerful way of identifying functionally associated genes. In this approach, correctly determining orthology, as a means of identifying functional equivalence between two genes, is a critical and nontrivial step and largely explains why previous work in this area has mainly focused on using presence–absence profiles in prokaryotic species. Here, we demonstrate that eukaryotic genomes have a high proportion of multigene families whose phylogenetic profile distributions are poor in presence–absence information content. This feature makes them prone to orthology mis-assignment and unsuited to standard profile-based prediction methods. Using CATH structural domain assignments from the Gene3D database for 13 complete eukaryotic genomes, we have developed a novel modification of the phylogenetic profiling method that uses genome copy number of each domain superfamily to predict functional relationships. In our approach, superfamilies are subclustered at ten levels of sequence identity—from 30% to 100%—and phylogenetic profiles built at each level. All the profiles are compared using normalised Euclidean distances to identify those with correlated changes in their domain copy number. We demonstrate that two protein families will “auto-tune” with strong co-evolutionary signals when their profiles are compared at the similarity levels that capture their functional relationship. Our method finds functional relationships that are not detectable by the conventional presence–absence profile comparisons, and it does not require a priori any fixed criteria to define orthologous genes.  相似文献   

15.
Subcellular location is an important functional annotation of proteins. An automatic, reliable and efficient prediction system for protein subcellular localization is necessary for large-scale genome analysis. This paper describes a protein subcellular localization method which extracts features from protein profiles rather than from amino acid sequences. The protein profile represents a protein family, discards part of the sequence information that is not conserved throughout the family and therefore is more sensitive than the amino acid sequence. The amino acid compositions of whole profile and the N-terminus of the profile are extracted, respectively, to train and test the probabilistic neural network classifiers. On two benchmark datasets, the overall accuracies of the proposed method reach 89.1% and 68.9%, respectively. The prediction results show that the proposed method perform better than those methods based on amino acid sequences. The prediction results of the proposed method are also compared with Subloc on two redundance-reduced datasets.  相似文献   

16.
MOTIVATION: The phylogenetic profile of a protein is a string that encodes the presence or absence of the protein in every fully sequenced genome. Because proteins that participate in a common structural complex or metabolic pathway are likely to evolve in a correlated fashion, the phylogenetic profiles of such proteins are often 'similar' or at least 'related' to each other. The question we address in this paper is the following: how to measure the 'similarity' between two profiles, in an evolutionarily relevant way, in order to develop efficient function prediction methods? RESULTS: We show how the profiles can be mapped to a high-dimensional vector space which incorporates evolutionarily relevant information, and we provide an algorithm to compute efficiently the inner product in that space, which we call the tree kernel. The tree kernel can be used by any kernel-based analysis method for classification or data mining of phylogenetic profiles. As an application a Support Vector Machine (SVM) trained to predict the functional class of a gene from its phylogenetic profile is shown to perform better with the tree kernel than with a naive kernel that does not include any information about the phylogenetic relationships among species. Moreover a kernel principal component analysis (KPCA) of the phylogenetic profiles illustrates the sensitivity of the tree kernel to evolutionarily relevant variations.  相似文献   

17.
The current available data on protein sequences largely exceeds the experimental capabilities to annotate their function. So annotation in silico, i.e. using computational methods becomes increasingly important. This annotation is inevitably a prediction, but it can be an important starting point for further experimental studies. Here we present a method for prediction of protein functional sites, SDPsite, based on the identification of protein specificity determinants. Taking as an input a protein sequence alignment and a phylogenetic tree, the algorithm predicts conserved positions and specificity determinants, maps them onto the protein's 3D structure, and searches for clusters of the predicted positions. Comparison of the obtained predictions with experimental data and data on performance of several other methods for prediction of functional sites reveals that SDPsite agrees well with the experiment and outperforms most of the previously available methods. SDPsite is publicly available under http://bioinf.fbb.msu.ru/SDPsite.  相似文献   

18.
Wang B  Chen P  Huang DS  Li JJ  Lok TM  Lyu MR 《FEBS letters》2006,580(2):380-384
This paper proposes a novel method that can predict protein interaction sites in heterocomplexes using residue spatial sequence profile and evolution rate approaches. The former represents the information of multiple sequence alignments while the latter corresponds to a residue's evolutionary conservation score based on a phylogenetic tree. Three predictors using a support vector machines algorithm are constructed to predict whether a surface residue is a part of a protein-protein interface. The efficiency and the effectiveness of our proposed approach is verified by its better prediction performance compared with other models. The study is based on a non-redundant data set of heterodimers consisting of 69 protein chains.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号