首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
A new method (MZEF) for predicting internal coding exons in genomic DNA sequences has been developed. This method is based on a prediction algorithm that uses the quadratic discriminant function for multivariate statistical pattern recognition. With improved feature measures, an Arabidopsis thaliana-specific implementation of MZEF is completed and made available to the plant genome community.  相似文献   

2.

Background  

Accurate classification into genotypes is critical in understanding evolution of divergent viruses. Here we report a new approach, MuLDAS, which classifies a query sequence based on the statistical genotype models learned from the known sequences. Thus, MuLDAS utilizes full spectra of well characterized sequences as references, typically of an order of hundreds, in order to estimate the significance of each genotype assignment.  相似文献   

3.
4.
5.
A number of operator-binding proteins contain similar sequence features to Cro and cI repressors of bacteriophage and CAP protein of Escherichia coli, such as conserved amino acids at constant positions. However, these sequence patterns also occur in proteins that are not operator-binding. We use sequence analogy information in conjunction with a pattern recognition algorithm. The functional and structural properties, e.g., distributions of hydrophobicity, hydrophilicity, charged amino acids, electrostatic free energy, and helical structures of protein are also considered. Within the framework of discriminant analysis, we calculate the above variables and search for a better combination of variables. To assess the discriminatory power of these variables, we allocated additional sequences and predict DNA-binding regions of regulatory proteins not included in the training set.  相似文献   

6.
7.
Prediction of protein structural class by discriminant analysis   总被引:7,自引:0,他引:7  
Protein structural class--alpha, beta, mixed (alpha/beta or alpha + beta), irregular--can be predicted from the amino acid sequence by discriminant analysis. Discrimination is based on distributions, in the classes, of vectors of attributes characterizing the sequences. In this paper, two sets of attributes and two methods of estimating their distributions are compared using more than 100 proteins from the Protein Data Bank. The best results were obtained when canonical variates of the frequencies of occurrence of 20 amino acids and non-parametric estimates of their distributions were used. Three variates are sufficient to allocate proteins to one of four classes with 83% reliability (estimated by cross-validation) and four variates allowed allocation to one of five classes with 78% reliability.  相似文献   

8.
We report an improved fluorescence-detected circular dichroism (FDCD)-based analytical method that is useful for probing protein three-dimensional structures. The method uses a novel FDCD device with an ellipsoidal mirror that functions on a standard circular dichroism (CD) spectrometer and eliminates all artifacts. Our experiments demonstrated three important findings. First, the method is applicable to any proteins either by using intrinsic fluorescence derived from tryptophan residues or by introducing a fluorescent label onto nonfluorescent proteins. Second, by using intrinsic fluorescence, FDCD spectroscopy can detect a structural change in the tertiary structure of metmyoglobin due to stepwise denaturation on a change in pH. Such changes could not be detected by conventional CD spectroscopy. Third, based on the typical advantages of fluorescence-based analyses, FDCD measurements enable observation of only the target proteins in a solution even in the presence of other peptides. Using our ellipsoidal mirror FDCD device, we could observe structural changes of fluorescently labeled calmodulin on binding with Ca2+ and/or interacting with binding peptides. Because FDCD appears to reflect the protein’s local structure around the fluorophore, it may provide a useful means for “pinpoint analysis” of protein structures.  相似文献   

9.

Background  

Protein structure classification plays a central role in understanding the function of a protein molecule with respect to all known proteins in a structure database. With the rapid increase in the number of new protein structures, the need for automated and accurate methods for protein classification is increasingly important.  相似文献   

10.
Does a protein's secondary structure determine its three-dimensional fold? This question is tested directly by analyzing proteins of known structure and constructing a taxonomy based solely on secondary structure. The taxonomy is generated automatically, and it takes the form of a tree in which proteins with similar secondary structure occupy neighboring leaves. Our tree is largely in agreement with results from the structural classification of proteins (SCOP), a multidimensional classification based on homologous sequences, full three-dimensional structure, information about chemistry and evolution, and human judgment. Our findings suggest a simple mechanism of protein evolution.  相似文献   

11.
【目的】马铃薯甲虫是马铃薯生产过程中的毁灭性害虫。温度是影响马铃薯甲虫发生的重要因素,明确马铃薯甲虫越冬期及发生期的温度对其发生的影响,可为该害虫未来发生情况的预测和防治提供理论支持。【方法】采用逐步判别分析法对1994—2021年马铃薯甲虫越冬及发生期(上一年12月—当年9月)新疆察布查尔县马铃薯甲虫发生等级及出土时间进行判别分类,建立发生预测模型。【结果】在训练组中,马铃薯甲虫的发生等级、出土时间判别准确率分别为100.00%、80.00%;在预测组中,马铃薯甲虫的发生等级、出土时间总判别准确率分别为69.23%、76.92%,认为判别结果较可信。【结论】通过对影响发生程度、出土时间判别的因素筛选发现,察布查尔县马铃薯甲虫的出土和发生判别均受到4月温度的影响。  相似文献   

12.
13.
Prediction of protein secondary structure content   总被引:5,自引:0,他引:5  
Liu W  Chou KC 《Protein engineering》1999,12(12):1041-1050
All existing algorithms for predicting the content of protein secondary structure elements have been based on the conventional amino-acid-composition, where no sequence coupling effects are taken into account. In this article, an algorithm was developed for predicting the content of protein secondary structure elements that was based on a new amino-acid-composition, in which the sequence coupling effects are explicitly included through a series of conditional probability elements. The prediction was examined by a self-consistency test and an independent dataset test. Both indicated a remarkable improvement obtained when using the current algorithm to predict the contents of alpha-helix, beta-sheet, beta-bridge, 3(10)-helix, pi-helix, H-bonded turn, bend and random coil. Examples of the improved accuracy by introducing the new amino-acid-composition, as well as its impact on the study of protein structural class and biologically function, are discussed.  相似文献   

14.
For over 2 decades, continuous efforts to organize the jungle of available protein structures have been underway. Although a number of discrepancies between different classification approaches for soluble proteins have been reported, the classification of membrane proteins has so far not been comparatively studied because of the limited amount of available structural data. Here, we present an analysis of α‐helical membrane protein classification in the SCOP and CATH databases. In the current set of 63 α‐helical membrane protein chains having between 1 and 13 transmembrane helices, we observed a number of differently classified proteins both regarding their domain and fold assignment. The majority of all discrepancies affect single transmembrane helix, two helix hairpin, and four helix bundle domains, while domains with more than five helices are mostly classified consistently between SCOP and CATH. It thus appears that the structural constraints imposed by the lipid bilayer complicate the classification of membrane proteins with only few membrane‐spanning regions. This problem seems to be specific for membrane proteins as soluble four helix bundles, not restrained by the membrane, are more consistently classified by SCOP and CATH. Our findings indicate that the structural space of small membrane helix bundles is highly continuous such that even minor differences in individual classification procedures may lead to a significantly different classification. Membrane proteins with few helices and limited structural diversity only seem to be reasonably classifiable if the definition of a fold is adapted to include more fine‐grained structural features such as helix–helix interactions and reentrant regions. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

15.
We present a protein fold recognition method, MANIFOLD, which uses the similarity between target and template proteins in predicted secondary structure, sequence and enzyme code to predict the fold of the target protein. We developed a non-linear ranking scheme in order to combine the scores of the three different similarity measures used. For a difficult test set of proteins with very little sequence similarity, the program predicts the fold class correctly in 34% of cases. This is an over twofold increase in accuracy compared with sequence-based methods such as PSI-BLAST or GenTHREADER, which score 13-14% correct first hits for the same test set. The functional similarity term increases the prediction accuracy by up to 3% compared with using the combination of secondary structure similarity and PSI-BLAST alone. We argue that using functional and secondary structure information can increase the fold recognition beyond sequence similarity.  相似文献   

16.
Abstract. 1. Regional scarabaeid dung beetle assemblages in southern Africa may contain over 100 species, ranging in live weight from 10 mg to 10 g. These show a wide variety of dung-use and reproductive strategies.
2. To facilitate analysis of these diverse assemblages, a system of classification analogous to guilds is proposed. Scarabaeid dung beetle species are allocated to one of seven functional groups (FGs) according to the way they use and disrupt dung. Each group therefore contains a set of species which are functional analogues of each other. This classification provides a conceptual framework within which to analyse the structure of dung beetle assemblages and the interactions between dung beetles and other dung-breeding species such as coprophagous flies.
3. There is a clear hierarchy of functional groups in their ability to compete for dung. Competitively dominant groups such as the large ball rollers (FG I) and fast-burying tunnellers (FG III) are mostly large, aggressive beetles which rapidly remove dung from the pad. The smaller ball rollers (FG II) are also effective competitors for dung. Subordinate groups are those which bury dung slowly over many days (FG IV and V) and those which breed inside the pad (FG VII, endocoprids). Kleptocoprids (FG VI) breed in dung buried by other beetles and so are not part of the hierarchy.
4. The use of this classification is illustrated by reference to three contrasting assemblages of dung beetles in a summer rainfall region of southern Africa. The potential of these beetles for biological control of dung-breeding flies is discussed.  相似文献   

17.
18.
Information of protein quaternary structure can help to understand the biological functions of proteins. Because wet-lab experiments are both time-consuming and costly, we adopt a novel computational approach to assign proteins into 10 kinds of quaternary structures. By coding each protein using its biochemical and physicochemical properties, feature selection was carried out using Incremental Feature Selection (IFS) method. The thus obtained optimal feature set consisted of 97 features, with which the prediction model was built. As a result, the overall prediction success rate is 74.90% evaluated by Jackknife test, much higher than the overall correct rate of a random guess 10% (1/10). The further feature analysis indicates that protein secondary structure is the most contributed feature in the prediction of protein quaternary structure.  相似文献   

19.
A procedure for monitoring plant community change was described using data from 189 quadrats (each 0.09 m2 in area) from or near 11 Carex exserta meadow sites in the high Sierra Nevada, California, USA. Initially the quadrats were agglomerated into five clusters by the flexible clustering strategy (beta=–0.25) with the standard absolute distance resemblance function. Data for each quadrat were cover percentages for C. exserta, other plants, litter, soil, gravel, and rock. The five clusters appeared to define a cover gradient, from quadrats with mostly gravel and rock to those with mostly C. exserta, and were accordingly designated pioneer, low seral, mid-seral, high seral, and climax.Classification functions (from discriminant analysis) are used with values of the variables to classify individual quadrats on sites used to monitor change. A site is characterized at repeated observations by the proportions of quadrats in each class. Within-class (low seral vs. low seral) rather than between-class (pioneer vs. low seral) tests are made for presence of change. Confidence intervals for differences in proportions of quadrats or individual quadrat probabilities of class membership are computed. If the confidence intervals do not cover zero, values for time one versus time two differ significantly.  相似文献   

20.
蛋白质折叠类型分类是蛋白质分类研究的重要内容。以SCOP数据库中的 PH domain-like barrel 折叠类型为研究对象,选择序列相似度小于25%的61个样本为检验集,通过结构特征分析,确定了该折叠类型的模板及其对应的特征参数,利用模板与待测蛋白的空间结构比对信息,提出了一个新的折叠类型打分函数Fscore,建立了基于Fscore的蛋白质折叠类型分类方法并用于该折叠类型的分类。用此方法对Astral1.75中序列相似度小于95%的16711个样本进行检验,分类结果的特异性为99.97%。结果表明:特征参数抓住了折叠类型的本质,打分函数Fscore及基于Fscore建立的分类方法可用于 PH domain-like barrel 蛋白质折叠类型自动分类。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号