首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have determined consensus protein-fold classifications on the basis of three classification methods, SCOP, CATH, and Dali. These classifications make use of different methods of defining and categorizing protein folds that lead to different views of protein-fold space. Pairwise comparisons of domains on the basis of their fold classifications show that much of the disagreement between the classification systems is due to differing domain definitions rather than assigning the same domain to different folds. However, there are significant differences in the fold assignments between the three systems. These remaining differences can be explained primarily in terms of the breadth of the fold classifications. Many structures may be defined as having one fold in one system, whereas far fewer are defined as having the analogous fold in another system. By comparing these folds for a nonredundant set of proteins, the consensus method breaks up broad fold classifications and combines restrictive fold classifications into metafolds, creating, in effect, an averaged view of fold space. This averaged view requires that the structural similarities between proteins having the same metafold be recognized by multiple classification systems. Thus, the consensus map is useful for researchers looking for fold similarities that are relatively independent of the method used to compare proteins. The 30 most populated metafolds, representing the folds of about half of a nonredundant subset of the PDB, are presented here. The full list of metafolds is presented on the Web.  相似文献   

2.
3.
Classifying species into functional groups is a way to understand the functioning of species-rich ecosystems, or to model the dynamics of such ecosystems. Many statistical techniques have been defined to classify species into groups, and a question is whether different techniques bring consistent classifications. In a tropical rain forest in French Guiana, five species classifications have been defined by different authors for the purpose of forest growth modelling but using different data sets and different statistical techniques. The correspondence between the five classifications was measured using four indices that are generalizations of existing indices to compare two classifications. A multiple correspondence analysis was used to identify associations between groups of different classifications. In a second step, two-table multivariate analyses were used to characterize the relationships between species classifications and eight species traits (consisting of seven populational traits and one functional trait). We evidenced a consensus on the potential size of trees: species were similarly clustered by the five classifications along this trait that is correlated to turnover rate. More surprisingly, no consensus was found for growth rate, nor wood density, traits that are correlated with light requirement.  相似文献   

4.
A class of new consensus methods for n-trees (hierarchical clusterings) is proposed. These methods apply systematically to an arbitrary collection of given classifications of a fixed set of taxa, and produce a single consensus classification. They are motivated by the desire that the consensus classification retain as much information as possible from the given classifications, even in the case of only approximate agreement among them. A focus of the paper is the concept of faithfulness of consensus methods; this concept explicates the informal notion of adequate retention of information referred to above, and is proposed as a desirable requirement for consensus methods in general. The new methods are all faithful; they have the additional property that they take hierarchical level into account. Other general properties of consensus methods are investigated, especially with reference to their relation with faithfulness. The most important of these properties is neutrality; loosely speaking a consensus method is neutral if all nontrivial clusters are treated equally in the conditions on the given classifications required to guarantee the appearance of a cluster in the consensus. A central result of the paper is an analogue of the classical impossibility theorem of K. Arrow: with trivial exceptions it is impossible to have a consensus method that is simultaneously faithful and neutral. Thus two intuitively very appealing general properties of consensus methods are seen to be incompatible.  相似文献   

5.
A consensus in dex method comprises a consensus method and a consensus index that are defined on a common set of objects (e.g. classifications). For each profile of objects, the consensus method returns a consensus object representing information or structure shared among profile objects, while the consensus index returns a quantitative measure of agreement among profile objects. Since the relationship between consensus method and consensus index is poorly understood, we propose simple axioms prescribing it in the most general terms. Many taxonomic consensus index methods violate these axioms because their consensus indices measure consensus object invariants rather than profile agreement. We propose paradigms to obtain consensus index methods that measure agreement and satisfy the axioms. These paradigms salvage concepts underlying consensus index methods violating the axioms. This work was supported in part by the Faculty of Science at Memorial University of Newfoundland, and by the Natural Sciences and Engineering Research Council of Canada Under Grant A-4142.  相似文献   

6.
Methods for measuring the degree of agreement between dendrograms (which correspond uniquely to hierarchic nested classifications) are discussed. While some methods take the relative levels of the subsets into account, most presently available methods do not. A previously proposed consensus index is shown to have a maximum possible value that is (for a given number of objects) a function of the balance (the degree to which the subsets of each set are equal in size) of the consensus of the two classifications being compared. A new consensus index is proposed that does not have this defect. In addition, a new index is proposed that is the proportion of all possible classifications that contain the sets that are in the consensus of the two dendrograms being compared. A table giving the numbers of possible bifurcating dendrograms (up to t=100 objects) is furnished to assist in the computation of this latter index.  相似文献   

7.
We propose a novel technique for automatically generating the SCOP classification of a protein structure with high accuracy. We achieve accurate classification by combining the decisions of multiple methods using the consensus of a committee (or an ensemble) classifier. Our technique, based on decision trees, is rooted in machine learning which shows that by judicially employing component classifiers, an ensemble classifier can be constructed to outperform its components. We use two sequence- and three structure-comparison tools as component classifiers. Given a protein structure and using the joint hypothesis, we first determine if the protein belongs to an existing category (family, superfamily, fold) in the SCOP hierarchy. For the proteins that are predicted as members of the existing categories, we compute their family-, superfamily-, and fold-level classifications using the consensus classifier. We show that we can significantly improve the classification accuracy compared to the individual component classifiers. In particular, we achieve error rates that are 3-12 times less than the individual classifiers' error rates at the family level, 1.5-4.5 times less at the superfamily level, and 1.1-2.4 times less at the fold level.  相似文献   

8.
Bao L  Cui Y 《FEBS letters》2006,580(5):1231-1234
In this work, we studied the correlations between selective constraint, structural environments and functional impacts of non-synonymous single nucleotide polymorphisms (nsSNPs). We found that the relation between solvent accessibility and functional impacts of nsSNPs is not as simple as generally thought. Finer structural classifications need to be taken into account to reveal the complex relations between the characteristics of a structure environment and its influence on the functional impacts of nsSNPs. We introduced two parameters for each structural environment, consensus residue percentage and residue distribution distance, to characterize the selective constraint imposed by the environment. Both parameters significantly correlate with the functional bias of nsSNPs across the structural environments. This result shows that selective constraint underlies the bias of a structural environment towards a certain type of nsSNPs (disease-associated or benign).  相似文献   

9.
The modern classifications of Cronquist, Dahlgren, Takhtajan, and Thorne have been compared with one another and also with those published at the beginning of the 20th century, which comprise the ones by Bessey, Engler, Gobi, and Hallier. Mantel and consensus tests have been used to compare the different matrices taken from the above classifications. Results indicate that all four modern classifications do not differ from one another statistically. Ordinal delimitation has not changed significantly for a century at least: Orders of the modern classifications are similar to those of the past classifications. However, the topology or structure of Cronquist’s and Takhtajan’s classifications differs from that of Bessey’s. Also, Engler’s dicotyledon classification is statistically different from those of the modern systems. Among past classifications, that of Hallier resembles the modern ones most. The resemblance among the modern classifications and, in general, with the past ones can be explained by the similarity in taxonomic principles and in the practice used. Two other factors help in explaining similarities among classifications: cognitive constraint and historical inertia. For instance, the Linnean scheme—upon which all botanical classifications are based—imposes on the latter a structure which allows only with difficulty and approximation the representation of taxon evolution. Moreover, not only have modern authors mutually influenced one another (particularly Cronquist/Takhtajan, Dahlgren/Thorne), but also they have been influenced by past authors. Indeed, modern classifications are a reshuffling of past ones. Also, Engler’s influence is great, especially at the ordinal level. For changes and modifications to become effective in future classifications of flowering plants, one will have to minimize, if not avoid, the implicit influence of the modern systems as standard systems, and to count on, among others, molecular data in redefining taxonomic concepts founded on classical morphology, and consequently to remove the prudence that makes us look at classification as a useful convention for which one of the basic criteria remains the stability of taxa recognized long ago.  相似文献   

10.
We analyze 17 studies of the use of urban sustainable development indicators (SDI) in developed western countries. The analysis reveals a lack of consensus not only on the conceptual framework and the approach favored, but also on the selection and optimal number of indicators. First, by performing different classifications and categorizations of SDI we identify problems inherent in territorial practices that use SDI. Second, we argue that the lack of consensus in several steps of the creation of SDI stems notably from the ambiguity in the definitions of sustainable development, objectives for the use of such indicators, the selection method and the accessibility of qualitative and quantitative data. Third, based on the reviewed studies, we propose a selection strategy for SDI through which we demonstrate the need to adopt a parsimonious list of SDI covering the sustainable development components and their constituent categories as broadly as possible while minimizing the number of indicators retained.  相似文献   

11.
The majority of biodiversity assessments use species as the base unit. Recently, a series of studies have suggested replacing numbers of species with higher ranked taxa (genera, families, etc.); a method known as taxonomic surrogacy that has an important potential to save time and resources in assesments of biological diversity. We examine the relationships between taxa and ranks, and suggest that species/higher taxon exchanges are founded on misconceptions about the properties of Linnaean classification. Rank allocations in current classifications constitute a heterogeneous mixture of various historical and contemporary views. Even if all taxa were monophyletic, those referred to the same rank would simply denote separate clades without further equivalence. We conclude that they are no more comparable than any other, non‐nested taxa, such as, for example, the genus Rattus and the phylum Arthropoda, and that taxonomic surrogacy lacks justification. These problems are also illustrated with data of polychaetous annelid worms from a broad‐scale study of benthic biodiversity and species distributions in the Irish Sea. A recent consensus phylogeny for polychaetes is used to provide three different family‐level classifications of polychaetes. We use families as a surrogate for species, and present Shannon‐Wiener diversity indices for the different sites and the three different classifications, showing how the diversity measures rely on subjective rank allocations.  相似文献   

12.
Classification of high-throughput genomic data is a powerful method to assign samples to subgroups with specific molecular profiles. Consensus partitioning is the most widely applied approach to reveal subgroups by summarizing a consensus classification from a list of individual classifications generated by repeatedly executing clustering on random subsets of the data. It is able to evaluate the stability of the classification. We implemented a new R/Bioconductor package, cola, that provides a general framework for consensus partitioning. With cola, various parameters and methods can be user-defined and easily integrated into different steps of an analysis, e.g., feature selection, sample classification or defining signatures. cola provides a new method named ATC (ability to correlate to other rows) to extract features and recommends spherical k-means clustering (skmeans) for subgroup classification. We show that ATC and skmeans have better performance than other commonly used methods by a comprehensive benchmark on public datasets. We also benchmark key parameters in the consensus partitioning procedure, which helps users to select optimal parameter values. Moreover, cola provides rich functionalities to apply multiple partitioning methods in parallel and directly compare their results, as well as rich visualizations. cola can automate the complete analysis and generates a comprehensive HTML report.  相似文献   

13.
Abstract— Finfoots (Heliornithidae) were chosen to test the possibility that there has been a dramatic reversal in a suite of morphological characters, intimated by an earlier phylogenetic reconstruction of Gruiformes based on DNA hybridization. There are three nodes where unstudied finfoots could stem from the existing reconstruction. The resulting alternate trees have largely exclusive implications for morphological character suite polarity, biogeography and fossil identifications. A new DNA hybridization study that includes all relevant taxa was intended to form the basis for independent evaluation of the trees, but it produced results that conflict with the earlier DNA study. So, instead, DNA trees were evaluated by their reproducibility and consensus with most-parsimonious trees, biogeography, paleontology and traditional classifications, I concur with traditional classifications that finfoots are monophyletic, and that Limpkin (Gruiformes: Aramidae) is the sister of cranes (Gruiformes: Gruidae). Limpkin is not supported as the sister of the Sungrebe ( Heliornis fulica ) or as a member of the Heliornithidae, as reported in the earlier DNA study. It is alarming that the gross lack of consensus with traditional characters and concomitant implications for character suite polarity in this case went unquestioned.  相似文献   

14.
Vegetative anatomy and systematics of subtribe Dendrobiinae (Orchidaceae)   总被引:5,自引:0,他引:5  
Anatomy of leaf, stem, and root of more than 100 species in subtribe Dendrobiinae (Orchidaceae) was studied with the light microscope to provide a comparative anatomical treatment of these organs, to serve as an independent source of evidence that might be taxonomically important, and to recommend such reinterpretations of existing classifications as are suggested by a phylogenetic assessment of data. We based our classification on that of Rudolf Schlechter as the most complete and widely accepted today. We found that the anatomy of plants in subtribe Dendrobiinae reflects a high degree of morphological diversity, and many of the anatomical characters appear to be homoplasous. When these anatomical data are used to interpret the systematic relationships among the genera, they indicate that Dendrobium is not monophyletic and that Cadetia and Pseuderia are apparently nested within the structure of Dendrobium when section Grastidium is chosen as a functional outgroup. Lack of resolution in the strict consensus tree illustrates the difficulty of determining the phylogenetic relationships of many of Schlechter's sections using anatomical characters. Nevertheless, we recommend that his sectional classification, with appropriate modifications based on available data, be retained for the present, pending a more detailed understanding of the phylogeny of Dendrobiinae based on morphology, micromorphology, anatomy, and DNA studies.  相似文献   

15.
János Podani 《Plant Ecology》1989,83(1-2):111-128
The methodology of comparing the results of multivariate community studies (resemblance matrices, ordinations, hierarchical and nonhierarchical classifications) is reviewed from two viewpoints: basic strategy and measure employed. The basic strategy is determined by 7 choices concerning the type of results, consensus methods or resemblance measures, hypothesis testing or exploratory analysis, lack or presence of reference basis, data set congruence or algorithmic effects, number of factors responsible for differences among results, and the number of properties considered in the comparison. Included is a brief summary of methods applicable to vegetation studies. Examples from a grassland survey demonstrate the utility of comparisons in evaluating the effects of plot size, data type, standardization, taxonomic level and number of species on classifications and ordinations.Abbreviations OUC = Operational Unit of Comparison - PCA = Principal Components Analysis - PCoA = Principal Coordinates Analysis - SSA = Incremental Sum of Squares Agglomeration  相似文献   

16.
A synopsis of the biology of the Ascomycotina. A wide variety of classifications of the Ascomycotina has been proposed but a consensus is being reached on the main orders that it is appropriate to recognize. Most of these orders are well characterized with respect to their ecology and nutritional requirements although defined primarily on morphology. The 43 orders are displayed diagrammatically to illustrate their host and substratum requirements. This display is intended to stimulate argument and research by broadening the consideration of evolutionary pathways to include ecological and nutritional factors. It will also be of value as a teaching aid; overlays can be constructed to show additional features not treated here.  相似文献   

17.
Soils lie at the interface between the atmosphere and the subsurface and are a key component that control ecosystem services, food production, and many other processes at the Earth’s surface. There is a long-established convention for identifying and mapping soils by texture. These readily available, georeferenced soil maps and databases are used widely in environmental sciences. Here, we show that these traditional soil classifications can be inappropriate, contributing to bias and uncertainty in applications from slope stability to water resource management. We suggest a new approach to soil classification, with a detailed example from the science of hydrology. Hydrologic simulations based on common meteorological conditions were performed using HYDRUS-1D, spanning textures identified by the United States Department of Agriculture soil texture triangle. We consider these common conditions to be: drainage from saturation, infiltration onto a drained soil, and combined infiltration and drainage events. Using a k-means clustering algorithm, we created soil classifications based on the modeled hydrologic responses of these soils. The hydrologic-process-based classifications were compared to those based on soil texture and a single hydraulic property, Ks. Differences in classifications based on hydrologic response versus soil texture demonstrate that traditional soil texture classification is a poor predictor of hydrologic response. We then developed a QGIS plugin to construct soil maps combining a classification with georeferenced soil data from the Natural Resource Conservation Service. The spatial patterns of hydrologic response were more immediately informative, much simpler, and less ambiguous, for use in applications ranging from trafficability to irrigation management to flood control. The ease with which hydrologic-process-based classifications can be made, along with the improved quantitative predictions of soil responses and visualization of landscape function, suggest that hydrologic-process-based classifications should be incorporated into environmental process models and can be used to define application-specific maps of hydrologic function.  相似文献   

18.
The effect of six resemblance coefficients (taxonomic distance, Manhattan distance, correlations, cosines, and two new general dissimilarity coefficients) on the character stability of classifications based on six data sets was evaluated. The six data sets represent a variety of organisms, and of ratios of number of characters to number of OTUs, and were randomly bipartitioned 100 times. The results of matrix correlations, cophenetic correlations and two consensus measures indicate that no one resemblance coefficient is uniformly better than all others when evaluated in terms of the stability of a classification, although taxonomic distance and Manhattan distance produce relatively more stable classifications than the other resemblance coefficients. An index of dimensionality, the stemminess and cophenetic correlations of classifications were calculated for the six data sets and also for 20 data sets analyzed in an earlier study. Regression analysis of stability on the ratio of number of characters to the number of OTUs, dimensionality, stemminess, and cophenetic correlations explained more than 70% of the variance in stability. Of the four factors, the ratio was by far the most important. Stemminess and dimensionality contributed little when considered singly, and did not add appreciably to the variance explained by ratio and cophenetic correlations.Dedicated to the memory of Prof.J. S. L. Gilmour. His insightful wrightings on naturalness in classifications paved the way for the development of numerical phenetics.  相似文献   

19.
《Ecological Indicators》2007,7(2):329-338
The classification of fish species tolerance to environmental disturbance is often used as a means to assess ecosystem conditions. Its use, however, may be problematic because the approach to tolerance classification is based on subjective judgment. We analyzed fish and physicochemical data from 773 stream sites collected as part of the U.S. Geological Survey's National Water-Quality Assessment Program to calculate tolerance indicator values for 10 physicochemical variables using weighted averaging. Tolerance indicator values (TIVs) for ammonia, chloride, dissolved oxygen, nitrite plus nitrate, pH, phosphorus, specific conductance, sulfate, suspended sediment, and water temperature were calculated for 105 common fish species of the United States. Tolerance indicator values for specific conductance and sulfate were correlated (rho = 0.87), and thus, fish species may be co-tolerant to these water-quality variables. We integrated TIVs for each species into an overall tolerance classification for comparisons with judgment-based tolerance classifications. Principal components analysis indicated that the distinction between tolerant and intolerant classifications was determined largely by tolerance to suspended sediment, specific conductance, chloride, and total phosphorus. Factors such as water temperature, dissolved oxygen, and pH may not be as important in distinguishing between tolerant and intolerant classifications, but may help to segregate species classified as moderate. Empirically derived tolerance classifications were 58.8% in agreement with judgment-derived tolerance classifications. Canonical discriminant analysis revealed that few TIVs, primarily chloride, could discriminate among judgment-derived tolerance classifications of tolerant, moderate, and intolerant. To our knowledge, this is the first empirically based understanding of fish species tolerance for stream fishes in the United States.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号