首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Cover-abundance estimates are commonly employed in phytosociological investigations to record the performance of species. Because the coded values are on an ordinal scale of measure, various authors have suggested that some transformation is necessary before such values can be used for classification and ordination. However, it is not clear that transformation is a sufficient treatment, and it would seem preferable to use ordinal data directly. In this paper we examine such direct use of partial rankings and show that several dissimilarity measures can be defined for this case without invoking any transformations. They include dissimilarity measures associated with various rank correlation measures and with distances between strings; all the measure are variant forms of Hausdorf's interset distance. Certain other kinds of data, such as those employing dominant and subdominant species and the dry-weight-rank estimation of biomass, are also on an ordinal scale and could be analysed using similar techniques.To illustrate the approach, a string dissimilarity measure is used to analyse a set of data from Slovakian grasslands which appear to reflect a simple gradient. The original data were recorded with 10 classes of performance and are analysed using hierarchical and nondeterministic, overlapping, classifications.  相似文献   

2.
Abstract. This article investigates whether the Braun‐Blanquet abundance/dominance (AD) scores that commonly appear in phytosociological tables can properly be analysed by conventional multivariate analysis methods such as Principal Components Analysis and Correspondence Analysis. The answer is a definite NO. The source of problems is that the AD values express species performance on a scale, namely the ordinal scale, on which differences are not interpretable. There are several arguments suggesting that no matter which methods have been preferred in contemporary numerical syntaxonomy and why, ordinal data should be treated in an ordinal way. In addition to the inadmissibility of arithmetic operations with the AD scores, these arguments include interpretability of dissimilarities derived from ordinal data, consistency of all steps throughout the analysis and universality of the method which enables simultaneous treatment of various measurement scales. All the ordination methods that are commonly used, for example, Principal Components Analysis and all variants of Correspondence Analysis as well as standard cluster analyses such as Ward's method and group average clustering, are inappropriate when using AD data. Therefore, the application of ordinal clustering and scaling methods to traditional phytosociological data is advocated. Dissimilarities between relevés should be calculated using ordinal measures of resemblance, and ordination and clustering algorithms should also be ordinal in nature. A good ordination example is Non‐metric Multidimensional Scaling (NMDS) as long as it is calculated from an ordinal dissimilarity measure such as the Goodman & Kruskal γ coefficient, and for clustering the new OrdClAn‐H and OrdClAn‐N methods.  相似文献   

3.
Mark Dale 《Plant Ecology》1985,63(2):79-88
This paper describes some methods that can be used to compare the phytosociological structure of plant communities using some graph theoretic properties of the directed graphs that represent them. In such a graph, the species are represented by points and the association of species A with species B is represented by a directed line segment going from B to A. Two communities can be compared using simple indices to measure the similarity of their species-lists (point similarity) and of the species associations in them (line similarity). A more sophisticated and informative measure of line similarity is the probability that, given the number of points shared by two graphs, they have at least as many lines in common as they are observed to have. A formula for calculating that probability is given here. The graphs of community structure can also be compared with respect to the homogeneity of the distribution of the lines among the points, a property related to the number of species that are important in determining the composition of the community. These techniques are illustrated using the graphs of the phytosociological structure of intertidal seaweed communities on the southeast coast of Nova Scotia.Nomenclature follows: South and Cardinal (1970)  相似文献   

4.
Zhang S  Chang Z  Li Z  DuanMu H  Li Z  Li K  Liu Y  Qiu F  Xu Y 《Gene》2012,497(1):58-65
Phenotypic similarity is correlated with a number of measures of gene function, such as relatedness at the level of direct protein-protein interaction. The phenotypic effect of a deleted or mutated gene, which is one part of gene annotation, has caught broad attention. However, there have been few measures to study phenotypic similarity with the data from Human Phenotype Ontology (HPO) database, therefore more analogous measures should be developed and investigated. We used five semantic similarity-based measures (Jiang and Conrath, Lin, Schlicker, Yu and Wu) to calculate the human phenotypic similarity between genes (PSG) with data from HPO database, and evaluated their accuracy with information of protein-protein interaction, protein complex, protein family, gene function or DNA sequence. Compared with the gene pairs that were random selected, the results of these methods were statistically significant (all P<0.001). Furthermore, we assessed the performance of these five measures by receiver operating characteristic (ROC) curve analysis, and found that most of them performed better than the previous methods. This work had proved that these measures based on semantic similarity for calculation of PSG were effective for hierarchical structure data. Our study contributes to the development and optimization of novel algorithms of PSG calculation and provides more alternative methods to researchers as well as tools and directions for PSG study.  相似文献   

5.
Missing data are commonly encountered using multilocus, fragment‐based (dominant) fingerprinting methods, such as random amplified polymorphic DNA (RAPD) or amplified fragment length polymorphism (AFLP). Data sets containing missing data have been analysed by eliminating those bands or samples with missing data, assigning values to missing data or ignoring the problem. Here, we present a method that uses random assignments of band presence–absence to the missing data, implemented by the computer program famd (available from http://homepage.univie.ac.at/philipp.maria.schlueter/famd.html ), for analyses based on pairwise similarity and Shannon's index. When missing values group in a data set, sample or band elimination is likely to be the most appropriate action. However, when missing values are scattered across the data set, minimum, maximum and average similarity coefficients are a simple means of visualizing the effects of missing data on tree structure. Our approach indicates the range of values that a data set containing missing data points might generate, and forces the investigator to consider the effects of missing values on data interpretation.  相似文献   

6.
The mapping and monitoring of Swiss mires has so far relied on a classification system based on expert judgement, which was not supported by a quantitative vegetation analysis and which did not include all wetland vegetation types described in the country. Based on a spatially representative sample of 17,608 relevés from 112 Swiss mires, we address the following questions: (1) How abundant are wetland vegetation types (phytosociological alliances) in Swiss mires? (2) How are they distributed across the country––is there a regional pattern? (3) How clearly are they separated from each other? (4) How clear and reliable is their ecological interpretation? Using published wetland vegetation relevés and lists of diagnostic species for phytosociological units (associations and alliances) established by experts, we developed a numerical method for assigning relevés to units through the calculation of similarity indices. We applied this method to our sample of 17,608 relevés and estimated the total area covered by each vegetation type in Switzerland. We found that vegetation types not included in previous mapping were either rare in Switzerland (partly due to mire drainage) or poorly distinguished from other vegetation units. In an ordination, the Swiss mire vegetation formed a triangular gradient system with the Sphagnion medii, the Caricion davallianae and the Phragmition australis as extreme types. Phytosociological alliances were clearly separated in a subset of 2,265 relevés, which had a strong similarity to one particular association, but poorly separated across all relevés, of which many could not be unequivocally assigned to one association. However, ecological gradients were reflected equally well by the vegetation types in either case. Overall, phytosociological alliances distinguished until now proved suitable schemes to describe and interpret vegetation gradients. Nevertheless, we see the urgent need to establish a data base of Swiss wetland relevés for a more reliable definition of some vegetation units.  相似文献   

7.
From a strictly statistical perspective, most of the commonly used statistical tests cannot be performed on vegetation data obtained using a non-random sampling design. Despite this, non-randomly sampled plots such as phytosociological relevés still make sense: because they may focus on objectives not appropriately addressed by random sampling, such as the study of rare plant communities or species; and because random sampling is often more time-demanding and expensive. Considering the huge body of phytosociological data available, an interesting question arises: if we compare randomly and non-randomly sampled data sets, to what extent do the results of our analyses differ with respect to various species and vegetation parameters? We present an attempt to tackle this question by comparing two data sets collected in a 25 km2 area close to the city of Bremen, northwestern Germany: the first data set consisted of 30 subjectively (non-randomly) placed, homogeneous plots across different plant communities, each of which was laid out in a nested design including 9 sizes from 0.5 m2 to 1,000 m2. The second data set consisted of 30 (again nested) plots randomly selected and located with a GPS device; plots were rejected only if they for some reason were inaccessible. The data collection was the same for both data sets: presence-absence of all vascular plants was recorded for the different plot sizes, and soil samples were collected for the determination of the values of some important environmental variables. For the comparison of the two data sets, we used either the complete data sets or sub-sets of those plots located in woodlands. The main results included the following: (1) Species abundance patterns: Random sampling resulted in a larger number of common and a smaller number of rare species than non-random sampling. (2) Species richness at different spatial scales: For the small plot sizes, the number of species in the non-randomly placed plots was higher than in the randomly placed plots, while the differences were less pronounced at larger spatial scales. As a consequence, also the parameters of species-area curves differed between the data sets, especially in the sub-set including woodland plots. (3) Vegetation differentiation: In random sampling, there was considerable redundancy, i.e., there were several plots with high floristic similarity. (4) Vegetation-environment relationships: The ordination scores of the non-randomly placed plots showed a larger number of significant correlations to soil parameters than the scores of randomly placed plots. The results suggest that conclusions drawn from the analysis of non-randomly placed plots such as phytosociological relevés may be biased, especially regarding estimates of species abundance and species richness patterns.  相似文献   

8.
Abstract. The program JUICE was designed as a Microsoft® WINDOWS® application for editing, classification and analysis of large phytosociological tables and databases. This software, with a current maximum capacity of 30 000 relevés in one table, includes many functions for easy manipulation of table and header data. Various options include classification using COCKTAIL and TWINSPAN methods, calculation of interspecific associations, fidelity measures, average Ellenberg indicator values, preparation of synoptic tables, automatic sorting of relevé tables, and export of table data into other applications (word processors, spreadsheet programs or mapping packages). JUICE is optimized for use in association with TURBOVEG which is the most widespread database program for storing phytosociological data in Europe.  相似文献   

9.
Summary Some characteristics of the more commonly used similarity coefficients have been discussed. Coefficients with undesirable properties include: the product moment correlation coefficient, some information statistics, the relative homogeneity function, the weighted similarity coefficient, the Euclidean (absolute) distance, Gleason's, Ellenberg's and Spatz's similarity indices, and the absolute value function.Of the six coefficients that were tested with two sets of phytosociological data, the Canberra measure and the absolute Euclidean distance were the least successful, with at least one data set, in providing classifications that were similar to the classifications obtained by the Braun-Blanquet sorting technique. The standard Euclidean distance and the similarity ratio had intermediate success. The Czekanowski coefficient, especially in its relativized from (= relative absolute value function), was the most successful. This latter coefficient is cover-weighted and therefore the Canberra measure, although it has some undesirable characteristics, may be valuable for investigating relevé similarities that are based on species with low cover.The transformation of Coetzee & Werger (1973) appears to be appropriate as a conversion of the cover-abundance values. The mean cover percent values corresponding to the cover-abundance values gave poor results.The relativized Czekanowski coefficient should be suitable at the lower syntaxonomical levels. At higher levels, qualitative coefficients may be sufficient for determining similarities between syntaxa. Qualitative coefficients will not always be successful at the lower levels — this was illustrated with the test data.In this study, the clustering procedure of group average sorting was used to construct the dendrogram. It gives an average similarity value within the dendrogram groups. These values can be used to give quantitative definitions to syntaxonomic rank.This research was supported by the University of Minnesota. A grant from the UM Computer Centre is acknowledged. I am grateful to Prof. E. Cushing, Dr. E. van der Maarel, and Prof. L. Orlóci for criticism which has improved the paper.  相似文献   

10.
Abstract. The relationship between mean Ellenberg indicator values (IV) per vegetation relevé and environmental parameters measured in the field usually shows a large variation. We tested the hypothesis that this variation is caused by bias dependent on the phytosociological class. For this purpose we collected data containing vegetation relevés and measured soil pH (3631 records) or mean spring groundwater level (MSL, 1600 records). The relevés were assigned to vegetation types by an automated procedure. Regression of the mean indicator values for acidity on soil pH and the mean indicator values for moisture on MSL gave percentages explained variance similar to values that were reported earlier in literature. When the phytosociological class was added as an explanatory factor the explained variance increased considerably. Regression lines per vegetation type were estimated, many of which were significantly different from each other. In most cases the intercepts were different, but in some cases their slopes differed as well. The results show that Ellenberg indicator values for acidity and moisture appear to be biased towards the values that experts expect for the various phytosociological classes. On the basis of the results, we advise to use Ellenberg IVs only for comparison within the same vegetation type.  相似文献   

11.
群落结构复杂性的测度方法研究进展   总被引:2,自引:0,他引:2       下载免费PDF全文
金森 《植物生态学报》2006,30(6):1030-1039
该文对群落结构复杂性的测度方法的研究进展状况进行了综述。根据测度方法建立的方法基础,将现有的方法分成3类:基于多样性的复杂性测度、基于计算复杂性的测度和基于几何学特征的复杂性测度。对每类测度方法进行了介绍,对其优缺点进行了评述。同时提出了未来研究中应给予重视的问题。结果表明,现有群落结构复杂性的测度方法普遍存在区分能力差的问题,对于基于多样性的结构复杂性测度,目前还缺乏确定各测度属性权重的客观方法;现有的一些基于计算复杂性的结构测度与多样性指标关系过于密切,还不完善,同时其生态学的意义还不明确,而另一些计算复杂性指标还缺乏实际检验。今后,如何建立既具有区分力、又与多样性在概念和数值上都有一定区别的群落结构的计算复杂性的测度方法、如何科学合理地确定复杂性测度中的属性权重以及如何建立结构复杂性的测度和功能过程之间的联系等都是需要深入和系统研究的。由于方法的相似性,有关群落结构复杂性的测度方法也可以应用到其它尺度上的结构复杂性的研究中。  相似文献   

12.
Coral reef monitoring programmes exist in all regions of the world, recording reef attributes such as coral cover, fish biomass and macroalgal cover. Given the cost of such monitoring programs, and the degraded state of many of the world’s reefs, understanding how reef monitoring data can be used to shape management decisions for coral reefs is a high priority. However, there is no general guide to understanding the ecological implications of the data in a format that can trigger a management response. We attempt to provide such a guide for interpreting the temporal trends in 41 coral reef monitoring attributes, recorded by seven of the largest reef monitoring programmes. We show that only a small subset of these attributes is required to identify the stressors that have impacted a reef (i.e. provide a diagnosis), as well as to estimate the likely recovery potential (prognosis). Two of the most useful indicators, turf algal canopy height and coral colony growth rate are not commonly measured, and we strongly recommend their inclusion in reef monitoring. The diagnosis and prognosis system that we have developed may help guide management actions and provides a foundation for further development as biological and ecological insights continue to grow.  相似文献   

13.
DNA microarrays (gene chips), frequently used in biological and medical studies, measure the expressions of thousands of genes per sample. Using microarray data to build accurate classifiers for diseases is an important task. This paper introduces an algorithm, called Committee of Decision Trees by Attribute Behavior Diversity (CABD), to build highly accurate ensembles of decision trees for such data. Since a committee's accuracy is greatly influenced by the diversity among its member classifiers, CABD uses two new ideas to "optimize" that diversity, namely (1) the concept of attribute behavior-based similarity between attributes, and (2) the concept of attribute usage diversity among trees. The ideas are effective for microarray data, since such data have many features and behavior similarity between genes can be high. Experiments on microarray data for six cancers show that CABD outperforms previous ensemble methods significantly and outperforms SVM, and show that the diversified features used by CABD's decision tree committee can be used to improve performance of other classifiers such as SVM. CABD has potential for other high-dimensional data, and its ideas may apply to ensembles of other classifier types.  相似文献   

14.
Abstract. In the framework of the European Vegetation Survey common data standards are proposed for recording phytosociological relevés for syntaxonomical classification. The authors wish to establish the notion that common data standards for recording phytosociological data can only be advantageous for advancing the credibility and application of vegetation science, and may stimulate other projects.  相似文献   

15.
Analysis of mitochondrial DNA sequence variation has been used extensively to study the evolutionary relationships of individuals and populations, both within and across species. So ubiquitous and easily acquired are mtDNA data that it has been suggested that such data could serve as a taxonomic 'barcode' for an objective species classification scheme. However, there are technical pitfalls associated with the acquisition of mtDNA data. One problem is the presence of translocated pieces of mtDNA in the nuclear genome of many taxa that may be mistaken for authentic organellar mtDNA. We assessed the extent to which such 'numt' sequences may pose an overlooked problem in analyses of mtDNA from humans and apes. Using long-range polymerase chain reaction (PCR), we generated necessarily authentic mtDNA sequences for comparison with sequences obtained using typical methods for a segment of the mtDNA control region in humans, chimpanzees, bonobos, gorillas and orangutans. Results revealed that gorillas are notable for having such a variety of numt sequences bearing high similarity to authentic mtDNA that any analysis of mtDNA using standard approaches is rendered impossible. Studies on humans, chimpanzees, bonobos or orangutans are apparently less problematic. One implication is that explicit measures need to be taken to authenticate mtDNA sequences in newly studied taxa or when any irregularities arise. Furthermore, some taxa may not be amenable to analysis of mtDNA variation at all.  相似文献   

16.
Abstract. Large phytosociological data sets of three types of grassland and three types of forest vegetation from the Czech Republic were analysed with a focus on plot size used in phytosociological sampling and on the species‐area relationship. The data sets included 12975 relevés, sampled by different authors in different parts of the country between 1922 and 1999. It was shown that in the grassland data sets, the relevés sampled before the 1960s tended to have a larger plot size than the relevés made later on. No temporal variation in plot sizes used was detected in forest relevés. Species‐area curves fitted to the data showed unnatural shapes, with levelling‐off or even decrease in plot sizes higher than average. This distortion is explained by the subjective, preferential method of field sampling used in phytosociology. When making relevés in species‐poor vegetation, researchers probably tend to use larger plots in order to include more species. The reason for this may be that a higher number of species gives a higher probability of including presumed diagnostic species, so that the relevé can be more easily classified in the Braun‐Blanquet classification system. This attitude of phytosociologists has at least two consequences: (1) in phytosociological data bases species‐poor vegetation types are underrepresented or relevés are artificially biased towards higher species richness; (2) the suitability of phytosociological data for species richness estimation is severely limited.  相似文献   

17.
Many authors apply statistical tests to sets of relevés obtained using non-random methods to investigate phytosociological and ecological relationships. Frequently applied tests include thet-test, ANOVA, Mann-Whitney test, Kruskal-Wallis test, chi-square test (of independence, goodness-of-fit, and homogeneity), Kolmogorov-Smirnov test, concentration analysis, tests of linear correlation and Spearman rank correlation coefficient, computer intensive methods (such as randomization and re-sampling) and others. I examined the extent of reliability of the results of such tests applied to non-random data by examining the tests requirements according to statistical theory. I conclude that when used for such data, the statistical tests do not provide reliable support for the inferences made because non-randomness of samples violated the demand for observations to be independent, and different parts of the investigated communities did not have equal chance to be represented in the sample. Additional requirements, e.g. of normality and homoscedasticity, were also neglected in several cases. The importance of data satisfying the basic requirements set by statistical tests is stressed.  相似文献   

18.
Editing and other manipulations of phytosociological data are considered parts of a more comprehensive data management program. The topic is reviewed against the background of an increasing number of presentations in Vegetatio. It is observed that the newer editing programs do not add anything new to the already existing battery of programs. Yet, there is a need for further discussions of some risks in data manipulations, notably a priori deletion of relevés and species transformation of field scores, and incorporation of suitable devices in the packages available. Also, the evaluation of classification results and the allocation of new material to existing classification systems need more attention. So do the problems concerning the production of structured phytosociological tables, as outcomes of multivariate data treatments.  相似文献   

19.
When a protein sequence does not share any significant sequence similarity with a protein of known structure, homology modeling cannot be applied. However, many novel and interesting methods, such as secondary structure prediction, fold recognition, and prediction of long-range interactions, are being developed and have been shown to be reasonably successful in predicting protein structures from sequence data and evolutionary information. The a priori evaluation of the correctness of a prediction obtained by one of these methods is however often problematic. Consequently, it is important to use all available information provided by as many different methods as possible and all the available experimental data about the protein of interest, since the consistency of the results is indicative of the reliability of the prediction. Hence the need has arisen for suitable tools able to compare results provided by different methods and evaluate their consistency. We have therefore constructed GLASS, a general platform to read, visualize, compare, and evaluate prediction results from many different sources and to project these prediction results into three dimensions. In addition, GLASS allows the comparison of selected parameters calculated for a model with the distribution observed in real protein structures, thus providing an easy way to test new methods for evaluating the likelihood of different structural models. GLASS can be considered as a “workbench” for structural predictions useful to both experimentalists and theoreticians. Proteins 30:339–351, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

20.
MOTIVATION: Significance analysis of differential expression in DNA microarray data is an important task. Much of the current research is focused on developing improved tests and software tools. The task is difficult not only owing to the high dimensionality of the data (number of genes), but also because of the often non-negligible presence of missing values. There is thus a great need to reliably impute these missing values prior to the statistical analyses. Many imputation methods have been developed for DNA microarray data, but their impact on statistical analyses has not been well studied. In this work we examine how missing values and their imputation affect significance analysis of differential expression. RESULTS: We develop a new imputation method (LinCmb) that is superior to the widely used methods in terms of normalized root mean squared error. Its estimates are the convex combinations of the estimates of existing methods. We find that LinCmb adapts to the structure of the data: If the data are heterogeneous or if there are few missing values, LinCmb puts more weight on local imputation methods; if the data are homogeneous or if there are many missing values, LinCmb puts more weight on global imputation methods. Thus, LinCmb is a useful tool to understand the merits of different imputation methods. We also demonstrate that missing values affect significance analysis. Two datasets, different amounts of missing values, different imputation methods, the standard t-test and the regularized t-test and ANOVA are employed in the simulations. We conclude that good imputation alleviates the impact of missing values and should be an integral part of microarray data analysis. The most competitive methods are LinCmb, GMC and BPCA. Popular imputation schemes such as SVD, row mean, and KNN all exhibit high variance and poor performance. The regularized t-test is less affected by missing values than the standard t-test. AVAILABILITY: Matlab code is available on request from the authors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号