首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
DNA甲基化作为一种重要的表观遗传修饰,其甲基化水平被发现与疾病的发生发展密切相关,对其进行聚类分析有希望发现新的疾病亚型并建立有效的疾病预测预后方法。传统的聚类分析方法之一模糊C-均值(FCM:Fuzzy C-means)适用于特征空间呈球形或椭球形分布的场景,缺乏普适性。而Illumina Golden Gate平台通过计算基因的各甲基化位点的甲基化百分比描述其甲基化程度,其值位于(0,1)之间,服从混合贝塔分布,不能直接采用FCM进行聚类分析。鉴于此,本文提出基于KL特征测度的KL-FCM聚类算法,采用各样本间的K-L距离作为样本划分时的度量准则。最后,本文基于KL-FCM算法实现IRIS测试数据集和基因的DNA甲基化水平数据的聚类分析。实验结果表明该方法可以以更低的计算负荷获得优于k-均值(k-means)和传统FCM的分类效果。  相似文献   

2.
Mapping historical forest types in Baraga County Michigan,USA as fuzzy sets   总被引:4,自引:0,他引:4  
Brown  Daniel G. 《Plant Ecology》1998,134(1):97-111
Data on tree location and species in a portion of Northern Michigan were gathered from General Land Office (GLO) survey notes (ca. 1850), digitized, and generalized to represent forest types. Fuzzy membership values describing the degree of membership of each species in each forest type were derived from (a) semantic information in the forestry literature and (b) a fuzzy clustering routine applied to data from randomly placed circular plots. The fuzzy membership values assigned to each tree point for each forest type were interpolated to form continuous surfaces using kriging and co-kriging. Advantages of this method over traditional discrete mapping methods include: (a) multiple options are available for the display and analysis; (b) classification uncertainty and the continuity of natural vegetation can be represented; and (c) the classification scheme is applied systematically across the entire map area and can be altered to produce alternative maps. The subset of available display and analytical products presented include: discrete forest type maps; a surface representing the confusion between forest types; fuzzy logical overlays of forest types; and discrete class maps with color value altered within each class to indicate degree of confusion at each location.  相似文献   

3.
This paper describes a fuzzy and neuro-fuzzy approach to modelling feeding intensity of Greylag Geese on reed. As a consequence of the presence of some non-measurable or random factors and the heterogeneity of reed and goose behaviour, the relationships between the model variables are often not well known and the data collected have a high degree of uncertainty. A fuzzy approach was selected which can be applied with vague knowledge and data of high uncertainty. Fuzzy logic can be used to handle inexact reasoning in knowledge-based models with fuzzy rules and fuzzy sets to handle uncertainty in data. The neural network technique was applied to develop the fuzzy data-based models. For training, several dataset combinations of three lakes in North Germany were used. The generalisation capability of these models was checked for other lakes. The performance of these models was compared with the results of the fuzzy knowledge-based model developed in the next step. The knowledge base of this model contains the Mamdani-type rules formulated by a domain expert. All models were implemented using the Fuzzy Logic Toolbox of MATLAB®.  相似文献   

4.
Hou et al. (2016) recently developed a water quality index (WQI) for assessing water quality of five typical reservoirs. Despite all the merits of the practical WQI, it suffers from lack of uncertainty consideration; a fact that motivated the present discussion focusing on mitigation of uncertainty in water quality assessment. In this regard, superiority of employing fuzzy WQI (FWQI) rather than crisp WQI is emphasized. Due to robustness of FWQI in handling uncertainties surrounding data acquisition, employment of fuzzy concept can improve water quality assessment and monitoring to generate results which are more consistent with real world conditions.  相似文献   

5.
Clustering is an important research area that has practical applications in many fields. Fuzzy clustering has shown advantages over crisp and probabilistic clustering, especially when there are significant overlaps between clusters. Most analytic fuzzy clustering approaches are derived from Bezdek's fuzzy c-means algorithm. One major factor that influences the determination of appropriate clusters in these approaches is an exponent parameter, called the fuzzifier. To our knowledge, no theoretical reason leading to an optimal setting of this parameter is available. This paper presents the development of an heuristic scheme for determining the fuzzifier. This scheme creates close interactions between the fuzzifier and the data set to be clustered. Experimental results in clustering IRIS data and in code book design required for image compression reveal a good performance of our proposal.  相似文献   

6.
Hall  Kimberly R.  Maruca  Susan L. 《Plant Ecology》2001,156(1):105-120
Many areas of ecological inquiry require the ability to detect and characterize change in ecological variables across both space and time. The purpose of this study was to investigate ways in which geographic boundary analysis techniques could be used to characterize the pattern of change over space in plant distributions in a forested wetland mosaic. With vegetation maps created using spatially constrained clustering and difference boundary delineation, we examined similarities between the identified boundaries in plant distributions and the occurrence of six species of songbirds. We found that vegetation boundaries were significantly cohesive, suggesting one or more crisp vegetation transition zones exist in the study site. Smaller, less cohesive boundary areas also provided important information about patterns of treefall gaps and dense patches of understory within the study area. Boundaries for songbird abundance were not cohesive, and bird and vegetation difference boundaries did not show significant overlap. However, bird boundaries did overlap significantly with vegetation cluster boundaries. Vegetation clusters delineated using constrained clustering techniques have the potential to be very useful for stratifying bird abundance data collected in different sections of the study site, which could be used to improve the efficiency of monitoring efforts for rare bird species.  相似文献   

7.
模糊C-均值聚类和TWINSPAN分类的比较研究   总被引:4,自引:0,他引:4  
以英国威尔士北部Snowdonia山地Aber山谷植被为例,对模糊c-均值聚类和TWINSPAN分类进行了应用和比较研究。两种方法的结果一致。模糊c-均值聚类结果给出样地和植被类型间的隶属程度,在一定程度上优于TWINSPAN。  相似文献   

8.
Ordination on the basis of fuzzy set theory   总被引:4,自引:0,他引:4  
Fuzzy set theory is an extension of classical set theory where elements of a set have grades of membership ranging from zero for non-membership to one for full membership. Exactly as for classical sets, there exist operators, relations, and mappings appropriate for these fuzzy sets. This paper presents the concepts of fuzzy sets, operations, relations, and mappings in an ecological context. Fuzzy set theory is then established as a theoretical basis for ordination, and is employed in a sequence of examples in an analysis of forest vegetation of western Montana, U.S.A. The example ordinations show how site characteristics can be analyzed for their effect on vegetation composition, and how different site factors can be synthesized into complex environmental factors using the calculus of fuzzy set theory.In contrast to current ordination methods, ordinations based on fuzzy set theory require the investigator to hypothesize an ecological relationship between vegetation and environment, or between different vegatation compositions, before constructing the ordination. The plotted ordination is then viewed as evidence to corroborate or discredit the hypothesis.I am grateful to Dr R. D. Pfister (formerly USDA Forest Service) for kind permission to publish data from a Forest Service study.I would like to gratefully acknowledge the helpful comments and criticisms of Drs. G. Cottam, J. D. Aber, T. F. H. Allen, E. W. Beals, I. C. Prentice, C. G. Lorimer, and two anonymous reviewers.Taxonomic nomenclature follows Hitchcock & Cronquist (1973).I would like to thank the Dean of the College of Letters and Sciences, University of Wisconsin—Madison, for a fellowship which supported this research, and the Department of Botany for computer funds to perform the analyses.  相似文献   

9.
复方配伍的摄动模糊聚类方法   总被引:1,自引:0,他引:1  
利用摄动模糊聚类方法对中药复方桂枝汤的药群进行了分类,结果表明该方法优于传统模糊聚类方法,避免了利用传递闭包求模糊等价矩阵进行分类的失真问题,与中药传统组方原则相吻合。  相似文献   

10.
Questions: Does fuzzy clustering provide an appropriate numerical framework to manage vegetation classifications? What is the best fuzzy clustering method to achieve this? Material: We used 531 relevés from Catalonia (Spain), belonging to two syntaxonomic alliances of mesophytic and xerophytic montane pastures, and originally classified by experts into nine and 13 associations, respectively. Methods: We compared the performance of fuzzy C‐means (FCM), noise clustering (NC) and possibilistic C‐means (PCM) on four different management tasks: (1) assigning new relevé data to existing types; (2) updating types incorporating new data; (3) defining new types with unclassified relevés; and (4) reviewing traditional vegetation classifications. Results: As fuzzy classifiers, FCM fails to indicate when a given relevé does not belong to any of the existing types; NC might leave too many relevés unclassified; and PCM membership values cannot be compared. As unsupervised clustering methods, FCM is more sensitive than NC to transitional relevés and therefore produces fuzzier classifications. PCM looks for dense regions in the space of species composition, but these are scarce when vegetation data contain many transitional relevés. Conclusions: All three models have advantages and disadvantages, although the NC model may be a good compromise between the restricted FCM model and the robust but impractical PCM model. In our opinion, fuzzy clustering might provide a suitable framework to manage vegetation classifications using a consistent operational definition of vegetation type. Regardless of the framework chosen, national/regional vegetation classification panels should promote methodological standards for classification practices with numerical tools.  相似文献   

11.
12.
13.
An improved algorithm for clustering gene expression data   总被引:1,自引:0,他引:1  
MOTIVATION: Recent advancements in microarray technology allows simultaneous monitoring of the expression levels of a large number of genes over different time points. Clustering is an important tool for analyzing such microarray data, typical properties of which are its inherent uncertainty, noise and imprecision. In this article, a two-stage clustering algorithm, which employs a recently proposed variable string length genetic scheme and a multiobjective genetic clustering algorithm, is proposed. It is based on the novel concept of points having significant membership to multiple classes. An iterated version of the well-known Fuzzy C-Means is also utilized for clustering. RESULTS: The significant superiority of the proposed two-stage clustering algorithm as compared to the average linkage method, Self Organizing Map (SOM) and a recently developed weighted Chinese restaurant-based clustering method (CRC), widely used methods for clustering gene expression data, is established on a variety of artificial and publicly available real life data sets. The biological relevance of the clustering solutions are also analyzed.  相似文献   

14.
MOTIVATION: It is well understood that the successful clustering of expression profiles give beneficial ideas to understand the functions of uncharacterized genes. In order to realize such a successful clustering, we investigate a clustering method based on adaptive resonance theory (ART) in this report. RESULTS: We apply Fuzzy ART as a clustering method for analyzing the time series expression data during sporulation of Saccharomyces cerevisiae. The clustering result by Fuzzy ART was compared with those by other clustering methods such as hierarchical clustering, k-means algorithm and self-organizing maps (SOMs). In terms of the mathematical validations, Fuzzy ART achieved the most reasonable clustering. We also verified the robustness of Fuzzy ART using noised data. Furthermore, we defined the correctness ratio of clustering, which is based on genes whose temporal expressions are characterized biologically. Using this definition, it was proved that the clustering ability of Fuzzy ART was superior to other clustering methods such as hierarchical clustering, k-means algorithm and SOMs. Finally, we validate the clustering results by Fuzzy ART in terms of biological functions and evidence. AVAILABILITY: The software is available at http//www.nubio.nagoya-u.ac.jp/proc/index.html  相似文献   

15.
Fuzzy C-means method for clustering microarray data   总被引:9,自引:0,他引:9  
MOTIVATION: Clustering analysis of data from DNA microarray hybridization studies is essential for identifying biologically relevant groups of genes. Partitional clustering methods such as K-means or self-organizing maps assign each gene to a single cluster. However, these methods do not provide information about the influence of a given gene for the overall shape of clusters. Here we apply a fuzzy partitioning method, Fuzzy C-means (FCM), to attribute cluster membership values to genes. RESULTS: A major problem in applying the FCM method for clustering microarray data is the choice of the fuzziness parameter m. We show that the commonly used value m = 2 is not appropriate for some data sets, and that optimal values for m vary widely from one data set to another. We propose an empirical method, based on the distribution of distances between genes in a given data set, to determine an adequate value for m. By setting threshold levels for the membership values, genes which are tigthly associated to a given cluster can be selected. Using a yeast cell cycle data set as an example, we show that this selection increases the overall biological significance of the genes within the cluster. AVAILABILITY: Supplementary text and Matlab functions are available at http://www-igbmc.u-strasbg.fr/fcm/  相似文献   

16.
A fuzzy coding approach for the analysis of long-term ecological data   总被引:15,自引:1,他引:14  
  • 1 We present an unconventional procedure (fuzzy coding) to structure biological and environmental information, which uses positive scores to describe the affinity of a species for different modalities (i.e. categories) of a given variable. Fuzzy coding is essential for the synthesis of long-term ecological data because it enables analysis of diverse kinds of biological information derived from a variety of sources (e.g. samples, literature).
  • 2 A fuzzy coded table can be processed by correspondence analysis. An example using aquatic beetles illustrates the properties of such a fuzzy correspondence analysis. Fuzzy coded tables were used in all articles of this issue to examine relationships between spatial-temporal habitat variability and species traits, which were obtained from a long-term study of the Upper Rhône River, France.
  • 3 Fuzzy correspondence analysis can be programmed with the equations given in this paper or can be performed using ADE (Environmental Data Analysis) software that has been adapted to analyse such long-term ecological data. On Macintosh AppleTM computers, ADE performs simple linear ordination, more recently developed methods (e.g. principal component analysis with respect to instrumental variables, canonical correspondence analysis, co-inertia analysis, local and spatial analyses), and provides a graphical display of results of these and other types of analysis (e.g. biplot, mapping, modelling curves).
  • 4 ADE consists of a program library that exploits the potential of the HyperCardTM interface. ADE in an open system, which offers the user a variety of facilities to create a specific sequence of programs. The mathematical background of ADE is supported by the algebraic model known as ‘duality diagram’.
  相似文献   

17.
Background, aim, and scope  Analysis of uncertainties plays a vital role in the interpretation of life cycle assessment findings. Some of these uncertainties arise from parametric data variability in life cycle inventory analysis. For instance, the efficiencies of manufacturing processes may vary among different industrial sites or geographic regions; or, in the case of new and unproven technologies, it is possible that prospective performance levels can only be estimated. Although such data variability is usually treated using a probabilistic framework, some recent work on the use of fuzzy sets or possibility theory has appeared in the literature. The latter school of thought is based on the notion that not all data variability can be properly described in terms of frequency of occurrence. In many cases, it is necessary to model the uncertainty associated with the subjective degree of plausibility of parameter values. Fuzzy set theory is appropriate for such uncertainties. However, the computations required for handling fuzzy quantities has not been fully integrated with the formal matrix-based life cycle inventory analysis (LCI) described by Heijungs and Suh (2002). Materials and methods  This paper integrates computations with fuzzy numbers into the matrix-based LCI computational model described in the literature. The approach uses fuzzy numbers to propagate the data variability in LCI calculations, and results in fuzzy distributions of the inventory results. The approach is developed based on similarities with the fuzzy economic input–output (EIO) model proposed by Buckley (Eur J Oper Res 39:54–60, 1989). Results  The matrix-based fuzzy LCI model is illustrated using three simple case studies. The first case shows how fuzzy inventory results arise in simple systems with variability in industrial efficiency and emissions data. The second case study illustrates how the model applies for life cycle systems with co-products, and thus requires the inclusion of displaced processes. The third case study demonstrates the use of the method in the context of comparing different carbon sequestration technologies. Discussion  These simple case studies illustrate the important features of the model, including possible computational issues that can arise with larger and more complex life cycle systems. Conclusions  A fuzzy matrix-based LCI model has been proposed. The model extends the conventional matrix-based LCI model to allow for computations with parametric data variability represented as fuzzy numbers. This approach is an alternative or complementary approach to interval analysis, probabilistic or Monte Carlo techniques. Recommendations and perspectives  Potential further work in this area includes extension of the fuzzy model to EIO-LCA models and to life cycle impact assessment (LCIA); development of hybrid fuzzy-probabilistic approaches; and integration with life cycle-based optimization or decision analysis. Additional theoretical work is needed for modeling correlations of the variability of parameters using interacting or correlated fuzzy numbers, which remains an unresolved computational issue. Furthermore, integration of the fuzzy model into LCA software can also be investigated.  相似文献   

18.
Taxonomy-independent analysis plays an essential role in microbial community analysis. Hierarchical clustering is one of the most widely employed approaches to finding operational taxonomic units, the basis for many downstream analyses. Most existing algorithms have quadratic space and computational complexities, and thus can be used only for small or medium-scale problems. We propose a new online learning-based algorithm that simultaneously addresses the space and computational issues of prior work. The basic idea is to partition a sequence space into a set of subspaces using a partition tree constructed using a pseudometric, then recursively refine a clustering structure in these subspaces. The technique relies on new methods for fast closest-pair searching and efficient dynamic insertion and deletion of tree nodes. To avoid exhaustive computation of pairwise distances between clusters, we represent each cluster of sequences as a probabilistic sequence, and define a set of operations to align these probabilistic sequences and compute genetic distances between them. We present analyses of space and computational complexity, and demonstrate the effectiveness of our new algorithm using a human gut microbiota data set with over one million sequences. The new algorithm exhibits a quasilinear time and space complexity comparable to greedy heuristic clustering algorithms, while achieving a similar accuracy to the standard hierarchical clustering algorithm.  相似文献   

19.
Background, Aims and Scope Noise impacts are rarely assessed in Life Cycle Assessment (LCA), probably due to lack of data, to the difficulty of setting up an appropriate assessment method including relevant uncertainties and vagueness and to their site-dependent nature. The evaluation, as well as for odour, cultural and aesthetic impacts, seems to be closely related to human judgements and perception based. Although fuzzy-sets have been developed for this purpose since the late '60s and their usefulness has been proven by successful applications, noise impact assessment approaches have been essentially crisp so far. The aim of this paper is to present a method for noise impact assessment based on fuzzy sets with an application to a simple example. Methods The fuzzy noise impact assessment involves: 1) the quality assessment of the site concerned by the noise impact before the occurrence of noise emissions; quality is expressed by a crisp (i.e. non-fuzzy) function depending on variables (the so-called 'primitives'), which are relevant for the evaluation (e.g. the population density, the type of land use,...); 2) the fuzzy representation of the primitives, e.g. their evaluation by means of linguistic variables (such as 'the population density is high') and by fuzzy numbers; 3) the fuzzy representation of the quality, by fuzzifying the crisp function defined in 1) and 4) the fuzzy representation of the noise impact. In the example, the noise impacts of three processes of coal mining and combustion are assessed. Results and Discussion The application example proved the operationability of the method. Primitives and noise impact assessment results are represented by fuzzy numbers and intervals that are more informative than crisp numbers for the interpretation of results The quality and impact assessment results obtained seem to be coherent with the nature of the processes involved and of the variables characterizing them. Conclusion and Outlook Fuzzy intervals and numbers could be more informative and closer to human judgements and perceptions than crisp numbers are, thus improving the pertinence and the interpretation of the results. Despite the increase in sophistication and the fact that the representation of the variables involved in calculations should be developed further (e.g. on the basis of consensus gained in an expert panel), the fuzzy approach seems to be promising for the assessment of noise impacts in LCA.  相似文献   

20.
高琼 《植物生态学报》1990,14(3):220-225
植被生态研究中常用的聚类法,是着眼于研究区域中植被和环境因子呈间断分布或变化梯度较大的一类情况,对原始数据中各个体按其属性进行归类。直接模糊聚类法则以各个体间的属性相近程度来定义一模糊关系矩阵,然后对矩阵取不同的水平截集,从而得出一等级分类。当模糊关系确定以后,截取水平的选择就成了聚类结果的决定性因素。至目前为止,直接模糊聚类中的截取水平通常由分析者主观给定,或者是以逐步试验,逐步修改的方法确定的。这样,聚类结果就不可避免地带有较大的主观和任意性。笔者认为截取水平应选在模糊关系变化较大之处,使聚类结果尽可能地反映原始数据的结构特征。这一原理已被实施于一通用软件中,实例分析表明,如此选择的截取水平确能比较客观地反映原始数据的特征,从而得出较为合理的聚类结果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号