首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
2.
The knowledge about the conservation status of species is an important data for conservation biology. Therefore, threatened species lists are a powerful tool for conservation planning and prioritization. Our objective is to compare the global, the national and state red lists of amphibians in Brazil. Threatened species were categorized according to their listing in one or several of these lists. We analyzed for true inconsistencies across lists in order to evaluate practical consequences of such incongruences on amphibian conservation in Brazil. We recorded a total of 61 threatened amphibian species in Brazil (across all red lists). Only one species, Phrynomedusa fimbriata, was listed as Extinct (both in IUCN, Brazil and S?o Paulo lists). A total of eleven endemic species are listed as threatened by the global red list, but do not appear in Brazil’s national red list, which represent an inconsistence among these lists. Besides that, the threat category of Thoropa lutzi and Thoropa petropolitana, two endemic species, differ among both lists, which also represents a problem between both lists. These mismatches may be due to several reasons such as different interpretation of the criteria; different methodologies used; different data availability on species; differences in the dates of assessments processes; the assessors’ attitudes to uncertainty; outdated red lists. Harmonization among red lists permits a better picture of threatened amphibian diversity across scales and to develop global, national and state plans to complement conservation actions in order to maximize the chance of success of these initiatives.  相似文献   

3.
Peak lists are commonly used in NMR as input data for various software tools such as automatic assignment and structure calculation programs. Inconsistencies of chemical shift referencing among different peak lists or between peak and chemical shift lists can cause severe problems during peak assignment. Here we present a simple and robust tool to achieve self-consistency of the chemical shift referencing among a set of peak lists. The Peakmatch algorithm matches a set of peak lists to a specified reference peak list, neither of which have to be assigned. The chemical shift referencing offset between two peak lists is determined by optimizing an assignment-free match score function using either a complete grid search or downhill simplex optimization. It is shown that peak lists from many different types of spectra can be matched reliably as long as they contain at least two corresponding dimensions. Using a simulated peak list, the Peakmatch algorithm can also be used to obtain the optimal agreement between a chemical shift list and experimental peak lists. Combining these features makes Peakmatch a useful tool that can be applied routinely before automatic assignment or structure calculation in order to obtain an optimized input data set.  相似文献   

4.
Question: How different are lists of diagnostic species of vegetation units, derived using various fidelity measures, in different contexts and with presence/absence versus cover data? Methods: Six different fidelity measures were calculated for vegetation units of two classified data sets covering contrasting types of Central European vegetation (beech forest and dwarf shrub vegetation). Both statistical and non‐statistical fidelity measures were used, and either species presence/absence or cover was considered. Each measure was calculated on four hierarchical levels and within two different contexts, either within the whole data set or within the next higher level of hierarchical classification. Average similarities of the diagnostic species lists derived from various combinations of fidelity measures and contexts were calculated and visualized using principal coordinate analysis (PCoA). Results: The correlations between fidelity values derived from non‐statistical and statistical measures were rather weak. Nevertheless, diagnostic species lists calculated for the same syntaxon by different measures usually had several species in common. Average similarity between pairs of fidelity measures or contexts (based on the Sørensen similarity index) ranged from 0.21 to 0.92. PCoA clustered individual combinations of fidelity measures and contexts mainly according to the context and the use of presence/absence versus cover data, rather than according to the fidelity measures. Conclusions: The strongest impact on the lists of diagnostic species was not the fidelity measure itself but the context of its application and the use of presence/absence or cover data. Despite the weak correlation between individual fidelity values, traditional (non‐statistical) and statistical measures produce quite similar lists of diagnostic species, provided that the context of the analysis is the same. Both approaches have their advantages and disadvantages, and the choice of the appropriate algorithm should depend on the focus of the study.  相似文献   

5.
Often, the most informative genes have to be selected from different gene sets and several computer gene ranking algorithms have been developed to cope with the problem. To help researchers decide which algorithm to use, we developed the analysis of gene ranking algorithms (AGRA) system that offers a novel technique for comparing ranked lists of genes. The most important feature of AGRA is that no previous knowledge of gene ranking algorithms is needed for their comparison. Using the text mining system finding-associated concepts with text analysis. AGRA defines what we call biomedical concept space (BCS) for each gene list and offers a comparison of the gene lists in six different BCS categories. The uploaded gene lists can be compared using two different methods. In the first method, the overlap between each pair of two gene lists of BCSs is calculated. The second method offers a text field where a specific biomedical concept can be entered. AGRA searches for this concept in each gene lists' BCS, highlights the rank of the concept and offers a visual representation of concepts ranked above and below it. AVAILABILITY AND IMPLEMENTATION: Available at http://agra.fzv.uni-mb.si/, implemented in Java and running on the Glassfish server. CONTACT: simon.kocbek@uni-mb.si.  相似文献   

6.
MOTIVATION: Mass spectrometry experiments in the field of proteomics produce lists containing tens to thousands of identified proteins. With the protein information and property explorer (PIPE), the biologist can acquire functional annotations for these proteins and explore the enrichment of the list, or fraction thereof, with respect to functional classes. These protein lists may be saved for access at a later time or different location. The PIPE is interoperable with the Firegoose and the Gaggle, permitting wide-ranging data exploration and analysis. The PIPE is a rich-client web application which uses AJAX capabilities provided by the Google Web Toolkit, and server-side data storage using Hibernate. AVAILABILITY: http://pipe.systemsbiology.net.  相似文献   

7.
Restriction enzyme lists are presented for the practical working geneticist to update any DNA computer program. These lists combine formerly scattered information and contain all presently known restriction enzymes with a unique recognition sequence, a cut site, or methylation (in)sensitivity. The lists are in the shortest possible form to also be functional with small DNA computer programs, and will produce clear restriction maps without any redundancy or loss of information. The lists discern between commercial and noncommercial enzymes, and prototype enzymes and different isoschizomers are cross-referenced. Differences in general methylation sensitivities and (in)sensitivities against Dam and Dcm methylases of Escherichia coli are indicated. Commercial methylases and intron-encoded endonucleases are included. An address list is presented to contact commercial suppliers. The lists are constantly updated and available in electronic form as pure US ASCII files, and in formats for the DNA computer programs DNA-Strider for Apple Macintosh, and DNAsis for IBM personal computers or compatibles via e-mail from the internet address: netservembl-heidelberg.de by sending only the message help relibrary.  相似文献   

8.

Background

Venn diagrams are commonly used to display list comparison. In biology, they are widely used to show the differences between gene lists originating from different differential analyses, for instance. They thus allow the comparison between different experimental conditions or between different methods. However, when the number of input lists exceeds four, the diagram becomes difficult to read. Alternative layouts and dynamic display features can improve its use and its readability.

Results

jvenn is a new JavaScript library. It processes lists and produces Venn diagrams. It handles up to six input lists and presents results using classical or Edwards-Venn layouts. User interactions can be controlled and customized. Finally, jvenn can easily be embeded in a web page, allowing to have dynamic Venn diagrams.

Conclusions

jvenn is an open source component for web environments helping scientists to analyze their data. The library package, which comes with full documentation and an example, is freely available at http://bioinfo.genotoul.fr/jvenn.  相似文献   

9.
10.
MOTIVATION: The analysis of genome-scale data from different high throughput techniques can be used to obtain lists of genes ordered according to their different behaviours under distinct experimental conditions corresponding to different phenotypes (e.g. differential gene expression between diseased samples and controls, different response to a drug, etc.). The order in which the genes appear in the list is a consequence of the biological roles that the genes play within the cell, which account, at molecular scale, for the macroscopic differences observed between the phenotypes studied. Typically, two steps are followed for understanding the biological processes that differentiate phenotypes at molecular level: first, genes with significant differential expression are selected on the basis of their experimental values and subsequently, the functional properties of these genes are analysed. Instead, we present a simple procedure which combines experimental measurements with available biological information in a way that genes are simultaneously tested in groups related by common functional properties. The method proposed constitutes a very sensitive tool for selecting genes with significant differential behaviour in the experimental conditions tested. RESULTS: We propose the use of a method to scan ordered lists of genes. The method allows the understanding of the biological processes operating at molecular level behind the macroscopic experiment from which the list was generated. This procedure can be useful in situations where it is not possible to obtain statistically significant differences based on the experimental measurements (e.g. low prevalence diseases, etc.). Two examples demonstrate its application in two microarray experiments and the type of information that can be extracted.  相似文献   

11.
MOTIVATION: Two important questions for the analysis of gene expression measurements from different sample classes are (1) how to classify samples and (2) how to identify meaningful gene signatures (ranked gene lists) exhibiting the differences between classes and sample subsets. Solutions to both questions have immediate biological and biomedical applications. To achieve optimal classification performance, a suitable combination of classifier and gene selection method needs to be specifically selected for a given dataset. The selected gene signatures can be unstable and the resulting classification accuracy unreliable, particularly when considering different subsets of samples. Both unstable gene signatures and overestimated classification accuracy can impair biological conclusions. METHODS: We address these two issues by repeatedly evaluating the classification performance of all models, i.e. pairwise combinations of various gene selection and classification methods, for random subsets of arrays (sampling). A model score is used to select the most appropriate model for the given dataset. Consensus gene signatures are constructed by extracting those genes frequently selected over many samplings. Sampling additionally permits measurement of the stability of the classification performance for each model, which serves as a measure of model reliability. RESULTS: We analyzed a large gene expression dataset with 78 measurements of four different cartilage sample classes. Classifiers trained on subsets of measurements frequently produce models with highly variable performance. Our approach provides reliable classification performance estimates via sampling. In addition to reliable classification performance, we determined stable consensus signatures (i.e. gene lists) for sample classes. Manual literature screening showed that these genes are highly relevant to our gene expression experiment with osteoarthritic cartilage. We compared our approach to others based on a publicly available dataset on breast cancer. AVAILABILITY: R package at http://www.bio.ifi.lmu.de/~davis/edaprakt  相似文献   

12.
MOTIVATION: Many applications of microarray technology in clinical cancer studies aim at detecting molecular features for refined diagnosis. In this paper, we follow an opposite rationale: we try to identify common molecular features shared by phenotypically distinct types of cancer using a meta-analysis of several microarray studies. We present a novel algorithm to uncover that two lists of differentially expressed genes are similar, even if these similarities are not apparent to the eye. The method is based on the ordering in the lists. RESULTS: In a meta-analysis of five clinical microarray studies we were able to detect significant similarities in five of the ten possible comparisons of ordered gene lists. We included studies, where not a single gene can be significantly associated to outcome. The detection of significant similarities of gene lists from different microarray studies is a novel and promising approach. It has the potential to improve upon specialized cancer studies by exploring the power of several studies in one single analysis. Our method is complementary to previous methods in that it does not rely on strong effects of differential gene expression in a single study but on consistent ones across multiple studies.  相似文献   

13.
MOTIVATION: Many entity taggers and information extraction systems make use of lists of terms of entities such as people, places, genes or chemicals. These lists have traditionally been constructed manually. We show that distributional clustering methods which group words based on the contexts that they appear in, including neighboring words and syntactic relations extracted using a shallow parser, can be used to aid in the construction of term lists. RESULTS: Experiments on learning lists of terms and using them as part of a gene tagger on a corpus of abstracts from the scientific literature show that our automatically generated term lists significantly boost the precision of a state-of-the-art CRF-based gene tagger to a degree that is competitive with using hand curated lists and boosts recall to a degree that surpasses that of the hand-curated lists. Our results also show that these distributional clustering methods do not generate lists as helpful as those generated by supervised techniques, but that they can be used to complement supervised techniques so as to obtain better performance. AVAILABILITY: The code used in this paper is available from http://www.cis.upenn.edu/datamining/software_dist/autoterm/  相似文献   

14.
Although occurrence-based listing methods could provide reliable lists of species composition for a site, the effective reliability of this method to provide more detailed information about species frequency (and abundance) has been rarely tested. In this paper, we compared the species frequencies obtained for the same set of species-rich sites (wetlands of central Italy) from two different methods: McKinnon lists and line transects. In all sites we observed: (i) rapid cumulating curves of line transect abundance frequencies toward the asymptote represented by the maximum value in McKinnon occurrence frequency; (ii) a large amount of species having a low frequency with line transect method showing a high range of variation in frequency obtained by McKinnon lists; (iii) a set of species having a subdominant (>0.02-<0.05) and dominant species (>0.05) frequency with line transect showed all the highest value in McKinnon frequency. McKinnon lists provides only a coarse-grained proxy of species frequency of individuals distinguishing only between common species (having the highest values of McKinnon frequency) and rare species (all the other species). Although McKinnon lists have some points of strength, this method does not discriminate the frequencies inside the subset of common species (sub-dominant and dominant species). Therefore, we suggest a cautionary approach when McKinnon frequencies should be used to obtain complex univariate metrics of diversity.  相似文献   

15.
In the phytosociological literature, there are numerous different approaches to the designation of diagnostic species. Frequently, this results in discrepancies between the lists of diagnostic species published for one and the same community. We examined different approaches to determining diagnostic species using as an examplePicea abies forests within the broader context of all Central European forests. Diagnostic species of spruce forests were determined from a data set of 20,164 phytosociological relevés of forests from the Eastern Alps, Western Carpathians, and the Bohemian Massif, which included 3,569 relevés of spruce forests. Phi coefficient of association was used to measure species fidelity, and species with the highest fidelities were considered as diagnostic. Diagnostic species were determined in four ways, including (A) comparison of spruce forests among the three mountain ranges, (B) comparison between spruce forests and other forests, performed separately in each of the mountain ranges, (C) simultaneous comparison of spruce forests of each of the mountain ranges with spruce forests of the other two ranges and with the other forests of all ranges, (D) comparison of spruce forests with the other forests, using pooled data sets from the three mountain ranges. The sets of diagnostic species of spruce forests yielded in comparisons A and B were sharply different; the set resulting from comparison C was intermediate between the first two and comparison D resulted in similar diagnostic species as comparison B. In comparison A, spruce forests of the Eastern Alps had a number of diagnostic species, while the spruce forests of the other two mountain ranges had only few diagnostic species. In comparison B, by contrast, the number and quality of diagnostic species decreased from the Bohemian Massif to the Eastern Alps. This exercise points out that lists of diagnostic species published in phytosociological literature are dependent on the context, i.e. the underlying data sets and comparisons: some of these lists are useful for identification of vegetation units at a local scale, some others for distinguishing units within a narrowly delimited community type over a large area. The thoughtless application of published lists of diagnostic species outside of the context for which they were intended should therefore be avoided.  相似文献   

16.
The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or to a meta-analysis comparison, it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained, instead of just one list. Here we introduce a method, based on permutations, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated by finding and comparing gene profiles on a large prostate cancer dataset, consisting of two cohorts of patients from different countries, for a total of 455 samples.  相似文献   

17.
MOTIVATION: A common problem in the emerging field of metabolomics is the consolidation of signal lists derived from metabolic profiling of different cell/tissue/fluid states where a number of replicate experiments was collected on each state. RESULTS: We describe an approach for the consolidation of peak lists based on hierarchical clustering, first within each set of replicate experiments and then between the sets of replicate experiments. The problems of finding the dendrogram tree cutoff which gives the optimal number of peak clusters and the effect of different clustering methods were addressed. When applied to gas chromatography-mass spectrometry metabolic profiling data acquired on Leishmania mexicana, this approach resulted in robust data matrices which completely separated the wild-type and two mutant parasite lines based on their metabolic profile.  相似文献   

18.
In epidemiology, capture–recapture models are commonly used to estimate the size of an unknown population based on several incomplete lists of individuals. The method operates under two main assumptions: independence between the lists (local independence) and homogeneity of capture probabilities of individuals. In practice, these assumptions are rarely satisfied. We introduce a multinomial latent class model that can account for both list dependence and heterogeneity. Parameter estimation is performed by maximizing the conditional likelihood function with the use of the EM algorithm. In addition, a new approach for evaluating the standard errors of the parameter estimates is discussed, which considerably reduces the computational burden associated with the evaluation of the variance of the population size estimate.  相似文献   

19.
Outcome signature genes in breast cancer: is there a unique set?   总被引:9,自引:0,他引:9  
MOTIVATION: Predicting the metastatic potential of primary malignant tissues has direct bearing on the choice of therapy. Several microarray studies yielded gene sets whose expression profiles successfully predicted survival. Nevertheless, the overlap between these gene sets is almost zero. Such small overlaps were observed also in other complex diseases, and the variables that could account for the differences had evoked a wide interest. One of the main open questions in this context is whether the disparity can be attributed only to trivial reasons such as different technologies, different patients and different types of analyses. RESULTS: To answer this question, we concentrated on a single breast cancer dataset, and analyzed it by a single method, the one which was used by van't Veer et al. to produce a set of outcome-predictive genes. We showed that, in fact, the resulting set of genes is not unique; it is strongly influenced by the subset of patients used for gene selection. Many equally predictive lists could have been produced from the same analysis. Three main properties of the data explain this sensitivity: (1) many genes are correlated with survival; (2) the differences between these correlations are small; (3) the correlations fluctuate strongly when measured over different subsets of patients. A possible biological explanation for these properties is discussed. CONTACT: eytan.domany@weizmann.ac.il SUPPLEMENTARY INFORMATION: http://www.weizmann.ac.il/physics/complex/compphys/downloads/liate/  相似文献   

20.
FiRe is a user-friendly Excel macro designed to survey microarray data rapidly. This software interactively assembles data from different experiments and produces lists of candidate genes according to patterns of gene expression. Furthermore, macros bundled with FiRe can compare lists of genes, merge information from different spreadsheets, link candidates to information available from web-based databases, and produce heat-maps for easy visualization of microarray data. FiRe is freely available at http://www.unifr.ch/plantbio/FiRe/main.html .  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号