首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MALDI mass spectrometry can generate profiles that contain hundreds of biomolecular ions directly from tissue. Spatially-correlated analysis, MALDI imaging MS, can simultaneously reveal how each of these biomolecular ions varies in clinical tissue samples. The use of statistical data analysis tools to identify regions containing correlated mass spectrometry profiles is referred to as imaging MS-based molecular histology because of its ability to annotate tissues solely on the basis of the imaging MS data. Several reports have indicated that imaging MS-based molecular histology may be able to complement established histological and histochemical techniques by distinguishing between pathologies with overlapping/identical morphologies and revealing biomolecular intratumor heterogeneity. A data analysis pipeline that identifies regions of imaging MS datasets with correlated mass spectrometry profiles could lead to the development of novel methods for improved diagnosis (differentiating subgroups within distinct histological groups) and annotating the spatio-chemical makeup of tumors. Here it is demonstrated that highlighting the regions within imaging MS datasets whose mass spectrometry profiles were found to be correlated by five independent multivariate methods provides a consistently accurate summary of the spatio-chemical heterogeneity. The corroboration provided by using multiple multivariate methods, efficiently applied in an automated routine, provides assurance that the identified regions are indeed characterized by distinct mass spectrometry profiles, a crucial requirement for its development as a complementary histological tool. When simultaneously applied to imaging MS datasets from multiple patient samples of intermediate-grade myxofibrosarcoma, a heterogeneous soft tissue sarcoma, nodules with mass spectrometry profiles found to be distinct by five different multivariate methods were detected within morphologically identical regions of all patient tissue samples. To aid the further development of imaging MS based molecular histology as a complementary histological tool the Matlab code of the agreement analysis, instructions and a reduced dataset are included as supporting information.  相似文献   

2.
Xu S  Rao N  Chen X  Zhou B 《Biotechnology letters》2011,33(5):889-896
The accuracy of prediction methods based on power spectrum analysis depends on the threshold that is used to discriminate between protein coding and non-coding sequences in the genomes of eukaryotes. Because the structure of genes vary among different eukaryotes, it is difficult to determine the best prediction threshold for a eukaryote relying only on prior biological knowledge. To improve the accuracy of prediction methods based on power spectral analysis, we developed a novel method based on a bootstrap algorithm to infer organism-specific optimal thresholds for eukaryotes. As prior information, our method requires the input of only a few annotated protein coding regions from the organism being studied. Our results show that using the calculated optimal thresholds for our test datasets, the average prediction accuracy of our method is 81%, an increase of 19% over that obtained using the same empirical threshold P = 4 for all datasets. The proposed method is simple and convenient and easily applied to infer optimal thresholds that can be used to predict coding regions in the genomes of most organisms.  相似文献   

3.
4.
Current evidence of phenological responses to recent climate change is substantially biased towards northern hemisphere temperate regions. Given regional differences in climate change, shifts in phenology will not be uniform across the globe, and conclusions drawn from temperate systems in the northern hemisphere might not be applicable to other regions on the planet. We conduct the largest meta-analysis to date of phenological drivers and trends among southern hemisphere species, assessing 1208 long-term datasets from 89 studies on 347 species. Data were mostly from Australasia (Australia and New Zealand), South America and the Antarctic/subantarctic, and focused primarily on plants and birds. This meta-analysis shows an advance in the timing of spring events (with a strong Australian data bias), although substantial differences in trends were apparent among taxonomic groups and regions. When only statistically significant trends were considered, 82% of terrestrial datasets and 42% of marine datasets demonstrated an advance in phenology. Temperature was most frequently identified as the primary driver of phenological changes; however, in many studies it was the only climate variable considered. When precipitation was examined, it often played a key role but, in contrast with temperature, the direction of phenological shifts in response to precipitation variation was difficult to predict a priori. We discuss how phenological information can inform the adaptive capacity of species, their resilience, and constraints on autonomous adaptation. We also highlight serious weaknesses in past and current data collection and analyses at large regional scales (with very few studies in the tropics or from Africa) and dramatic taxonomic biases. If accurate predictions regarding the general effects of climate change on the biology of organisms are to be made, data collection policies focussing on targeting data-deficient regions and taxa need to be financially and logistically supported.  相似文献   

5.
Three-way PCA has been applied to proteomic pattern images to identify the classes of samples present in the dataset. The developed method has been applied to two different datasets: a rat sera dataset, constituted by five samples of healthy Wistar rat sera and five samples of nicotine-treated Wistar rat sera; a human lymph-node dataset constituted by four healthy lymph-nodes and four lymph-nodes affected by a non-Hodgkin's lymphoma. The method proved to be successful in the identification of the classes of samples present in both of the groups of 2D-PAGE images, and it allowed us to identify the regions of the two-dimensional maps responsible for the differences occurring between the classes for both rat sera and human lymph-nodes datasets.  相似文献   

6.
Planning for resilience is the focus of many marine conservation programs and initiatives. These efforts aim to inform conservation strategies for marine regions to ensure they have inbuilt capacity to retain biological diversity and ecological function in the face of global environmental change--particularly changes in climate and resource exploitation. In the absence of direct biological and ecological information for many marine species, scientists are increasingly using spatially-explicit, predictive-modeling approaches. Through the improved access to multibeam sonar and underwater video technology these models provide spatial predictions of the most suitable regions for an organism at resolutions previously not possible. However, sensible-looking, well-performing models can provide very different predictions of distribution depending on which occurrence dataset is used. To examine this, we construct species distribution models for nine temperate marine sedentary fishes for a 25.7 km(2) study region off the coast of southeastern Australia. We use generalized linear model (GLM), generalized additive model (GAM) and maximum entropy (MAXENT) to build models based on co-located occurrence datasets derived from two underwater video methods (i.e. baited and towed video) and fine-scale multibeam sonar based seafloor habitat variables. Overall, this study found that the choice of modeling approach did not considerably influence the prediction of distributions based on the same occurrence dataset. However, greater dissimilarity between model predictions was observed across the nine fish taxa when the two occurrence datasets were compared (relative to models based on the same dataset). Based on these results it is difficult to draw any general trends in regards to which video method provides more reliable occurrence datasets. Nonetheless, we suggest predictions reflecting the species apparent distribution (i.e. a combination of species distribution and the probability of detecting it). Consequently, we also encourage researchers and marine managers to carefully interpret model predictions.  相似文献   

7.
8.
9.
10.
11.
Xu Y  Duanmu H  Chang Z  Zhang S  Li Z  Li Z  Liu Y  Li K  Qiu F  Li X 《Molecular biology reports》2012,39(2):1627-1637
Copy number variations (CNVs) are one type of the human genetic variations and are pervasive in the human genome. It has been confirmed that they can play a causal role in complex diseases. Previous studies of CNVs focused more on identifying the disease-specific CNV regions or candidate genes on these CNV regions, but less on the synergistic actions between genes on CNV regions and other genes. Our research combined the CNVs with related gene co-expression to reconstruct gene co-expression network by using single nucleotide polymorphism microarray datasets and gene microarray datasets of breast cancer, and then extracted the modules which connected densely inside and analyzed the functions of modules. Interestingly, all of these modules’ functions were related to breast cancer according to our enrichment analysis, and most of the genes in these modules have been reported to be involved in breast cancer. Our findings suggested that integrating CNVs and gene co-expressed relations was an available way to analyze the roles of CNV genes and their synergistic genes in breast cancer, and provided a novel insight into the pathological mechanism of breast cancer.  相似文献   

12.
Understanding how epigenetic variation in non-coding regions is involved in distal gene-expression regulation is an important problem. Regulatory regions can be associated to genes using large-scale datasets of epigenetic and expression data. However, for regions of complex epigenomic signals and enhancers that regulate many genes, it is difficult to understand these associations. We present StitchIt, an approach to dissect epigenetic variation in a gene-specific manner for the detection of regulatory elements (REMs) without relying on peak calls in individual samples. StitchIt segments epigenetic signal tracks over many samples to generate the location and the target genes of a REM simultaneously. We show that this approach leads to a more accurate and refined REM detection compared to standard methods even on heterogeneous datasets, which are challenging to model. Also, StitchIt REMs are highly enriched in experimentally determined chromatin interactions and expression quantitative trait loci. We validated several newly predicted REMs using CRISPR-Cas9 experiments, thereby demonstrating the reliability of StitchIt. StitchIt is able to dissect regulation in superenhancers and predicts thousands of putative REMs that go unnoticed using peak-based approaches suggesting that a large part of the regulome might be uncharted water.  相似文献   

13.
Detecting genetic variation is one of the main applications of high-throughput sequencing, but is still challenging wherever aligning short reads poses ambiguities. Current state-of-the-art variant calling approaches avoid such regions, arguing that it is necessary to sacrifice detection sensitivity to limit false discovery. We developed a method that links candidate variant positions within repetitive genomic regions into clusters. The technique relies on a resource, a thesaurus of genetic variation, that enumerates genomic regions with similar sequence. The resource is computationally intensive to generate, but once compiled can be applied efficiently to annotate and prioritize variants in repetitive regions. We show that thesaurus annotation can reduce the rate of false variant calls due to mappability by up to three orders of magnitude. We apply the technique to whole genome datasets and establish that called variants in low mappability regions annotated using the thesaurus can be experimentally validated. We then extend the analysis to a large panel of exomes to show that the annotation technique opens possibilities to study variation in hereto hidden and under-studied parts of the genome.  相似文献   

14.
The determination of factors that influence protein conformational changes is very important for the identification of potentially amyloidogenic and disordered regions in polypeptide chains. In our work we introduce a new parameter, mean packing density, to detect both amyloidogenic and disordered regions in a protein sequence. It has been shown that regions with strong expected packing density are responsible for amyloid formation. Our predictions are consistent with known disease-related amyloidogenic regions for eight of 12 amyloid-forming proteins and peptides in which the positions of amyloidogenic regions have been revealed experimentally. Our findings support the concept that the mechanism of amyloid fibril formation is similar for different peptides and proteins. Moreover, we have demonstrated that regions with weak expected packing density are responsible for the appearance of disordered regions. Our method has been tested on datasets of globular proteins and long disordered protein segments, and it shows improved performance over other widely used methods. Thus, we demonstrate that the expected packing density is a useful value with which one can predict both intrinsically disordered and amyloidogenic regions of a protein based on sequence alone. Our results are important for understanding the structural characteristics of protein folding and misfolding.  相似文献   

15.
Marine protected areas (MPAs) can be an effective tool for marine biodiversity conservation, yet decision-makers usually have limited and biased datasets with which to make decisions about where to locate MPAs. Using commonly available abiotic and biotic datasets, I asked how many datasets are necessary to achieve robust patterns of conservation importance. I applied a decision support tool for marine protected area design in two regions of British Columbia, Canada, and sequentially excluded the datasets with the most limited geographic distribution. I found that the reserve selection method was robust to some missing datasets. The removal of up to 15 of the most geographically limited datasets did not significantly change the geographic patterns of the importance of areas for conservation. Indeed, including abiotic datasets plus at least 12 biotic datasets resulted in a spatial pattern similar to including all available biotic datasets. It was best to combine abiotic and biotic datasets in order to ensure habitats and species were represented. Patterns of clustering differed according to whether I used one set alone or both combined. Biotic datasets served as better surrogates for abiotic datasets than vice versa, and both represented more biodiversity features than randomly selected reserves. These results should provide encouragement to decision-makers engaged in MPA planning with limited spatial data.  相似文献   

16.
RRE allows the extraction of non-coding regions surrounding a coding sequence [i.e. gene upstream region, 5'-untranslated region (5'-UTR), introns, 3'-UTR, downstream region] from annotated genomic datasets available at NCBI. AVAILABILITY: RRE parser and web-based interface are accessible at http://www.bioinformatica.unito.it/bioinformatics/rre/rre.html  相似文献   

17.
MALDI mass spectrometry can simultaneously measure hundreds of biomolecules directly from tissue. Using essentially the same technique but different sample preparation strategies, metabolites, lipids, peptides and proteins can be analyzed. Spatially correlated analysis, imaging MS, enables the distributions of these biomolecular ions to be simultaneously measured in tissues. A key advantage of imaging MS is that it can annotate tissues based on their MS profiles and thereby distinguish biomolecularly distinct regions even if they were unexpected or are not distinct using established histological and histochemical methods e.g. neuropeptide and metabolite changes following transient electrophysiological events such as cortical spreading depression (CSD), which are spreading events of massive neuronal and glial depolarisations that occur in one hemisphere of the brain and do not pass to the other hemisphere , enabling the contralateral hemisphere to act as an internal control. A proof-of-principle imaging MS study, including 2D and 3D datasets, revealed substantial metabolite and neuropeptide changes immediately following CSD events which were absent in the protein imaging datasets. The large high dimensionality 3D datasets make even rudimentary contralateral comparisons difficult to visualize. Instead non-negative matrix factorization (NNMF), a multivariate factorization tool that is adept at highlighting latent features, such as MS signatures associated with CSD events, was applied to the 3D datasets. NNMF confirmed that the protein dataset did not contain substantial contralateral differences, while these were present in the neuropeptide dataset.  相似文献   

18.
The EpiGRAPH web service enables biologists to uncover hidden associations in vertebrate genome and epigenome datasets. Users can upload sets of genomic regions and EpiGRAPH will test multiple attributes (including DNA sequence, chromatin structure, epigenetic modifications and evolutionary conservation) for enrichment or depletion among these regions. Furthermore, EpiGRAPH learns to predictively identify similar genomic regions. This paper demonstrates EpiGRAPH's practical utility in a case study on monoallelic gene expression and describes its novel approach to reproducible bioinformatic analysis.  相似文献   

19.
Metabarcoding data generated using next-generation sequencing (NGS) technologies are overwhelmed with rare taxa and skewed in Operational Taxonomic Unit (OTU) frequencies comprised of few dominant taxa. Low frequency OTUs comprise a rare biosphere of singleton and doubleton OTUs, which may include many artifacts. We present an in-depth analysis of global singletons across sixteen NGS libraries representing different ribosomal RNA gene regions, NGS technologies and chemistries. Our data indicate that many singletons (average of 38 % across gene regions) are likely artifacts or potential artifacts, but a large fraction can be assigned to lower taxonomic levels with very high bootstrap support (∼32 % of sequences to genus with ≥90 % bootstrap cutoff). Further, many singletons clustered into rare OTUs from other datasets highlighting their overlap across datasets or the poor performance of clustering algorithms. These data emphasize a need for caution when discarding rare sequence data en masse: such practices may result in throwing the baby out with the bathwater, and underestimating the biodiversity. Yet, the rare sequences are unlikely to greatly affect ecological metrics. As a result, it may be prudent to err on the side of caution and omit rare OTUs prior to downstream analyses.  相似文献   

20.
Intrinsic disorder in transcription factors   总被引:8,自引:0,他引:8  
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号