首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background  

Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-seq) or ChIP followed by genome tiling array analysis (ChIP-chip) have become standard technologies for genome-wide identification of DNA-binding protein target sites. A number of algorithms have been developed in parallel that allow identification of binding sites from ChIP-seq or ChIP-chip datasets and subsequent visualization in the University of California Santa Cruz (UCSC) Genome Browser as custom annotation tracks. However, summarizing these tracks can be a daunting task, particularly if there are a large number of binding sites or the binding sites are distributed widely across the genome.  相似文献   

2.
3.
Analyses of pairwise relatedness represent a key component to addressing many topics in biology. However, such analyses have been limited because most available programs provide a means to estimate relatedness based on only a single estimator, making comparison across estimators difficult. Second, all programs to date have been platform specific, working only on a specific operating system. This has the undesirable outcome of making choice of relatedness estimator limited by operating system preference, rather than being based on scientific rationale. Here, we present a new R package, called related, that can calculate relatedness based on seven estimators, can account for genotyping errors, missing data and inbreeding, and can estimate 95% confidence intervals. Moreover, simulation functions are provided that allow for easy comparison of the performance of different estimators and for analyses of how much resolution to expect from a given data set. Because this package works in R, it is platform independent. Combined, this functionality should allow for more appropriate analyses and interpretation of pairwise relatedness and will also allow for the integration of relatedness data into larger R workflows.  相似文献   

4.
5.
SUMMARY: OTUbase is an R package designed to facilitate the analysis of operational taxonomic unit (OTU) data and sequence classification (taxonomic) data. Currently there are programs that will cluster sequence data into OTUs and/or classify sequence data into known taxonomies. However, there is a need for software that can take the summarized output of these programs and organize it into easily accessed and manipulated formats. OTUbase provides this structure and organization within R, to allow researchers to easily manipulate the data with the rich library of R packages currently available for additional analysis. AVAILABILITY: OTUbase is an R package available through Bioconductor. It can be found at http://www.bioconductor.org/packages/release/bioc/html/OTUbase.html.  相似文献   

6.
7.
8.
The vast amount of occurrence records currently available offers increasing opportunities for biodiversity data analyses. This amount of data poses new challenges for the reliability and correct interpretation of the results. Indeed, to safely deal with occurrence records, their uncertainty and associated biases should be taken into account. We developed an R package to explicitly include spatial and temporal uncertainties during the mapping and listing of plant occurrence records for a given study area. Our workflow returns two objects: (a) a ‘Map of Relative Floristic Ignorance’ (MRFI), which represents the spatial distribution of the lack of floristic knowledge; (b) a ‘Virtual Floristic List’ (VFL), i.e. a list of taxa potentially occurring in the area with an associated probability of occurrence. The method implemented in the package can manage a large amount of occurrence data and represents relative floristic ignorance across a study area with a sustainable computational effort. Several parameters can be set by the user, conferring high flexibility to the method. Uncertainty is not avoided, but incorporated into biodiversity analyses through appropriate methodological approaches and innovative spatial representations. Our study introduces a workflow that pushes forward the analytical capacities to deal with uncertainty in biological occurrence records, allowing to produce more accurate outputs.  相似文献   

9.

Background

Pathway enrichment techniques are useful for understanding experimental metabolomics data. Their purpose is to give context to the affected metabolites in terms of the prior knowledge contained in metabolic pathways. However, the interpretation of a prioritized pathway list is still challenging, as pathways show overlap and cross talk effects.

Results

We introduce FELLA, an R package to perform a network-based enrichment of a list of affected metabolites. FELLA builds a hierarchical representation of an organism biochemistry from the Kyoto Encyclopedia of Genes and Genomes (KEGG), containing pathways, modules, enzymes, reactions and metabolites. In addition to providing a list of pathways, FELLA reports intermediate entities (modules, enzymes, reactions) that link the input metabolites to them. This sheds light on pathway cross talk and potential enzymes or metabolites as targets for the condition under study. FELLA has been applied to six public datasets –three from Homo sapiens, two from Danio rerio and one from Mus musculus– and has reproduced findings from the original studies and from independent literature.

Conclusions

The R package FELLA offers an innovative enrichment concept starting from a list of metabolites, based on a knowledge graph representation of the KEGG database that focuses on interpretability. Besides reporting a list of pathways, FELLA suggests intermediate entities that are of interest per se. Its usefulness has been shown at several molecular levels on six public datasets, including human and animal models. The user can run the enrichment analysis through a simple interactive graphical interface or programmatically. FELLA is publicly available in Bioconductor under the GPL-3 license.
  相似文献   

10.
AStream, an R-statistical software package for the curation and identification of feature peaks extracted from liquid chromatography mass spectrometry (LC/MS) metabolomics data, is described. AStream detects isotopic, fragment and adduct patterns by identifying feature pairs that fulfill expected relational patterns. Data reduction by AStream allows compounds to be identified reliably and subsequently linked to metabolite databases. AStream provides researchers with a fast, reliable tool for summarizing metabolomic data, notably reducing curation time and increasing consistency of results. AVAILABILITY: The AStream R package and a study example can be freely accessed at http://www.urr.cat/AStream/AStream.html.  相似文献   

11.
A key benefit of long-read nanopore sequencing technology is the ability to detect modified DNA bases, such as 5-methylcytosine. The lack of R/Bioconductor tools for the effective visualization of nanopore methylation profiles between samples from different experimental groups led us to develop the NanoMethViz R package. Our software can handle methylation output generated from a range of different methylation callers and manages large datasets using a compressed data format. To fully explore the methylation patterns in a dataset, NanoMethViz allows plotting of data at various resolutions. At the sample-level, we use dimensionality reduction to look at the relationships between methylation profiles in an unsupervised way. We visualize methylation profiles of classes of features such as genes or CpG islands by scaling them to relative positions and aggregating their profiles. At the finest resolution, we visualize methylation patterns across individual reads along the genome using the spaghetti plot and heatmaps, allowing users to explore particular genes or genomic regions of interest. In summary, our software makes the handling of methylation signal more convenient, expands upon the visualization options for nanopore data and works seamlessly with existing methylation analysis tools available in the Bioconductor project. Our software is available at https://bioconductor.org/packages/NanoMethViz.  相似文献   

12.
13.
下一代测序中ChIP-seq数据的处理与分析   总被引:1,自引:0,他引:1  
Gao S  Zhang N  Li B  Xu S  Ye YB  Ruan JS 《遗传》2012,34(6):773-783
将染色质免疫共沉淀技术(ChIP)与下一代高通量测序技术相结合的染色质免疫共沉淀测序(ChIP-seq),已成为功能基因组学、特别是基因表达调控领域研究的关键技术。ChIP-seq实验带来的海量数据向生物信息学研究人员提出了新的挑战。由于此领域数据处理技术的发展大大滞后于实验技术进步,有必要系统地介绍和回顾ChIP-seq数据处理的各个方面,以便更多研究人员进入此领域设计或改进相应的算法。文章结合实例详细介绍了ChIP-seq数据整个流程,并重点讨论了其中的主要问题和关键环节,为这一研究领域的科研人员提供一个快速而深入的认识。  相似文献   

14.
15.
16.
The use of high-density SNP arrays for investigating copy number alterations in clinical tumor samples, with intra tumor heterogeneity and varying degrees of normal cell contamination, imposes several problems for commonly used segmentation algorithms. This calls for flexibility when setting thresholds for calling gains and losses. In addition, sample normalization can induce artifacts in the copy-number ratios for the non-changed genomic elements in the tumor samples. RESULTS: We present an open source R package, Rseg, which allows the user to define sample-specific thresholds to call gains and losses. It also allows the user to correct for normalization artifacts. AVAILABILITY: The R package, Rseg, is available at: http://www.cs.au.dk/~plamy/Rseg/ and runs on Linux and MS-Windows.  相似文献   

17.
ABSTRACT: We introduce \ggbio{}, a new methodology to visualize and explore genomics annotations and high-throughput data. The plots provide detailed views of genomic regions, summary views of sequence alignments and splicing patterns, and genome-wide overviews with karyogram, circular and grand linear layouts. The methods leverage the statistical functionality available in \R{}, the grammar of graphics and the data handling capabilities of the Bioconductor project. The plots are specified within a modular framework that enables users to construct plots in a systematic way, and are generated directly from Bioconductor data structures. The \ggbio{} \R{} package is available at \url{http://tengfei.github.com/ggbio/}.  相似文献   

18.

Background

The Immunoglobulins (IG) and the T cell receptors (TR) play the key role in antigen recognition during the adaptive immune response. Recent progress in next-generation sequencing technologies has provided an opportunity for the deep T cell receptor repertoire profiling. However, a specialised software is required for the rational analysis of massive data generated by next-generation sequencing.

Results

Here we introduce tcR, a new R package, representing a platform for the advanced analysis of T cell receptor repertoires, which includes diversity measures, shared T cell receptor sequences identification, gene usage statistics computation and other widely used methods. The tool has proven its utility in recent research studies.

Conclusions

tcR is an R package for the advanced analysis of T cell receptor repertoires after primary TR sequences extraction from raw sequencing reads. The stable version can be directly installed from The Comprehensive R Archive Network (http://cran.r-project.org/mirrors.html). The source code and development version are available at tcR GitHub (http://imminfo.github.io/tcr/) along with the full documentation and typical usage examples.  相似文献   

19.
  • 1.Camera trapping plays an important role in wildlife surveys, and provides valuable information for estimation of population density. While mark-recapture techniques can estimate population density for species that can be individually recognized or marked, there are no robust methods to estimate density of species that cannot be individually identified.
  • 2.We developed a new approach to estimate population density based on the simulation of individual movement within the camera grid. Simulated animals followed a correlated random walk with the movement parameters of segment length, angular deflection, movement distance and home-range size derived from empirical movement paths. Movement was simulated under a series of population densities. We used the Random Forest algorithm to determine the population density with the highest likelihood of matching the camera trap data. We developed an R package, cameratrapR, to conduct simulations and estimate population density.
  • 3.Compared with line transect surveys and the random encounter model, cameratrapR provides more reliable estimates of wildlife density with narrower confidence intervals. Functions are provided to visualize movement paths, derive movement parameters, and plot camera trapping results.
  • 4.The package allows researchers to estimate population sizes/densities of animals that cannot be individually identified and cameras are deployed in a grid pattern.
  相似文献   

20.
Species occurrence records from a variety of sources are increasingly aggregated into heterogeneous databases and made available to ecologists for immediate analytical use. However, these data are typically biased, i.e. they are not a probability sample of the target population of interest, meaning that the information they provide may not be an accurate reflection of reality. It is therefore crucial that species occurrence data are properly scrutinised before they are used for research. In this article, we introduce occAssess, an R package that enables straightforward screening of species occurrence data for potential biases. The package contains a number of discrete functions, each of which returns a measure of the potential for bias in one or more of the taxonomic, temporal, spatial, and environmental dimensions. Users can opt to provide a set of time periods into which the data will be split; in this case separate outputs will be provided for each period, making the package particularly useful for assessing the suitability of a dataset for estimating temporal trends in species'' distributions. The outputs are provided visually (as ggplot2 objects) and do not include a formal recommendation as to whether data are of sufficient quality for any given inferential use. Instead, they should be used as ancillary information and viewed in the context of the question that is being asked, and the methods that are being used to answer it. We demonstrate the utility of occAssess by applying it to data on two key pollinator taxa in South America: leaf‐nosed bats (Phyllostomidae) and hoverflies (Syrphidae). In this worked example, we briefly assess the degree to which various aspects of data coverage appear to have changed over time. We then discuss additional applications of the package, highlight its limitations, and point to future development opportunities.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号