首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In vitro selection has been an essential tool in the development of recombinant antibodies against various antigen targets. Deep sequencing has recently been gaining ground as an alternative and valuable method to analyze such antibody selections. The analysis provides a novel and extremely detailed view of selected antibody populations, and allows the identification of specific antibodies using only sequencing data, potentially eliminating the need for expensive and laborious low-throughput screening methods such as enzyme-linked immunosorbant assay. The high cost and the need for bioinformatics experts and powerful computer clusters, however, have limited the general use of deep sequencing in antibody selections. Here, we describe the AbMining ToolBox, an open source software package for the straightforward analysis of antibody libraries sequenced by the three main next generation sequencing platforms (454, Ion Torrent, MiSeq). The ToolBox is able to identify heavy chain CDR3s as effectively as more computationally intense software, and can be easily adapted to analyze other portions of antibody variable genes, as well as the selection outputs of libraries based on different scaffolds. The software runs on all common operating systems (Microsoft Windows, Mac OS X, Linux), on standard personal computers, and sequence analysis of 1–2 million reads can be accomplished in 10–15 min, a fraction of the time of competing software. Use of the ToolBox will allow the average researcher to incorporate deep sequence analysis into routine selections from antibody display libraries.  相似文献   

2.
A software package, IndexToolkit, aimed at overcoming the disadvantage of FASTA-format databases for frequent searching, is developed to utilize an indexing strategy to substantially accelerate sequence queries. IndexToolkit includes user-friendly tools and an Application Programming Interface (API) to facilitate indexing, storage and retrieval of protein sequence databases. As open source, it provides a sequence-retrieval developing framework, which is easily extensible for high-speed-request proteomic applications, such as database searching or modification discovering. We applied IndexToolkit to database searching engine pFind to demonstrate its effect. Experimental studies show that IndexToolkit is able to support significantly faster searches of protein database. AVAILABILITY: The IndexToolkit is free to use under the open source GNU GPL license. The source code and the compiled binary can be freely accessed through the website http://pfind.jdl.ac.cn/IndexToolkit. In this website, the more detailed information including screenshots and documentations for users and developers is also available.  相似文献   

3.
The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples.  相似文献   

4.
A key benefit of long-read nanopore sequencing technology is the ability to detect modified DNA bases, such as 5-methylcytosine. The lack of R/Bioconductor tools for the effective visualization of nanopore methylation profiles between samples from different experimental groups led us to develop the NanoMethViz R package. Our software can handle methylation output generated from a range of different methylation callers and manages large datasets using a compressed data format. To fully explore the methylation patterns in a dataset, NanoMethViz allows plotting of data at various resolutions. At the sample-level, we use dimensionality reduction to look at the relationships between methylation profiles in an unsupervised way. We visualize methylation profiles of classes of features such as genes or CpG islands by scaling them to relative positions and aggregating their profiles. At the finest resolution, we visualize methylation patterns across individual reads along the genome using the spaghetti plot and heatmaps, allowing users to explore particular genes or genomic regions of interest. In summary, our software makes the handling of methylation signal more convenient, expands upon the visualization options for nanopore data and works seamlessly with existing methylation analysis tools available in the Bioconductor project. Our software is available at https://bioconductor.org/packages/NanoMethViz.  相似文献   

5.
6.
Biologists are increasingly confronted with the challenge of quickly understanding genome-wide biological data, which usually involve a large number of genomic coordinates (e.g. genes) but a much smaller number of samples. To meet the need for data of this shape, we present an open-source package called ‘supraHex’ for training, analysing and visualising omics data. This package devises a supra-hexagonal map to self-organise the input data, offers scalable functionalities for post-analysing the map, and more importantly, allows for overlaying additional data for multilayer omics data comparisons. Via applying to DNA replication timing data of mouse embryogenesis, we demonstrate that supraHex is capable of simultaneously carrying out gene clustering and sample correlation, providing intuitive visualisation at each step of the analysis. By overlaying CpG and expression data onto the trained replication-timing map, we also show that supraHex is able to intuitively capture an inherent relationship between late replication, low CpG density promoters and low expression levels. As part of the Bioconductor project, supraHex makes accessible to a wide community in a simple way, what would otherwise be a complex framework for the ultrafast understanding of any tabular omics data, both scientifically and artistically. This package can run on Windows, Mac and Linux, and is freely available together with many tutorials on featuring real examples at http://supfam.org/supraHex.  相似文献   

7.
Schiffers K  Teal LR  Travis JM  Solan M 《PloS one》2011,6(12):e28028
Bioturbation is one of the most widespread forms of ecological engineering and has significant implications for the structure and functioning of ecosystems, yet our understanding of the processes involved in biotic mixing remains incomplete. One reason is that, despite their value and utility, most mathematical models currently applied to bioturbation data tend to neglect aspects of the natural complexity of bioturbation in favour of mathematical simplicity. At the same time, the abstract nature of these approaches limits the application of such models to a limited range of users. Here, we contend that a movement towards process-based modelling can improve both the representation of the mechanistic basis of bioturbation and the intuitiveness of modelling approaches. In support of this initiative, we present an open source modelling framework that explicitly simulates particle displacement and a worked example to facilitate application and further development. The framework combines the advantages of rule-based lattice models with the application of parameterisable probability density functions to generate mixing on the lattice. Model parameters can be fitted by experimental data and describe particle displacement at the spatial and temporal scales at which bioturbation data is routinely collected. By using the same model structure across species, but generating species-specific parameters, a generic understanding of species-specific bioturbation behaviour can be achieved. An application to a case study and comparison with a commonly used model attest the predictive power of the approach.  相似文献   

8.
The internal transcribed spacer (ITS) region of the nuclear ribosomal repeat unit holds a central position in the pursuit of the taxonomic affiliation of fungi recovered through environmental sampling. Newly generated fungal ITS sequences are typically compared against the International Nucleotide Sequence Databases for a species or genus name using the sequence similarity software suite blast . Such searches are not without complications however, and one of them is the presence of chimeric entries among the query or reference sequences. Chimeras are artificial sequences, generated unintentionally during the polymerase chain reaction step, that feature sequence data from two (or possibly more) distinct species. Available software solutions for chimera control do not readily target the fungal ITS region, but the present study introduces a blast -based open source software package (available at http://www.emerencia.org/chimerachecker.html ) to examine newly generated fungal ITS sequences for the presence of potentially chimeric elements in batch mode. We used the software package on a random set of 12 300 environmental fungal ITS sequences in the public sequence databases and found 1.5% of the entries to be chimeric at the ordinal level after manual verification of the results. The proportion of chimeras in the sequence databases can be hypothesized to increase as emerging sequencing technologies drawing from pooled DNA samples are becoming important tools in molecular ecology research.  相似文献   

9.

Background  

Virtual screening methods start to be well established as effective approaches to identify hits, candidates and leads for drug discovery research. Among those, structure based virtual screening (SBVS) approaches aim at docking collections of small compounds in the target structure to identify potent compounds. For SBVS, the identification of candidate pockets in protein structures is a key feature, and the recent years have seen increasing interest in developing methods for pocket and cavity detection on protein surfaces.  相似文献   

10.
MedlineR: an open source library in R for Medline literature data mining   总被引:3,自引:0,他引:3  
SUMMARY: We describe an open source library written in the R programming language for Medline literature data mining. This MedlineR library includes programs to query Medline through the NCBI PubMed database; to construct the co-occurrence matrix; and to visualize the network topology of query terms. The open source nature of this library allows users to extend it freely in the statistical programming language of R. To demonstrate its utility, we have built an application to analyze term-association by using only 10 lines of code. We provide MedlineR as a library foundation for bioinformaticians and statisticians to build more sophisticated literature data mining applications. AVAILABILITY: The library is available from http://dbsr.duke.edu/pub/MedlineR.  相似文献   

11.

Background

So far many algorithms have been proposed towards the detection of significant genes in microarray analysis problems. Several of those approaches are freely available as R-packages though their engagement in gene expression analysis by non-bioinformaticians is usually a frustrating task. Besides, only some of those packages offer a complete suite of tools starting from initial data import and ending to analysis report. Here we present an R/Bioconductor package that implements a hybrid gene selection method along with a bunch of functions to facilitate a thorough and convenient gene expression profiling analysis.

Results

mAPKL is an open-source R/Bioconductor package that implements the mAP-KL hybrid gene selection method. The advantage of this method is that selects a small number of gene exemplars while achieving comparable classification results to other well established algorithms on a variety of datasets and dataset sizes. The mAPKL package is accompanied with extra functionalities including (i) solid data import; (ii) data sampling following a user-defined proportion; (iii) preprocessing through several normalization and transformation alternatives; (iv) classification with the aid of SVM and performance evaluation; (v) network analysis of the significant genes (exemplars), including degree of centrality, closeness, betweeness, clustering coefficient as well as the construction of an edge list table; (vi) gene annotation analysis, (vii) pathway analysis and (viii) auto-generated analysis reporting.

Conclusions

Users are able to run a thorough gene expression analysis in a timely manner starting from raw data and concluding to network characteristics of the selected gene exemplars. Detailed instructions and example data are provided in the R package, which is freely available at Bioconductor under the GPL-2 or later license http://www.bioconductor.org/packages/3.1/bioc/html/mAPKL.html.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0719-5) contains supplementary material, which is available to authorized users.  相似文献   

12.
The R453Plus1Toolbox is an R/Bioconductor package for the analysis of 454 Sequencing data. Projects generated with Roche's data analysis software can be imported into R allowing advanced and customized analyses within the R/Bioconductor environment for sequencing data. Several methods were implemented extending the current functionality of Roche's software. These extensions include methods for quality assurance and annotation of detected variants. Further, a pipeline for the detection of structural variants, e.g. balanced chromosomal translocations, is provided. AVAILABILITY: The R453Plus1Toolbox is implemented in R and available at http://www.bioconductor.org/. A vignette outlining typical workflows is included in the package. CONTACT: h.klein@uni-muenster.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

13.
14.
SWATH-MS is an acquisition and analysis technique of targeted proteomics that enables measuring several thousand proteins with high reproducibility and accuracy across many samples. OpenSWATH is popular open-source software for peptide identification and quantification from SWATH-MS data. For downstream statistical and quantitative analysis there exist different tools such as MSstats, mapDIA and aLFQ. However, the transfer of data from OpenSWATH to the downstream statistical tools is currently technically challenging. Here we introduce the R/Bioconductor package SWATH2stats, which allows convenient processing of the data into a format directly readable by the downstream analysis tools. In addition, SWATH2stats allows annotation, analyzing the variation and the reproducibility of the measurements, FDR estimation, and advanced filtering before submitting the processed data to downstream tools. These functionalities are important to quickly analyze the quality of the SWATH-MS data. Hence, SWATH2stats is a new open-source tool that summarizes several practical functionalities for analyzing, processing, and converting SWATH-MS data and thus facilitates the efficient analysis of large-scale SWATH/DIA datasets.  相似文献   

15.
SUMMARY: The survcomp package provides functions to assess and statistically compare the performance of survival/risk prediction models. It implements state-of-the-art statistics to (i) measure the performance of risk prediction models; (ii) combine these statistical estimates from multiple datasets using a meta-analytical framework; and (iii) statistically compare the performance of competitive models.  相似文献   

16.
This article describes specific procedures for conducting quality assessment of Affymetrix GeneChip(R) soybean genome data and for performing analyses to determine differential gene expression using the open-source R programming environment in conjunction with the open-source Bioconductor software. We describe procedures for extracting those Affymetrix probe set IDs related specifically to the soybean genome on the Affymetrix soybean chip and demonstrate the use of exploratory plots including images of raw probe-level data, boxplots, density plots and M versus A plots. RNA degradation and recommended procedures from Affymetrix for quality control are discussed. An appropriate probe-level model provides an excellent quality assessment tool. To demonstrate this, we discuss and display chip pseudo-images of weights, residuals and signed residuals and additional probe-level modeling plots that may be used to identify aberrant chips. The Robust Multichip Averaging (RMA) procedure was used for background correction, normalization and summarization of the AffyBatch probe-level data to obtain expression level data and to discover differentially expressed genes. Examples of boxplots and MA plots are presented for the expression level data. Volcano plots and heatmaps are used to demonstrate the use of (log) fold changes in conjunction with ordinary and moderated t-statistics for determining interesting genes. We show, with real data, how implementation of functions in R and Bioconductor successfully identified differentially expressed genes that may play a role in soybean resistance to a fungal pathogen, Phakopsora pachyrhizi. Complete source code for performing all quality assessment and statistical procedures may be downloaded from our web source: http://css.ncifcrf.gov/services/download/MicroarraySoybean.zip.  相似文献   

17.
18.
《Cryobiology》2016,73(3):251-257
The aim of this study is to provide the reader with a simple setup that can detect antifreeze proteins (AFP) by inhibition of ice recrystallisation in very small sample sizes. This includes an open source cryostage, a method for preparing and loading samples as well as a software analysis method. The entire setup was tested using hyperactive AFP from the cerambycid beetle, Rhagium mordax. Samples containing AFP were compared to buffer samples, and the results are visualised as crystal radius evolution over time and in absolute change over 30 min. Statistical analysis showed that samples containing AFP could reliably be told apart from controls after only two minutes of recrystallisation. The goal of providing a fast, cheap and easy method for detecting antifreeze proteins in solution was met, and further development of the system can be followed at https://github.com/pechano/cryostage.  相似文献   

19.
Functional & Integrative Genomics - Currently, standard network analysis workflows rely on many different packages, often requiring users to have a solid statistics and programming background....  相似文献   

20.
Epithelial cells are defined by apical-basal and planar cell polarity (PCP) signaling, the latter of which establishes an orthogonal plane of polarity in the epithelial sheet. PCP signaling is required for normal cell migration, differentiation, stem cell generation and tissue repair, and defects in PCP have been associated with developmental abnormalities, neuropathologies and cancers. While the molecular mechanism of PCP is incompletely understood, the deepest insights have come from Drosophila, where PCP is manifest in hairs and bristles across the adult cuticle and organization of the ommatidia in the eye. Fly wing cells are marked by actin-rich trichome structures produced at the distal edge of each cell in the developing wing epithelium and in a mature wing the trichomes orient collectively in the distal direction. Genetic screens have identified key PCP signaling pathway components that disrupt trichome orientation, which has been measured manually in a tedious and error prone process. Here we describe a set of image processing and pattern-recognition macros that can quantify trichome arrangements in micrographs and mark these directly by color, arrow or colored arrow to indicate trichome location, length and orientation. Nearest neighbor calculations are made to exploit local differences in orientation to better and more reliably detect and highlight local defects in trichome polarity. We demonstrate the use of these tools on trichomes in adult wing preps and on actin-rich developing trichomes in pupal wing epithelia stained with phalloidin. FijiWingsPolarity is freely available and will be of interest to a broad community of fly geneticists studying the effect of gene function on PCP.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号