首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

With next-generation sequencing technologies, experiments that were considered prohibitive only a few years ago are now possible. However, while these technologies have the ability to produce enormous volumes of data, the sequence reads are prone to error. This poses fundamental hurdles when genetic diversity is investigated.

Results

We developed ShoRAH, a computational method for quantifying genetic diversity in a mixed sample and for identifying the individual clones in the population, while accounting for sequencing errors. The software was run on simulated data and on real data obtained in wet lab experiments to assess its reliability.

Conclusions

ShoRAH is implemented in C++, Python, and Perl and has been tested under Linux and Mac OS X. Source code is available under the GNU General Public License at http://www.cbg.ethz.ch/software/shorah.  相似文献   

2.
MOTIVATION: We announce the availability of the second release of Darwin v. 2.0, an interpreted computer language especially tailored to researchers in the biosciences. The system is a general tool applicable to a wide range of problems. RESULTS: This second release improves Darwin version 1.6 in several ways: it now contains (1) a larger set of libraries touching most of the classical problems from computational biology (pairwise alignment, all versus all alignments, tree construction, multiple sequence alignment), (2) an expanded set of general purpose algorithms (search algorithms for discrete problems, matrix decomposition routines, complex/long integer arithmetic operations), (3) an improved language with a cleaner syntax, (4) better on-line help, and (5) a number of fixes to user-reported bugs. AVAILABILITY: Darwin is made available for most operating systems free of char ge from the Computational Biochemistry Research Group (CBRG), reachable at http://chrg.inf.ethz.ch. CONTACT: darwin@inf.ethz.ch  相似文献   

3.
GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox   总被引:26,自引:0,他引:26  
High-throughput gene expression analysis has become a frequent and powerful research tool in biology. At present, however, few software applications have been developed for biologists to query large microarray gene expression databases using a Web-browser interface. We present GENEVESTIGATOR, a database and Web-browser data mining interface for Affymetrix GeneChip data. Users can query the database to retrieve the expression patterns of individual genes throughout chosen environmental conditions, growth stages, or organs. Reversely, mining tools allow users to identify genes specifically expressed during selected stresses, growth stages, or in particular organs. Using GENEVESTIGATOR, the gene expression profiles of more than 22,000 Arabidopsis genes can be obtained, including those of 10,600 currently uncharacterized genes. The objective of this software application is to direct gene functional discovery and design of new experiments by providing plant biologists with contextual information on the expression of genes. The database and analysis toolbox is available as a community resource at https://www.genevestigator.ethz.ch.  相似文献   

4.
Nested effects models have been used successfully for learning subcellular networks from high-dimensional perturbation effects that result from RNA interference (RNAi) experiments. Here, we further develop the basic nested effects model using high-content single-cell imaging data from RNAi screens of cultured cells infected with human rhinovirus. RNAi screens with single-cell readouts are becoming increasingly common, and they often reveal high cell-to-cell variation. As a consequence of this cellular heterogeneity, knock-downs result in variable effects among cells and lead to weak average phenotypes on the cell population level. To address this confounding factor in network inference, we explicitly model the stimulation status of a signaling pathway in individual cells. We extend the framework of nested effects models to probabilistic combinatorial knock-downs and propose NEMix, a nested effects mixture model that accounts for unobserved pathway activation. We analyzed the identifiability of NEMix and developed a parameter inference scheme based on the Expectation Maximization algorithm. In an extensive simulation study, we show that NEMix improves learning of pathway structures over classical NEMs significantly in the presence of hidden pathway stimulation. We applied our model to single-cell imaging data from RNAi screens monitoring human rhinovirus infection, where limited infection efficiency of the assay results in uncertain pathway stimulation. Using a subset of genes with known interactions, we show that the inferred NEMix network has high accuracy and outperforms the classical nested effects model without hidden pathway activity. NEMix is implemented as part of the R/Bioconductor package ‘nem’ and available at www.cbg.ethz.ch/software/NEMix.  相似文献   

5.
We present AUDENS, a new platform-independent open source tool for automated de novo sequencing of peptides from MS/MS data. We implemented a dynamic programming algorithm and combined it with a flexible preprocessing module which is designed to distinguish between signal and other peaks. By applying a user-defined set of heuristics, AUDENS screens through the spectrum and assigns high relevance values to putative signal peaks. The algorithm constructs a sequence path through the MS/MS spectrum using the peak relevances to score each suggested sequence path, i.e., the corresponding amino acid sequence. At present, we consider AUDENS a prototype that unfolds its biggest potential if used in parallel with other de novo sequencing tools. AUDENS is available open source and can be downloaded with further documentation at http://www.ti.inf.ethz.ch/pw/software/audens/ .  相似文献   

6.
Boosting for tumor classification with gene expression data   总被引:7,自引:0,他引:7  
MOTIVATION: Microarray experiments generate large datasets with expression values for thousands of genes but not more than a few dozens of samples. Accurate supervised classification of tissue samples in such high-dimensional problems is difficult but often crucial for successful diagnosis and treatment. A promising way to meet this challenge is by using boosting in conjunction with decision trees. RESULTS: We demonstrate that the generic boosting algorithm needs some modification to become an accurate classifier in the context of gene expression data. In particular, we present a feature preselection method, a more robust boosting procedure and a new approach for multi-categorical problems. This allows for slight to drastic increase in performance and yields competitive results on several publicly available datasets. AVAILABILITY: Software for the modified boosting algorithms as well as for decision trees is available for free in R at http://stat.ethz.ch/~dettling/boosting.html.  相似文献   

7.
The 67th Discussion Forum on Life Cycle Assessment (LCA), organised by partners of the European project RELIEF (RELIability of product Environmental Footprints), focused on methods for better understanding the impacts of land use linked to agricultural value chains. The first session of the forum was dedicated to methods that help in retrospective tracking of land use within complex supply chains. Novel approaches were presented for the integration of increasingly available spatially located land use data into LCA. The second session focused on forward-looking projections of land use change and included emerging, predictive methods for the modelling of land change. The third session considered impact assessment methods related to the use of land and their application together with land change modelling approaches. Discussions throughout the day centred on opportunities and challenges arising from integrating spatially located land use information into Life Cycle Assessment. Increasing amounts of spatially located land use data are becoming available and this could potentially increase the robustness and specificity of Life Cycle Assessment. However, the use of such data can be computationally expensive and requires the development of skills (i.e. use of geographical information systems (GIS) and model coding) within the LCA community. Land change modelling and ecosystem service modelling are associated with considerable uncertainty which must be communicated appropriately to stakeholders and decision-makers when interpreting results from an LCA. The new approaches were found to challenge aspects of the traditional LCA approach—particularly the division between the life cycle inventory and impact assessment and the assumption of linearity between scale and impacts when deriving characterisation factors. The presentations from the DF-67 are available for download (www.lcaforum.ch), and video recordings can be accessed online (http://www.video.ethz.ch/events/lca/2017/autumn/67th.html).  相似文献   

8.
SUMMARY: Besides classical clustering methods such as hierarchical clustering, in recent years biclustering has become a popular approach to analyze biological data sets, e.g. gene expression data. The Biclustering Analysis Toolbox (BicAT) is a software platform for clustering-based data analysis that integrates various biclustering and clustering techniques in terms of a common graphical user interface. Furthermore, BicAT provides different facilities for data preparation, inspection and postprocessing such as discretization, filtering of biclusters according to specific criteria or gene pair analysis for constructing gene interconnection graphs. The possibility to use different biclustering algorithms inside a single graphical tool allows the user to compare clustering results and choose the algorithm that best fits a specific biological scenario. The toolbox is described in the context of gene expression analysis, but is also applicable to other types of data, e.g. data from proteomics or synthetic lethal experiments. AVAILABILITY: The BicAT toolbox is freely available at http://www.tik.ee.ethz.ch/sop/bicat and runs on all operating systems. The Java source code of the program and a developer's guide is provided on the website as well. Therefore, users may modify the program and add further algorithms or extensions.  相似文献   

9.
Cell surface proteins are major targets of biomedical research due to their utility as cellular markers and their extracellular accessibility for pharmacological intervention. However, information about the cell surface protein repertoire (the surfaceome) of individual cells is only sparsely available. Here, we applied the Cell Surface Capture (CSC) technology to 41 human and 31 mouse cell types to generate a mass-spectrometry derived Cell Surface Protein Atlas (CSPA) providing cellular surfaceome snapshots at high resolution. The CSPA is presented in form of an easy-to-navigate interactive database, a downloadable data matrix and with tools for targeted surfaceome rediscovery (http://wlab.ethz.ch/cspa). The cellular surfaceome snapshots of different cell types, including cancer cells, resulted in a combined dataset of 1492 human and 1296 mouse cell surface glycoproteins, providing experimental evidence for their cell surface expression on different cell types, including 136 G-protein coupled receptors and 75 membrane receptor tyrosine-protein kinases. Integrated analysis of the CSPA reveals that the concerted biological function of individual cell types is mainly guided by quantitative rather than qualitative surfaceome differences. The CSPA will be useful for the evaluation of drug targets, for the improved classification of cell types and for a better understanding of the surfaceome and its concerted biological functions in complex signaling microenvironments.  相似文献   

10.
Designed peptides that bind to major histocompatibility protein I (MHC-I) allomorphs bear the promise of representing epitopes that stimulate a desired immune response. A rigorous bioinformatical exploration of sequence patterns hidden in peptides that bind to the mouse MHC-I allomorph H-2Kb is presented. We exemplify and validate these motif findings by systematically dissecting the epitope SIINFEKL and analyzing the resulting fragments for their binding potential to H-2Kb in a thermal denaturation assay. The results demonstrate that only fragments exclusively retaining the carboxy- or amino-terminus of the reference peptide exhibit significant binding potential, with the N-terminal pentapeptide SIINF as shortest ligand. This study demonstrates that sophisticated machine-learning algorithms excel at extracting fine-grained patterns from peptide sequence data and predicting MHC-I binding peptides, thereby considerably extending existing linear prediction models and providing a fresh view on the computer-based molecular design of future synthetic vaccines. The server for prediction is available at http://modlab-cadd.ethz.ch (SLiDER tool, MHC-I version 2012).  相似文献   

11.
MOTIVATION: Microarray experiments are expected to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. They create a need for class prediction tools, which can deal with a large number of highly correlated input variables, perform feature selection and provide class probability estimates that serve as a quantification of the predictive uncertainty. A very promising solution is to combine the two ensemble schemes bagging and boosting to a novel algorithm called BagBoosting. RESULTS: When bagging is used as a module in boosting, the resulting classifier consistently improves the predictive performance and the probability estimates of both bagging and boosting on real and simulated gene expression data. This quasi-guaranteed improvement can be obtained by simply making a bigger computing effort. The advantageous predictive potential is also confirmed by comparing BagBoosting to several established class prediction tools for microarray data. AVAILABILITY: Software for the modified boosting algorithms, for benchmark studies and for the simulation of microarray data are available as an R package under GNU public license at http://stat.ethz.ch/~dettling/bagboost.html.  相似文献   

12.
Different plant plastid types contain a distinct protein complement for specialized functions and metabolic activities. plprot was established as a plastid proteome database to provide information about the proteomes of chloroplasts, etioplasts and undifferentiated plastids. The current version of plprot features 2,043 protein entries and consists of two modules. Module one contains a BLAST search option and provides comparative information on the proteomes of different plastid types. The second module contains four searchable databases, three for each individual plastid type and one comprehensive composite database that provides the results of plastid proteome analyses from different laboratories. plprot is accessible at http://www.plprot.ethz.ch.  相似文献   

13.
Neurite outgrowth and branching patterns are instrumental in dictating the wiring diagram of developing neuronal networks. We study the self-organization of single cultured neurons into complex networks focusing on factors governing the branching of a neurite into its daughter branches. Neurite branching angles of insect ganglion neurons in vitro were comparatively measured in two neuronal categories: neurons in dense cultures that bifurcated under the presence of extrinsic (cellular environment) cues versus neurons in practical isolation that developed their neurites following predominantly intrinsic cues. Our experimental results were complemented by theoretical modeling and computer simulations. A preferred regime of branching angles was found in isolated neurons. A model based on biophysical constraints predicted a preferred bifurcation angle that was consistent with this range shown by our real neurons. In order to examine the origin of the preferred regime of angles we constructed simulations of neurite outgrowth in a developing network and compared the simulated developing neurons with our experimental results. We tested cost functions for neuronal growth that would be optimized at a specific regime of angles. Our results suggest two phases in the process of neuronal development. In the first, reflected by our isolated neurons, neurons are tuned to make first contact with a target cell as soon as possible, to minimize the time of growth. After contact is made, that is, after neuronal interconnections are formed, a second branching strategy is adopted, favoring higher efficiency in neurite length and volume. The two-phase development theory is discussed in relation to previous results.  相似文献   

14.
Neurite outgrowth and branching patterns are instrumental in dictating the wiring diagram of developing neuronal networks. We study the self‐organization of single cultured neurons into complex networks focusing on factors governing the branching of a neurite into its daughter branches. Neurite branching angles of insect ganglion neurons in vitro were comparatively measured in two neuronal categories: neurons in dense cultures that bifurcated under the presence of extrinsic (cellular environment) cues versus neurons in practical isolation that developed their neurites following predominantly intrinsic cues. Our experimental results were complemented by theoretical modeling and computer simulations. A preferred regime of branching angles was found in isolated neurons. A model based on biophysical constraints predicted a preferred bifurcation angle that was consistent with this range shown by our real neurons. In order to examine the origin of the preferred regime of angles we constructed simulations of neurite outgrowth in a developing network and compared the simulated developing neurons with our experimental results. We tested cost functions for neuronal growth that would be optimized at a specific regime of angles. Our results suggest two phases in the process of neuronal development. In the first, reflected by our isolated neurons, neurons are tuned to make first contact with a target cell as soon as possible, to minimize the time of growth. After contact is made, that is, after neuronal interconnections are formed, a second branching strategy is adopted, favoring higher efficiency in neurite length and volume. The two‐phase development theory is discussed in relation to previous results. © 2004 Wiley Periodicals, Inc. J Neurobiol, 2005  相似文献   

15.
This paper presents a language for describing arrangements of motifs in biological sequences, and a program that uses the language to find the arrangements in motif match databases. The program does not by itself search for the constituent motifs, and is thus independent of how they are detected, which allows it to use motif match data of various origins. AVAILABILITY: The program can be tested online at http://hits.isb-sib.ch and the distribution is available from ftp://ftp.isrec.isb-sib.ch/pub/software/unix/mmsearch-1.0.tar.gz CONTACT: Thomas.Junier@isrec.unil.ch SUPPLEMENTARY INFORMATION: The full documentation about mmsearchis available from http://hits.isb-sib.ch/~tjunier/mmsearch/doc.  相似文献   

16.
Summary : An interactive dotmatrix program for the MacOS was designed that allows comparison of DNA to protein sequences using nested 3-frame translations. Availability : Shareware, available at http://copan.bioz.unibas.ch/software/ Contact : burglin@ubaclu. unibas.ch   相似文献   

17.
《The Journal of cell biology》1986,103(6):2659-2672
We have compared neurite outgrowth on extracellular matrix (ECM) constituents to outgrowth on glial and muscle cell surfaces. Embryonic chick ciliary ganglion (CG) neurons regenerate neurites rapidly on surfaces coated with laminin (LN), fibronectin (FN), conditioned media (CM) from several non-neuronal cell types that secrete LN, and on intact extracellular matrices. Neurite outgrowth on all of these substrates is blocked by two monoclonal antibodies, CSAT and JG22, that prevent the adhesion of many cells, including neurons, to the ECM constituents LN, FN, and collagen. Neurite outgrowth is inhibited even on mixed LN/poly-D-lysine substrates where neuronal attachment is independent of LN. Therefore, neuronal process outgrowth on extracellular matrices requires the function of neuronal cell surface molecules recognized by these antibodies. The surfaces of cultured astrocytes, Schwann cells, and skeletal myotubes also promote rapid process outgrowth from CG neurons. Neurite outgrowth on these surfaces, though, is not prevented by CSAT or JG22 antibodies. In addition, antibodies to a LN/proteoglycan complex that block neurite outgrowth on several LN-containing CM factors and on an ECM extract failed to inhibit cell surface-stimulated neurite outgrowth. After extraction with a nonionic detergent, Schwann cells and myotubes continue to support rapid neurite outgrowth. However, the activity associated with the detergent insoluble residue is blocked by CSAT and JG22 antibodies. Detergent extraction of astrocytes, in contrast, removes all neurite- promoting activity. These results provide evidence for at least two types of neuronal interactions with cells that promote neurite outgrowth. One involves adhesive proteins present in the ECM and ECM receptors on neurons. The second is mediated through detergent- extractable macromolecules present on non-neuronal cell surfaces and different, uncharacterized receptor(s) on neurons. Schwann cells and skeletal myotubes appear to promote neurite outgrowth by both mechanisms.  相似文献   

18.
In computational evolutionary biology, verification and benchmarking is a challenging task because the evolutionary history of studied biological entities is usually not known. Computer programs for simulating sequence evolution in silico have shown to be viable test beds for the verification of newly developed methods and to compare different algorithms. However, current simulation packages tend to focus either on gene-level aspects of genome evolution such as character substitutions and insertions and deletions (indels) or on genome-level aspects such as genome rearrangement and speciation events. Here, we introduce Artificial Life Framework (ALF), which aims at simulating the entire range of evolutionary forces that act on genomes: nucleotide, codon, or amino acid substitution (under simple or mixture models), indels, GC-content amelioration, gene duplication, gene loss, gene fusion, gene fission, genome rearrangement, lateral gene transfer (LGT), or speciation. The other distinctive feature of ALF is its user-friendly yet powerful web interface. We illustrate the utility of ALF with two possible applications: 1) we reanalyze data from a study of selection after globin gene duplication and test the statistical significance of the original conclusions and 2) we demonstrate that LGT can dramatically decrease the accuracy of two well-established orthology inference methods. ALF is available as a stand-alone application or via a web interface at http://www.cbrg.ethz.ch/alf.  相似文献   

19.
SUMMARY: The purpose of this work is to provide the modern molecular geneticist with tools to perform more efficient and more accurate analysis of the genotype data they produce. By using Microsoft Excel macros written in Visual Basic, we can translate genotype data into a form readable by the versatile software 'Arlequin', read the Arlequin output, calculate statistics of linkage disequilibrium, and put the results in a format for viewing with the software 'GOLD'. AVAILABILITY: The software is available by FTP at: ftp://xcsg.iarc.fr/cox/Genotype_Transposer/. SUPPLEMENTARY INFORMATION: Detailed instruction and examples are available at: ftp://xcsg.iarc.fr/cox/Genotype&_Transposer/. Arlequin is available at: http://lgb.unige.ch/arlequin/. GOLD is available at: http://www.well.ox.ac.uk/asthma/GOLD/.  相似文献   

20.
Summary Explants and dissociated cells from normal adult spinal cord and regenerating cord of the teleostApteronotus albifrons were grown in vitro for periods of 8 to 12 wk. During this time the neurons showed extensive neurite outgrowth. Neurite outgrowth from tissue explants and dissociated cells of regenerated spinal cord starts sooner and is more profuse than that from normal (unregenerated) cord. Neurite outgrowth is maximized by using adhesive substrata and a high density of explants or dissociated cells. Inasmuch asApteronotus does regenerate its spinal cord naturally after injury, whereas mammals do not, this culture system will be useful to study factors that control (permit) regeneration of spinal neurons in this adult vertebrate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号