首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
PartiGene--constructing partial genomes   总被引:4,自引:0,他引:4  
Expressed sequence tags (ESTs) offer a low-cost approach to gene discovery and are being used by an increasing number of laboratories to obtain sequence information for a wide variety of organisms. The challenge lies in processing and organizing this data within a genomic context to facilitate large scale analyses. Here we present PartiGene, an integrated sequence analysis suite that uses freely available public domain software to (1) process raw trace chromatograms into sequence objects suitable for submission to dbEST; (2) place these sequences within a genomic context; (3) perform customizable first-pass annotation of the data; and (4) present the data as HTML tables and an SQL database resource. PartiGene has been used to create a number of non-model organism database resources including NEMBASE (http://www.nematodes.org) and LumbriBase (http://www.earthworms.org/). The packages are readily portable, freely available and can be run on simple Linux-based workstations. AVAILABILITY: PartiGene is available from http://www.nematodes.org/PartiGene and also forms part of the EST analysis software, associated with the Natural Environmental Research Council (UK) Bio-Linux project (http://envgen.nox.ac.uk/biolinux.html).  相似文献   

2.
MOTIVATION: Accurate gene structure annotation is a challenging computational problem in genomics. The best results are achieved with spliced alignment of full-length cDNAs or multiple expressed sequence tags (ESTs) with sufficient overlap to cover the entire gene. For most species, cDNA and EST collections are far from comprehensive. We sought to overcome this bottleneck by exploring the possibility of using combined EST resources from fairly diverged species that still share a common gene space. Previous spliced alignment tools were found inadequate for this task because they rely on very high sequence similarity between the ESTs and the genomic DNA. RESULTS: We have developed a computer program, GeneSeqer, which is capable of aligning thousands of ESTs with a long genomic sequence in a reasonable amount of time. The algorithm is uniquely designed to tolerate a high percentage of mismatches and insertions or deletions in the EST relative to the genomic template. This feature allows use of non-cognate ESTs for gene structure prediction, including ESTs derived from duplicated genes and homologous genes from related species. The increased gene prediction sensitivity results in part from novel splice site prediction models that are also available as a stand-alone splice site prediction tool. We assessed GeneSeqer performance relative to a standard Arabidopsis thaliana gene set and demonstrate its utility for plant genome annotation. In particular, we propose that this method provides a timely tool for the annotation of the rice genome, using abundant ESTs from other cereals and plants. AVAILABILITY: The source code is available for download at http://bioinformatics.iastate.edu/bioinformatics2go/gs/download.html. Web servers for Arabidopsis and other plant species are accessible at http://www.plantgdb.org/cgi-bin/AtGeneSeqer.cgi and http://www.plantgdb.org/cgi-bin/GeneSeqer.cgi, respectively. For non-plant species, use http://bioinformatics.iastate.edu/cgi-bin/gs.cgi. The splice site prediction tool (SplicePredictor) is distributed with the GeneSeqer code. A SplicePredictor web server is available at http://bioinformatics.iastate.edu/cgi-bin/sp.cgi  相似文献   

3.
ToxoDB: accessing the Toxoplasma gondii genome   总被引:1,自引:0,他引:1  
ToxoDB (http://ToxoDB.org) provides a genome resource for the protozoan parasite Toxoplasma gondii. Several sequencing projects devoted to T. gondii have been completed or are in progress: an EST project (http://genome.wustl.edu/est/index.php?toxoplasma=1), a BAC clone end-sequencing project (http://www.sanger.ac.uk/Projects/T_gondii/) and an 8X random shotgun genomic sequencing project (http://www.tigr.org/tdb/e2k1/tga1/). ToxoDB was designed to provide a central point of access for all available T. gondii data, and a variety of data mining tools useful for the analysis of unfinished, un-annotated draft sequence during the early phases of the genome project. In later stages, as more and different types of data become available (microarray, proteomic, SNP, QTL, etc.) the database will provide an integrated data analysis platform facilitating user-defined queries across the different data types.  相似文献   

4.
5.
6.
7.
8.
SUMMARY: We have developed Look-Align, an interactive web-based viewer to display pre-computed multiple sequence alignments. Although initially developed to support the visualization needs of the maize diversity website Panzea (http://www.panzea.org), the viewer is a generic stand-alone tool that can be easily integrated into other websites. AVAILABILITY: Look-Align is written in Perl using open-source components and is available under an open-source license. Live installation and download information can be found at the Panzea website (http://www.panzea.org/software/alignment_viewer.html). CONTACT: ware@cshl.edu SUPPLEMENTARY INFORMATION: The Supplementary information includes sample lists of multiple sequence alignment software and sample screenshots of the viewer.  相似文献   

9.
The Biological General Repository for Interaction Datasets (BioGRID) representational state transfer (REST) service allows full URL-based access to curated protein and genetic interaction data at the BioGRID database. Appending URL parameters allows filtering of data by various attributes including gene names and identifiers, PubMed ID and evidence type. We also describe two visualization tools that interface with the REST service, the BiogridPlugin2 for Cytoscape and the BioGRID WebGraph. Availability and implementation: BioGRID data and applications are completely free for commercial and non-commercial use. http://webservice.thebiogrid.org/resources/interactions (REST Service), http://wiki.thebiogrid.org/doku.php/biogridrest(REST Service parameter list and help), http://webservice.thebiogrid.org/resources/application.wadl(REST Service WADL), http://thebiogrid.org/download.php (BiogridPlugin2, v2.1 download), http://wiki.thebiogrid.org/doku.php/biogridplugin2 (BiogridPlugin2 help) and http://tyerslab.bio.ed.ac.uk/tools/BioGRID_webgraph.php(BioGRID WebGraph).  相似文献   

10.
11.
12.
Introduction: Epigenetic dysregulation drives or supports numerous human cancers. The chromatin landscape in cancer cells is often marked by abnormal histone post-translational modification (PTM) patterns and by aberrant assembly and recruitment of protein complexes to specific genomic loci. Mass spectrometry-based proteomic analyses can support the discovery and characterization of both phenomena.

Areas covered: We broadly divide this literature into two parts: ‘modification-centric’ analyses that link histone PTMs to cancer biology; and ‘complex-centric’ analyses that examine protein–protein interactions that occur de novo as a result of oncogenic mutations. We also discuss proteomic studies of oncohistones. We highlight relevant examples, discuss limitations, and speculate about forthcoming innovations regarding each application.

Expert commentary: ‘Modification-centric’ analyses have been used to further understanding of cancer’s histone code and to identify associated therapeutic vulnerabilities. ‘Complex-centric’ analyses have likewise revealed insights into mechanisms of oncogenesis and suggested potential therapeutic targets, particularly in MLL-associated leukemia. Proteomic experiments have also supported some of the pioneering studies of oncohistone-mediated tumorigenesis. Additional applications of proteomics that may benefit cancer epigenetics research include middle-down and top-down histone PTM analysis, chromatin reader profiling, and genomic locus-specific protein identification. In the coming years, proteomic approaches will remain powerful ways to interrogate the biology of cancer.  相似文献   


13.
WebACT--an online companion for the Artemis Comparison Tool   总被引:4,自引:0,他引:4  
SUMMARY: WebACT is an online resource which enables the rapid provision of simultaneous BLAST comparisons between up to five genomic sequences in a format amenable for visualization with the well-known Artemis Comparison Tool (ACT). Comparisons can be generated on-the-fly using sequences directly retrieved via EMBL database queries, or by entering or uploading user sequences. Furthermore, pre-computed comparisons are available between all publicly available, completed prokaryotic genomes and plasmids currently contained within the Genome Reviews database (372 sequences, representing 175 different species). The system is designed to minimize the volume of downloaded data and maximize performance. Genome sequences, annotation and pre-computed comparisons are stored in a relational database allowing flexible querying based on user-defined sequence regions, from whole genome to a defined region flanking a specified gene. Comparison and sequence files, whether computed online or retrieved from the database of pre-computed genome comparisons, can be viewed online using ACT and are available for download. AVAILABILITY: Freely accessible at http://www.webact.org. SUPPLEMENTARY INFORMATION: User guide and worked examples are available at http://www.webact.org/WebACT/docs.  相似文献   

14.
15.
Introduction: Biomarkers are commonly used to stratify cancer patients and guide targeted therapies, but most biomarkers are of a genomic nature. Discrepancies between the genome and proteome and the high rates of drug resistance indicate that proteomic analyses may provide additional critically important information. Here we present immuno-Matrix-Assisted Laser Desorption/Ionization (iMALDI), the combination of immuno-affinity enrichment of peptides followed by direct MALDI-mass spectrometry analysis. iMALDI is a highly sensitive, targeted protein-quantitation technique with the potential to measure clinically relevant signaling-pathway proteins using minimal sample amounts, thus improving upon existing methodologies.

Areas covered: We provide a brief overview of the current state of biomarker analysis technologies for modern cancer treatment. We also show the advantages of iMALDI for translating potential new biomarkers into the clinic, factors to consider for iMALDI assay development, and the utility of iMALDI for the quantitation of cell-signaling proteins.

Expert commentary: We see targeted mass spectrometry approaches such as iMALDI as an important part of improving patient responses to targeted therapies by providing highly sensitive, accurate, precise, and specific measurements of signaling-pathway proteins, both in tumor cells and in cells from the tumor microenvironment. iMALDI results can be integrated with other -omics data to aid in tumor-targeting therapies and immuno-oncology.  相似文献   


16.
17.
Coal is an important energy source but it has a significant negative impact on the environmental processes. This paper analyses the impact, measurement, and input of parameters representing potential environmental polluters in the information system (IS).

The methodology of recording and systematization includes the following parameters: coal deposits; climate parameters; roads; rivers; land and surrounding objects; air polluters; water polluters; and soil polluters. Methods for calculating land deformation, air polluter emissions, and noise impact are also presented.

Based on the number and specificity of analyzed data, the paper provides a concept of the IS and an overview of environmental impact of underground coal mine technological units. The concept was used to present the results of a research conducted at the underground coal mine “Soko” in Serbia.

The results of this research can help many potential users realize their goals. Those goals are preventive by nature, since negative environmental impact can be predicted, which enables the environmental protection experts to take appropriate measures.  相似文献   


18.
Introduction: Mass spectrometry (MS)-based proteomics has become an indispensable tool for the characterization of the proteome and its post-translational modifications (PTM). In addition to standard protein sequence databases, proteogenomics strategies search the spectral data against the theoretical spectra obtained from customized protein sequence databases. Up to date, there are no published proteogenomics studies on acute myeloid leukemia (AML) samples.

Areas covered: Proteogenomics involves the understanding of genomic and proteomic data. The intersection of both datatypes requires advanced bioinformatics skills. A standard proteogenomics workflow that could be used for the study of AML samples is described. The generation of customized protein sequence databases as well as bioinformatics tools and pipelines commonly used in proteogenomics are discussed in detail.

Expert commentary: Drawing on evidence from recent cancer proteogenomics studies and taking into account the public availability of AML genomic data, the interpretation of present and future MS-based AML proteomic data using AML-specific protein sequence databases could discover new biological mechanisms and targets in AML. However, proteogenomics workflows including bioinformatics guidelines can be challenging for the wide AML research community. It is expected that further automation and simplification of the bioinformatics procedures might attract AML investigators to adopt the proteogenomics strategy.  相似文献   


19.
The Medicago Genome Initiative (MGI) is a database of EST sequences of the model legume MEDICAGO: truncatula. The database is available to the public and has resulted from a collaborative research effort between the Samuel Roberts Noble Foundation and the National Center for Genome Resources to investigate the genome of M.truncatula. MGI is part of the greater integrated MEDICAGO: functional genomics program at the Noble Foundation (http://www.noble.org ), which is taking a global approach in studying the genetic and biochemical events associated with the growth, development and environmental interactions of this model legume. Our approach will include: large-scale EST sequencing, gene expression profiling, the generation of M.truncatula activation-tagged and promoter trap insertion mutants, high-throughput metabolic profiling, and proteome studies. These multidisciplinary information pools will be interfaced with one another to provide scientists with an integrated, holistic set of tools to address fundamental questions pertaining to legume biology. The public interface to the MGI database can be accessed at http://www.ncgr.org/research/mgi.  相似文献   

20.
SUMMARY: ESTminer is a collection of programs that use expressed sequence tag (EST) data from inbred genomes to identify unique genes within gene families. The algorithm utilizes Cap3 to perform an initial clustering of related EST sequences to produce a consensus sequence of a gene family. These consensus sequences are then used to collect all ESTs in the original EST library that are related using BLAST. A redundancy based criterion is applied to each EST to identify reliable unique gene-sequences. Using a highly inbred genome as a source of ESTs eliminates the necessity of computing covariance on each polymorphism to identify alleles of the same gene, thus making this algorithm more streamlined than other alternatives which must computationally attempt to distinguish genes from alleles. AVAILABILITY: The programs were written in PERL and are freely available at http://www.soybase.org/publication_data/Nelson/ESTminer/ESTminer.html CONTACT: nelsonrt@iastate.edu SUPPLEMENTARY INFORMATION: Figures and dataset can be obtained from: http://www.soybase.org/publication_data/Nelson/ESTminer/ESTminer.html.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号