期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

HaMStR: Profile hidden markov model based search for orthologs in ESTs

Ingo Ebersberger Sascha Strauss Arndt von Haeseler 《BMC evolutionary biology》2009,9(1):157-9

Background

EST sequencing is a versatile approach for rapidly gathering protein coding sequences. They provide direct access to an organism's gene repertoire bypassing the still error-prone procedure of gene prediction from genomic data. Therefore, ESTs are often the only source for biological sequence data from taxa outside mainstream interest. The widespread use of ESTs in evolutionary studies and particularly in molecular systematics studies is still hindered by the lack of efficient and reliable approaches for automated ortholog predictions in ESTs. Existing methods either depend on a known species tree or cannot cope with redundancy in EST data. 相似文献

2.

PRESTA: associating promoter sequences with information on gene expression

Mach V 《Genome biology》2002,3(9):research0050.1-research00507

Background

Large sets of well-characterized promoter sequences are required to facilitate the understanding of promoter architecture. The major sequence databases are a prospective source of upstream regulatory regions, but suffer from inaccurate annotation. The software tool PRESTA (PRomoter EST Association) presented in this study is designed for efficient recovery of characterized and partially verified promoters from GenBank and EMBL libraries. 相似文献

3.

PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results

Ji He Xinbin Dai Xuechun Zhao 《BMC bioinformatics》2007,8(1):53

Background

BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. 相似文献

4.

IsoSVM – Distinguishing isoforms and paralogs on the protein level

Michael Spitzer Stefan Lorkowski Paul Cullen Alexander Sczyrba Georg Fuellen 《BMC bioinformatics》2006,7(1):110-14

Background

Recent progress in cDNA and EST sequencing is yielding a deluge of sequence data. Like database search results and proteome databases, this data gives rise to inferred protein sequences without ready access to the underlying genomic data. Analysis of this information (e.g. for EST clustering or phylogenetic reconstruction from proteome data) is hampered because it is not known if two protein sequences are isoforms (splice variants) or not (i.e. paralogs/orthologs). However, even without knowing the intron/exon structure, visual analysis of the pattern of similarity across the alignment of the two protein sequences is usually helpful since paralogs and orthologs feature substitutions with respect to each other, as opposed to isoforms, which do not. 相似文献

5.

Large-scale identification of polymorphic microsatellites using an <Emphasis Type="Italic">in silico</Emphasis> approach

Jifeng Tang Samantha J Baldwin Jeanne ME Jacobs C Gerard van der Linden Roeland E Voorrips Jack AM Leunissen Herman van Eck Ben Vosman 《BMC bioinformatics》2008,9(1):374

Background

Simple Sequence Repeat (SSR) or microsatellite markers are valuable for genetic research. Experimental methods to develop SSR markers are laborious, time consuming and expensive. In silico approaches have become a practicable and relatively inexpensive alternative during the last decade, although testing putative SSR markers still is time consuming and expensive. In many species only a relatively small percentage of SSR markers turn out to be polymorphic. This is particularly true for markers derived from expressed sequence tags (ESTs). In EST databases a large redundancy of sequences is present, which may contain information on length-polymorphisms in the SSR they contain, and whether they have been derived from heterozygotes or from different genotypes. Up to now, although a number of programs have been developed to identify SSRs in EST sequences, no software can detect putatively polymorphic SSRs. 相似文献

6.

EST2uni: an open,parallel tool for automated EST analysis and database creation,with a data mining web interface and microarray expression data integration

Javier Forment Francisco Gilabert Antonio Robles Vicente Conejero Fernando Nuez Jose M Blanca 《BMC bioinformatics》2008,9(1):5

Background

Expressed sequence tag (EST) collections are composed of a high number of single-pass, redundant, partial sequences, which need to be processed, clustered, and annotated to remove low-quality and vector regions, eliminate redundancy and sequencing errors, and provide biologically relevant information. In order to provide a suitable way of performing the different steps in the analysis of the ESTs, flexible computation pipelines adapted to the local needs of specific EST projects have to be developed. Furthermore, EST collections must be stored in highly structured relational databases available to researchers through user-friendly interfaces which allow efficient and complex data mining, thus offering maximum capabilities for their full exploitation. 相似文献

7.

<Emphasis Type="Italic">ClustalXeed</Emphasis>: a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment

Taeho Kim Hyun Joo 《BMC bioinformatics》2010,11(1):467

Background

There is an increasing demand to assemble and align large-scale biological sequence data sets. The commonly used multiple sequence alignment programs are still limited in their ability to handle very large amounts of sequences because the system lacks a scalable high-performance computing (HPC) environment with a greatly extended data storage capacity. 相似文献

8.

prot4EST: Translating Expressed Sequence Tags from neglected genomes

James?D?Wasmuth Email author Mark?L?Blaxter 《BMC bioinformatics》2004,5(1):187

相似文献

9.

An expressed sequence tag (EST) library from developing fruits of an Hawaiian endemic mint (Stenogyne rugosa, Lamiaceae): characterization and microsatellite markers

Charlotte Lindqvist Anne-Cathrine Scheen Mi-Jeong Yoo Paris Grey David G Oppenheimer James H Leebens-Mack Douglas E Soltis Pamela S Soltis Victor A Albert 《BMC plant biology》2006,6(1):16-15

Background

The endemic Hawaiian mints represent a major island radiation that likely originated from hybridization between two North American polyploid lineages. In contrast with the extensive morphological and ecological diversity among taxa, ribosomal DNA sequence variation has been found to be remarkably low. In the past few years, expressed sequence tag (EST) projects on plant species have generated a vast amount of publicly available sequence data that can be mined for simple sequence repeats (SSRs). However, these EST projects have largely focused on crop or otherwise economically important plants, and so far only few studies have been published on the use of intragenic SSRs in natural plant populations. We constructed an EST library from developing fleshy nutlets of Stenogyne rugosa principally to identify genetic markers for the Hawaiian endemic mints. 相似文献

10.

Inferring angiosperm phylogeny from EST data with widespread gene duplication

Sanderson MJ McMahon MM 《BMC evolutionary biology》2007,7(Z1):S3

Background

Most studies inferring species phylogenies use sequences from single copy genes or sets of orthologs culled from gene families. For taxa such as plants, with very high levels of gene duplication in their nuclear genomes, this has limited the exploitation of nuclear sequences for phylogenetic studies, such as those available in large EST libraries. One rarely used method of inference, gene tree parsimony, can infer species trees from gene families undergoing duplication and loss, but its performance has not been evaluated at a phylogenomic scale for EST data in plants.

Results

A gene tree parsimony analysis based on EST data was undertaken for six angiosperm model species and Pinus, an outgroup. Although a large fraction of the tentative consensus sequences obtained from the TIGR database of ESTs was assembled into homologous clusters too small to be phylogenetically informative, some 557 clusters contained promising levels of information. Based on maximum likelihood estimates of the gene trees obtained from these clusters, gene tree parsimony correctly inferred the accepted species tree with strong statistical support. A slight variant of this species tree was obtained when maximum parsimony was used to infer the individual gene trees instead.

Conclusion

Despite the complexity of the EST data and the relatively small fraction eventually used in inferring a species tree, the gene tree parsimony method performed well in the face of very high apparent rates of duplication.

相似文献

11.

Development and production of an oligonucleotide MuscleChip: use for validation of ambiguous ESTs

Rehannah?HA?Borup Stefano?Toppo Yi-Wen?Chen Tanya?M?Teslovich Gerolamo?Lanfranchi Giorgio?Valle Eric?P?Hoffman Email author 《BMC bioinformatics》2002,3(1):33

相似文献

12.

SIMPROT: Using an empirically determined indel distribution in simulations of protein evolution

Andy?Pang Andrew?D?Smith Paulo?AS?Nuin Elisabeth?RM?Tillier Email author 《BMC bioinformatics》2005,6(1):236

Background

General protein evolution models help determine the baseline expectations for the evolution of sequences, and they have been extensively useful in sequence analysis and for the computer simulation of artificial sequence data sets. 相似文献

13.

SimHap GUI: An intuitive graphical user interface for genetic association analysis 总被引：1，自引：0，他引：1

Kim W Carter Pamela A McCaskie Lyle J Palmer 《BMC bioinformatics》2008,9(1):1-6

Background

The expressed sequence tag (EST) methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways.

Results

annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO), Enzyme Commission (EC) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools.

Conclusion

annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non-model species EST-sequencing projects. 相似文献

14.

Transcriptome analysis of the desert locust central nervous system: production and annotation of a Schistocerca gregaria EST database

Badisco L Huybrechts J Simonet G Verlinden H Marchal E Huybrechts R Schoofs L De Loof A Vanden Broeck J 《PloS one》2011,6(3):e17274

相似文献

15.

CoaSim: A flexible environment for simulating genetic data under coalescent models

Thomas?Mailund Email author Mikkel?H?Schierup Christian?NS?Pedersen Peter?JM?Mechlenborg Jesper?N?Madsen Leif?Schauser 《BMC bioinformatics》2005,6(1):252

Background

Coalescent simulations are playing a large role in interpreting large scale intra-specific sequence or polymorphism surveys and for planning and evaluating association studies. Coalescent simulations of data sets under different models can be compared to the actual data to test the importance of different evolutionary factors and thus get insight into these. 相似文献

16.

QualitySNP: a pipeline for detecting single nucleotide polymorphisms and insertions/deletions in EST data from diploid and polyploid species 总被引：2，自引：0，他引：2

Jifeng Tang Ben Vosman Roeland E Voorrips C Gerard van der Linden Jack AM Leunissen 《BMC bioinformatics》2006,7(1):438

Background

Single nucleotide polymorphisms (SNPs) are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs)) and are considered to be faster and more cost-effective than experimental procedures. A major challenge in computational SNP discovery is distinguishing allelic variation from sequence variation between paralogous sequences, in addition to recognizing sequencing errors. For the majority of the public EST sequences, trace or quality files are lacking which makes detection of reliable SNPs even more difficult because it has to rely on sequence comparisons only. 相似文献

17.

Deep sequencing of ESTs from nacreous and prismatic layer producing tissues and a screen for novel shell formation-related genes in the pearl oyster

Kinoshita S Wang N Inoue H Maeyama K Okamoto K Nagai K Kondo H Hirono I Asakawa S Watabe S 《PloS one》2011,6(6):e21238

相似文献

18.

Fourmidable: a database for ant genomics 总被引：1，自引：0，他引：1

Yannick Wurm Paolo Uva Frédéric Ricci John Wang Stephanie Jemielity Christian Iseli Laurent Falquet Laurent Keller 《BMC genomics》2009,10(1):1-5

相似文献

19.

Phylogenomics with incomplete taxon coverage: the limits to inference

Michael J Sanderson Michelle M McMahon Mike Steel 《BMC evolutionary biology》2010,10(1):155

Background

Phylogenomic studies based on multi-locus sequence data sets are usually characterized by partial taxon coverage, in which sequences for some loci are missing for some taxa. The impact of missing data has been widely studied in phylogenetics, but it has proven difficult to distinguish effects due to error in tree reconstruction from effects due to missing data per se. We approach this problem using a explicitly phylogenomic criterion of success, decisiveness, which refers to whether the pattern of taxon coverage allows for uniquely defining a single tree for all taxa. 相似文献

20.

SolEST database: a "one-stop shop" approach to the study of Solanaceae transcriptomes

Nunzio D'Agostino Alessandra Traini Luigi Frusciante Maria Luisa Chiusano 《BMC plant biology》2009,9(1):142-16

相似文献