期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

GMAP: a genomic mapping and alignment program for mRNA and EST sequences 总被引：13，自引：0，他引：13

Wu TD Watanabe CK 《Bioinformatics (Oxford, England)》2005,21(9):1859-1875

MOTIVATION: We introduce GMAP, a standalone program for mapping and aligning cDNA sequences to a genome. The program maps and aligns a single sequence with minimal startup time and memory requirements, and provides fast batch processing of large sequence sets. The program generates accurate gene structures, even in the presence of substantial polymorphisms and sequence errors, without using probabilistic splice site models. Methodology underlying the program includes a minimal sampling strategy for genomic mapping, oligomer chaining for approximate alignment, sandwich DP for splice site detection, and microexon identification with statistical significance testing. RESULTS: On a set of human messenger RNAs with random mutations at a 1 and 3% rate, GMAP identified all splice sites accurately in over 99.3% of the sequences, which was one-tenth the error rate of existing programs. On a large set of human expressed sequence tags, GMAP provided higher-quality alignments more often than blat did. On a set of Arabidopsis cDNAs, GMAP performed comparably with GeneSeqer. In these experiments, GMAP demonstrated a several-fold increase in speed over existing programs. AVAILABILITY: Source code for gmap and associated programs is available at http://www.gene.com/share/gmap SUPPLEMENTARY INFORMATION: http://www.gene.com/share/gmap. 相似文献

2.

Arabidopsis genomic information for interpreting wheat EST sequences

Clarke B Lambrecht M Rhee SY 《Functional & integrative genomics》2003,3(1-2):33-38

The resources available from Arabidopsis thaliana for interpreting functional attributes of wheat EST are reviewed. A focus for the review is a comparison between wheat EST sequences, generated from developing endosperm tissue, and the complete genomic sequence from Arabidopsis. The available information indicates that not only can tentative annotations be assigned to many wheat genes but also putative or unknown Arabidopsis gene annotations can be improved by comparative genomics. Electronic Publication 相似文献

3.

MAP2: multiple alignment of syntenic genomic sequences

Ye L Huang X 《Nucleic acids research》2005,33(1):162-170

We describe a multiple alignment program named MAP2 based on a generalized pairwise global alignment algorithm for handling long, different intergenic and intragenic regions in genomic sequences. The MAP2 program produces an ordered list of local multiple alignments of similar regions among sequences, where different regions between local alignments are indicated by reporting only similar regions. We propose two similarity measures for the evaluation of the performance of MAP2 and existing multiple alignment programs. Experimental results produced by MAP2 on four real sets of orthologous genomic sequences show that MAP2 rarely missed a block of transitively similar regions and that MAP2 never produced a block of regions that are not transitively similar. Experimental results by MAP2 on six simulated data sets show that MAP2 found the boundaries between similar and different regions precisely. This feature is useful for finding conserved functional elements in genomic sequences. The MAP2 program is freely available in source code form at http://bioinformatics.iastate.edu/aat/sas.html for academic use. 相似文献

4.

Fast and sensitive multiple alignment of large genomic sequences

Michael?Brudno Email author Michael?Chapman Berthold?G?ttgens Serafim?Batzoglou Burkhard?Morgenstern Email author 《BMC bioinformatics》2003,4(1):66

Background

Genomic sequence alignment is a powerful method for genome analysis and annotation, as alignments are routinely used to identify functional sites such as genes or regulatory elements. With a growing number of partially or completely sequenced genomes, multiple alignment is playing an increasingly important role in these studies. In recent years, various tools for pair-wise and multiple genomic alignment have been proposed. Some of them are extremely fast, but often efficiency is achieved at the expense of sensitivity. One way of combining speed and sensitivity is to use an anchored-alignment approach. In a first step, a fast search program identifies a chain of strong local sequence similarities. In a second step, regions between these anchor points are aligned using a slower but more accurate method.

Results

Herein, we present CHAOS, a novel algorithm for rapid identification of chains of local pair-wise sequence similarities. Local alignments calculated by CHAOS are used as anchor points to improve the running time of DIALIGN, a slow but sensitive multiple-alignment tool. We show that this way, the running time of DIALIGN can be reduced by more than 95% for BAC-sized and longer sequences, without affecting the quality of the resulting alignments. We apply our approach to a set of five genomic sequences around the stem-cell-leukemia (SCL) gene and demonstrate that exons and small regulatory elements can be identified by our multiple-alignment procedure.

Conclusion

We conclude that the novel CHAOS local alignment tool is an effective way to significantly speed up global alignment tools such as DIALIGN without reducing the alignment quality. We likewise demonstrate that the DIALIGN/CHAOS combination is able to accurately align short regulatory sequences in distant orthologues.

相似文献

5.

Large-scale predictions of secretory proteins from mammalian genomic and EST sequences

Ladunga I 《Current opinion in biotechnology》2000,11(1):13-18

Machine learning techniques have improved predictions of secretory proteins from protein, genomic and expressed sequence tag (EST) sequences. Artificial neural networks, physical sequence analysis using high-performance optimization, and hidden Markov models identify extremely variable signal peptides (the vehicles of protein transport across the endoplasmic reticulum membrane), transmembrane segments, and specific extracellular and intracellular domains as indicators of possible roles in the intercellular and intracellular chemical signaling pathways. The major role of peptide hormones, blood coagulation factors, carcinogenesis agents, and other secretory proteins in orchestrating multicellular life indicates pharmacological potential in the cure of major diseases and numerous biotechnological applications. 相似文献

6.

Direct mapping and alignment of protein sequences onto genomic sequence

Gotoh O 《Bioinformatics (Oxford, England)》2008,24(21):2438-2444

相似文献

7.

Integrated web service for improving alignment quality based on segments comparison

Dariusz?Plewczynski Email author Leszek?Rychlewski Yuzhen?Ye Lukasz?Jaroszewski Adam?Godzik 《BMC bioinformatics》2004,5(1):98

相似文献

8.

SITEBLAST--rapid and sensitive local alignment of genomic sequences employing motif anchors

Michael M Dieterich C Vingron M 《Bioinformatics (Oxford, England)》2005,21(9):2093-2094

MOTIVATION: Comparative sequence analysis is the essence of many approaches to genome annotation. Heuristic alignment algorithms utilize similar seed pairs to anchor an alignment. Some applications of local alignment algorithms (e.g. phylogenetic footprinting) would benefit from including prior knowledge (e.g. binding site motifs) in the alignment building process. RESULTS: We introduce predefined sequence patterns as anchor points into a heuristic local alignment strategy. We extended the BLASTZ program for this purpose. A set of seed patterns is either given as consensus sequences in IUPAC code or position-weight-matrices. Phylogenetic footprinting of promoter regions is one of many potential applications for the SITEBLAST software. AVAILABILITY: The source code is freely available to the academic community from http://corg.molgen.mpg.de/software 相似文献

9.

OCPAT: an online codon-preserved alignment tool for evolutionary genomic analysis of protein coding sequences

Guozhen Liu Monica Uddin Munirul Islam Morris Goodman Lawrence I Grossman Roberto Romero Derek E Wildman 《Source code for biology and medicine》2007,2(1):5

相似文献

10.

A multiple alignment program for protein sequences 总被引：1，自引：0，他引：1

Santibanez M.; Rohde K. 《Bioinformatics (Oxford, England)》1987,3(2):111-114

A program for the multiple alignment of protein sequences ispresented. The program is an extension of the fast alignmentprogram by Wilbur et al. (1984) into higher dimensions. Theuse of hash procedures on fragments of the protein sequencesincreases the speed of calculation. Thereby we also take intoaccount fragments which are present in some, but not in all,sequences considered. The results of some multiple alignmentsare given. Received on September 11, 1986; accepted on March 18, 1987 相似文献

11.

A graph based algorithm for generating EST consensus sequences

Malde K Coward E Jonassen I 《Bioinformatics (Oxford, England)》2005,21(8):1371-1375

相似文献

12.

MICAS: a fully automated web server for microsatellite extraction and analysis from prokaryote and viral genomic sequences

Sreenu VB Ranjitkumar G Swaminathan S Priya S Bose B Pavan MN Thanu G Nagaraju J Nagarajaram HA 《Applied bioinformatics》2003,2(3):165-168

MICAS is a web server for extracting microsatellite information from completely sequenced prokaryote and viral genomes, or user-submitted sequences. This server provides an integrated platform for MICdb (database of prokaryote and viral microsatellites), W-SSRF (simple sequence repeat finding program) and Autoprimer (primer design software). MICAS, through dynamic HTML page generation, helps in the systematic extraction of microsatellite information from selected genomes hosted on MICdb or from user-submitted sequences. Further, it assists in the design of primers with the help of Autoprimer, for sequences containing selected microsatellite tracts. 相似文献

13.

Use of bovine EST data and human genomic sequences to map 100 gene-specific bovine markers 总被引：4，自引：0，他引：4

Roger T. Stone W. Michael Grosse Eduardo Casas Timothy P.L. Smith John W. Keele Gary L. Bennett 《Mammalian genome》2002,13(4):211-215

A system to use bovine EST data in conjunction with human genomic sequence to improve the bovine linkage map over the entire genome or on specific chromosomes was evaluated. Bovine EST sequence was used to provide primer sequences corresponding to bovine genes, while human genomic sequence directed primer design to flank introns and produce amplicons of appropriate size for efficient direct sequencing. The sequence tagged sites (STS) produced in this way from the four sires of the MARC reference families were examined for single nucleotide polymorphisms (SNPs) that could be used to map the corresponding genes. With this approach, along with a primer/extension mass spectrometry SNP genotyping assay, 100 ESTs were placed on the bovine genetic linkage map. The first 70 were chosen at random from bovine EST–human genomic comparisons. An additional 30 ESTs were successfully mapped to bovine Chromosome 19 (BTA19), and comparison of the resulting BTA19 map to the position of the corresponding human orthologs on the HSA17 draft sequences revealed differences in the spacing and order of genes. Over 80% of successful amplicons contained SNPs, indicating that this is an efficient approach to generating EST-associated genetic markers. We have demonstrated the feasibility of constructing a linkage map based on SNPs associated with ESTs and the plausibility of utilizing EST, comparative mapping information, and human sequence data to target regions of the bovine genome for SNP marker development. 相似文献

14.

A web service for biomedical term look-up

Harkema H Roberts I Gaizauskas R Hepple M 《Comparative and Functional Genomics》2005,6(1-2):86-93

Recent years have seen a huge increase in the amount of biomedical information that is available in electronic format. Consequently, for biomedical researchers wishing to relate their experimental results to relevant data lurking somewhere within this expanding universe of on-line information, the ability to access and navigate biomedical information sources in an efficient manner has become increasingly important. Natural language and text processing techniques can facilitate this task by making the information contained in textual resources such as MEDLINE more readily accessible and amenable to computational processing. Names of biological entities such as genes and proteins provide critical links between different biomedical information sources and researchers' experimental data. Therefore, automatic identification and classification of these terms in text is an essential capability of any natural language processing system aimed at managing the wealth of biomedical information that is available electronically. To support term recognition in the biomedical domain, we have developed Termino, a large-scale terminological resource for text processing applications, which has two main components: first, a database into which very large numbers of terms can be loaded from resources such as UMLS, and stored together with various kinds of relevant information; second, a finite state recognizer, for fast and efficient identification and mark-up of terms within text. Since many biomedical applications require this functionality, we have made Termino available to the community as a web service, which allows for its integration into larger applications as a remotely located component, accessed through a standardized interface over the web. 相似文献

15.

Microsatellite markers for Vaccinium from EST and genomic libraries

P. S. BOCHES N. V. BASSIL L. J. ROWLAND 《Molecular ecology resources》2005,5(3):657-660

We present 30 microsatellite loci isolated from expressed sequence tag (EST) and genomic libraries in Vaccinium corymbosum L. Allele number per locus in 11 tetraploid and one diploid V. corymbosum accessions ranged from two to 15 (mean = 8.16) in 24 single‐locus simple sequence repeats (SSRs). Cross‐species amplification in a panel of 12 species representing nine sections ranged from 30 to 100% (mean = 83%). 相似文献

16.

ReAlignerV: Web-based genomic alignment tool with high specificity and robustness estimated by species-specific insertion sequences

Hisakazu Iwama Yukio Hori Kensuke Matsumoto Koji Murao Toshihiko Ishida 《BMC bioinformatics》2008,9(1):112

相似文献

17.

EST and random genomic nuclear microsatellite markers for Streptocarpus

M. Hughes M. Mller D. U. Bellstedt T. J. Edwards M. Woodhead 《Molecular ecology resources》2004,4(1):36-38

Microsatellite markers have been developed from standard enriched genomic libraries and a cDNA library for the genus Streptocarpus. Out of 15 loci derived from ESTs (expressed sequence tags), four gave working primer pairs, with expected heterozygosities (H_E) ranging from 0.42 to 0.86. Out of 89 genomic library derived loci, 6 gave working primer pairs, with H_E ranging from 0.63 to 0.93. 相似文献

18.

GoSh: a web-based database for goat and sheep EST sequences

Caprera A Lazzari B Stella A Merelli I Caetano AR Mariani P 《Bioinformatics (Oxford, England)》2007,23(8):1043-1045

The GoSh database is a collection of 58 990 Capra hircus and Ovis aries expressed sequence tags. A perl pipeline was prepared to process sequences, and data were collected in a MySQL database. A PHP-based web interface allows browsing and querying the database. Putative single nucleotide polymorphism (SNP) detection, as well as search to repeats were performed, and links to external related resources were provided. Sequences were annotated against three different databases and an algorithm was implemented to create statistics of the distribution of retrieved homologous ontologies in the Gene Ontology categories. The GoSh database is a repository of data and links related to goat and sheep expressed genes. AVAILABILITY: The GoSh database is available at http://www.itb.cnr.it/gosh/ 相似文献

19.

Development of intron-flanking EST markers for the Lolium/Festuca complex using rice genomic information

Ken-ichi Tamura Jun-ichi Yonemaru Hiroshi Hisano Hiroyuki Kanamori Julie King Ian P. King Kazuhiro Tase Yasuharu Sanada Toshinori Komatsu Toshihiko Yamada 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2009,118(8):1549-1560

DNA markers able to distinguish species or genera with high specificity are valuable in the identification of introgressed regions in interspecific or intergeneric hybrids. Intergeneric hybridization between the genera of Lolium and Festuca, leading to the reciprocal introgression of chromosomal segments, can produce novel forage grasses with unique combinations of characteristics. To characterize Lolium/Festuca introgressions, novel PCR-based expression sequence tag (EST) markers were developed. These markers were designed around intronic regions which show higher polymorphism than exonic regions. Intronic regions of the grass genes were predicted from the sequenced rice genome. Two hundred and nine primer sets were designed from Lolium/Festuca ESTs that showed high similarity to unique rice genes dispersed uniformly throughout the rice genome. We selected 61 of these primer sets as insertion-deletion (indel)-type markers and 82 primer sets as cleaved amplified polymorphic sequence (CAPS) markers to distinguish between Lolium perenne and Festuca pratensis. Specificity of these markers to each species was evaluated by the genotyping of four cultivars and accessions (32 individuals) of L. perenne and F. pratensis, respectively. Evaluation using specificity indices proposed in this study suggested that many indel-type markers had high species specificity to L. perenne and F. pratensis, including 15 markers completely specific to both species. Forty-nine of the CAPS markers completely distinguish between the two species at bulk level. Chromosome mapping of these markers using a Lolium/Festuca substitution line revealed syntenic relationships between Lolium/Festuca and rice largely consistent with previous reports. This intron-based marker system that shows a high level of polymorphisms between species in combination with high species specificity will consequently be a valuable tool in Festulolium breeding. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. 相似文献

20.

An optimized protocol for analysis of EST sequences 总被引：16，自引：1，他引：16

Liang F Holt I Pertea G Karamycheva S Salzberg SL Quackenbush J 《Nucleic acids research》2000,28(18):3657-3665

相似文献