共查询到20条相似文献,搜索用时 929 毫秒
1.
2.
Background
EST sequencing is a versatile approach for rapidly gathering protein coding sequences. They provide direct access to an organism's gene repertoire bypassing the still error-prone procedure of gene prediction from genomic data. Therefore, ESTs are often the only source for biological sequence data from taxa outside mainstream interest. The widespread use of ESTs in evolutionary studies and particularly in molecular systematics studies is still hindered by the lack of efficient and reliable approaches for automated ortholog predictions in ESTs. Existing methods either depend on a known species tree or cannot cope with redundancy in EST data. 相似文献3.
Jose EB de la Torre Mary G Egan Manpreet S Katari Eric D Brenner Dennis W Stevenson Gloria M Coruzzi Rob DeSalle 《BMC evolutionary biology》2006,6(1):48-15
Background
While Expressed Sequence Tags (ESTs) have proven a viable and efficient way to sample genomes, particularly those for which whole-genome sequencing is impractical, phylogenetic analysis using ESTs remains difficult. Sequencing errors and orthology determination are the major problems when using ESTs as a source of characters for systematics. Here we develop methods to incorporate EST sequence information in a simultaneous analysis framework to address controversial phylogenetic questions regarding the relationships among the major groups of seed plants. We use an automated, phylogenetically derived approach to orthology determination called OrthologID generate a phylogeny based on 43 process partitions, many of which are derived from ESTs, and examine several measures of support to assess the utility of EST data for phylogenies. 相似文献4.
Background
ESTs are a tremendous resource for determining the exon-intron structures of genes, but even extensive EST sequencing tends to leave many exons and genes untouched. Gene prediction systems based exclusively on EST alignments miss these exons and genes, leading to poor sensitivity. De novo gene prediction systems, which ignore ESTs in favor of genomic sequence, can predict such "untouched" exons, but they are less accurate when predicting exons to which ESTs align. TWINSCAN is the most accurate de novo gene finder available for nematodes and N-SCAN is the most accurate for mammals, as measured by exact CDS gene prediction and exact exon prediction. 相似文献5.
Javier Forment Francisco Gilabert Antonio Robles Vicente Conejero Fernando Nuez Jose M Blanca 《BMC bioinformatics》2008,9(1):5
Background
Expressed sequence tag (EST) collections are composed of a high number of single-pass, redundant, partial sequences, which need to be processed, clustered, and annotated to remove low-quality and vector regions, eliminate redundancy and sequencing errors, and provide biologically relevant information. In order to provide a suitable way of performing the different steps in the analysis of the ESTs, flexible computation pipelines adapted to the local needs of specific EST projects have to be developed. Furthermore, EST collections must be stored in highly structured relational databases available to researchers through user-friendly interfaces which allow efficient and complex data mining, thus offering maximum capabilities for their full exploitation. 相似文献6.
Alternative splicing and protein function 总被引:1,自引:0,他引:1
Background
Alternative splicing is a major mechanism of generating protein diversity in higher eukaryotes. Although at least half, and probably more, of mammalian genes are alternatively spliced, it was not clear, whether the frequency of alternative splicing is the same in different functional categories. The problem is obscured by uneven coverage of genes by ESTs and a large number of artifacts in the EST data. 相似文献7.
Utility of EST-derived SSR in cultivated peanut (Arachis hypogaea L.) and Arachis wild species 总被引:1,自引:0,他引:1
Xuanqiang Liang Xiaoping Chen Yanbin Hong Haiyan Liu Guiyuan Zhou Shaoxiong Li Baozhu Guo 《BMC plant biology》2009,9(1):35-9
Background
Lack of sufficient molecular markers hinders current genetic research in peanuts (Arachis hypogaea L.). It is necessary to develop more molecular markers for potential use in peanut genetic research. With the development of peanut EST projects, a vast amount of available EST sequence data has been generated. These data offered an opportunity to identify SSR in ESTs by data mining. 相似文献8.
Shaolin Wang Eric Peatman Jason Abernathy Geoff Waldbieser Erika Lindquist Paul Richardson Susan Lucas Mei Wang Ping Li Jyothi Thimmapuram Lei Liu Deepika Vullaganti Huseyin Kucuktas Christopher Murdock Brian C Small Melanie Wilson Hong Liu Yanliang Jiang Yoona Lee Fei Chen Jianguo Lu Wenqi Wang Peng Xu Benjaporn Somridhivej Puttharat Baoprasertkul Jonas Quilang Zhenxia Sha Baolong Bao Yaping Wang Qun Wang Tomokazu Takano Samiran Nandi Shikai Liu Lilian Wong Ludmilla Kaltenboeck Sylvie Quiniou Eva Bengten Norman Miller John Trant Daniel Rokhsar Zhanjiang Liu 《Genome biology》2010,11(1):1-14
9.
Background
Expressed sequence tags (ESTs) analyses are a fundamental tool for gene identification in organisms. Given a preliminary EST sample from a certain library, several statistical prediction problems arise. In particular, it is of interest to estimate how many new genes can be detected in a future EST sample of given size and also to determine the gene discovery rate: these estimates represent the basis for deciding whether to proceed sequencing the library and, in case of a positive decision, a guideline for selecting the size of the new sample. Such information is also useful for establishing sequencing efficiency in experimental design and for measuring the degree of redundancy of an EST library. 相似文献10.
11.
12.
Jifeng Tang Samantha J Baldwin Jeanne ME Jacobs C Gerard van der Linden Roeland E Voorrips Jack AM Leunissen Herman van Eck Ben Vosman 《BMC bioinformatics》2008,9(1):374
Background
Simple Sequence Repeat (SSR) or microsatellite markers are valuable for genetic research. Experimental methods to develop SSR markers are laborious, time consuming and expensive. In silico approaches have become a practicable and relatively inexpensive alternative during the last decade, although testing putative SSR markers still is time consuming and expensive. In many species only a relatively small percentage of SSR markers turn out to be polymorphic. This is particularly true for markers derived from expressed sequence tags (ESTs). In EST databases a large redundancy of sequences is present, which may contain information on length-polymorphisms in the SSR they contain, and whether they have been derived from heterozygotes or from different genotypes. Up to now, although a number of programs have been developed to identify SSRs in EST sequences, no software can detect putatively polymorphic SSRs. 相似文献13.
Jie Zheng Jan T Svensson Kavitha Madishetty Timothy J Close Tao Jiang Stefano Lonardi 《BMC bioinformatics》2006,7(1):7
Background
Expressed sequence tag (EST) datasets represent perhaps the largest collection of genetic information. ESTs can be exploited in a variety of biological experiments and analysis. Here we are interested in the design of overlapping oligonucleotide (overgo) probes from large unigene (EST-contigs) datasets. 相似文献14.
Hyun-Jin Kim Kwang-Hyun Baek Seung-Won Lee JungEun Kim Bong-Woo Lee Hye-Sun Cho Woo Taek Kim Doil Choi Cheol-Goo Hur 《BMC plant biology》2008,8(1):101
Background
There is no dedicated database available for Expressed Sequence Tags (EST) of the chili pepper (Capsicum annuum), although the interest in a chili pepper EST database is increasing internationally due to the nutritional, economic, and pharmaceutical value of the plant. Recent advances in high-throughput sequencing of the ESTs of chili pepper cv. Bukang have produced hundreds of thousands of complementary DNA (cDNA) sequences. Therefore, a chili pepper EST database was designed and constructed to enable comprehensive analysis of chili pepper gene expression in response to biotic and abiotic stresses. 相似文献15.
16.
Background
Immotile cilia syndrome (ICS) or primary ciliary dyskinesia (PCD) is an autosomal recessive disorder in humans in which the beating of cilia and sperm flagella is impaired. Ciliated epithelial cell linings are present in many tissues. To understand ciliary assembly and motility, it is important to isolate those genes involved in the process.Results
Total RNA was isolated from cultured ciliated nasal epithelial cells after in vitro ciliogenesis and expressed sequenced tags (ESTs) were generated. The functions and locations of 63 of these ESTs were derived by BLAST from two public databases. These ESTs are grouped into various classes. One group has high homology not only with the mitochondrial genome but also with one or more chromosomal DNAs, suggesting that very similar genes, or genes with very similar domains, are expressed from both mitochondrial and nuclear DNA. A second class comprises genes with complete homology with part of a known gene, suggesting that they are the same genes. A third group has partial homology with domains of known genes. A fourth group, constituting 33% of the ESTs characterized, has no significant homology with any gene or EST in the database.Conclusions
We have shown that sufficient information about the location of ESTs could be derived electronically from the recently completed human genome sequences. This strategy of EST localization should be significantly useful for mapping and identification of new genes in the forthcoming human genome sequences with the vast number of ESTs in the dbEST database. 相似文献17.
QualitySNP: a pipeline for detecting single nucleotide polymorphisms and insertions/deletions in EST data from diploid and polyploid species 总被引:2,自引:0,他引:2
Jifeng Tang Ben Vosman Roeland E Voorrips C Gerard van der Linden Jack AM Leunissen 《BMC bioinformatics》2006,7(1):438
Background
Single nucleotide polymorphisms (SNPs) are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs)) and are considered to be faster and more cost-effective than experimental procedures. A major challenge in computational SNP discovery is distinguishing allelic variation from sequence variation between paralogous sequences, in addition to recognizing sequencing errors. For the majority of the public EST sequences, trace or quality files are lacking which makes detection of reliable SNPs even more difficult because it has to rely on sequence comparisons only. 相似文献18.
R Henrik Nilsson Balaji Rajashekar Karl-Henrik Larsson Bj?rn M Ursing 《BMC bioinformatics》2004,5(1):87
Background
Research involving expressed sequence tags (ESTs) is intricately coupled to the existence of large, well-annotated sequence repositories. Comparatively complete and satisfactory annotated public sequence libraries are, however, available only for a limited range of organisms, rendering the absence of sequences and gene structure information a tangible problem for those working with taxa lacking an EST or genome sequencing project. Paralogous genes belonging to the same gene family but distinguished by derived characteristics are particularly prone to misidentification and erroneous annotation; high but incomplete levels of sequence similarity are typically difficult to interpret and have formed the basis of many unsubstantiated assumptions of orthology. 相似文献19.