首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 312 毫秒
1.
In the past years, identification of alternative splicing (AS) variants has been gaining momentum. We developed AVATAR, a database for documenting AS using 5,469,433 human EST sequences and 26,159 human mRNA sequences. AVATAR contains 12000 alternative splicing sites identified by mapping ESTs and mRNAs with the whole human genome sequence. AVATAR also contains AS information for 6 eukaryotes. We mapped EST alignment information into a graph model where exons and introns are represented with vertices and edges, respectively. AVATAR can be queried using, (1) gene names, (2) number of identified AS events in a gene, (3) minimal number of ESTs supporting a splicing site, etc. as search parameters. The system provides visualized AS information for queried genes.

Availability  相似文献   


2.
Computational analysis of alternative splicing using EST tissue information   总被引:2,自引:0,他引:2  
Expressed sequence tags (ESTs) from normal and tumor tissues have been deposited in public databases. These ESTs and all mRNA sequences were aligned with the human genome sequence using LEADS, Compugen's alternative splicing modeling platform. We developed a novel computational approach to analyze tissue information of aligned ESTs in order to identify cancer-specific alternative splicing and gene segments highly expressed in particular cancers. Several genes, including one encoding a possible pre-mRNA splicing factor, displayed cancer-specific alternative splicing. In addition, multiple candidate gene segments highly expressed in colon cancers were identified.  相似文献   

3.
AsMamDB: an alternative splice database of mammals   总被引:11,自引:1,他引:10  
Ji H  Zhou Q  Wen F  Xia H  Lu X  Li Y 《Nucleic acids research》2001,29(1):260-263
  相似文献   

4.
5.
真核基因可变剪接研究现状与展望   总被引:2,自引:0,他引:2  
mRNA前体(pre-mRNA)的可变剪接是控制基因表达和产生蛋白质多样性的重要机制,是功能基因组时代的研究重点之一。生物信息学在识别可变剪接基因及其结构、分析可变剪接的功能和调控方式等方面具有重要作用。除了耗时的实验研究,识别可变剪接基因及其结构主要通过EST、mRNA等转录数据与基因组序列进行比对,获得同一基因的不同结构方式。分析蛋白质产物可对可变剪接的功能进行预测;潜在调控元件的统计分析则可为可变剪接调控机制的研究提供必要的数据。转录数据的时空信息以及比较基因组学对理解可变剪接信息的精确调控将提供重要资料。可变剪接及其调控机制的深入研究将为基因组和蛋白质组之间的对接提供重要的桥梁。  相似文献   

6.
Over 28,000 expressed sequence tags (ESTs) were produced from cDNA libraries representing a variety of growth conditions and cell types. Several Magnaporthe grisea strains were used to produce the libraries, including a nonpathogenic strain bearing a mutation in the PMK1 mitogen-activated protein kinase. Approximately 23,000 of the ESTs could be clustered into 3,050 contigs, leaving 5,127 singleton sequences. The estimate of 8,177 unique sequences indicates that over half of the genes of the fungus are represented in the ESTs. Analysis of EST frequency reveals growth and cell type-specific patterns of gene expression. This analysis establishes criteria for identification of fungal genes involved in pathogenesis. A large fraction of the genes represented by ESTs have no known function or described homologs. Manual annotation of the most abundant cDNAs with no known homologs allowed us to identify a family of metallothionein proteins present in M. grisea, Neurospora crassa, and Fusarium graminearum. In addition, multiply represented ESTs permitted the identification of alternatively spliced mRNA species. Alternative splicing was rare, and in most cases, the alternate mRNA forms were unspliced, although alternative 5' splice sites were also observed.  相似文献   

7.
8.
9.
ASAP: the Alternative Splicing Annotation Project   总被引:2,自引:0,他引:2  
Recently, genomics analyses have demonstrated that alternative splicing is widespread in mammalian genomes (30-60% of genes reported to have multiple isoforms), and may be one of their most important mechanisms of functional regulation. However, by comparison with other genomics data such as genome annotation, SNPs, or gene expression, there exists relatively little database infrastructure for the study of alternative splicing. We have constructed an online database ASAP (the Alternative Splicing Annotation Project) for biologists to access and mine the enormous wealth of alternative splicing information coming from genomics and proteomics. ASAP is based on genome-wide analyses of alternative splicing in human (30 793 alternative splice relationships found) from detailed alignment of expressed sequences onto the genomic sequence. ASAP provides precise gene exon-intron structure, alternative splicing, tissue specificity of alternative splice forms, and protein isoform sequences resulting from alternative splicing. Moreover, it can help biologists design probe sequences for distinguishing specific mRNA isoforms. ASAP is intended to be a community resource for collaborative annotation of alternative splice forms, their regulation, and biological functions. The URL for ASAP is http://www.bioinformatics.ucla.edu/ASAP.  相似文献   

10.
The generation of large numbers of partial cDNA sequences, or expressed sequence tags (ESTs), has provided a method with which to sample a large number of genes from an organism. More than 25,000 Arabidopsis thaliana ESTs have been deposited in public databases, producing the largest collection of ESTs for any plant species. We describe here the application of a method of reducing redundancy and increasing information content in this collection by grouping overlapping ESTs representing the same gene into a "contig" or assembly. The increased information content of these assemblies allows more putative identifications to be assigned based on the results of similarity searches with nucleotide and protein databases. The results of this analysis indicate that sequence information is available for approximately 12,600 nonoverlapping ESTs from Arabidopsis. Comparison of the assemblies with 953 Arabidopsis coding sequences indicates that up to 57% of all Arabidopsis genes are represented by an EST. Clustering analysis of these sequences suggests that between 300 and 700 gene families are represented by between 700 and 2000 sequences in the EST database. A database of the assembled sequences, their putative identifications, and cellular roles is available through the World Wide Web.  相似文献   

11.
鉴定9个新的RHD基因mRNA可变剪接体   总被引:1,自引:0,他引:1  
许先国  吴俊杰  洪小珍  朱发明  严力行 《遗传》2006,28(10):1213-1218
为了研究各种RHD基因mRNA可变剪接体的基因结构, 应用逆转录聚合酶链反应(RT-PCR)检测正常人脐血样本RHD mRNA, 对RHD cDNA进行TA克隆和序列分析, 对各可变剪接体的剪接位点进行DNA序列分析, 并将RHD mRNA进行表达序列标签(ESTs)分析。结果在28个阳性克隆中, 除全长RHD cDNA外, 共检测到12种(包括9种新的)RHD可变剪接体, 发现外显子遗漏、5′和3′剪接位点变异3种剪接形式, 涉及外显子2~9, 其中6种新的剪接体同时存在RHD和RHCE基因同源杂交现象。ESTs分析还检索到内含子保留形式的剪接体。研究表明, RHD基因mRNA存在复杂的可变剪接机制, 除已报道的剪接体外, 检测到9种新的RHD可变剪接体, 并发现了可变剪接和同源杂交并存现象。  相似文献   

12.
The genome of Arabidopsis has been searched for sequences of genes involved in acyl lipid metabolism. Over 600 encoded proteins have been identified, cataloged, and classified according to predicted function, subcellular location, and alternative splicing. At least one-third of these proteins were previously annotated as "unknown function" or with functions unrelated to acyl lipid metabolism; therefore, this study has improved the annotation of over 200 genes. In particular, annotation of the lipolytic enzyme group (at least 110 members total) has been improved by the critical examination of the biochemical literature and the sequences of the numerous proteins annotated as "lipases." In addition, expressed sequence tag (EST) data have been surveyed, and more than 3,700 ESTs associated with the genes were cataloged. Statistical analysis of the number of ESTs associated with specific cDNA libraries has allowed calculation of probabilities of differential expression between different organs. More than 130 genes have been identified with a statistical probability > 0.95 of preferential expression in seed, leaf, root, or flower. All the data are available as a Web-based database, the Arabidopsis Lipid Gene database (http://www.plantbiology.msu.edu/lipids/genesurvey/index.htm). The combination of the data of the Lipid Gene Catalog and the EST analysis can be used to gain insights into differential expression of gene family members and sets of pathway-specific genes, which in turn will guide studies to understand specific functions of individual genes.  相似文献   

13.
Domestic pig (Sus scrofa domestica) is one of the most important mammals to humans. Alternative splicing is a cellular mechanism in eukaryotes that greatly increases the diversity of gene products. Expression sequence tags (ESTs) have been widely used for gene discovery, expression profile analysis, and alternative splicing detection. In this study, a total of 712,905 ESTs extracted from 101 different nonnormalized EST libraries of the domestic pig were analyzed. These EST libraries cover the nervous system, digestive system, immune system, and meat production related tissues from embryo, newborn, and adult pigs, making contributions to the analysis of alternative splicing variants as well as expression profiles in various stages of tissues. A modified approach was designed to cluster and assemble large EST datasets, aiming to detect alternative splicing together with EST abundance of each splicing variant. Much efforts were made to classify alternative splicing into different types and apply different filters to each type to get more reliable results. Finally, a total of 1,223 genes with average 2.8 splicing variants were detected among 16,540 unique genes. The overview of expression profiles would change when we take alternative splicing into account.  相似文献   

14.
The Intronerator (http://www.cse.ucsc.edu/ approximately kent/intronerator/ ) is a set of web-based tools for exploring RNA splicing and gene structure in Caenorhabditis elegans. It includes a display of cDNA alignments with the genomic sequence, a catalog of alternatively spliced genes and a database of introns. The cDNA alignments include >100 000 ESTs and almost 1000 full-length cDNAs. ESTs from embryos and mixed stage animals as well as full-length cDNAs can be compared in the alignment display with each other and with predicted genes. The alt-splicing catalog includes 844 open reading frames for which there is evidence of alternative splicing of pre-mRNA. The intron database includes 28 478 introns, and can be searched for patterns near the splice junctions.  相似文献   

15.
We consider the problem of predicting alternative splicing patterns from a set of expressed sequences (cDNAs and ESTs). Some of these expressed sequences may be errorous, thus forming incorrect exons/introns. These incorrect exons/introns may cause a lot of false positives. For example, we examined a popular alternative splicing database, ECgene, which predicts alternate splicing patterns from expressed sequences. The result shows that about 81.3%-81.6% (sensitivity) of known patterns are found, but the specificity can be as low as 5.9%. Based on the idea that errorous sequences are usually not consistent with other sequences, in this paper we provide an alternative approach for finding alternative splicing patterns which ensures that individual exons/introns of the reported patterns have enough support from the expressed sequences. On the same dataset, our approach can achieve a much higher specificity and a slight increase in sensitivity (38.9% and 84.9%, respectively). Our approach also gives better results compared with popular alternative splicing databases (ASD, ECgene, SpliceNest) and the software ClusterMerge.  相似文献   

16.
Expressed sequence tags (ESTs) from the marine red alga Gracilaria gracilis   总被引:2,自引:0,他引:2  
Expressed sequence tags (ESTs) are partial sequences of cDNAs, and can be used to characterize gene expression in organisms or tissues. We have constructed a 200-sequence EST database from vegetative thalli of Gracilaria gracilis, the first ESTs reported from any alga. This database contains recognizable ESTs corresponding to genes of carbohydrate metabolism (seven), amino acid metabolism (three), photosynthesis (five), nucleic acid synthesis, repair and processing (three), protein synthesis (14), protein degradation (six), cellular maintenance and stress response (three), other identifiable protein-coding genes (13) and 146 sequences for which significant matches were not found in existing sequence databases. We have already used this EST database to recover genes of carbohydrate biosynthesis from G. gracilis. This revised version was published online in August 2006 with corrections to the Cover Date.  相似文献   

17.
Expressed sequence tags (ESTs) currently encompass more entries in the public databases than any other form of sequence data. Thus, EST data sets provide a vast resource for gene identification and expression profiling. We have mapped the complete set of 176,915 publicly available Arabidopsis EST sequences onto the Arabidopsis genome using GeneSeqer, a spliced alignment program incorporating sequence similarity and splice site scoring. About 96% of the available ESTs could be properly aligned with a genomic locus, with the remaining ESTs deriving from organelle genomes and non-Arabidopsis sources or displaying insufficient sequence quality for alignment. The mapping provides verified sets of EST clusters for evaluation of EST clustering programs. Analysis of the spliced alignments suggests corrections to current gene structure annotation and provides examples of alternative and non-canonical pre-mRNA splicing. All results of this study were parsed into a database and are accessible via a flexible Web interface at http://www.plantgdb.org/AtGDB/.  相似文献   

18.
19.
In an effort to determine genes that are expressed in mycelial cultures of Neurospora crassa over the course of the circadian day, we have sequenced 13,000 cDNA clones from two time-of-day-specific libraries (morning and evening library) generating approximately 20,000 sequences. Contig analysis allowed the identification of 445 unique expressed sequence tags (ESTs) and 986 ESTs present in multiple cDNA clones. For approximately 50% of the sequences (710 of 1431), significant matches to sequences in the National Center for Biotechnology Information database (of known or unknown function) were detected. About 50% of the ESTs (721 of 1431) showed no similarity to previously identified genes. We hybridized Northern blots with probes derived from 26 clones chosen from contigs identified by multiple cDNA clones and EST sequences. Using these sequences, the representation of genes among the morning and evening sequences, respectively, in most cases does not reflect their expression patterns over the course of the day. Nevertheless, we were able to identify four new clock-controlled genes. On the basis of these data we predict that a significant proportion of the expressed Neurospora genes may be regulated by the circadian clock. The mRNA levels of all four genes peak in the subjective morning as is the case with previously identified ccgs.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号