首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
SUMMARY: The Kinase Sequence Database (KSD) located at http://kinase.ucsf.edu/ksd contains information on 290 protein kinase families derived by profile-based clustering of the non-redundant list of sequences obtained from a GenBank-wide search. Included in the database are a total of 5,041 protein kinases from over 100 organisms. Clustering into families is based on the extent of homology within the kinase catalytic domain (250-300 residues in length). Alignments of the families are viewed by interactive Excel-based sequence spreadsheets. In addition, KSD features evolutionary trees derived for each family and detailed information on each sequence as well as links to the corresponding GenBank entries. Sequence manipulation tools, such as evolutionary tree generation, novel sequence assignment, and statistical analysis, are also provided. AVAILABILITY: The kinase sequence database is a web-based service accessible at http://kinase.ucsf.edu/ksd CONTACT: buzko@cmp.ucsf.edu; shokat@cmp.ucsf.edu/ksd  相似文献   

2.
3.
MOTIVATION: Using bioinformatic approaches we aimed to characterize poorly understood abnormalities in splicing known as exon scrambling, exon repetition and trans-splicing. RESULTS: We developed a software package that allows large-scale comparison of all human expressed sequence tags (EST) sequences to the entire set of human gene sequences. Among 5,992,495 EST sequences, 401 cases of exon repetition and 416 cases of exon scrambling were found. The vast majority of identified ESTs contain fragments rather than full-length repeated or scrambled exons. Their structures suggest that the scrambled or repeated exon fragments may have arisen in the process of cDNA cloning and not from splicing abnormalities. Nevertheless, we found 11 cases of full-length exon repetition showing that this phenomenon is real yet very rare. In searching for examples of trans-splicing, we looked only at reproducible events where at least two independent ESTs represent the same putative trans-splicing event. We found 15 ESTs representing five types of putative trans-splicing. However, all 15 cases were derived from human malignant tissues and could have resulted from genomic rearrangements. Our results provide support for a very rare but physiological occurrence of exon repetition, but suggest that apparent exon scrambling and trans-splicing result, respectively, from in vitro artifact and gene-level abnormalities. AVAILABILITY: Exon-Intron Database (EID) is available at http://www.meduohio.edu/bioinfo/eid. Programs are available at http://www.meduohio.edu/bioinfo/software.html. The Laboratory website is available at http://www.meduohio.edu/medicine/fedorov Supplementary information: Supplementary file is available at http://www.meduohio.edu/bioinfo/software.html.  相似文献   

4.
PEDB: the Prostate Expression Database.   总被引:6,自引:1,他引:5       下载免费PDF全文
The Prostate Expression Database (PEDB) is a curated relational database and suite of analysis tools designed for the study of prostate gene expression in normal and disease states. Expressed Sequence Tags (ESTs) and full-length cDNA sequences derived from more than 40 human prostate cDNA libraries are maintained and represent a wide spectrum of normal and pathological conditions. Detailed library information including tissue source, library construction methods, sequence diversity and abundance are available in a library archive. Prostate ESTs are assembled into distinct species groups using the multiple alignment program CAP2 and are annotated with information from the GenBank, dbEST and Unigene public sequence databases. Annotated sequences in PEDB are searched using the BLAST algorithm. The differential expression of each EST species can be viewed across all libraries using a Virtual Expression Analysis Tool (VEAT), a graphical user interface written in Java for intra- and inter-library species comparisons. PEDB may be accessed via the World Wide Web at http://www.mbt.washington.edu/PEDB/  相似文献   

5.
The Ribosomal Database Project (RDP).   总被引:24,自引:2,他引:22       下载免费PDF全文
The Ribosomal Database Project (RDP) is a curated database that offers ribosome-related data, analysis services and associated computer programs. The offerings include phylogenetically ordered alignments of ribosomal RNA (rRNA) sequences, derived phylogenetic trees, rRNA secondary structure diagrams and various software for handling, analyzing and displaying alignments and trees. The data are available via anonymous ftp (rdp.life.uiuc.edu), electronic mail (server@rdp.life.uiuc.edu), gopher (rdpgopher.life.uiuc.edu) and World Wide Web (WWW)(http://rdpwww.life.uiuc.edu/). The electronic mail and WWW servers provide ribosomal probe checking, screening for possible chimeric rRNA sequences, automated alignment and approximate phylogenetic placement of user-submitted sequences on an existing phylogenetic tree.  相似文献   

6.
We describe a high-throughput cDNA sequencing pipeline (http://www.hgsc.bcm.tmc.edu/projects/cdna) built in response to the emerging need for rapid sequencing of large cDNA collections. Using this strategy cDNA inserts are purified and joined through concatenation into large molecules. These 'pseudo-BACs' are subjected to random shotgun sequencing whereby the majority of cDNA inserts in the pool are sequenced. Using this concatenation cDNA sequencing platform, we have contributed more than 13000 full-length cDNA sequences from human and mouse to the Mammalian Gene Collection (MGC).  相似文献   

7.
The RDP (Ribosomal Database Project).   总被引:53,自引:1,他引:53       下载免费PDF全文
The Ribosomal Database Project (RDP) is a curated database that offers ribosome-related data, analysis services and associated computer programs. The offerings include phylogenetically ordered alignments of ribosomal RNA (rRNA) sequences, derived phylogenetic trees, rRNA secondary structure diagrams, and various software for handling, analyzing and displaying alignments and trees. The data are available via anonymous FTP (rdp.life.uiuc.edu), electronic mail (server@rdp.life.uiuc.edu), gopher (rdpgopher.life.uiuc.edu) and WWW (http://rdpwww.life.uiuc.edu/ ). The electronic mail and WWW servers provide ribosomal probe checking, approximate phylogenetic placement of user-submitted sequences, screening for possible chimeric rRNA sequences, automated alignment, and a suggested placement of an unknown sequence on an existing phylogenetic tree.  相似文献   

8.
MOTIVATION: While the mechanism for regulating alternative splicing is poorly understood, secondary structure has been shown to be integral to this process. Due to their propensity for forming complementary hairpin loops and their elevated mutation rates, tandem repeated sequences have the potential to influence splicing regulation. RESULTS: An analysis of human intronic sequences reveals a strong correlation between alternative splicing and the prevalence of mono- through hexanucleotide tandem repeats that may engage in complementary pairing in introns that flank alternatively spliced exons. While only 44% of the 18 173 genes in the Human Alternative Splicing Database are known to be alternatively spliced, they contain 84% of the 694 237 intronic complementary repeat pairs. Significantly, the normalized frequency and distribution of repeat sequences, independent of their potential for pairing, are indistinguishable between alternatively spliced and non-alternatively spliced genes. Thus, the increased prevalence of repeats with pairing potential in alternatively spliced genes is not merely a consequence of more repeats or repeat composition bias. These results suggest that complementary repeats may play a role in the regulation of alternative splicing. CONTACT: harold.garner@utsouthwestern.edu.  相似文献   

9.
肌动蛋白是细胞骨架微丝的主要组成成分,在肌肉收缩、细胞骨架形成、细胞移动等方面起重要作用。以鳞翅目夜蛾科昆虫甘蓝夜蛾Mamestra brassicae L.和八字地老虎Agrotis c-nigrum 3龄幼虫整个虫体为材料提取总RNA,利用RT-PCR和cDNA末端快速扩增技术(RACE),分别扩增得到2种昆虫的肌动蛋白的cDNA序列,甘蓝夜蛾肌动蛋白的cDNA序列含有1441个碱基,而八字地老虎肌动蛋白的cDNA序列含有1411个碱基。2种昆虫的该基因的cDNA序列均包括1个1131个碱基的开放阅读框,编码1个含376个氨基酸的蛋白。甘蓝夜蛾肌动蛋白分子量约为41.8kDa;八字地老虎肌动蛋白分子量约为41.9kDa。Prosite软件分析结果表明,甘蓝夜蛾和八字地老虎肌动蛋白氨基酸序列中存在3个肌动蛋白特征片段。GenBank数据库搜索及序列比对结果表明,甘蓝夜蛾肌动蛋白属于肌肉特异型肌动蛋白,八字地老虎肌动蛋白属于细胞质特异型肌动蛋白。2个基因的cDNA序列已经登录GenBank并获得登录号,甘蓝夜蛾肌动蛋白cDNA序列登录号为EU035314,八字地老虎肌动蛋白cDNA序列登录号为EU035315。利用RT-PCR技术在八字地老虎4龄、5龄、6龄幼虫、蛹期4个不同发育阶段和6龄期的肠道、体壁、脂肪体3种不同组织中都检测到了肌动蛋白基因在mRNA水平的表达。  相似文献   

10.
SUMMARY: DNAFSMiner (DNA Functional Sites Miner) is a web-based software toolbox to recognize functional sites in nucleic acid sequences. Currently in this toolbox, we provide two software: TIS Miner and Poly(A) Signal Miner. The TIS Miner can be used to predict translation initiation sites in vertebrate DNA/mRNA/cDNA sequences, and the Poly(A) Signal Miner can be used to predict polyadenylation [poly(A)] signals in human DNA sequences. The prediction results are better than those by literature methods on two benchmark applications. This good performance is mainly attributable to our unique learning method. DNAFSMiner is available free of charge for academic and non-profit organizations. AVAILABILITY: http://research.i2r.a-star.edu.sg/DNAFSMiner/ CONTACT: huiqing@i2r.a-star.edu.sg.  相似文献   

11.
Motivation: A large number of new DNA sequences with virtuallyunknown functions are generated as the Human Genome Projectprogresses. Therefore, it is essential to develop computer algorithmsthat can predict the functionality of DNA segments accordingto their primary sequences, including algorithms that can predictpromoters. Although several promoter-predicting algorithms areavailable, they have high false-positive detections and therate of promoter detection needs to be improved further. Results: In this research, PromFD, a computer program to recognizevertebrate RNA polymerase II promoters, has been developed.Both vertebrate promoters and non-promoter sequences are usedin the analysis. The promoters are obtained from the EukaryoticPromoter Database. Promoters are divided into a training setand a test set. Non-promoter sequences are obtained from theGenBank sequence databank, and are also divided into a trainingset and a test set. The first step is to search out, among allpossible permutations, patterns of strings 5–10 bp long,that are significantly over-represented in the promoter set.The program also searches IMD (Information Matrix Database)matrices that have a significantly higher presence in the promoterset. The results of the searches are stored in the PromFD database,and the program PromFD scores input DNA sequences accordingto their content of the database entries. PromFD predicts promoters—theirlocations and the location of potential TATA boxes, if found.The program can detect 71% of promoters in the training setwith a false-positive rate of under 1 in every 13 000 bp, and47% of promoters in the test set with a false-positive rateof under 1 in every 9800 bp. PromFD uses a new approach andits false-positive identification rate is better compared withother available promoter recognition algorithms. The sourcecode for PromFD is in the ‘c++’ language. Availability: PromFD is available for Unix platforms by anonymousftp to: beagle. colorado. edu, cd pub, get promFD.tar. A Javaversion of the program is also available for netscape 2.0, byhttp: // beagle.colorado.edu/chenq. Contact: E-mail: chenq{at}beagle.colorado.edu  相似文献   

12.
The RDP (Ribosomal Database Project) continues   总被引:56,自引:0,他引:56  
The Ribosomal Database Project (RDP-II), previously described by Maidak et al., continued during the past year to add new rRNA sequences to the aligned data and to improve the analysis commands. Release 7.1 (September 17, 1999) included more than 10 700 small subunit rRNA sequences. More than 850 type strain sequences were identified and added to the prokaryotic alignment, bringing the total number of type sequences to 3324 representing 2460 different species. Availability of an RDP-II mirror site in Japan is also near completion. RDP-II provides aligned and annotated rRNA sequences, derived phylogenetic trees and taxonomic hierarchies, and analysis services through its WWW server (http://rdp.cme.msu.edu/ ). Analysis services include rRNA probe checking, approx-i-mate phylogenetic placement of user sequences, screening user sequences for possible chimeric rRNA sequences, automated alignment, production of similarity matrices and services to plan and analyze terminal restriction fragment length polymorphism (T-RFLP) experiments.  相似文献   

13.
The RESID Database is a comprehensive collection of annotations and structures for protein post-translational modifications including N-terminal, C-terminal and peptide chain cross-link modifications. The RESID Database includes systematic and frequently observed alternate names, Chemical Abstracts Service registry numbers, atomic formulas and weights, enzyme activities, taxonomic range, keywords, literature citations with database cross-references, structural diagrams and molecular models. The NRL-3D Sequence-Structure Database is derived from the three-dimensional structure of proteins deposited with the Research Collaboratory for Structural Bioinformatics Protein Data Bank. The NRL-3D Database includes standardized and frequently observed alternate names, sources, keywords, literature citations, experimental conditions and searchable sequences from model coordinates. These databases are freely accessible through the National Cancer Institute-Frederick Advanced Biomedical Computing Center at these web sites: http://www. ncifcrf.gov/RESID, http://www.ncifcrf.gov/NRL-3D; or at these National Biomedical Research Foundation Protein Information Resource web sites: http://pir.georgetown.edu/pirwww/dbinfo/resid .html, http://pir.georgetown.edu/pirwww/dbinfo/nrl3d .html  相似文献   

14.
SUMMARY: DNA polymorphism detector (DPD) is a new web application developed to help automate the process of cDNA clone validation. DPD identifies and highlights discrepancies between any cDNA clone sequence and its expected reference sequence. To determine if these differences correspond to natural genetic polymorphisms (versus artifacts introduced during clone production or evaluation), DPD uses the discrepancies, along with flanking sequences, to search GenBank for identical matching strings. If matching DNA sequences are found, DPD verifies that they are from the same gene. The application then reports the discrepancy as a polymorphism along with the corresponding GenBank reference information. AVAILABILITY: DPD is currently hosted by the Harvard Institute of Proteomics at http://www.hip.harvard.edu  相似文献   

15.
The RDP-II (Ribosomal Database Project)   总被引:23,自引:0,他引:23  
The Ribosomal Database Project (RDP-II), previously described by Maidak et al. [Nucleic Acids Res. (2000), 28, 173-174], continued during the past year to add new rRNA sequences to the aligned data and to improve the analysis commands. Release 8.0 (June 1, 2000) consisted of 16 277 aligned prokaryotic small subunit (SSU) rRNA sequences while the number of eukaryotic and mitochondrial SSU rRNA sequences in aligned form remained at 2055 and 1503, respectively. The number of prokaryotic SSU rRNA sequences more than doubled from the previous release 14 months earlier, and approximately 75% are longer than 899 bp. An RDP-II mirror site in Japan is now available (http://wdcm.nig.ac.jp/RDP/html/index.h tml). RDP-II provides aligned and annotated rRNA sequences, derived phylogenetic trees and taxonomic hierarchies, and analysis services through its WWW server (http://rdp.cme.msu.edu/). Analysis services include rRNA probe checking, approximate phylogenetic placement of user sequences, screening user sequences for possible chimeric rRNA sequences, automated alignment, production of similarity matrices and services to plan and analyze terminal restriction fragment polymorphism experiments. The RDP-II email address for questions and comments has been changed from curator@cme.msu.edu to rdpstaff@msu.edu.  相似文献   

16.
The Ribosomal Database Project.   总被引:79,自引:0,他引:79       下载免费PDF全文
The Ribosomal Database Project (RDP) is a curated database that offers ribosome-related data, analysis services, and associated computer programs. The offerings include phylogenetically ordered alignments of ribosomal RNA (rRNA) sequences, derived phylogenetic trees, rRNA secondary structure diagrams, and various software for handling, analyzing and displaying alignments and trees. The data are available via anonymous ftp (rdp.life.uiuc.edu), electronic mail (server/rdp.life.uiuc.edu) and gopher (rdpgopher.life.uiuc.edu). The electronic mail server also provides ribosomal probe checking, approximate phylogenetic placement of user-submitted sequences, screening for chimeric nature of newly sequenced rRNAs, and automated alignment.  相似文献   

17.
18.
The Kabat Database was initially started in 1970 to determine the combining site of antibodies based on the available amino acid sequences. The precise delineation of complementarity determining regions (CDR) of both light and heavy chains provides the first example of how properly aligned sequences can be used to derive structural and functional information of biological macromolecules. This knowledge has subsequently been applied to the construction of artificial antibodies with prescribed specificities, and to many other studies. The Kabat database now includes nucleotide sequences, sequences of T cell receptors for antigens (TCR), major histocompatibility complex (MHC) class I and II molecules, and other proteins of immunological interest. While new sequences are continually added into this database, we have undertaken the task of developing more analytical methods to study the information content of this collection of aligned sequences. New examples of analysis will be illustrated on a yearly basis. The Kabat Database and its applications are freely available at http://immuno.bme.nwu.edu.  相似文献   

19.
RECODE 2003     
The RECODE database is a compilation of translational recoding events (programmed ribosomal frameshifting, codon redefinition and translational bypass). The database provides information about the genes utilizing these events for their expression, recoding sites, stimulatory sequences and other relevant information. The Database is freely available at http://recode.genetics.utah.edu/.  相似文献   

20.
The availability of genomic sequences of many organisms has opened new challenges in many aspects particularly in terms of genome analysis. Sequence extraction is a vital step and many tools have been developed to solve this issue. These tools are available publically but have limitations with reference to the sequence extraction, length of the sequence to be extracted, organism specificity and lack of user friendly interface. We have developed a java based software package having three modules which can be used independently or sequentially. The tool efficiently extracts sequences from large datasets with few simple steps. It can efficiently extract multiple sequences of any desired length from a genome of any organism. The results are crosschecked by published data.

Availability

URL 1: http://ww3.comsats.edu.pk/bio/ResearchProjects.aspxURL 2: http://ww3.comsats.edu.pk/bio/SequenceManeuverer.aspx  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号