首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this study, we have identified a novel mechanism of mutation involving translocation between the HPRT1 loci and other loci on the X chromosome. In HRT‐25's cDNA obtained from a patient with Lesch‐Nyhan syndrome, the upstream region of exon 3 was amplified, but the full‐length region was not amplified. The use of 3′ rapid amplification of cDNA ends polymerase chain reaction (3′RACE‐PCR) for HRT‐25 revealed part of intron 3 and an unknown sequence which have not identified the HPRT1 gene starting at the 3′ end of exon 3. We analyzed HPRT1 genomic DNA in order to confirm the mutation with the unknown sequence in the genomic DNA. Unknown sequence compared through BLAST analysis of human genome (NCBI; http://www.ncbi.nlm.nih.gov/BLAST/) showed that at least 0.5 to 0.6‐Mb telomeric to HPRT1 on chromosome Xq where located near LOC340581. This study provides the molecular basis for the involvement of genomic instability in germ cells.  相似文献   

2.
BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences   总被引:49,自引:0,他引:49  
'BLAST 2 Sequences', a new BLAST-based tool for aligning two protein or nucleotide sequences, is described. While the standard BLAST program is widely used to search for homologous sequences in nucleotide and protein databases, one often needs to compare only two sequences that are already known to be homologous, coming from related species or, e.g. different isolates of the same virus. In such cases searching the entire database would be unnecessarily time-consuming. 'BLAST 2 Sequences' utilizes the BLAST algorithm for pairwise DNA-DNA or protein-protein sequence comparison. A World Wide Web version of the program can be used interactively at the NCBI WWW site (http://www.ncbi.nlm.nih.gov/gorf/bl2.++ +html). The resulting alignments are presented in both graphical and text form. The variants of the program for PC (Windows), Mac and several UNIX-based platforms can be downloaded from the NCBI FTP site (ftp://ncbi.nlm.nih.gov).  相似文献   

3.
Motivation: The key to MS -based proteomics is peptide sequencing.The major challenge in peptide sequencing, whether library searchor de novo, is to better infer statistical significance andbetter attain noise reduction. Since the noise in a spectrumdepends on experimental conditions, the instrument used andmany other factors, it cannot be predicted even if the peptidesequence is known. The characteristics of the noise can onlybe uncovered once a spectrum is given. We wish to overcome suchissues. Results: We designed RAId to identify peptides from their associatedtandem mass spectrometry data. RAId performs a novel de novosequencing followed by a search in a peptide library that wecreated. Through de novo sequencing, we establish the spectrum-specificbackground score statistics for the library search. When thedatabase search fails to return significant hits, the top-rankingde novo sequences become potential candidates for new peptidesthat are not yet in the database. The use of spectrum-specificbackground statistics seems to enable RAId to perform well evenwhen the spectral quality is marginal. Other important featuresof RAId include its potential in de novo sequencing alone andthe ease of incorporating post-translational modifications. Availability: Programs implementing the methods described areavailable from the authors on request. Contact: yyu{at}ncbi.nlm.nih.gov Supplementary information: ftp://ftp.ncbi.nih.gov/pub/yyu/Proteomics/MSMS/RAId/MSMS_bioinfo_supp.pdf  相似文献   

4.
A structure-based method for protein sequence alignment   总被引:1,自引:0,他引:1  
MOTIVATION: With the continuing rapid growth of protein sequence data, protein sequence comparison methods have become the most widely used tools of bioinformatics. Among these methods are those that use position-specific scoring matrices (PSSMs) to describe protein families. PSSMs can capture information about conserved patterns within families, which can be used to increase the sensitivity of searches for related sequences. Certain types of structural information, however, are not generally captured by PSSM search methods. Here we introduce a program, Structure-based ALignment TOol (SALTO), that aligns protein query sequences to PSSMs using rules for placing and scoring gaps that are consistent with the conserved regions of domain alignments from NCBI's Conserved Domain Database. RESULTS: In most cases, the alignment scores obtained using the local alignment version follow an extreme value distribution. SALTO's performance in finding related sequences and producing accurate alignments is similar to or better than that of IMPALA; one advantage of SALTO is that it imposes an explicit gapping model on each protein family. AVAILABILITY: A stand-alone version of the program that can generate global or local alignments is available by ftp distribution (ftp://ftp.ncbi.nih.gov/pub/SALTO/), and has been incorporated to Cn3D structure/alignment viewer. CONTACT: bryant@ncbi.nlm.nih.gov.  相似文献   

5.

Background

BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific score matrix (PSSM) for searching the database in round i?+?1. Biegert and S?ding developed Context-sensitive BLAST (CS-BLAST), which combines information from searching the sequence database with information derived from a library of short protein profiles to achieve better homology detection than PSI-BLAST, which builds its PSSMs from scratch.

Results

We describe a new method, called domain enhanced lookup time accelerated BLAST (DELTA-BLAST), which searches a database of pre-constructed PSSMs before searching a protein-sequence database, to yield better homology detection. For its PSSMs, DELTA-BLAST employs a subset of NCBI??s Conserved Domain Database (CDD). On a test set derived from ASTRAL, with one round of searching, DELTA-BLAST achieves a ROC5000 of 0.270 vs. 0.116 for CS-BLAST. The performance advantage diminishes in iterated searches, but DELTA-BLAST continues to achieve better ROC scores than CS-BLAST.

Conclusions

DELTA-BLAST is a useful program for the detection of remote protein homologs. It is available under the ??Protein BLAST?? link at http://blast.ncbi.nlm.nih.gov.

Reviewers

This article was reviewed by Arcady Mushegian, Nick V. Grishin, and Frank Eisenhaber.  相似文献   

6.
7.
The Conserved Domain Database (CDD) is now indexed as a separate database within the Entrez system and linked to other Entrez databases such as MEDLINE(R). This allows users to search for domain types by name, for example, or to view the domain architecture of any protein in Entrez's sequence database. CDD can be accessed on the WorldWideWeb at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd. Users may also employ the CD-Search service to identify conserved domains in new sequences, at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. CD-Search results, and pre-computed links from Entrez's protein database, are calculated using the RPS-BLAST algorithm and Position Specific Score Matrices (PSSMs) derived from CDD alignments. CD-Searches are also run by default for protein-protein queries submitted to BLAST(R) at http://www.ncbi.nlm.nih.gov/BLAST. CDD mirrors the publicly available domain alignment collections SMART and PFAM, and now also contains alignment models curated at NCBI. Structure information is used to identify the core substructure likely to be present in all family members, and to produce sequence alignments consistent with structure conservation. This alignment model allows NCBI curators to annotate 'columns' corresponding to functional sites conserved among family members.  相似文献   

8.
Histone Sequence Database: new histone fold family members.   总被引:2,自引:0,他引:2       下载免费PDF全文
Searches of the major public protein databases with core and linker chicken and human histone sequences have resulted in the compilation of an annotated set of histone protein sequences. In addition, new database searches with two distinct motif search algorithms have identified several members of the histone fold family, including human DRAP1 and yeast CSE4. Database resources include information on conflicts between similar sequence entries in different source databases, multiple sequence alignments, links to the Entrez integrated information retrieval system, structures for histone and histone fold proteins, and the ability to visualize structural data through Cn3D. The database currently contains >1000 protein sequences, which are searchable by protein type, accession number, organism name, or any other free text appearing in the definition line of the entry. All sequences and alignments in this database are available through the World Wide Web at http://www.nhgri.nih. gov/DIR/GTB/HISTONES or http://www.ncbi.nlm.nih. gov/Baxevani/HISTONES  相似文献   

9.
dbSNP: a database of single nucleotide polymorphisms   总被引:12,自引:0,他引:12       下载免费PDF全文
In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Cancer for Biotechnology Information (NCBI) has established the dbSNP database. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP. Submitted SNPs can also be downloaded via anonymous FTP at ftp://ncbi.nlm.nih.gov/snp/  相似文献   

10.
By searching the current protein sequence databases using sequences from human and chicken histones H1/H5, H2A, H2B, H3 and H4, a database of aligned histone protein sequences with statistically significant sequence similarity to the search sequence was constructed. In addition, a nucleotide sequence database of the corresponding coding regions for these proteins has been assembled. The region of each of the core histones containing the histone fold motif is identified in the protein alignments. The database contains >1300 protein and nucleotide sequences. All sequences and alignments in this database are available through the World Wide Web at http://www.ncbi.nlm.nih.gov/Baxevani/HISTO NES.  相似文献   

11.
The Histone Sequence Database is an annotated and searchable collection of all available histone and histone fold sequences and structures. Particular emphasis has been placed on documenting conflicts between similar sequence entries from a number of source databases, conflicts that are not necessarily documented in the source databases themselves. New additions to the database include compilations of post-translational modifications for each of the core and linker histones, as well as genomic information in the form of map loci for the human histone gene complement, with the genetic loci linked to Online Mendelian Inheritance in Man (OMIM). The database is freely accessible through the World Wide Web at either http://genome.nhgri.nih.gov/histones/ or http://www.ncbi.nlm.nih. gov/Baxevani/HISTONES  相似文献   

12.
Ng KW  Lawson J  Garner HR 《BioTechniques》2004,37(2):218, 220-218, 222
PathoGene is a web-based resource that streamlines the process of predicting genes in microorganisms and designs PCR primers for amplification to facilitate sequence analysis and experimentation. PathoGene currently supports primer design for every complete microbial, viral, and fungal genome as cataloged in GenBank by the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/). The resulting primers can then be subjected to a stand-alone Basic Local Alignment Search Tool (BLAST) system called PathoBLAST in which the predicted PCR product and/or primers can be compared against the genome of interest or a similar genome to find related genes or estimate primer quality.  相似文献   

13.
Three-dimensional structures are now known within most protein families and it is likely, when searching a sequence database, that one will identify a homolog of known structure. The goal of Entrez's 3D-structure database is to make structure information and the functional annotation it can provide easily accessible to molecular biologists. To this end, Entrez's search engine provides several powerful features: (i) links between databases, for example between a protein's sequence and structure; (ii) pre-computed sequence and structure neighbors; and (iii) structure and sequence/structure alignment visualization. Here, we focus on a new feature of Entrez's Molecular Modeling Database (MMDB): Graphical summaries of the biological annotation available for each 3D structure, based on the results of automated comparative analysis. MMDB is available at: http://www.ncbi.nlm.nih.gov/Entrez/structure.html.  相似文献   

14.
ABSTRACT: BACKGROUND: Local alignment programs often calculate the probability that a match occurred by chance. The calculation of this probability may require a "finite-size" correction to the lengths of the sequences, as an alignment that starts near the end of either sequence may run out of sequence before achieving a significant score. FINDINGS: We present an improved finite-size correction that considers the distribution of sequence lengths rather than simply the corresponding means. This approach improves sensitivity and avoids substituting an ad hoc length for short sequences that can underestimate the significance of a match. We use a test set derived from ASTRAL to show improved ROC scores, especially for shorter sequences. CONCLUSIONS: The new finite-size correction improves the calculation of probabilities for a local alignment. It is now used in the BLAST + package and at the NCBI BLAST web site (http://blast.ncbi.nlm.nih.gov).  相似文献   

15.
Studies into the genetic origins of tumor cell chemoactivity pose significant challenges to bioinformatic mining efforts. Connections between measures of gene expression and chemoactivity have the potential to identify clinical biomarkers of compound response, cellular pathways important to efficacy and potential toxicities; all vital to anticancer drug development. An investigation has been conducted that jointly explores tumor-cell constitutive NCI60 gene expression profiles and small-molecule NCI60 growth inhibition chemoactivity profiles, viewed from novel applications of self-organizing maps (SOMs) and pathway-centric analyses of gene expressions, to identify subsets of over- and under-expressed pathway genes that discriminate chemo-sensitive and chemo-insensitive tumor cell types. Linear Discriminant Analysis (LDA) is used to quantify the accuracy of discriminating genes to predict tumor cell chemoactivity. LDA results find 15% higher prediction accuracies, using ∼30% fewer genes, for pathway-derived discriminating genes when compared to genes derived using conventional gene expression-chemoactivity correlations. The proposed pathway-centric data mining procedure was used to derive discriminating genes for ten well-known compounds. Discriminating genes were further evaluated using gene set enrichment analysis (GSEA) to reveal a cellular genetic landscape, comprised of small numbers of key over and under expressed on- and off-target pathway genes, as important for a compound’s tumor cell chemoactivity. Literature-based validations are provided as support for chemo-important pathways derived from this procedure. Qualitatively similar results are found when using gene expression measurements derived from different microarray platforms. The data used in this analysis is available at http://pubchem.ncbi.nlm.nih.gov/and http://www.ncbi.nlm.nih.gov/projects/geo (GPL96, GSE32474).  相似文献   

16.
dbSNP: the NCBI database of genetic variation   总被引:1,自引:0,他引:1  
In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Center for Biotechnology Information (NCBI) has established the dbSNP database [S.T.Sherry, M.Ward and K. Sirotkin (1999) Genome Res., 9, 677-679]. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP. The complete contents of dbSNP can also be downloaded in multiple formats via anonymous FTP at ftp://ncbi.nlm.nih.gov/snp/.  相似文献   

17.
The majority of human telomere length studies have focused on the overall length of telomeres within a cell. In fact, very few studies have examined telomere length for individual chromosome arms. The objective of this study was to examine the relationship between chromosome arm size and the relative length of the associated telomere. Quantitative Fluorescence In Situ Hybridization (Q-FISH) was used to measure the relative telomere length of each chromosome arm in metaphases from cultured lymphocytes of 17 individuals. A statistically significant positive correlation (r = 0.6) was found between telomere length and the size of the associated chromosome arm, which was estimated based on megabase pair measurements from http://www.ncbi.nlm.nih.gov/projects/mapview/.  相似文献   

18.
In this study, we have identified a novel mechanism of mutation involving translocation between the HPRT1 loci and other loci on the X chromosome. In HRT-25's cDNA obtained from a patient with Lesch-Nyhan syndrome, the upstream region of exon 3 was amplified, but the full-length region was not amplified. The use of 3' rapid amplification of cDNA ends polymerase chain reaction (3'RACE-PCR) for HRT-25 revealed part of intron 3 and an unknown sequence which have not identified the HPRT1 gene starting at the 3' end of exon 3. We analyzed HPRT1 genomic DNA in order to confirm the mutation with the unknown sequence in the genomic DNA. Unknown sequence compared through BLAST analysis of human genome (NCBI; http://www.ncbi.nlm.nih.gov/BLAST/) showed that at least 0.5 to 0.6-Mb telomeric to HPRT1 on chromosome Xq where located near LOC340581. This study provides the molecular basis for the involvement of genomic instability in germ cells.  相似文献   

19.
NCBI's LocusLink and RefSeq   总被引:1,自引:0,他引:1       下载免费PDF全文
  相似文献   

20.
Accurate anchoring alignment of divergent sequences   总被引:1,自引:0,他引:1  
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号