首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.  相似文献   

2.
MOTIVATION: We consider the problem of finding similarities in protein structure databases. Current techniques sequentially compare the given query protein to all of the proteins in the database to find similarities. Therefore, the cost of similarity queries increases linearly as the volume of the protein databases increase. As the sizes of experimentally determined and theoretically estimated protein structure databases grow, there is a need for scalable searching techniques. RESULTS: Our techniques extract feature vectors on triplets of SSEs (Secondary Structure Elements). Later, these feature vectors are indexed using a multidimensional index structure. For a given query protein, this index structure is used to quickly prune away unpromising proteins in the database. The remaining proteins are then aligned using a popular alignment tool such as VAST. We also develop a novel statistical model to estimate the goodness of a match using the SSEs. Experimental results show that our techniques improve the pruning time of VAST 3 to 3.5 times while maintaining similar sensitivity.  相似文献   

3.
A computer program that allows interactive sequence comparisonis described. It graphically displays a search matrix usingresidue physicochemical characteristics and multilength segmentalcomparisons. The user selects through a mousing device and screenpointer the sequence spans to be matched. The results of thismethod are compared with those of ALIGN and BESTFIT. Received on August 23, 1988; accepted on December 6, 1988  相似文献   

4.
G Valle 《Nucleic acids research》1993,21(22):5152-5156
DISCOVER1 (DIStribution COunter VERsion 1) is a new program that can identify DNA motifs occurring with a high deviation from the expected frequency. The program generates families of patterns, each family having a common set of defined bases. Undefined bases are inserted amongst the defined bases in different ways, thus generating the diverse patterns of each family. The occurrences of the different patterns are then compared and analysed within each family, assuming that all patterns should have the same probability of occurrence. An extensive use of computer memory, combined with the immediate sorting of counts by address calculation allow a complete counting of all DNA motifs on a single pass on the DNA sequence. This approach offers a very fast way to search for unusually distributed patterns and can identify inexact patterns as well as exact patterns.  相似文献   

5.
MATCH-UP/MATRIX is a program designed to aid the investigatorinterested in determining primary protein structure. It is writtenin Applesoft BASIC for the Apple lle microcomputer. MATCH-UPwill survey any set of proteinaceous materials for amino acidsequence homology; however, it is primarily intended to comparethe structures of newly sequenced peptides with the establishedstructure of a protein with suspected homology. Any peptide-to-proteinalignment which shows a homology greater than or equal to thepercentage specified by the user will result in output. MATRIXwill compare the sequences of two proteins (peptides) in whateveralignment specified by the user and is intended to spot insertionsand/or deletions between structures. Received on December 2, 1985; accepted on March 10, 1986  相似文献   

6.
Uteroferrin, a purple-colored, iron-containing acid phosphatase, with many of the properties of a lysosomal hydrolase, transports iron from the mother to the conceptus in pregnant pigs. Uteroferrin, however, is but one member of what may be a broad class of iron-containing phosphatases with unusual spectral properties which result from a novel type of di-iron active site. The biological function of uteroferrin is unknown. We argue here that the in vivo function of uteroferrin, despite its undoubted ability to act as a potent acid phosphatase, is that of a transplacental iron transporter.  相似文献   

7.
We describe a computer program (Metal Search) that helps design tetrahedrally coordinated metal binding sites in proteins of known structure. The program takes as input the backbone coordinates of a protein and outputs lists of four residues that might form tetrahedral sites if wild-type amino acids were replaced by cysteine or histidine. The program also outputs the side chain dihedral angles of the amino acids and the coordinates of the predicted metal ion. The only function evaluated by Metal Search is the ability of side chains to meet simple geometric criteria for formation of a tetrahedral site, but these criteria are sufficient to produce a manageably small list that can then be evaluated by other means. The program has been used in the introduction of zinc binding sites in the designed four-helix bundle protein α 4 and in the B1 domain of streptococcal protein G, and in both cases the tetrahedral coordination of a bound metal ion has been confirmed1 (Klemba, M., Gardner, K. H., Marino, S., Clarke, N. D., and Regan, L., Nature: Structural Biology 2:368–373, 1995). © 1995 Wiley-Liss, Inc.  相似文献   

8.
A computer program to search for tRNA genes.   总被引:7,自引:27,他引:7       下载免费PDF全文
This paper describes a computer program that can find tRNA genes within long DNA sequences. The program obviates the need to map the tRNA genes.  相似文献   

9.
5-Aminoimidazole-4-carboxamide-1-beta-D-ribofuranoside (AICA riboside) has been extensively used in vitro and in vivo to activate the AMP-activated protein kinase (AMPK), a metabolic sensor involved in both cellular and whole body energy homeostasis. However, it has been recently highlighted that AICA riboside also exerts AMPK-independent effects, mainly on AMP-regulated enzymes and mitochondrial oxidative phosphorylation (OXPHOS), leading to the conclusion that new compounds with reduced off target effects are needed to specifically activate AMPK. Here, we review recent findings on newly discovered AMPK activators, notably on A-769662, a nonnucleoside compound from the thienopyridone family. We also report that A-769662 is able to activate AMPK and stimulate glucose uptake in both L6 cells and primary myotubes derived from human satellite cells. In addition, A-769662 increases AMPK activity and phosphorylation of its main downstream targets in primary cultured rat hepatocytes but, by contrast with AICA riboside, does neither affect mitochondrial OXPHOS nor change cellular AMP:ATP ratio. We conclude that A-769662 could be one of the new promising chemical agents to activate AMPK with limited AMPK-independent side effects.  相似文献   

10.
The available quantity of archaeobotanical data derived from the identification of macroremains has expanded considerably over the last few decades. In order to obtain a supraregional or even regional overview for a particular period of time, or of the distribution of a single species, a database is needed. At the Archaeobotanical Department of the Institute of the "Kommission für Arch?ologische Landesforschung in Hessen e.V." (KAL) such a database has been developed in the last few years. It is suitable for the handling of large quantities of archaeobotanical results, including a whole range of background information comprising archaeological, ecological and other related data, and offers various possibilities for the evaluation of these data. Received January 8, 2001 / Accepted April 9, 2002  相似文献   

11.
MS/MS and database searching has emerged as a valuable technology for rapidly analyzing protein expression, localization, and post-translational modifications. The probability-based search engine Mascot has found widespread use as a tool to correlate tandem mass spectra with peptides in a sequence database. Although the Mascot scoring algorithm provides a probability-based model for peptide identification, the independent peptide scores do not correlate with the significance of the proteins to which they match. Herein, we describe a heuristic method for organizing proteins identified at a specified false-discovery rate using Mascot-matched peptides. We call this method PROVALT, and it uses peptide matches from a random database to calculate false-discovery rates for protein identifications and reduces a complex list of peptide matches to a nonredundant list of homologous protein groups. This method was evaluated using Mascot-identified peptides from a Trypanosoma cruzi epimastigote whole-cell lysate, which was separated by multidimensional LC and analyzed by MS/MS. PROVALT was then compared with the two traditional methods of protein identification when using Mascot, the single peptide score and cumulative protein score methods, and was shown to be superior to both in regards to the number of proteins identified and the inclusion of lower scoring nonrandom peptide matches.  相似文献   

12.

Background  

BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming.  相似文献   

13.
The process of knowledge discovery from big and high dimensional datasets has become a popular research topic. The classification problem is a key task in bioinformatics, business intelligence, decision science, astronomy, physics, etc. Building associative classifiers has been a notable research interest in recent years because of their superior accuracy. In associative classifiers, using under-sampling or over-sampling methods for imbalanced big datasets reduces accuracy or increases running time, respectively. Hence, there is a significant need to create efficient associative classifiers for imbalanced big data problems. These classifiers should be able to handle challenges such as memory usage, running time and efficiently exploring the search space. To this end, efficient calculation of measures is a primary objective for associative classifiers. In this paper, we propose a new efficient associative classifier for big imbalanced datasets. The proposed method is based on Rare-PEARs (a multi-objective evolutionary algorithm that efficiently discovers rare and reliable association rules) and is able to evaluate rules in a distributed manner by using a new storing data format. This format simplifies measures calculation and is fully compatible with the MapReduce programming model. We have applied the proposed method (RPII) on a well-known big dataset (ECBDL’14) and have compared our results with seven other learning methods. The experimental results show that RPII outperform other methods in sensitivity and final score measures (the values of sensitivity and final score measures were approximately 0.74 and 0.54 respectively). The results demonstrate that the proposed method is a good candidate for large-scale classification problems; furthermore, it achieves reasonable execution time when the target platform is a typical computer clusters.  相似文献   

14.
This paper presents a language for describing arrangements of motifs in biological sequences, and a program that uses the language to find the arrangements in motif match databases. The program does not by itself search for the constituent motifs, and is thus independent of how they are detected, which allows it to use motif match data of various origins. AVAILABILITY: The program can be tested online at http://hits.isb-sib.ch and the distribution is available from ftp://ftp.isrec.isb-sib.ch/pub/software/unix/mmsearch-1.0.tar.gz CONTACT: Thomas.Junier@isrec.unil.ch SUPPLEMENTARY INFORMATION: The full documentation about mmsearchis available from http://hits.isb-sib.ch/~tjunier/mmsearch/doc.  相似文献   

15.
The recent accumulation of large amounts of 3D structural data warrants a sensitive and automatic method to compare and classify these structures. We developed a web server for comparing protein 3D structures using the program Matras (http://biunit.aist-nara.ac.jp/matras). An advantage of Matras is its structure similarity score, which is defined as the log-odds of the probabilities, similar to Dayhoff's substitution model of amino acids. This score is designed to detect evolutionarily related (homologous) structural similarities. Our web server has three main services. The first one is a pairwise 3D alignment, which is simply align two structures. A user can assign structures by either inputting PDB codes or by uploading PDB format files in the local machine. The second service is a multiple 3D alignment, which compares several protein structures. This program employs the progressive alignment algorithm, in which pairwise 3D alignments are assembled in the proper order. The third service is a 3D library search, which compares one query structure against a large number of library structures. We hope this server provides useful tools for insights into protein 3D structures.  相似文献   

16.
17.
A new method was developed for identifying amyloidogenic regions in protein chains. The formation of amyloid fibrils was attributed to protein regions enriched in residues with a high expected packing density. Predictions consistent with experimental findings were obtained for 8 out of 11 amyloid-forming proteins examined.  相似文献   

18.
We suggest a new method to detect amyloidogenic regions in a protein sequence. In the present work it is shown that regions enriched with amino acid residues which have a high expected packing density are responsible for the amyloid formation. Our predictions are consistent with known disease-related amyloidogenic regions for 8 of 11 amyloid-forming proteins and peptides in which positions of amyloidogenic regions have been revealed experimentally.  相似文献   

19.
Signature sequences are contiguous patterns of amino acids 10-50 residues long that are associated with a particular structure or function in proteins. These may be of three types (by our nomenclature): superfamily signatures, remnant homologies, and motifs. We have performed a systematic search through a database of protein sequences to automatically and preferentially find remnant homologies and motifs. This was accomplished in three steps: 1. We generated a nonredundant sequence database. 2. We used BLAST3 (Altschul and Lipman, Proc. Natl. Acad. Sci. U.S.A. 87:5509-5513, 1990) to generate local pairwise and triplet sequence alignments for every protein in the database vs. every other. 3. We selected "interesting" alignments and grouped them into clusters. We find that most of the clusters contain segments from proteins which share a common structure or function. Many of them correspond to signatures previously noted in the literature. We discuss three previously recognized motifs in detail (FAD/NAD-binding, ATP/GTP-binding, and cytochrome b5-like domains) to demonstrate how the alignments generated by our procedure are consistent with previous work and make structural and functional sense. We also discuss two signatures (for N-acetyltransferases and glycerol-phosphate binding) which to our knowledge have not been previously recognized.  相似文献   

20.
P L Chart  E Franssen 《CMAJ》1997,157(9):1235-1242
OBJECTIVE: To examine the characteristics of malignant tumours that develop in women undergoing surveillance for increased risk for breast cancer and to identify presentation patterns in order to determine the respective roles of mammography, clinical breast examination (CBE) and breast self-examination (BSE). SETTING: Breast Diagnostic Clinic and Familial Breast Cancer Clinic at Toronto-Sunnybrook Regional Cancer Centre. PARTICIPANTS: A total of 1044 women evaluated for breast cancer risk from Oct. 1, 1990, to Dec. 31, 1996, of whom 381 were categorized as being at high risk, 204 as being at moderate risk, 401 as being at slightly increased risk and 58 as being at no appreciably increased risk. PROGRAM COMPONENTS: Comprehensive review and discussion of risk factors, clinical assessment, surveillance recommendations that include mammography, CBE and BSE, genetics consultation (Familial Breast Cancer Clinic) and psychosocial support. Data are captured prospectively, updated at each visit and audited every 3 to 6 months. PROGRAM OUTCOMES: During the study period breast cancer was diagnosed in 24 patients, 12 in the high-risk group, 4 in the moderate-risk group and 8 in the group at slightly increased risk. The mean age at diagnosis was 47 (range 32 to 82) years. Ten cases of cancer were diagnosed during surveillance (incident cancer), 5 in women under age 50. The mean length of time from initial assessment to diagnosis was 28.6 (range 12 to 51) months. Of the 24 women, 17 reported a family history of breast cancer. The mean age at diagnosis in this cohort was 45.5 years, and the diagnosis was made under age 50 in 10 patients (59%). The mean earliest age at which breast cancer was diagnosed in a family member was 42.5 years. CONCLUSIONS: These preliminary results suggest that surveillance of women at increased risk for breast cancer may be useful in detecting disease at an early stage. The regular performance of mammography, CBE and BSE appears necessary to achieve these results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号