首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Improved sensitivity of biological sequence database searches   总被引:26,自引:0,他引:26  
We have increased the sensitivity ofDNA and protein sequencedatabase searches by allowing similar but non-identical aminoacids or nucleotides to match. In addition, one can match k-tuplesor words instead of matching individual residues in order tospeed the search. A matching matrix specifies which k-tuplesmatch each other. The matching matrix can be calculated froma similarity matrix of amino acids and a threshold of similarityrequired for matching. This permits amino acid similarity matricesor replacement matrices (PAM matrices) to be used in the firststep of a sequence comparison rather than in a secondary scoringphase. The concept of matching non-identical k-tuples also increasesthe power ofDNA database searches. For example, a matrix thatspecifies that any 3-tuple in a DNA sequence can match any other3-tuple encoding the same amino acid permits a DNA databasesearch using a DNA query sequence for regions that would encodea similar amino acid sequence. Received on October 10, 1989; accepted on May 1, 1990  相似文献   

2.
3.
The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.  相似文献   

4.

Background  

The inference of homology between proteins is a key problem in molecular biology The current best approaches only identify ~50% of homologies (with a false positive rate set at 1/1000).  相似文献   

5.
Dot-matrix sequence similarity searches can be greatly speeded up through use of a table listing all locations of short oligomers in one of the sequences to find potential similarities with a second sequence. The algorithm described finds similarities between two sequences of lengths M and N, comparing L residues at a time, with an efficiency of L X M X N/(SK) where S is the alphabet size, and k is the length of the oligomer. For nucleic acids, in which S = 4, use of a tetranucleotide table results in an efficiency of L X M X N/256. The simplicity of the approach allows for a straightforward calculation of the level of similarities expected to be found for given search parameters. Furthermore, the storage required is minimal, allowing for even large sequences to be compared on small microcomputers. Theoretical considerations regarding the use of this search are discussed.  相似文献   

6.
In this work, a method for improved protein identification of low-abundance proteins using unstained gels, in combination with robotics and matrix-assisted laser desorption/ionization tandem mass spectrometry, has been developed and evaluated. Omitting the silver-staining process resulted in increased protein identification scores, an increase in the number of peptides observed in the MALDI mass spectrum, and improved quality of the tandem mass spectrometry data.  相似文献   

7.
Optimized homology searches of the gene and protein sequence data banks   总被引:3,自引:0,他引:3  
A strategy is presented for searching the gene and protein sequence data banks which combines the use of two previously described algorthms. The implementation of this strategy is thoroughly evaluated with respect to sensitivity, specificity and speed. The establishment of standard benchmarks for comparing programs that rearch the sequence data banks for homology is proposed.  相似文献   

8.
9.
10.
Chloroperoxidase (CPO) from Caldariomyces fumago is a potentially very useful enzyme due to its ability to catalyze a large variety of stereoselective oxidation reactions, but poor operational stability is a main limitation for commercial use. In the present study, the possibility of increasing the operational stability by use of antioxidants was investigated using the oxidation of indole as model reaction. Caffeic acid was the antioxidant showing the strongest positive effects, reaching a total turnover number (TTN) of 135,000 at pH 4 and 4 mM hydrogen peroxide, compared to 28,700 in the absence of antioxidant. Portion-wise addition of hydrogen peroxide in the presence of caffeic acid caused a further increase in TTN to 171,000. An alternative way to reach high TTN was to use tert-butyl hydroperoxide as oxidant instead of hydrogen peroxide: a TTN of 600,000 was achieved although the reaction was quite slow. In this case, antioxidants did not have any positive effect. Possible mechanisms for the observed inactivation of CPO are discussed.  相似文献   

11.
12.
Ramu C 《Nucleic acids research》2003,31(13):3771-3774
SIRW (http://sirw.embl.de/) is a World Wide Web interface to the Simple Indexing and Retrieval System (SIR) that is capable of parsing and indexing various flat file databases. In addition it provides a framework for doing sequence analysis (e.g. motif pattern searches) for selected biological sequences through keyword search. SIRW is an ideal tool for the bioinformatics community for searching as well as analyzing biological sequences of interest.  相似文献   

13.
A series of alphaVbeta3 receptor antagonists lacking the amide bond of previously-reported 'chain-shortened' compounds is described. Replacement of the lone amide bond with two methylene groups in this series yields more lipophilic compounds that have longer half-lives, lower clearance, and greater oral bioavailability when administered to dogs.  相似文献   

14.
A suite of tests to evaluate the statistical significance of protein sequence similarities is developed for use in data bank searches. The tests are based on the Wilbur-Lipman word-search algorithm, and take into account the sequence lengths and compositions, and optionally the weighting of amino acid matches. The method is extended to allow for the existence of a sequence insertion/deletion within the region of similarity. The accuracy of statistical distributions underlying the tests is validated using randomly generated sequences and real sequences selected at random from the data banks. A computer program to perform the tests is briefly described.  相似文献   

15.
Effective management of coastal and marine resources requires knowledge of how community sensitivity varies spatially. With this in mind, we developed a benthic sensitivity index (SI), based on the distribution and abundance of five ecological groups that can be used to assess community tolerance to organic enrichment and other disturbances. The index, projected as a high-resolution map, ranks communities from those dominated by sensitive and ecologically important species (i.e. low SI values) to those composed mainly of tolerant and/or opportunistic species (i.e., high SI values). Applying our model to a multiple-use case study in southeast Brazil, we were able to show considerable variability in the sensitivity of communities across the study area that was relatively stable over time. This allowed us to evaluate the possible direct (i.e., spatially overlapping) and indirect effects (i.e., cumulative changes to the physical environment) of a range of activities on sensitive and ecologically diverse benthic communities. Our approach and the resulting high-resolution maps hold promise for a range of spatial planning applications, including the development of coastal infrastructure, assessments of the representativeness of marine protected areas and other activities such as the selection of appropriate locations for dredge spoil dumping. Overall, we present a novel and transparent way of extrapolating limited survey data to provide spatial and temporal information on the sensitivity of benthic communities in multiple-use coastal and marine areas.  相似文献   

16.
17.
The metabolic pathways of the central carbon metabolism in Saccharomyces cerevisiae are well studied and consequently S. cerevisiae has been widely evaluated as a cell factory for many industrial biological products. In this study, we investigated the effect of engineering the supply of precursor, acetyl‐CoA, and cofactor, NADPH, on the biosynthesis of the bacterial biopolymer polyhydroxybutyrate (PHB), in S. cerevisiae. Supply of acetyl‐CoA was engineered by over‐expression of genes from the ethanol degradation pathway or by heterologous expression of the phophoketolase pathway from Aspergillus nidulans. Both strategies improved the production of PHB. Integration of gapN encoding NADP+‐dependent glyceraldehyde‐3‐phosphate dehydrogenase from Streptococcus mutans into the genome enabled an increased supply of NADPH resulting in a decrease in glycerol production and increased production of PHB. The strategy that resulted in the highest PHB production after 100 h was with a strain harboring the phosphoketolase pathway to supply acetyl‐CoA without the need of increased NADPH production by gapN integration. The results from this study imply that during the exponential growth on glucose, the biosynthesis of PHB in S. cerevisiae is likely to be limited by the supply of NADPH whereas supply of acetyl‐CoA as precursor plays a more important role in the improvement of PHB production during growth on ethanol. Biotechnol. Bioeng. 2013; 110: 2216–2224. © 2013 Wiley Periodicals, Inc.  相似文献   

18.
19.
20.
Mass spectrometry-driven BLAST (MS BLAST) is a database search protocol for identifying unknown proteins by sequence similarity to homologous proteins available in a database. MS BLAST utilizes redundant, degenerate, and partially inaccurate peptide sequence data obtained by de novo interpretation of tandem mass spectra and has become a powerful tool in functional proteomic research. Using computational modeling, we evaluated the potential of MS BLAST for proteome-wide identification of unknown proteins. We determined how the success rate of protein identification depends on the full-length sequence identity between the queried protein and its closest homologue in a database. We also estimated phylogenetic distances between organisms under study and related reference organisms with completely sequenced genomes that allow substantial coverage of unknown proteomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号