首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
When aligning RNAs, it is important to consider both the secondary structure similarity and primary sequence similarity to find an accurate alignment. However, algorithms that can handle RNA secondary structures typically have high computational complexity that limits their utility. For this reason, there have been a number of attempts to find useful alignment constraints that can reduce the computations without sacrificing the alignment accuracy. In this paper, we propose a new method for finding effective alignment constraints for fast and accurate structural alignment of RNAs, including pseudoknots. In the proposed method, we use a profile-HMM to identify the “seedâ€� regions that can be aligned with high confidence. We also estimate the position range of the aligned bases that are located outside the seed regions. The location of the seed regions and the estimated range of the alignment positions are then used to establish the sequence alignment constraints. We incorporated the proposed constraints into the profile context-sensitive HMM (profile-csHMM) based RNA structural alignment algorithm. Experiments indicate that the proposed method can make the alignment speed up to 11 times faster without degrading the accuracy of the RNA alignment.  相似文献   

2.
3.
Mol. Biol. Evol. 2007 24:1464-1479 The first affiliation should have appeared as EMBL-EuropeanBioinformatics Institute, Hinxton, United Kingdom. On page  相似文献   

4.

Background

Mapping protein primary sequences to their three dimensional folds referred to as the 'second genetic code' remains an unsolved scientific problem. A crucial part of the problem concerns the geometrical specificity in side chain association leading to densely packed protein cores, a hallmark of correctly folded native structures. Thus, any model of packing within proteins should constitute an indispensable component of protein folding and design.

Results

In this study an attempt has been made to find, characterize and classify recurring patterns in the packing of side chain atoms within a protein which sustains its native fold. The interaction of side chain atoms within the protein core has been represented as a contact network based on the surface complementarity and overlap between associating side chain surfaces. Some network topologies definitely appear to be preferred and they have been termed 'packing motifs', analogous to super secondary structures in proteins. Study of the distribution of these motifs reveals the ubiquitous presence of typical smaller graphs, which appear to get linked or coalesce to give larger graphs, reminiscent of the nucleation-condensation model in protein folding. One such frequently occurring motif, also envisaged as the unit of clustering, the three residue clique was invariably found in regions of dense packing. Finally, topological measures based on surface contact networks appeared to be effective in discriminating sequences native to a specific fold amongst a set of decoys.

Conclusions

Out of innumerable topological possibilities, only a finite number of specific packing motifs are actually realized in proteins. This small number of motifs could serve as a basis set in the construction of larger networks. Of these, the triplet clique exhibits distinct preference both in terms of composition and geometry.  相似文献   

5.
C Savin  A Leclercq  E Carniel 《PloS one》2012,7(7):e41176
Enteropathogenic Yersinia are among the most frequent agents of human diarrhea in temperate and cold countries. However, the incidence of yersiniosis is largely underestimated because of the peculiar growth characteristics of pathogenic Yersinia, which make their isolation from poly-contaminated samples difficult. The use of specific procedures for Yersinia isolation is required, but is expensive and time consuming, and therefore is not systematically performed in clinical pathology laboratories. A means to circumvent this problem would be to use a single procedure for the isolation of all bacterial enteropathogens. Since the Statens Serum Institut enteric medium (SSI) has been reported to allow the growth at 37°C of most Gram-negative bacteria, including Yersinia, our study aimed at evaluating its performances for Yersinia isolation, as compared to the commonly used Yersinia-specific semi-selective Cefsulodin-Irgasan-Novobiocin medium (CIN) incubated at 28°C. Our results show that Yersinia pseudotuberculosis growth was strongly inhibited on SSI at 37°C, and therefore that this medium is not suitable for the isolation of this species. All Yersinia enterocolitica strains tested grew on SSI, while some non-pathogenic Yersinia species were inhibited. The morphology of Y. enterocolitica colonies on SSI allowed their differentiation from various other Gram-negative bacteria commonly isolated from stool samples. However, in artificially contaminated human stools, the recovery of Y. enterocolitica colonies on SSI at 37°C was difficult and was 3 logs less sensitive than on CIN at 28°C. Therefore, despite its limitations, the use of a specific procedure (CIN incubated at 28°C) is still required for an efficient isolation of enteropathogenic Yersinia from stools.  相似文献   

6.
A Model for DNA Sequence Evolution within Transposable Element Families   总被引:5,自引:2,他引:3  
J. F. Y. Brookfield 《Genetics》1986,112(2):393-407
A quantitative model is proposed for the expected degree of relationship between copies of a family of transposable elements in a finite population of hosts. Special cases of the model (in which the process of homogenization of element copies either is or is not limited by transposition rate) are presented and illustrated, using data on mobile sequences from different species. It is shown that transposition will be expected, in large populations, to result in only a rather distant relationship between transposable elements at different genomic sites. Possible inadequacies of the model are suggested and quantified.  相似文献   

7.
Sequence divergence derives from either point substitution or indel (insertion or deletion) processes. We investigated the rates of these two processes both in protein and non-protein coding DNA. We aligned sequence pairs using two pair-hidden Markov models (PHMMs) conjoined by one silent state. The two PHMMs had their own set of parameters to model rates in their respective regions. The aim was to test the hypothesis that the indel mutation rate mimics the point mutation rate. That is, indels are found less often in conserved regions (slow point substitution rate) and more often in non-conserved regions (fast point substitution rate). Both polypeptides and rRNA molecules in our data exhibited a clear distinction between slow and fast rates of the two processes. These two rates served as surrogates to conserved and non-conserved secondary structure components, respectively. With polypeptides we found both the fast indel rate and the fast replacement rate were co-located with hydrophilic residues. We also found that the average concordance, of our alignments with corresponding curated alignments, improves markedly when the model allows either of the two fast rates to colocate with hydrophilic residues. With rRNA molecules, our model did not detect colocation between the fast indel rate and the fast substitution rate. Nevertheless, coupling the indel rates with the point substitution rates across the two regions markedly increased model fit. This result suggests that rRNA pairwise alignments should be modeled after allowing for the two processes to vary simultaneously and independently in the two regions.  相似文献   

8.
Germline mutation rates have been found to be higher in males than in females in many organisms, a likely consequence of cell division being more frequent in spermatogenesis than in oogenesis. If the majority of mutations are due to DNA replication error, the male-to-female mutation rate ratio (αm) is expected to be similar to the ratio of the number of germ line cell divisions in males and females (c), an assumption that can be tested with proper estimates of αm and c. αm is usually estimated by comparing substitution rates in putatively neutral sequences on the sex chromosomes. However, substantial regional variation in substitution rates across chromosomes may bias estimates of αm based on the substitution rates of short sequences. To investigate regional substitution rate variation, we estimated sequence divergence in 16 gametologous introns located on the Z and W chromosomes of five bird species of the order Galliformes. Intron ends and potentially conserved blocks were excluded to reduce the effect of using sequences subject to negative selection. We found significant substitution rate variation within Z chromosome (G15 = 37.6, p = 0.0010) as well as within W chromosome introns (G15 = 44.0, p = 0.0001). This heterogeneity also affected the estimates of αm, which varied significantly, from 1.53 to 3.51, among the introns (ANOVA: F13,14 =2.68, p = 0.04). Our results suggest the importance of using extensive data sets from several genomic regions to avoid the effects of regional mutation rate variation and to ensure accurate estimates of αm. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users. [Reviewing Editor: Mr. Martin Kreitman] Nick G.C. Smith Deceased  相似文献   

9.
Considerable variations in the content of free amino acids, elhanol-soluble carbohydrates, starch, protein, chlorophyll, phylic acid, RNA and DNA exist in different regions of the long filaments of Cuscuta reflexa. The distinction is especially pronounced when comparison is made between the hanstoria-bearing curls of the parasite and the apical portions of the overhanging filament.  相似文献   

10.
H. Tachida 《Genetics》1996,143(2):1033-1042
A transient population genetic model of SINE (short interspersed repetitive element) evolution assuming the master copy model is theoretically investigated. Means and variances of consensus frequency of nucleotides, nucleotide homozygosity, and the number of shared differences that are considered to have caused by mutations occurring in the master copy lineages are computed. All quantities investigated are shown to be monotone functions of the duration of the expansion period. Thus, they can be used to estimate the expansion period although their sampling variances are generally large. Using the theoretical results, the Sb subfamily of human Alu sequences is analyzed. First, the expansion period is estimated from the observed mean and variance of homozygosity. The expansion period is shown to be short compared to the time since the end of the expansion of the subfamily. However, the observed number of the shared differences is more than twice that expected under the master copy model with the estimated expansion period. Alternative models including that with multiple master copy loci to explain this observation are discussed.  相似文献   

11.
Multiple Sequence Alignment (MSA) methods are typically benchmarked on sets of reference alignments. The quality of the alignment can then be represented by the sum-of-pairs (SP) or column (CS) scores, which measure the agreement between a reference and corresponding query alignment. Both the SP and CS scores treat mismatches between a query and reference alignment as equally bad, and do not take the separation into account between two amino acids in the query alignment, that should have been matched according to the reference alignment. This is significant since the magnitude of alignment shifts is often of relevance in biological analyses, including homology modeling and MSA refinement/manual alignment editing. In this study we develop a new alignment benchmark scoring scheme, SPdist, that takes the degree of discordance of mismatches into account by measuring the sequence distance between mismatched residue pairs in the query alignment. Using this new score along with the standard SP score, we investigate the discriminatory behavior of the new score by assessing how well six different MSA methods perform with respect to BAliBASE reference alignments. The SP score and the SPdist score yield very similar outcomes when the reference and query alignments are close. However, for more divergent reference alignments the SPdist score is able to distinguish between methods that keep alignments approximately close to the reference and those exhibiting larger shifts. We observed that by using SPdist together with SP scoring we were able to better delineate the alignment quality difference between alternative MSA methods. With a case study we exemplify why it is important, from a biological perspective, to consider the separation of mismatches. The SPdist scoring scheme has been implemented in the VerAlign web server (http://www.ibi.vu.nl/programs/veralignwww/). The code for calculating SPdist score is also available upon request.  相似文献   

12.

Motivation

To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis.

Results

With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.  相似文献   

13.
Immunogold localization revealed that OmcS, a cytochrome that is required for Fe(III) oxide reduction by Geobacter sulfurreducens, was localized along the pili. The apparent spacing between OmcS molecules suggests that OmcS facilitates electron transfer from pili to Fe(III) oxides rather than promoting electron conduction along the length of the pili.There are multiple competing/complementary models for extracellular electron transfer in Fe(III)- and electrode-reducing microorganisms (8, 18, 20, 44). Which mechanisms prevail in different microorganisms or environmental conditions may greatly influence which microorganisms compete most successfully in sedimentary environments or on the surfaces of electrodes and can impact practical decisions on the best strategies to promote Fe(III) reduction for bioremediation applications (18, 19) or to enhance the power output of microbial fuel cells (18, 21).The three most commonly considered mechanisms for electron transfer to extracellular electron acceptors are (i) direct contact between redox-active proteins on the outer surfaces of the cells and the electron acceptor, (ii) electron transfer via soluble electron shuttling molecules, and (iii) the conduction of electrons along pili or other filamentous structures. Evidence for the first mechanism includes the necessity for direct cell-Fe(III) oxide contact in Geobacter species (34) and the finding that intensively studied Fe(III)- and electrode-reducing microorganisms, such as Geobacter sulfurreducens and Shewanella oneidensis MR-1, display redox-active proteins on their outer cell surfaces that could have access to extracellular electron acceptors (1, 2, 12, 15, 27, 28, 31-33). Deletion of the genes for these proteins often inhibits Fe(III) reduction (1, 4, 7, 15, 17, 28, 40) and electron transfer to electrodes (5, 7, 11, 33). In some instances, these proteins have been purified and shown to have the capacity to reduce Fe(III) and other potential electron acceptors in vitro (10, 13, 29, 38, 42, 43, 48, 49).Evidence for the second mechanism includes the ability of some microorganisms to reduce Fe(III) that they cannot directly contact, which can be associated with the accumulation of soluble substances that can promote electron shuttling (17, 22, 26, 35, 36, 47). In microbial fuel cell studies, an abundance of planktonic cells and/or the loss of current-producing capacity when the medium is replaced is consistent with the presence of an electron shuttle (3, 14, 26). Furthermore, a soluble electron shuttle is the most likely explanation for the electrochemical signatures of some microorganisms growing on an electrode surface (26, 46).Evidence for the third mechanism is more circumstantial (19). Filaments that have conductive properties have been identified in Shewanella (7) and Geobacter (41) species. To date, conductance has been measured only across the diameter of the filaments, not along the length. The evidence that the conductive filaments were involved in extracellular electron transfer in Shewanella was the finding that deletion of the genes for the c-type cytochromes OmcA and MtrC, which are necessary for extracellular electron transfer, resulted in nonconductive filaments, suggesting that the cytochromes were associated with the filaments (7). However, subsequent studies specifically designed to localize these cytochromes revealed that, although the cytochromes were extracellular, they were attached to the cells or in the exopolymeric matrix and not aligned along the pili (24, 25, 30, 40, 43). Subsequent reviews of electron transfer to Fe(III) in Shewanella oneidensis (44, 45) appear to have dropped the nanowire concept and focused on the first and second mechanisms.Geobacter sulfurreducens has a number of c-type cytochromes (15, 28) and multicopper proteins (12, 27) that have been demonstrated or proposed to be on the outer cell surface and are essential for extracellular electron transfer. Immunolocalization and proteolysis studies demonstrated that the cytochrome OmcB, which is essential for optimal Fe(III) reduction (15) and highly expressed during growth on electrodes (33), is embedded in the outer membrane (39), whereas the multicopper protein OmpB, which is also required for Fe(III) oxide reduction (27), is exposed on the outer cell surface (39).OmcS is one of the most abundant cytochromes that can readily be sheared from the outer surfaces of G. sulfurreducens cells (28). It is essential for the reduction of Fe(III) oxide (28) and for electron transfer to electrodes under some conditions (11). Therefore, the localization of this important protein was further investigated.  相似文献   

14.

Background

Predicting protein function from primary sequence is an important open problem in modern biology. Not only are there many thousands of proteins of unknown function, current approaches for predicting function must be improved upon. One problem in particular is overly-specific function predictions which we address here with a new statistical model of the relationship between protein sequence similarity and protein function similarity.

Methodology

Our statistical model is based on sets of proteins with experimentally validated functions and numeric measures of function specificity and function similarity derived from the Gene Ontology. The model predicts the similarity of function between two proteins given their amino acid sequence similarity measured by statistics from the BLAST sequence alignment algorithm. A novel aspect of our model is that it predicts the degree of function similarity shared between two proteins over a continuous range of sequence similarity, facilitating prediction of function with an appropriate level of specificity.

Significance

Our model shows nearly exact function similarity for proteins with high sequence similarity (bit score >244.7, e-value >1e−62, non-redundant NCBI protein database (NRDB)) and only small likelihood of specific function match for proteins with low sequence similarity (bit score <54.6, e-value <1e−05, NRDB). For sequence similarity ranges in between our annotation model shows an increasing relationship between function similarity and sequence similarity, but with considerable variability. We applied the model to a large set of proteins of unknown function, and predicted functions for thousands of these proteins ranging from general to very specific. We also applied the model to a data set of proteins with previously assigned, specific functions that were electronically based. We show that, on average, these prior function predictions are more specific (quite possibly overly-specific) compared to predictions from our model that is based on proteins with experimentally determined function.  相似文献   

15.
Most bacteria live in colonies, where they often express different cell types. The ecological significance of these cell types and their evolutionary origin are often unknown. Here, we study the evolution of cell differentiation in the context of surface colonization. We particularly focus on the evolution of a ‘sticky’ cell type that is required for surface attachment, but is costly to express. The sticky cells not only facilitate their own attachment, but also that of non-sticky cells. Using individual-based simulations, we show that surface colonization rapidly evolves and in most cases leads to phenotypic heterogeneity, in which sticky and non-sticky cells occur side by side on the surface. In the presence of regulation, cell differentiation leads to a remarkable set of bacterial life cycles, in which cells alternate between living in the liquid and living on the surface. The dominant life stage is formed by the surface-attached colony that shows many complex features: colonies reproduce via fission and by producing migratory propagules; cells inside the colony divide labour; and colonies can produce filaments to facilitate expansion. Overall, our model illustrates how the evolution of an adhesive cell type goes hand in hand with the evolution of complex bacterial life cycles.  相似文献   

16.
17.
Sequence data of mitochondrial 16S ribosomal DNA (mt-rDNA) and nuclear 28S ribosomal DNA (nuc-rDNA) were compared in two honeybee species (Apis mellifera and Apis dorsata) and a selection of 22 wasp species (Vespidae) with different levels of sociality. The averge substitution rates in mt-rDNA and nuc-rDNA were almost-equal in solitary species. In species with larger nests, however, the difference between the nuclear and the mitochondrial substitution rate significantly increased. The average substitution ratio, ψ (nucleotide substitutions in mt-rDNA/nucleotide substitutions in nuc-rDNA) was 1.48 ± 0.12 (SE) among the solitary Eumeninae, 3.70 ± 0.15 among five primitive social Stenogastrinae species, 3.24 ± 0.20 among five Polistinae species, 5.76 ± 0.33 among nine highly eusocial Vespinae, and 12.7 in the two Apis species. The high egg-laying rate and the effective population size skew between the sexes may contribute to the rise of the substitution ratio in the highly eusocial species. Drift and bottleneck effects in the mitochondrial DNA pool during speciation events as well as polyandry may further enhance this phenomenon. Received: 12 January 1998 / Accepted: 28 April 1998  相似文献   

18.
The phylogeny of the grey mullets is considered problematic both at the intra- and interfamily level. Such a difficulty arises from the highly homogeneous morphology displayed by this group of fish and, consequently, from the paucity of the key morphological characters suitable to address their phylogeny and evolution. In the present work, we have approached the phylogenetic and evolutionary relationships of seven species of Mugilidae, six of which from the Mediterranean Sea, on the basis of the DNA sequences of two mitochondrial genes (cytochromeband 12S rRNA). Despite the morphological homogeneity exhibited by the taxa considered, the two species of the genusMugil(M. cephalusandM. curema) showed a remarkable genetic divergence compared to all the other members of the family. The relative rate test revealed a significantly higher rate of evolution along theMugillineage.  相似文献   

19.
Abstract

Modelling by homology is an approach to the rational design of new drugs based on the construction of ligand protein interaction complexes. Because in most cases the 3D-structure of the target protein is not known from biophysical data, this approach yields a theoretical procedure which establishes at least parts of the protein by comparison with isofunctional proteins, assuming that much of the structural information is embedded in the amino acid sequence. This approach should be of considerable importance for proteins with divergent primary structures but with a high degree of isofunctionality, the latter demanding a similar active site folding pattern.

This study is a pattern recognition approach based on additive secondary structure prediction and surface probabilities from residue variabilities. The comparison of the additive properties yields a sequence alignment of the viral thymidine kinases with the adenylate kinases having a closely related functionality. X-ray structures of adenylate kinases can then be used as templates to derive a 3D-structure prediction of the thymidine kinase active site.  相似文献   

20.
Summary The tight junctions along the medullary collecting duct in the kidneys of the rat and the rabbit were studied with freeze-fracture electron microscopy and quantitated according to the number of strands and the apico-basal depth (nm) of the junctions.The most elaborate tight junctions were found in the inner stripe of the outer medulla; rat: 10.6±0.8 strands and 205±24nm; rabbit: 11.6±2.4 strands and 291±55 nm.The elaboration of the tight junctions decreased continuously towards the papillary tip. Inner zone I; rat: 9.3±2.6 strands and 186±38nm, rabbit: 9.5±2.3 strands and 247±59nm. Inner zone II; rat: 7.1±2.2 strands and 129±32nm, rabbit: 8.5±1.4 strands and 199±26nm. Inner zone III; rat: 6.0±1.6 strands and 111 + 19 nm, rabbit: 7.0±1.5 strands and 183±43 nm. In the inner zone III comprising the papillary tip tight junctions with only 1–3 strands were not infrequently seen. Preliminary findings in the kidney of the golden hamster indicate a similar decline of junctional tightness along the collecting duct.These morphological observations suggest that the permeability of the paracellular pathway of the medullary collecting duct increases towards the tip of the papilla, especially in the rat. The functional implications for the medullary recycling of urea and electrolytes, and for the urinary concentrating mechanism are discussed.In addition, the tight junctions of the papillary epithelium are described.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号