首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
The export of proteins to the periplasmic compartment of bacterial cells is mediated by an amino-terminal signal peptide. After transport, the signal peptide is cleaved by a processing enzyme, signal peptidase I. A comparison of the cleavage sites of many exported proteins has identified a conserved feature of small, uncharged amino acids at positions -1 and -3 relative to the cleavage site. To determine experimentally the sequences required for efficient signal peptide cleavage, we simultaneously randomized the amino acid residues from positions -4 to +2 of the TEM-1 beta-lactamase enzyme to form a library of random sequences. Mutants that provide wild-type levels of ampicillin resistance were then selected from the random-sequence library. The sequences of 15 mutants indicated a bias towards small amino acids. The N-terminal amino acid sequence of the mature enzyme was determined for nine of the mutants to assign the new -1 and -3 residues. Alanine was present in the -1 position for all nine of these mutants, strongly supporting the importance of alanine at the -1 position. The amino acids at the -3 position were much less conserved but were consistent with the -3 rules derived from sequence comparisons. Compared with the wild type, two of the nine mutants have an altered cleavage position, suggesting that sequence is more important than position for processing of the signal peptide.  相似文献   

3.
4.
5.
MOTIVATION: Given a large family of homologous protein sequences, many methods can divide the family into smaller groups that correspond to the different functions carried out by proteins within the family. One important problem, however, has been the absence of a general method for selecting an appropriate level of granularity, or size of the groups. RESULTS: We propose a consistent way of choosing the granularity that is independent of the sequence similarity and sequence clustering method used. We study three large, well-investigated protein families: basic leucine zippers, nuclear receptors and proteins with three consecutive C2H2 zinc fingers. Our method is tested against known functional information, the experimentally determined binding specificities, using a simple scoring method. The significance of the groups is also measured by randomizing the data. Finally, we compare our algorithm against a popular method of grouping proteins, the TRIBE-MCL method. In the end, we determine that dividing the families at the proposed level of granularity creates very significant and useful groups of proteins that correspond to the different DNA-binding motifs. We expect that such groupings will be useful in studying not only DNA binding but also other protein interactions.  相似文献   

6.
7.
8.
Jones M  Ghoorah A  Blaxter M 《PloS one》2011,6(4):e19259

Background

DNA barcoding and other DNA sequence-based techniques for investigating and estimating biodiversity require explicit methods for associating individual sequences with taxa, as it is at the taxon level that biodiversity is assessed. For many projects, the bioinformatic analyses required pose problems for laboratories whose prime expertise is not in bioinformatics. User-friendly tools are required for both clustering sequences into molecular operational taxonomic units (MOTU) and for associating these MOTU with known organismal taxonomies.

Results

Here we present jMOTU, a Java program for the analysis of DNA barcode datasets that uses an explicit, determinate algorithm to define MOTU. We demonstrate its usefulness for both individual specimen-based Sanger sequencing surveys and bulk-environment metagenetic surveys using long-read next-generation sequencing data. jMOTU is driven through a graphical user interface, and can analyse tens of thousands of sequences in a short time on a desktop computer. A companion program, Taxonerator, that adds traditional taxonomic annotation to MOTU, is also presented. Clustering and taxonomic annotation data are stored in a relational database, and are thus amenable to subsequent data mining and web presentation.

Conclusions

jMOTU efficiently and robustly identifies the molecular taxa present in survey datasets, and Taxonerator decorates the MOTU with putative identifications. jMOTU and Taxonerator are freely available from http://www.nematodes.org/.  相似文献   

9.
10.
Summary The size distribution of 411 randomly selected mammalian exons was investigated. This distribution was found to be unimodal with a frequency maximum of 120 bp. Detailed analysis of the distribution demonstrated that larger exons (>150 bp) have a high goodness of fit to the size distribution of open reading frames (ORFs) in a random sequence, i.e., (61/64)t in which t is the number of triplets. Based on this observation, the general character of the total exon size distribution suggested that this could be defined by a theoretical distribution by superimposing a sigmoid function on the ORF generating function, i.e., (61/64)t×fs(t)×E in which fs(t) is a sigmoid function and E is a constant. We tested this distribution for fitness to the exon distribution using two sigmoid functions. fs(t)=(t) and fs(t)=Bekt/1+Bekt. In both cases a very high goodness of fit was attained. It is concluded that exons have been generated from ORFs in random sequences, that ORFs larger than 150 bp have been selected, irrespective of size, as exons, and that a lower size limit exists below which the probability of an ORF being selected as an exon is very low. These results provide evidence at the molecular level to support the ideas that (1) larger exons have been selected from random ORFs without primary correlation to structural or functional properties at the protein level, (2) there exists a restriction on smaller ORFs to be selected as exons, and (3) the interrupted coding sequences found in eukaryotes represent the ancient form of gene organization that existed prior to the divergence of prokaryotes and eukaryotes.  相似文献   

11.
The development of genetically altered murine animals has generated a need for in vitro systems in the mouse. We have now characterized a novel isolated bile duct unit (IBDU) preparation from the mouse to facilitate such studies. The mouse IBDU is isolated by portal perfusion of collagenase, blunt dissection, further enzymatic digestions, filtering through sized mesh, and culturing on Matrigel for 16-72 h. This mouse IBDU forms a central, enclosed lumen lined by polarized cytokeratin-19-positive cholangiocytes with numerous microvilli on the apical membrane. The IBDU responds to secretory stimuli, including secretin, vasoactive intestinal peptide, IBMX, and forskolin, resulting in expansion of the central lumen from secretion as quantified by videomicroscopy. The secretory response to secretin is dependent on Cl- and HCO3-in the perfusate. These findings indicate that mouse IBDUs are intact, polarized, functional bile duct secretory units that permit quantitative measurements of fluid secretion from mouse bile duct epithelium for the first time. This method should facilitate studies of cholangiocyte secretion in genetically altered murine animal models.  相似文献   

12.
13.
The mechanism by which protein-coding portions of eukaryotic genes came to be separated by long non-coding stretches of DNA, and the purpose for this perplexing arrangement, have remained unresolved fundamental biological problems for three decades. We report here a plausible solution to this problem based on analysis of open reading frame (ORF) length constraints in the genomes of nine diverse species. If primordial nucleic acid sequences were random in sequence, functional proteins that are innately long would not be encoded due to the frequent occurrence of stop codons. The best possible way that a long protein-coding sequence could have been derived was by evolving a split-structure from the random DNA (or RNA) sequence. Results of the systematic analyses of nine complete genome sequences presented here suggests that perhaps the major underlying structural features of split-genes have evolved due to the indigenous occurrence of split protein-coding genes in primordial random nucleotide sequence. The results also suggest that intron-rich genes containing short exons may have been the original form of genes intrinsically occurring in random DNA, and that intron-poor genes containing long exons were perhaps derived from the original intron-rich genes.  相似文献   

14.
《Gene》1996,169(1):133-134
The calcium-binding protein, calmodulin (CaM), was used to screen a phage library displaying random peptides 26 amino acids (aa) in length. Twenty CaM-binding peptides were identified, 17 of which contained one of three consensus sequence motifs: + W-OλR, WRAAV or WRXXAAAL, where +, -, O,λ and X are positively charged, negatively charged, hydrophobic, leucine or valine, and any residue, respectively. The Trp residue in these motifs is located within 14 aa of the N-terminus of the displayed peptide. Previous studies [Dedman et al., J. Biol. Chem. 268 (1993) 23025–23030] using a library displaying random peptides 15 aa in length identified CaM-binding peptides which contained a Trp-Pro dipeptide motif. These results suggest that the type of CaM-binding motif identified can vary between different types of combinatorial peptides  相似文献   

15.
16.
To classify proteins into functional families based on their primary sequences, popular algorithms such as the k-NN-, HMM-, and SVM-based algorithms are often used. For many of these algorithms to perform their tasks, protein sequences need to be properly aligned first. Since the alignment process can be error-prone, protein classification may not be performed very accurately. To improve classification accuracy, we propose an algorithm, called the Unaligned Protein SEquence Classifier (UPSEC), which can perform its tasks without sequence alignment. UPSEC makes use of a probabilistic measure to identify residues that are useful for classification in both positive and negative training samples, and can handle multi-class classification with a single classifier and a single pass through the training data. UPSEC has been tested with real protein data sets. Experimental results show that UPSEC can effectively classify unaligned protein sequences into their corresponding functional families, and the patterns it discovers during the training process can be biologically meaningful.  相似文献   

17.
18.
Summary We examine in this paper one of the expected consequences of the hypothesis that modern proteins evolved from random heteropeptide sequences. Specifically, we investigate the lengthwise distributions of amino acids in a set of 1,789 protein sequences with little sequence identity using the run test statistic (r o) of Mood (1940,Ann. Math. Stat. 11, 367–392). The probability density ofr o for a collection of random sequences has mean=0 and variance=1 [the N(0,1) distribution] and can be used to measure the tendency of amino acids of a given type to cluster together in a sequence relative to that of a random sequence. We implement the run test using binary representations of protein sequences in which the amino acids of interest are assigned a value of 1 and all others a value of 0. We consider individual amino acids and sets of various combinations of them based upon hydrophobicity (4 sets), charge (3 sets), volume (4 sets), and secondary structure propensity (3 sets). We find that any sequence chosen randomly has a 90% or greater chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. We regard this as strong support for the random-origin hypothesis. However, we do observe significant deviations from the random expectation as might be expected after billions years of evolution. Two important global trends are found: (1) Amino acids with a strong α-helix propensity show a strong tendency to cluster whereas those with β-sheet or reverse-turn propensity do not. (2) Clustered rather than evenly distributed patterns tend to be preferred by the individual amino acids and this is particularly so for methionine. Finally, we consider the problem of reconciling the random nature of protein sequences with structurally meaningful periodic “patterns” that can be detected by sliding-window, autocorrelation, and Fourier analyses. Two examples, rhodopsin and bacteriorhodopsin, show that such patterns are a natural feature of random sequences.  相似文献   

19.
20.
Synchrotron x-ray scattering measurements were performed on dilute solutions of the purified hemocyanin subunit (Bsin1) from scorpion (Buthus sindicus) and the N-terminal functional unit (Rta) from a marine snail (Rapana thomasiana). The model-independent approach based on spherical harmonics was applied to calculate the molecular envelopes directly from the scattering profiles. Their molecular shapes in solution could be restored at 2-nm resolution. We show that these units represent stable, globular building blocks of the two hemocyanin families and emphasize their conformational differences on a subunit level. Because no crystallographic or electron microscopy data are available for isolated functional units, this study provides for the first time structural information for isolated, monomeric functional subunits from both hemocyanin families. This has been made possible through the use of low protein concentrations (< or = 1 mg/ml). The observed structural differences may offer advantages in building very different overall molecular architectures of hemocyanin by the two phyla.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号