首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Copley RR  Doerks T  Letunic I  Bork P 《FEBS letters》2002,513(1):129-134
Domains present one of the most useful levels at which to understand protein function, and domain family-based analysis has had a profound impact on the study of individual proteins. Protein domain discovery has been progressing steadily over the past 30 years. What are the realistically achievable goals of sequence-based domain analysis, and how far off are they for the sequences encoded in eukaryotic genomes? Here we address some of the issues involved in better coverage of sequence-based domain annotation, and the integration of these results within the wider context of genomes, structures and function.  相似文献   

2.
The process of glycosylation has been studied extensively in prokaryotes but many questions still remain unanswered. Glycosyltransferase is the enzyme which mediates glycosylation and has its preference for the target glycosylation sites as well as for the type of glycosylation i.e. N-linked and O-linked glycosylation. In this study we carried out the bioinformatics analysis of one of the key enzymes of pgl locus from Campylobacter jejuni, known as PglB, which is distributed widely in bacteria and AglB from archaea. Relatively little sequence similarity was observed in the archaeal AglB(s) as compared to those of the bacterial PglB(s). In addition we tried to the answer the question of as to why not all the sequins Asp-X-Ser/Thr have an equal opportunity to be glycosylated by looking at the influence of the neighboring amino acids but no significant conserved pattern of the flanking sites could be identified. The software tool was developed to predict the potential glycosylation sites in autotransporter protein, the virulence factors of gram negative bacteria, and our results revealed that the frequency of glycosylation sites was higher in adhesins (a subclass of autotransporters) relative to the other classes of autotransporters.  相似文献   

3.
Small ubiquitin-related modifier (SUMO) genes regulate various functions of target proteins through post-translational modification. The SUMO proteins have a similar 3-dimensional structure as that of ubiquitin proteins and occur through a cascade of enzymatic reactions. In the present study we have cloned a new SUMO gene from Tomato (Solanum lycopersicum L.), cv Saudi-1, named SlS-SUMO1 gene by PCR using specific primers. This gene has SUMO member's features such as C-terminal diglycine (GG) motif as processing site by ULP (ubiquitin-like SUMO protease) and has SUMO consensus ΨKXE/D sequence. Phylogenetic analysis showed that SlS-SUMO1 gene is highly conserved and homologous to Potatoes Ca-SUMO1 and Ca-SUMO2 genes based on sequence similarity. Expression protein of SlS-SUMO1 gene found to be localized in the nucleus, cytoplasm, and nuclear envelop or nuclear pore complex. SUMO conjugating enzyme SCE1a with SlS-SUMO1 protein co-expressed and co-localized in nucleus and formed nuclear subdomains. This study reported that the SlS-SUMO1 gene is a member of SUMO family and its SUMO protein processing using GG motif and activate and transport to nucleus through Sumoylation system in the plant cell.  相似文献   

4.
5.
Genome sequencing projects has led to an explosion of large amount of gene products in which many are of hypothetical proteins with unknown function. Analyzing and annotating the functions of hypothetical proteins is important in Staphylococcus aureus which is a pathogenic bacterium that cause multiple types of diseases by infecting various sites in humans and animals. In this study, ten hypothetical proteins of Staphylococcus aureus were retrieved from NCBI and analyzed for their structural and functional characteristics by using various bioinformatics tools and databases. The analysis revealed that some of them possessed functionally important domains and families and protein-protein interacting partners which were ABC transporter ATP-binding protein, Multiple Antibiotic Resistance (MAR) family, export proteins, Helix-Turn-helix domains, arsenate reductase, elongation factor, ribosomal proteins, Cysteine protease precursor, Type-I restriction endonuclease enzyme and plasmid recombination enzyme which might have the same functions in hypothetical proteins. The structural prediction of those proteins and binding sites prediction have been done which would be useful in docking studies for aiding in the drug discovery.  相似文献   

6.
Fukushima A  Ikemura T  Kinouchi M  Oshima T  Kudo Y  Mori H  Kanaya S 《Gene》2002,300(1-2):203-211
We used a power spectrum method to identify periodic patterns in nucleotide sequence, and characterized nucleotide sequences that confer periodicities to prokaryotic and eukaryotic genomes and genomes. A 10-bp periodicity was prevalent in hyperthermophilic bacteria and archaebacteria, and an 11-bp periodicity was prevalent in eubacteria. The 10-bp periodicity was also prevalent in the eukaryotes such as the worm Caenorhabditis elegans. Additionally, in the worm genome, a 68-bp periodicity in chromosome I, a 59-bp periodicity in chromosome II, and a 94-bp periodicity in chromosome III were found. In human chromosomes 21 and 22, approximately 167- or 84-bp periodicity was detected along the entire length of these chromosomes. Because the 167-bp is identical to the length of DNA that forms two complete helical turns in nucleosome organization, we speculated that the respective sequences may correspond to arrays of a special compact form of nucleosomes clustered in specific regions of the human chromosomes. This periodic element contained a high frequency of TGG. TGG-rich sequences are known to form a specific subset of folded DNA structures, and therefore, the sequences might have potential to form specific higher order structures related to the clustered occurrence of a specific form of the speculated nucleosomes.  相似文献   

7.
The N- and O-glycans of Arianta arbustorum, Achatina fulica, Arion lusitanicus and Planorbarius corneus were analysed for their monosaccharide pattern by reversed-phase HPLC after labelling with 2-aminobenzoic acid or 3-methyl-1-phenyl-2-pyrazolin-5-one and by gas chromatography-mass spectrometry. Glucosamine, galactosamine, mannose, galactose, glucose, fucose and xylose were identified. Furthermore, three different methylated sugars were detected: 3-O-methyl-mannose and 3-O-methyl-galactose were confirmed to be a common snail feature; 4-O-methyl-galactose was detected for the first time in snails.  相似文献   

8.
The Leishmania homologue of activated C kinase (LACK) a known T cell epitope from soluble Leishmania antigens (SLA) that confers protection against Leishmania challenge. This antigen has been found to be highly conserved among Leishmania strains. LACK has been shown to be protective against L. donovani challenge. A comprehensive analysis of several LACK sequences was completed. The analysis shows a high level of conservation, lower variability and higher antigenicity in specific portions of the LACK protein. This information provides insights for the potential consideration of LACK as a putative candidate in the context of visceral Leishmaniasis vaccine target.  相似文献   

9.
10.
Based on searches for disabled homologs to known proteins, we have identified a large population of pseudogenes in four sequenced eukaryotic genomes—the worm, yeast, fly and human (chromosomes 21 and 22 only). Each of our nearly 2500 pseudogenes is characterized by one or more disablements mid-domain, such as premature stops and frameshifts. Here, we perform a comprehensive survey of the amino acid and nucleotide composition of these pseudogenes in comparison to that of functional genes and intergenic DNA. We show that pseudogenes invariably have an amino acid composition intermediate between genes and translated intergenic DNA. Although the degree of intermediacy varies among the four organisms, in all cases, it is most evident for amino acid types that differ most in occurrence between genes and intergenic regions. The same intermediacy also applies to codon frequencies, especially in the worm and human. Moreover, the intermediate composition of pseudogenes applies even though the composition of the genes in the four organisms is markedly different, showing a strong correlation with the overall A/T content of the genomic sequence. Pseudogenes can be divided into ‘ancient’ and ‘modern’ subsets, based on the level of sequence identity with their closest matching homolog (within the same genome). Modern pseudogenes usually have a much closer sequence composition to genes than ancient pseudogenes. Collectively, our results indicate that the composition of pseudogenes that are under no selective constraints progressively drifts from that of coding DNA towards non-coding DNA. Therefore, we propose that the degree to which pseudogenes approach a random sequence composition may be useful in dating different sets of pseudogenes, as well as to assess the rate at which intergenic DNA accumulates mutations. Our compositional analyses with the interactive viewer are available over the web at http://genecensus.org/pseudogene.  相似文献   

11.
Mannose-6-phosphate (M-6-P) glycan analysis is important for quality control of therapeutic enzymes for lysosomal storage diseases. Here, we found that the analysis of glycans containing two M-6-Ps was highly affected by the hydrophilicity of the elution solvent used in high-performance liquid chromatography (HPLC). In addition, the performances of three fluorescent tags—2-aminobenzoic acid (2-AA), 2-aminobenzamide (2-AB), and 3-(acetyl-amino)-6-aminoacridine (AA-Ac)—were compared with each other for M-6-P glycan analysis using HPLC and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. The best performance for analyzing M-6-P glycans was shown by 2-AA labeling in both analyses.  相似文献   

12.
C Gao  M Xiao  X Ren  A Hayward  J Yin  L Wu  D Fu  J Li 《Genomics》2012,100(4):222-230
The movement of transposable elements (TE) in eukaryotic genomes can often result in the occurrence of nested TEs (the insertion of TEs into pre-existing TEs). We performed a general TE assessment using available databases to detect nested TEs and analyze their characteristics and putative functions in eukaryote genomes. A total of 802 TEs were found to be inserted into 690 host TEs from a total number of 11,329 TEs. We reveal that repetitive sequences are associated with an increased occurrence of nested TEs and sequence biased of TE insertion. A high proportion of the genes which were associated with nested TEs are predicted to localize to organelles and participate in nucleic acid and protein binding. Many of these function in metabolic processes, and encode important enzymes for transposition and integration. Therefore, nested TEs in eukaryotic genomes may negatively influence genome expansion, and enrich the diversity of gene expression or regulation.  相似文献   

13.
The Bos indicus Vechur breed cow milk is known for its medicinal value and the breed is listed under the category of critically maintained breeds by the Food and Agriculture Organization. The lactoferrin protein in milk is known for its nutritional value. Gene polymorphisms have been reported for Bovine lactoferrin. Mutations in the evolutionarily conserved sites tend to impair protein function and are related with the physicochemical difference between the known variants with 11 SNPs within the wild type. Structural differences are located due to these SNPs that may lead to functional variations. The structural variation is seen primarily in the first 48 residues at 5' end in all the samples modelled. Out of 11 SNPs 5 amino acid variations fall under alpha helix and beta sheet region, this might be of functional significance. This result may provide evidence that the SNPs detected in lactoferrin gene might have potential effects on milk composition. Our result demonstrates one major domain that could be a common binding pocket to all the samples, and important as an active site common to all the breeds that could be utilized for effective drug designing. Moreover, at some SNP positions in Vechur breed, antimicrobial peptides were located indicating importance of those residues for enhanced antimicrobial activity in lactoferrin of Vechur breed. Second binding pocket found in N- lobe region with the three required residues aspartic acid, histidine and tyrosine for iron binding, was considered as major binding site.  相似文献   

14.
Pandit SB  Srinivasan N 《Proteins》2003,52(4):585-597
The members of the family of G-proteins are characterized by their ability to bind and hydrolyze guanosine triphosphate (GTP) to guanosine diphosphate (GDP). Despite a common biochemical function of GTP hydrolysis shared among the members of the family of G-proteins, they are associated with diverse biological roles. The current work describes the identification and detailed analysis of the putative G-proteins encoded in the completely sequenced prokaryotic genomes. Inferences on the biological roles of these G-proteins have been obtained by their classification into known functional subfamilies. We have identified 497 G-proteins in 42 genomes. Seven small GTP-binding protein homologues have been identified in prokaryotes with at least two of the diagnostic sequence motifs of G-proteins conserved. The translation factors have the largest representation (234 sequences) and are found to be ubiquitous, which is consistent with their critical role in protein synthesis. The GTP_OBG subfamily comprises of 79 sequences in our dataset. A total of 177 sequences belong to the subfamily of GTPase of unknown function and 154 of these could be associated with domains of known functions such as cell cycle regulation and t-RNA modification. The large GTP-binding proteins and the alpha-subunit of heterotrimeric G-proteins are not detected in the genomes of the prokaryotes surveyed.  相似文献   

15.
We developed a highly accurate method to predict polyketide (PK) and nonribosomal peptide (NRP) structures encoded in microbial genomes. PKs/NRPs are polymers of carbonyl/peptidyl chains synthesized by polyketide synthases (PKS) and nonribosomal peptide synthetases (NRPS). We analyzed domain sequences corresponding to specific substrates and physical interactions between PKSs/NRPSs in order to predict which substrates (carbonyl/peptidyl units) are selected and assembled into highly ordered chemical structures. The predicted PKs/NRPs were represented as the sequences of carbonyl/peptidyl units to extract the structural motifs efficiently. We applied our method to 4529 PKSs/NRPSs and found 619 PKs/NRPs. We also collected 1449 PKs/NRPs whose chemical structures have been determined experimentally. The structural sequences were compared using the Smith-Waterman algorithm, and clustered into 271 clusters. From the compound clusters, we extracted 33 structural motifs that are significantly related with their bioactivities. We used the structural motifs to infer functions of 13 novel PKs/NRPs clusters produced by Pseudomonas spp. and Burkholderia spp. and found a putative virulence factor. The integrative analysis of genomic and chemical information given here will provide a strategy to predict the chemical structures, the biosynthetic pathways, and the biological activities of PKs/NRPs, which is useful for the rational design of novel PKs/NRPs.  相似文献   

16.
Essentially all of the sequences in the pea (Pisum sativum) genome which reassociate with single copy kinetics at standard (Tm -25°C) criterion follow repetitive kinetics at lower temperatures (about Tm-35°C). Analysis of thermal stability profiles for presumptive single copy duplexes show that they contain substantial mismatch even when formed at standard criterion. Thus most of the sequences in the pea genome which are conventionally defined as single copy are actually fossil repeats — that is, they are members of extensively diverged (mutuated) and thus presumably ancient families of repeated sequences. Coding sequences as represented by a cDNA probe prepared from poly-somal poly(A) + mRNA reassociate with single copy kinetics regardless of criterion and do not form mismatched duplexes. The coding regions thus appear to be composed of true single copy sequences but they cannot represent more than a few percent of the pea genome. Ancient diverged repeats are present, but not a prominent feature of the smaller mung bean (Vigna radiata) genome. An extension of a simple evolutionary model is proposed in which these and other differences in genome organization are considered to reflect different rates of sequence amplification or genome turnover during evolution. The model accounts for some of the differences between typical plant and animal genomes.  相似文献   

17.
18.
Dicot wood is mainly composed of cellulose, lignin and glucuronoxylan (GX). Although the biosynthetic genes for cellulose and lignin have been studied intensively, little is known about the genes involved in the biosynthesis of GX during wood formation. Here, we report the molecular characterization of two genes, PoGT8D and PoGT43B, which encode putative glycosyltransferases, in the hybrid poplar Populus alba x tremula. The predicted amino acid sequences of PoGT8D and PoGT43B exhibit 89 and 75% similarity to the Arabidopsis thaliana IRREGULAR XYLEM8 (IRX8) and IRX9, respectively, both of which have been shown to be required for GX biosynthesis. The PoGT8D and PoGT43B genes were found to be expressed in cells undergoing secondary wall thickening, including the primary xylem, secondary xylem and phloem fibers in stems, and the secondary xylem in roots. Both PoGT8D and PoGT43B are predicted to be type II membrane proteins and shown to be targeted to Golgi. Overexpression of PoGT43B in the irx9 mutant was able to rescue the defects in plant size and secondary wall thickness and partially restore the xylose content. Taken together, our results demonstrate that PoGT8D and PoGT43B are Golgi-localized, secondary wall-associated proteins, and PoGT43B is a functional ortholog of IRX9 involved in GX biosynthesis during wood formation.  相似文献   

19.
Sugino H 《FEBS letters》2007,581(3):355-360
The rat and mouse amylase gene families were characterized using sequence data from the UCSC genome assembly. We found that the rat genome contains one amylase-1 and two amylase-2 genes, lying close to one another on the same chromosome. Detailed analysis revealed at least six additional amylase pseudogenes in the rat genome in the region adjacent to the amylase-2 genes. In contrast, the mouse has one amylase-1 gene and five amylase-2 genes; the latter are tandemly and systematically arranged on the same chromosome and were generated by segmental duplication. Detailed analysis revealed that the mouse has two amylase pseudogenes, located 5' to the five amylase-2 segments. Thus, the amylase genes of mouse and rat tend to be amplified; the sequences of some of them are fixed while others have become pseudogenes during evolution. This is the second report of amylase genomic organization in mammals and the first in the rodents.  相似文献   

20.
Simple Sequence Repeats (SSR), also called microsatellite, is very useful for genetic marker development and genome application. The increasing whole sequences of more and more large genomes provide sources for SSR mining in silico. However currently existing SSR mining tools can’t process large genomes efficiently and generate no or poor statistics. Genome-wide Microsatellite Analyzing Tool (GMATo) is a novel tool for SSR mining and statistics at genome aspects. It is faster and more accurate than existed tools SSR Locator and MISA. If a DNA sequence was too long, it was chunked to short segments at several Mb followed by motifs generation and searching using Perl powerful pattern match function. Matched loci data from each chunk were then merged to produce final SSR loci information. Only one input file is required which contains raw fasta DNA sequences and output files in tabular format list all SSR loci information and statistical distribution at four classifications. GMATo was programmed in Java and Perl with both graphic and command line interface, either executable alone in platform independent manner with full parameters control. Software GMATo is a powerful tool for complete SSR characterization in genomes at any size.

Availability

The soft GMATo is freely available at http://sourceforge.net/projects/gmato/files/?source=navbar or on contact  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号