首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The functional significance of evolutionarily conserved motifs/patterns of short regions in proteins is well documented. Although a large number of sequences are conserved, only a small fraction of these are invariant across several organisms. Here, we have examined the structural features of the functionally important peptide sequences, which have been found invariant across diverse bacterial genera. Ramachandran angles (phi,psi) have been used to analyze the conformation, folding patterns and geometrical location (buried/exposed) of these invariant peptides in different crystal structures harboring these sequences. The analysis indicates that the peptides preferred a single conformation in different protein structures, with the exception of only a few longer peptides that exhibited some conformational variability. In addition, it is noticed that the variability of conformation occurs mainly due to flipping of peptide units about the virtual C(alpha)...C(alpha) bond. However, for a given invariant peptide, the folding patterns are found to be similar in almost all the cases. Over and above, such peptides are found to be buried in the protein core. Thus, we can safely conclude that these invariant peptides are structurally important for the proteins, since they acquire unique structures across different proteins and can act as structural determinants (SD) of the proteins. The location of these SD peptides on the protein chain indicated that most of them are clustered towards the N-terminal and middle region of the protein with the C-terminal region exhibiting low preference. Another feature that emerges out of this study is that some of these SD peptides can also play the roles of "fold boundaries" or "hinge nucleus" in the protein structure. The study indicates that these SD peptides may act as chain-reversal signatures, guiding the proteins to adopt appropriate folds. In some cases the invariant signature peptides may also act as folding nuclei (FN) of the proteins.  相似文献   

2.
Two types of Herpesvirus saimiri genomes can be isolated from purified virions: (i) the M genome is a double-stranded, liniear DNA molecule with a mean contour length corresponding to 89 times 10-6 daltons. The M genome contains about 70% of unique sequences (light DNA, 36% guanine plus cytosine) and 30% reiterated sequences (heavy DNA, 71% guanine plus cytosine). (ii) the H genome is composed of heavy DNA only and is more heterogeneous in size. The sequences in the H genome are up to 40-fold reiterated, indicating defectiveness of this type of genome. The repetitions in the H genome and the M genome cross-hybridize almost completely and have identical kinetic complexity (2.8 times 10-6 daltons). DNA infectivity studies by using the calcium phosphate and the DEAE-dextran method gave further evidence that H genomes are defective: no infectious virus was recovered from permissive cells treated with heavy DNA, whereas M genome-infected cells developed cytopathic changes after 11 to 56 days. Defective H genomes were present in the progeny virus two passages after transfection.  相似文献   

3.
Q Xu  G Xiong  P Li  F He  Y Huang  K Wang  Z Li  J Hua 《PloS one》2012,7(8):e37128

Background

Cotton (Gossypium spp.) is a model system for the analysis of polyploidization. Although ascertaining the donor species of allotetraploid cotton has been intensively studied, sequence comparison of Gossypium chloroplast genomes is still of interest to understand the mechanisms underlining the evolution of Gossypium allotetraploids, while it is generally accepted that the parents were A- and D-genome containing species. Here we performed a comparative analysis of 13 Gossypium chloroplast genomes, twelve of which are presented here for the first time.

Methodology/Principal Findings

The size of 12 chloroplast genomes under study varied from 159,959 bp to 160,433 bp. The chromosomes were highly similar having >98% sequence identity. They encoded the same set of 112 unique genes which occurred in a uniform order with only slightly different boundary junctions. Divergence due to indels as well as substitutions was examined separately for genome, coding and noncoding sequences. The genome divergence was estimated as 0.374% to 0.583% between allotetraploid species and A-genome, and 0.159% to 0.454% within allotetraploids. Forty protein-coding genes were completely identical at the protein level, and 20 intergenic sequences were completely conserved. The 9 allotetraploids shared 5 insertions and 9 deletions in whole genome, and 7-bp substitutions in protein-coding genes. The phylogenetic tree confirmed a close relationship between allotetraploids and the ancestor of A-genome, and the allotetraploids were divided into four separate groups. Progenitor allotetraploid cotton originated 0.43–0.68 million years ago (MYA).

Conclusion

Despite high degree of conservation between the Gossypium chloroplast genomes, sequence variations among species could still be detected. Gossypium chloroplast genomes preferred for 5-bp indels and 1–3-bp indels are mainly attributed to the SSR polymorphisms. This study supports that the common ancestor of diploid A-genome species in Gossypium is the maternal source of extant allotetraploid species and allotetraploids have a monophyletic origin. G. hirsutum AD1 lineages have experienced more sequence variations than other allotetraploids in intergenic regions. The available complete nucleotide sequences of 12 Gossypium chloroplast genomes should facilitate studies to uncover the molecular mechanisms of compartmental co-evolution and speciation of Gossypium allotetraploids.  相似文献   

4.
MIPS: a database for protein sequences and complete genomes.   总被引:7,自引:0,他引:7       下载免费PDF全文
The MIPS group [Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)] at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis of other small genomes and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). Through its WWW server (http://www.mips.biochem.mpg.de ) MIPS provides access to a variety of generic databases, including a database of protein families as well as automatically generated data by the systematic application of sequence analysis algorithms. The yeast genome sequence and its related information was also compiled on CD-ROM to provide dynamic interactive access to the 16 chromosomes of the first eukaryotic genome unraveled.  相似文献   

5.
Liriomyza trifolii (Burgess), Liriomyza huidobrensis (Blanchard), and Liriomyza bryoniae (Kaltenbach), are three closely related and economically important leafminer pests in the world. This study examined the complete mitochondrial genomes of L. trifolii, L. huidobrensis and L. bryoniae, which were 16141 bp, 16236 bp and 16183 bp in length, respectively. All of them displayed 37 typical animal mitochondrial genes and an A + T-rich region. The genomes were highly compact with only 60–68 bp of non-coding intergenic spacer. However, considerable differences in the A + T-rich region were detected among the three species. Results of this study also showed the two ribosomal RNA genes of the three species had very limited variable sites and thus should not provide much information in the study of population genetics of these species. Data generated from three leafminers' complete mitochondrial genomes should provide valuable information in studying phylogeny of Diptera, and developing genetic markers for species identification in leafminers.  相似文献   

6.
7.
Shannon information in the genomes of all completely sequenced prokaryotes and eukaryotes are measured in word lengths of two to ten letters. It is found that in a scale-dependent way, the Shannon information in complete genomes are much greater than that in matching random sequences--thousands of times greater in the case of short words. Furthermore, with the exception of the 14 chromosomes of Plasmodium falciparum, the Shannon information in all available complete genomes belong to a universality class given by an extremely simple formula. The data are consistent with a model for genome growth composed of two main ingredients: random segmental duplications that increase the Shannon information in a scale-independent way, and random point mutations that preferentially reduces the larger-scale Shannon information. The inference drawn from the present study is that the large-scale and coarse-grained growth of genomes was selectively neutral and this suggests an independent corroboration of Kimura's neutral theory of evolution.  相似文献   

8.
Raghavan S  Hariharan R  Brahmachari SK 《Gene》2000,242(1-2):275-283
The genomes of Methanococcus jannaschii, Mycoplasma genitalium, Haemophilus influenzae, Archaeoglobus fulgidus, Helicobacter pylori, Treponema pallidum, Borrelia burgdorferri, Rickettsia prowazekeii, Mycobacterium tuberculosis, Methanobacterium thermoautotrophicum, Synechocystis sp. PCC6803, Bacillus subtilis, Chlamydia trachomatis, Pyrococcus horikoshii, Aquifex aeolicus, Mycoplasma pneumoniae and Escherichia coli have been analysed for the presence of polypurine.polypyrimidine tracts, in order to understand their distribution in these genomes. We observed a variation in abundance of such sequences in these bacteria, with the archaeal genomes forming a high-abundance group and the canonical eubacteria forming a low-abundance group. The genomes of M. tuberculosis and A. aeolicus are unique among the organisms analysed here in the abnormal underrepresentation and overrepresentation of polypurine.polypyrimidine, respectively. We also observe a strand bias, i.e., a preferential occurrence of polypurines in coding strands. It varies widely among the bacteria, from the very high bias in M. jannaschii to the slightly inverse bias in the parasitic genomes of T. pallidum and C. trachomatis. The extent of strand bias, however, cannot be explained on the basis of the GC-content of the genome, use of all-purine codons or an excess in the amino acids that are encoded by such codons. The probable causes and effects of this phenomenon are discussed.  相似文献   

9.
10.
Mitochondrial DNA (mtDNA), as a model sys-tem, has been extensively used for molecular phy-logenetic and evolutionary analysis[1]. With the ad-vances in DNA sequencing technology, more andmore researchers prefer to use complete mitochondrialgenome for phylogenetic analysis[2—4]. Since the com-plete sequencing of human mtDNA in 1981 (Andersonet al., 1981)[5], 342 vertebrate mitochondrial genomeshave so far been sequenced. Up to now the completesequences of 29 avian mitochondrial genomes h…  相似文献   

11.
Insertion sequences (ISs) are small DNA segments that are often capable of moving neighbouring genes. Over 1500 different ISs have been identified to date. They can have large and spectacular effects in shaping and reshuffling the bacterial genome. Recent studies have provided dramatic examples of such IS activity, including massive IS expansion during the emergence of some pathogenic bacterial species and the intimate involvement of ISs in assembling genes into complex plasmid structures. However, a global understanding of their impact on bacterial genomes requires detailed knowledge of their distribution across the eubacterial and archaeal kingdoms, understanding their partition between chromosomes and extra-chromosomal elements (e.g. plasmids and viruses) and the factors which influence this, and appreciation of the different transposition mechanisms in action, the target preferences and the host factors that influence transposition. In addition, defective (non- autonomous) elements, which can be complemented by related active elements in the same cell, are often overlooked in genome annotations but also contribute to the evolution of genome organisation.  相似文献   

12.
Intervening sequences in chloroplast genomes   总被引:13,自引:0,他引:13  
B Koller  H Delius 《Cell》1984,36(3):613-622
  相似文献   

13.
14.
A method is proposed to represent and to analyze complete genome sequences (52 species from procaryotes and eukaryotes), based upon n-gram sequence's frequencies of amino acid pairs (bigrams), separated by a given number of other residues. For each of the species analyzed, it allows us to construct over-abundant and over-deficient occurrence profiles, summarizing amino acid bigram frequencies over the entire genome. The method deals efficiently with a sparseness of statistical representations of individual sequences, and describes every gene sequence in the same way, independently of its length and of the genome sizes. The frequency of over-abundant and over-deficient occurrences of bigrams presents a singular periodicity around 3.5 peptide bonds, suggesting a relation with the alpha helical secondary structure.  相似文献   

15.
16.
Chen M  Zeng G  Tan Z  Jiang M  Zhang J  Zhang C  Lu L  Lin Y  Peng J 《FEBS letters》2011,585(7):1072-1076
Compound microsatellites consisting of two or more repeats in close proximity have been found in eukaryotic genomes. So far such compound microsatellites have not been investigated in any prokaryotic genomes. We have therefore examined compound microsatellites in 22 complete genomes of Escherichia coli, which is one of the ideal model organisms to analyze the nature and evolution of prokaryotic compound microsatellites. Our results indicated that about 1.75-2.85% of all microsatellites could be accounted as compound microsatellites with very low complexity, and most compound microsatellites were composed of very different motifs. Compound microsatellites were significantly overrepresented in all surveyed genomes. These results were dramatically different from those in eukaryotes. We discussed the possible reasons for the observed divergence.  相似文献   

17.
Computer analysis of genes was performed for lower fungi Aspergillus fumigatus, Candida glabrata, Cryptococcus neoformans, Debaryomyces hansenii, Encephalitozoon cuniculi, Eremothecium gossypii, Kluyveromyces lactis, Magnaporthe grisea, Neurospora crassa, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Ustilago maydis, and Yarrowia lipolytica. The content of genes with an exon-intron structure in their genomes varied from 0.7 to 97.0%. The exon-intron structure substantially changes with an increasing portion of intron-containing genes. Gene size and total exon length proved to linearly depend on the intron number in the A. fumigatus, C. neoformans, M. grisea, N. crassa, S. pombe, and U. maydis genomes.  相似文献   

18.

Background  

Detecting new coding sequences (CDSs) in viral genomes can be difficult for several reasons. The typically compact genomes often contain a number of overlapping coding and non-coding functional elements, which can result in unusual patterns of codon usage; conservation between related sequences can be difficult to interpret – especially within overlapping genes; and viruses often employ non-canonical translational mechanisms – e.g. frameshifting, stop codon read-through, leaky-scanning and internal ribosome entry sites – which can conceal potentially coding open reading frames (ORFs).  相似文献   

19.

Background

With the advances in DNA sequencer-based technologies, it has become possible to automate several steps of the genotyping process leading to increased throughput. To efficiently handle the large amounts of genotypic data generated and help with quality control, there is a strong need for a software system that can help with the tracking of samples and capture and management of data at different steps of the process. Such systems, while serving to manage the workflow precisely, also encourage good laboratory practice by standardizing protocols, recording and annotating data from every step of the workflow.

Results

A laboratory information management system (LIMS) has been designed and implemented at the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) that meets the requirements of a moderately high throughput molecular genotyping facility. The application is designed as modules and is simple to learn and use. The application leads the user through each step of the process from starting an experiment to the storing of output data from the genotype detection step with auto-binning of alleles; thus ensuring that every DNA sample is handled in an identical manner and all the necessary data are captured. The application keeps track of DNA samples and generated data. Data entry into the system is through the use of forms for file uploads. The LIMS provides functions to trace back to the electrophoresis gel files or sample source for any genotypic data and for repeating experiments. The LIMS is being presently used for the capture of high throughput SSR (simple-sequence repeat) genotyping data from the legume (chickpea, groundnut and pigeonpea) and cereal (sorghum and millets) crops of importance in the semi-arid tropics.

Conclusion

A laboratory information management system is available that has been found useful in the management of microsatellite genotype data in a moderately high throughput genotyping laboratory. The application with source code is freely available for academic users and can be downloaded fromhttp://www.icrisat.org/gt-bt/lims/lims.asp.  相似文献   

20.
Organelle genomics has become an increasingly important research field, with applications in molecular modeling, phylogeny, taxonomy, population genetics and biodiversity. Typically, research projects involve the determination and comparative analysis of complete mitochondrial and plastid genome sequences, either from closely related species or from a taxonomically broad range of organisms. Here, we describe two alternative organelle genome sequencing protocols. The "random genome sequencing" protocol is suited for the large majority of organelle genomes irrespective of their size. It involves DNA fragmentation by shearing (nebulization) and blunt-end cloning of the resulting fragments into pUC or BlueScript-type vectors. This protocol excels in randomness of clone libraries as well as in time and cost-effectiveness. The "long-PCR-based genome sequencing" protocol is specifically adapted for DNAs of low purity and quantity, and is particularly effective for small organelle genomes. Library construction by either protocol can be completed within 1 week.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号