首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
4.
There is currently convincing evidence that microRNAs have evolved independently in at least six different eukaryotic lineages: animals, land plants, chlorophyte green algae, demosponges, slime molds and brown algae. MicroRNAs from different lineages are not homologous but some structural features are strongly conserved across the eukaryotic tree allowing the application of stringent criteria to identify novel microRNA loci. A large set of 63 microRNA families was identified in the brown alga Ectocarpus based on mapping of RNA-seq data and nine microRNAs were confirmed by northern blotting. The Ectocarpus microRNAs are highly diverse at the sequence level with few multi-gene families, and do not tend to occur in clusters but exhibit some highly conserved structural features such as the presence of a uracil at the first residue. No homologues of Ectocarpus microRNAs were found in other stramenopile genomes indicating that they emerged late in stramenopile evolution and are perhaps specific to the brown algae. The large number of microRNA loci in Ectocarpus is consistent with the developmental complexity of many brown algal species and supports a proposed link between the emergence and expansion of microRNA regulatory systems and the evolution of complex multicellularity.  相似文献   

5.
YedZ of Escherichia coli is an integral 6 transmembrane spanning (TMS) protein of unknown function. We have identified homologues of YedZ in bacteria and animals but could not find homologues in Archaea or the other eukaryotic kingdoms. YedZ homologues exhibit conserved histidyl residues in their transmembrane domains that may function in heme binding. Some of the homologues encoded in the genomes of magnetotactic bacteria and cyanobacteria have YedZ domains fused to transport and electron transfer proteins, respectively. One of the animal homologues is the 6 TMS epithelial plasma membrane antigen of the prostate (STAMP1) that is overexpressed in prostate cancer. Animal homologues have YedZ domains fused C-terminal to homologues of coenzyme F420-dependent NADP oxidoreductases. YedZ homologues are shown to have arisen by intragenic triplication of a 2 TMS-encoding element. They exhibit slight but statistically significant sequence similarity to two families of putative heme export systems and one family of cytochrome-containing electron carriers. We propose that YedZ homologues function as heme-binding proteins that can facilitate or regulate oxidoreduction, transmembrane electron flow and transport.  相似文献   

6.
7.
Fast algorithms for large-scale genome alignment and comparison   总被引:35,自引:5,他引:30       下载免费PDF全文
We describe a suffix-tree algorithm that can align the entire genome sequences of eukaryotic and prokaryotic organisms with minimal use of computer time and memory. The new system, MUMmer 2, runs three times faster while using one-third as much memory as the original MUMmer system. It has been used successfully to align the entire human and mouse genomes to each other, and to align numerous smaller eukaryotic and prokaryotic genomes. A new module permits the alignment of multiple DNA sequence fragments, which has proven valuable in the comparison of incomplete genome sequences. We also describe a method to align more distantly related genomes by detecting protein sequence homology. This extension to MUMmer aligns two genomes after translating the sequence in all six reading frames, extracts all matching protein sequences and then clusters together matches. This method has been applied to both incomplete and complete genome sequences in order to detect regions of conserved synteny, in which multiple proteins from one organism are found in the same order and orientation in another. The system code is being made freely available by the authors.  相似文献   

8.
The heat shock protein 70 kDa sequences (HSP70) are of great importance as molecular chaperones in protein folding and transport. They are abundant under conditions of cellular stress. They are highly conserved in all domains of life: Archaea, eubacteria, eukaryotes, and organelles (mitochondria, chloroplasts). A multiple alignment of a large collection of these sequences was obtained employing our symmetric-iterative ITERALIGN program (Brocchieri and Karlin 1998). Assessments of conservation are interpreted in evolutionary terms and with respect to functional implications. Many archaeal sequences (methanogens and halophiles) tend to align best with the Gram-positive sequences. These two groups also miss a signature segment [about 25 amino acids (aa) long] present in all other HSP70 species (Gupta and Golding 1993). We observed a second signature sequence of about 4 aa absent from all eukaryotic homologues, significantly aligned in all prokaryotic sequences. Consensus sequences were developed for eight groups [Archaea, Gram-positive, proteobacterial Gram-negative, singular bacteria, mitochondria, plastids, eukaryotic endoplasmic reticulum (ER) isoforms, eukaryotic cytoplasmic isoforms]. All group consensus comparisons tend to summarize better the alignments than do the individual sequence comparisons. The global individual consensus ``matches' 87% with the consensus of consensuses sequence. A functional analysis of the global consensus identifies a (new) highly significant mixed charge cluster proximal to the carboxyl terminus of the sequence highlighting the hypercharge run EEDKKRRER (one-letter aa code used). The individual Archaea and Gram-positive sequences contain a corresponding significant mixed charge cluster in the location of the charge cluster of the consensus sequence. In contrast, the four Gram-negative proteobacterial sequences of the alignment do not have a charge cluster (even at the 5% significance level). All eukaryotic HSP70 sequences have the analogous charge cluster. Strikingly, several of the eukaryotic isoforms show multiple mixed charged clusters. These clusters were interpreted with supporting data related to HSP70 activity in facilitating chaperone, transport, and secretion function. We observed that the consensus contains only a single tryptophan residue and a single conserved cysteine. This is interpreted with respect to the target rule for disaggregating misfolded proteins. The mitochondrial HSP70 connections to bacterial HSP70 are analyzed, suggesting a polyphyletic split of Trypanosoma and Leishmania protist mitochondrial (Mt) homologues separated from Mt-animal/fungal/plant homologues. Moreover, the HSP70 sequences from the amitochondrial Entamoeba histolytica and Trichomonas vaginalis species were analyzed. The E. histolytica HSP70 is most similar to the higher eukaryotic cytoplasmic sequences, with significantly weaker alignments to ER sequences and much diminished matching to all eubacterial, mitochondrial, and chloroplast sequences. This appears to be at variance with the hypothesis that E. histolytica rather recently lost its mitochondrial organelle. T. vaginalis contains two HSP70 sequences, one Mt-like and the second similar to eukaryotic cytoplasmic sequences suggesting two diverse origins. Received: 29 January 1998 / Accepted: 14 May 1998  相似文献   

9.
10.
Approximately 40 ribosomal proteins from each Halobacterium marismortui and Bacillus stearothermophilus have been sequenced either by direct protein sequence analysis or by DNA sequence analysis of the appropriate genes. The comparison of the amino acid sequences from the archaebacterium H marismortui with the available ribosomal proteins from the eubacterial and eukaryotic kingdoms revealed four different groups of proteins: 24 proteins are related to both eubacterial as well as eukaryotic proteins. Eleven proteins are exclusively related to eukaryotic counterparts. For three proteins only eubacterial relatives-and for another three proteins no counterpart-could be found. The similarities of the halobacterial ribosomal proteins are in general somewhat higher to their eukaryotic than to their eubacterial counterparts. The comparison of B stearothermophilus proteins with their E coli homologues showed that the proteins evolved at different rates. Some proteins are highly conserved with 64-76% identity, others are poorly conserved with only 25-34% identical amino acid residues.  相似文献   

11.
12.
13.
Repeat elements are important components of eukaryotic genomes. One limitation in our understanding of repeat elements is that most analyses rely on reference genomes that are incomplete and often contain missing data in highly repetitive regions that are difficult to assemble. To overcome this problem we develop a new method, REPdenovo, which assembles repeat sequences directly from raw shotgun sequencing data. REPdenovo can construct various types of repeats that are highly repetitive and have low sequence divergence within copies. We show that REPdenovo is substantially better than existing methods both in terms of the number and the completeness of the repeat sequences that it recovers. The key advantage of REPdenovo is that it can reconstruct long repeats from sequence reads. We apply the method to human data and discover a number of potentially new repeats sequences that have been missed by previous repeat annotations. Many of these sequences are incorporated into various parasite genomes, possibly because the filtering process for host DNA involved in the sequencing of the parasite genomes failed to exclude the host derived repeat sequences. REPdenovo is a new powerful computational tool for annotating genomes and for addressing questions regarding the evolution of repeat families. The software tool, REPdenovo, is available for download at https://github.com/Reedwarbler/REPdenovo.  相似文献   

14.
15.
16.
In the process of analysing the four available complete archaeal genomes, we have noted that certain regions characterised as 'non-coding' exhibit significant sequence similarity to other protein sequences from Archaea and other species. Using established technology, we have identified a number of potential protein coding regions in these putative 'non-coding' regions. We have detected 524 such cases, of which 113 regions appear to code for proteins present in archaeal or other species, while the remaining 411 regions are mostly start/stop definition conflicts. Of the 113 protein coding regions, only 21 code for proteins with homologues of known function. The number of novel coding sequences identified herein amounts to 1. 5% of the total genome entries, while the conflicting cases represent an additional 5%. The observed differences between the four complete archaeal genomes seem to reflect disparate approaches to genome annotation. Genome sequence collections should be regularly checked to improve gene prediction by sequence similarity and greater effort is required to make gene definitions consistent across related species.  相似文献   

17.
Several GTP-binding proteins with poorly defined functions were previously identified in Escherichia coli (i.e. Era, ThdF (TrmE)), Bacillus subtilis (i.e. Obg) and Neisseria gonorrhoeae (i.e. EngA). In these species, every individual protein is encoded by an essential gene. BLAST searches were used to detect orthologs in genomes of various organisms. Alignments of orthologous sequences allowed the construction of phylogenetic trees and the definition of protein families. The BLAST searches also resulted in the identification of two additional families, the YchF and YihA families, named after the ychF and yihA genes of E. coli. Most families are not present in archaeal genomes, but representatives of each family were also detected in eukaryotic genomes. Only representatives of the YchF family are present in every genome sequenced to date, suggesting that YchF-like proteins might be involved in a fundamental life process. The GTP1/DRG family consisting of eukaryotic and archaeal proteins is related to the YchF family of GTP-binding proteins. The relationship of the six prokaryotic families of GTP-binding proteins and the GTP1/DRG family to eukaryotic GTPase families was also investigated: With the exception of the ARF family, a clear separation of the six prokaryotic families and the GTP1/DRG family with respect to eukaryotic (RAB, RAN, RAS and RHO) GTPases was observed.  相似文献   

18.
19.
Annexins are Ca2+-binding, membrane-interacting proteins, widespread among eukaryotes, consisting usually of four structurally similar repeated domains. It is accepted that vertebrate annexins derive from a double genome duplication event. It has been postulated that a single domain annexin, if found, might represent a molecule related to the hypothetical ancestral annexin. The recent discovery of a single-domain annexin in a bacterium, Cytophaga hutchinsonii, apparently confirmed this hypothesis. Here, we present a more complex picture. Using remote sequence similarity detection tools, a survey of bacterial genomes was performed in search of annexin-like proteins. In total, we identified about thirty annexin homologues, including single-domain and multi-domain annexins, in seventeen bacterial species. The thorough search yielded, besides the known annexin homologue from C. hutchinsonii, homologues from the Bacteroidetes/Chlorobi phylum, from Gemmatimonadetes, from beta- and delta-Proteobacteria, and from Actinobacteria. The sequences of bacterial annexins exhibited remote but statistically significant similarity to sequence profiles built of the eukaryotic ones. Some bacterial annexins are equipped with additional, different domains, for example those characteristic for toxins. The variation in bacterial annexin sequences, much wider than that observed in eukaryotes, and different domain architectures suggest that annexins found in bacteria may actually descend from an ancestral bacterial annexin, from which eukaryotic annexins also originate. The hypothesis of an ancient origin of bacterial annexins has to be reconciled with the fact that remarkably few bacterial strains possess annexin genes compared to the thousands of known bacterial genomes and with the patchy, anomalous phylogenetic distribution of bacterial annexins. Thus, a massive annexin gene loss in several bacterial lineages or very divergent evolution would appear a likely explanation. Alternative evolutionary scenarios, involving horizontal gene transfer between bacteria and protozoan eukaryotes, in either direction, appear much less likely. Altogether, current evidence does not allow unequivocal judgement as to the origin of bacterial annexins.  相似文献   

20.
Genomic polymorphism in the T-even bacteriophages.   总被引:11,自引:0,他引:11       下载免费PDF全文
F Repoila  F Tétart  J Y Bouet    H M Krisch 《The EMBO journal》1994,13(17):4181-4192
We have compared the genomes of 49 bacteriophages related to T4. PCR analysis of six chromosomal regions reveals two types of local sequence variation. In four loci, we found only two alternative configurations in all the genomes that could be analyzed. In contrast, two highly polymorphic loci exhibit variations in the number, the order and the identity of the sequences present. In phage T4, both highly polymorphic loci encode internal proteins (IPs) that are encapsidated in the phage particle and injected with the viral DNA. Among the various T4-related phages, 10 different ORFs have been identified in the IP loci; their amino acid sequences have the characteristics of internal proteins. At the beginning of each of these coding sequences is a highly conserved 11 amino acid leader motif. In addition, both 5' and 3' to most of these ORFs, there is a approximately 70 bp sequence that contains a T4 early promoter sequence with an overlapping inversely repeated sequence. The homologies within these flanking sequences may mediate the recombinational shuffling of the IP sequences within the locus. A role for the new IP-like sequences in determining the phage host range is proposed since such a role has been previously demonstrated for the IP1 gene of T4.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号