首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The peridinin-pigmented plastids of dinoflagellates are very poorly understood, in part because of the paucity of molecular data available from these endosymbiotic organelles. To identify additional gene sequences that would carry information about the biology of the peridinin-type dinoflagellate plastid and its evolutionary history, an analysis was undertaken of arbitrarily selected sequences from cDNA libraries constructed from Lingulodinium polyedrum (1012 non-redundant sequences) and Amphidinium carterae (2143). Among the two libraries 118 unique plastid-associated sequences were identified, including 30 (most from A. carterae) that are encoded in the plastid genome of the red alga Porphyra. These sequences probably represent bona fide nuclear genes, and suggest that there has been massive transfer of genes from the plastid to the nuclear genome in dinoflagellates. These data support the hypothesis that the peridinin-type plastid has a minimal genome, and provide data that contradict the hypothesis that there is an unidentified canonical genome in the peridinin-type plastid. Sequences were also identified that were probably transferred directly from the nuclear genome of the red algal endosymbiont, as well as others that are distinctive to the Alveolata. A preliminary report of these data was presented at the Botany 2002 meeting in Madison, WI.  相似文献   

2.
Availability of genome sequences of pathogens has provided a tremendous amount of information that can be useful in drug target and vaccine target identification. One of the recently adopted strategies is based on a subtractive genomics approach, in which the subtraction dataset between the host and pathogen genome provides information for a set of genes that are likely to be essential to the pathogen but absent in the host. This approach has been used successfully in recent times to identify essential genes in Pseudomonas aeruginosa. We have used the same methodology to analyse the whole genome sequence of the human gastric pathogen Helicobacter pylori. Our analysis revealed that out of the 1590 coding sequences of the pathogen, 40 represent essential genes that have no human homolog. We have further analysed these 40 genes by the protein sequence databases to list some 10 genes whose products are possibly exposed on the pathogen surface. This preliminary work reported here identifies a small subset of the Helicobacter proteome that might be investigated further for identifying potential drug and vaccine targets in this pathogen.  相似文献   

3.

Background  

The sequencing of the human genome has enabled us to access a comprehensive list of genes (both experimental and predicted) for further analysis. While a majority of the approximately 30000 known and predicted human coding genes are characterized and have been assigned at least one function, there remains a fair number of genes (about 12000) for which no annotation has been made. The recent sequencing of other genomes has provided us with a huge amount of auxiliary sequence data which could help in the characterization of the human genes. Clustering these sequences into families is one of the first steps to perform comparative studies across several genomes.  相似文献   

4.
Analysis of evolution of paralogous genes in a genome is central to our understanding of genome evolution. Comparison of closely related bacterial genomes, which has provided clues as to how genome sequences evolve under natural conditions, would help in such an analysis. With species Staphylococcus aureus, whole-genome sequences have been decoded for seven strains. We compared their DNA sequences to detect large genome polymorphisms and to deduce mechanisms of genome rearrangements that have formed each of them. We first compared strains N315 and Mu50, which make one of the most closely related strain pairs, at the single-nucleotide resolution to catalogue all the middle-sized (more than 10 bp) to large genome polymorphisms such as indels and substitutions. These polymorphisms include two paralogous gene sets, one in a tandem paralogue gene cluster for toxins in a genomic island and the other in a ribosomal RNA operon. We also focused on two other tandem paralogue gene clusters and type I restriction-modification (RM) genes on the genomic islands. Then we reconstructed rearrangement events responsible for these polymorphisms, in the paralogous genes and the others, with reference to the other five genomes. For the tandem paralogue gene clusters, we were able to infer sequences for homologous recombination generating the change in the repeat number. These sequences were conserved among the repeated paralogous units likely because of their functional importance. The sequence specificity (S) subunit of type I RM systems showed recombination, likely at the homology of a conserved region, between the two variable regions for sequence specificity. We also noticed novel alleles in the ribosomal RNA operons and suggested a role for illegitimate recombination in their formation. These results revealed importance of recombination involving long conserved sequence in the evolution of paralogous genes in the genome.  相似文献   

5.
The availability of whole genome sequences for Shewanella oneidensis and Geobacter sulfurreducens has provided numerous new biological insights into the function of these model dissimilatory metal-reducing bacteria. Many of these findings, including the identification of a high number of c-type cytochromes in both organisms, have resulted from comparative genomic analyses, and several have been experimentally confirmed. These genome sequences have also aided the identification of genes important for the reduction of metal ions and other electron acceptors utilized during anaerobic growth, by facilitating the identification of genes disrupted by random insertions. Technologies for assaying global expression patterns for genes and proteins have also been employed, but their application has been limited mainly to the analysis of the role of global regulatory genes and to identifying genes expressed or repressed in response to specific electron acceptors. It is anticipated that details of the mechanisms of metal ion respiration, and metabolism in general, will eventually be revealed by comprehensive, systems-level analyses enabled by functional genomics data.  相似文献   

6.
The genome sequences of Phaeodactylum tricornutum, Thalassiosira pseudonana, and Cyanidioschyzon merolae have provided significant evidence for the secondary endosymbiosis of diatoms in regard to the genome. Yet little about their relationships in regard to gene regulation pattern, such as microRNA (miRNA), has been reported. Using a homology search based on genomic sequences, 13, 3, and 7 predicted miRNA genes were found in genomes from P. tricornutum, T. pseudonana, and C. merolae, respectively. Of the 23 miRNA genes, 18 had homology with animal miRNAs, implying that they are ancestral miRNAs. A phylogenetic tree based on common miRNA families shared by these three unicellular algae, higher plants, and animals showed that P. tricornutum shared most miRNAs with animals. The phylogenetic tree also showed that C. merolae shared more miRNAs with plants than did the two diatoms, and the majority of its miRNAs were shared with the two diatoms. Our results were consistent with diatoms originating from a secondary endosymbiosis.  相似文献   

7.
Schistosoma mansoni genome project: an update   总被引:4,自引:0,他引:4  
A schistosome genome project was initiated by the World Health Organization in 1994 with the notion that the best prospects for identifying new targets for drugs, vaccines, and diagnostic development lie in schistosome gene discovery, development of chromosome maps, whole genome sequencing and genome analysis. Schistosoma mansoni has a haploid genome of 270 Mb contained on 8 pairs of chromosomes. It is estimated that the S. mansoni genome contains between 15000 and 25000 genes. There are approximately 16689 ESTs obtained from diverse libraries representing different developmental stages of S. mansoni, deposited in the NCBI EST database. More than half of the deposited sequences correspond to genes of unknown function. Approximately 40-50% of the sequences form unique clusters, suggesting that approximately 20-25% of the total schistosome genes have been discovered. Efforts to develop low resolution chromosome maps are in progress. There is a genome sequencing program underway that will provide 3X sequence coverage of the S. mansoni genome that will result in approximately 95% gene discovery. The genomics era has provided the resources to usher in the era of functional genomics that will involve microarrays to focus on specific metabolic pathways, proteomics to identify relevant proteins and protein-protein interactions to understand critical parasite pathways. Functional genomics is expected to accelerate the development of control and treatment strategies for schistosomiasis.  相似文献   

8.
9.
10.
Publication of the rice genome sequence has allowed an in-depth analysis of genome organization in a model monocot plant species. This has provided a powerful tool for genome analysis in large-genome unsequenced agriculturally important monocot species such as wheat, barley, rye, Lolium, etc. Previous data have indicated that the majority of genes in large-genome monocots are located toward the ends of chromosomes in gene-rich regions that undergo high frequencies of recombination. Here we demonstrate that a substantial component of the coding sequences in monocots is localized proximally in regions of very low and even negligible recombination frequencies. The implications of our findings are that during domestication of monocot plant species selection has concentrated on genes located in the terminal regions of chromosomes within areas of high recombination frequency. Thus a large proportion of the genetic variation available for selection of superior plant genotypes has not been exploited. In addition our findings raise the possibility of the evolutionary development of large supergene complexes that confer a selective advantage to the individual.  相似文献   

11.
12.
In Euglena gracilis, a 26 nucleotide leader sequence (spliced leader sequence = SL) is transferred by trans-splicing to the 5' end of a vast majority of cytoplasmic mRNAs (8). The SL originates from the 5' extremity of a family of closely related snRNAs (SL-RNAs) which are about 100 nucleotide long. In this paper we present the nucleotide sequences of two SL-RNA genes, confirming the sequences previously established by sequencing purified SL-RNAs. Although some SL-RNA genes are dispersed throughout the genome, we show that the majority of SL-RNA genes are located on 0.6 kb repeated units which also encode the cytoplasmic 5S rRNA. We estimate that the copy number of these repeated units is about 300 per haploid genome. The association of SL-RNA and 5S rRNA genes in tandemly repeated units is also found in nematodes but paradoxically does not exist in trypanosomes which are phylogenically much closer to Euglena. We also show that a high number of sequences analogous to the 26 nucleotide SL are dispersed throughout the genome and are not associated with SL-RNAs.  相似文献   

13.
14.
Bacteria of the genus ‘Candidatus Phytoplasma’ are uncultivated intracellular plant pathogens transmitted by phloem-feeding insects. They have small genomes lacking genes for essential metabolites, which they acquire from either plant or insect hosts. Nonetheless, some phytoplasmas, such as ‘Ca. P. solani’, have broad plant host range and are transmitted by several polyphagous insect species. To understand better how these obligate symbionts can colonize such a wide range of hosts, the genome of ‘Ca. P. solani’ strain SA-1 was sequenced from infected periwinkle via a metagenomics approach. The de novo assembly generated a draft genome with 19 contigs totalling 821,322 bp, which corresponded to more than 80% of the estimated genome size. Further completion of the genome was challenging due to the high occurrence of repetitive sequences. The majority of repeats consisted of gene arrangements characteristic of phytoplasma potential mobile units (PMUs). These regions showed variation in gene orders intermixed with genes of unknown functions and lack of similarity to other phytoplasma genes, suggesting that they were prone to rearrangements and acquisition of new sequences via recombination. The availability of this high-quality draft genome also provided a foundation for genome-scale genotypic analysis (e.g., average nucleotide identity and average amino acid identity) and molecular phylogenetic analysis. Phylogenetic analyses provided evidence of horizontal transfer for PMU-like elements from various phytoplasmas, including distantly related ones. The ‘Ca. P. solani’ SA-1 genome also contained putative secreted protein/effector genes, including a homologue of SAP11, found in many other phytoplasma species.  相似文献   

15.
16.

Background  

The rapid completion of genome sequences has created an infrastructure of biological information and provided essential information to link genes to gene products, proteins, the building blocks for cellular functions. In addition, genome/cDNA sequences make it possible to predict proteins for which there is no experimental evidence. Clues for function of hypothetical proteins are provided by sequence similarity with proteins of known function in model organisms.  相似文献   

17.
Human gene catalogs are fundamental to the study of human biology and medicine. But they are all based on open reading frames (ORFs) in a reference genome sequence (with allowance for introns). Individual genomes, however, are polymorphic: their sequences are not identical. There has been much research on how polymorphism affects previously-identified genes, but no research has been done on how it affects gene identification itself. We computationally predict protein-coding genes in a straightforward manner, by finding long ORFs in mRNA sequences aligned to the reference genome. We systematically test the effect of known polymorphisms with this procedure. Polymorphisms can not only disrupt ORFs, they can also create long ORFs that do not exist in the reference sequence. We found 5,737 putative protein-coding genes that do not exist in the reference, whose protein-coding status is supported by homology to known proteins. On average 10% of these genes are located in the genomic regions devoid of annotated genes in 12 other catalogs. Our statistical analysis showed that these ORFs are unlikely to occur by chance.  相似文献   

18.
Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.  相似文献   

19.
Pairwise comparison of whole plastid and draft nuclear genomic sequences of Arabidopsis thaliana and Oryza sativa L. ssp. indica shows that rice nuclear genomic sequences contain homologs of plastid DNA covering about 94 kb (83%) of plastid genome and including one or more full-length intact (without mutations resulting in premature stop codons) homologues of 26 known protein-coding (KPC) plastid genes. By contrast, only about 20 kb (16%) of chloroplast DNA, including a single intact plastid-derived KPC gene, is presented in the nucleus of A. thaliana. Sixteen rice plastid genes have at least one nuclear copy without any mutation or with only synonymous substitutions. Nuclear copies for other ten plastid genes contain both synonymous and non-synonymous substitutions. Multiple ESTs for 25 out of 26 KPC genes were also found, as well as putative promoters for some of them. The study of substitutions pattern shows that some of nuclear homologues of plastid genes may be functional and/or are under the pressure of the positive natural selection. The similar comparative analysis performed on rice chromosome 1 revealed 27 contigs containing plastid-derived sequences, totalling about 84 kb and covering two thirds of chloroplast DNA, with the intact nuclear copies of 26 different KPC genes. One of these contigs, AP003280, includes almost 57 kb (45%) of chloroplast genome with the intact copies of 22 KPC genes. At the same time, we observed that relative locations of homologues in plastid DNA and the nuclear genome are significantly different.  相似文献   

20.
At least 0.08% of the Apis mellifera nuclear genome contains sequences that originated from mitochondria. These nuclear copies of mitochondrial sequences (numts) are scattered all over the honeybee chromosomes and have originated by multiple independent insertions of mitochondrial DNA (mtDNA) as evident by phylogenetic analysis. Apart from original insertions, moderate duplications of numts also contributed to the present pattern and distribution of mitochondrial sequences in honeybee chromosomes. Assimilation of mitochondrial genes in the nuclear genome is mediated by extensive fragmentations of the original inserts. Replication slippage seems to be a major mechanism by which small sequences are inserted or deleted from mtDNA destined to nucleus. Most of the honeybee numts (84%) are located in the nongenic regions. The majority (94%) of the numts that are located in predicted nuclear genes have originated from mitochondrial genes coding for cytochrome oxidase and NADH dehydrogenase subunits. On the other hand, the mitochondrial rRNA or tRNA gene sequences are predominantly (88%) located in nongenic regions of the genome. Evidences also support for exertion of purifying selection on numts located in specific genes. Comparative analysis of numts of European, African, and Africanized honeybees suggests that numt evolution in A. mellifera is probably not demarked by speciation time frame but may be a continuous and dynamic process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号