首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Shigella flexneri, which causes shigellosis in humans, evolved from Escherichia coli. The sequencing of Shigella genomes has revealed that a large number of insertion sequence (IS) elements (over 200 elements) reside in the genome. Although the presence of these elements has been noted previously and summarized, more detailed analyses are required to understand their evolutionary significance. Here, the genome of S. flexneri strain 2457T is used to investigate the spatial distribution of IS copies around the chromosome and the location of elements with respect to genes. It is found that most IS isoforms occur essentially randomly around the genome. Two exceptions are IS91 and IS911, which appear to cluster due to local hopping. The location of IS elements with respect to genes is biased, however, revealing the action of natural selection. The non-coding regions of the genome (no more than 21%) carry disproportionally more IS elements (at least 28%) than the coding regions, implying that selection acts against insertion into genes. Of the genes disrupted by ISs, those involved in signal transduction, intracellular trafficking, and cell motility are most commonly targeted, suggesting selection against genes in these categories.  相似文献   

2.
3.
Rheumatoid arthritis (RA) is an autoimmune disease, the pathogenesis of which is affected by multiple genetic and environmental factors. To understand the genetic and molecular basis of RA, a large number of quantitative trait loci (QTL) that regulate experimental autoimmune arthritis have been identified using various rat models for RA. However, identifying the particular responsible genes within these QTL remains a major challenge. Using currently available genome data and gene annotation information, we systematically examined RA-associated genes and polymorphisms within and outside QTL over the whole rat genome. By the whole genome analysis of genes and polymorphisms, we found that there are significantly more RA-associated genes in QTL regions as contrasted with non-QTL regions. Further experimental studies are necessary to determine whether these known RA-associated genes or polymorphisms are genetic components causing the QTL effect.  相似文献   

4.
刘玉萍  吕婷  朱迪  周勇辉  刘涛  苏旭 《植物研究》2018,38(4):518-525
藏扇穗茅(Littledalea tibetica)是禾本科(Poaceae)雀麦族(Bromeae)中一个具有重要生态价值的多年生高山特有种,主要分布于青藏高原及其毗邻地区。本文采用基于第二代高通量测序平台的Illumina MiSeq技术,对青藏高原特有种—藏扇穗茅进行了叶绿体基因组测序,首次建立了雀麦族物种的标准测序流程;同时,以其近缘物种—黑麦草(Lolium perenne)的叶绿体基因组序列作为参考,组装获得它的叶绿体基因组序列。结果表明,藏扇穗茅叶绿体基因组序列全长136 852 bp,GC含量为38.5%,呈典型的四段式结构,其中大(LSC)、小(SSC)单拷贝区大小分别为80 970和12 876 bp,反向互补重复区(IR)大小为21 503 bp,共注释得到141个基因,包含95个蛋白编码基因、38个tRNA基因和8个rRNA基因,主要分布于大单拷贝区和小单拷贝区。同时,基于藏扇穗茅和其它30种禾本科植物叶绿体基因全序列构建的系统发育树显示,藏扇穗茅与早熟禾亚科中小麦族植物亲缘关系较近。  相似文献   

5.
Mapping DNase-I hypersensitive sites on human isochores   总被引:3,自引:0,他引:3  
Di Filippo M  Bernardi G 《Gene》2008,419(1-2):62-65
Mapping DNase-I hypersensitive sites (HS) was used in the past to identify regulatory elements of specific genes. More recently, thousands of HS were identified in the human genome by using high-throughput methods. These approaches showed a general enrichment of HS near or within known genes, within CpG islands, within human-mouse conserved regions and in GC-rich regions of the genome. Here we show that HS: (i) are characterized by a much higher GC level (approximately 56%) than the average GC level of the human genome (approximately 41%); (ii) are overwhelmingly located in the GC-richest compartment of the genome, which is predominantly associated with an open chromatin structure; (iii) and are slightly more and slightly less frequent than genes, respectively, in the gene-rich and in the gene-poor isochore families.  相似文献   

6.
The mouse genome has undergone extensive chromosome rearrangement relative to the human genome since these species last shared a common ancestor. One possible consequence of these rearrangements is the deletion of genes that are located within evolutionary breakpoint regions. In this article, we present evidence of four human genes (COL21A1, STK17A, GPR145 and ARHI) that are located in regions corresponding to evolutionary breakpoints in rodents and lack mouse and rat orthologues. We propose that "evolutionary breakpoint-associated gene deletion" is an unexpected consequence of evolutionary chromosome rearrangement, and we describe a novel mechanism through which genes can be lost during evolution.  相似文献   

7.
The complete arrangement of genes in the mitochondrial (mt) genome is known for 12 species of insects, and part of the gene arrangement in the mt genome is known for over 300 other species of insects. The arrangement of genes in the mt genome is very conserved in insects studied, since all of the protein-coding and rRNA genes and most of the tRNA genes are arranged in the same way. We sequenced the entire mt genome of the wallaby louse, Heterodoxus macropus, which is 14,670 bp long and has the 37 genes typical of animals and some noncoding regions. The largest noncoding region is 73 bp long (93% A+T), and the second largest is 47 bp long (92% A+T). Both of these noncoding regions seem to be able to form stem-loop structures. The arrangement of genes in the mt genome of this louse is unlike that of any other animal studied. All tRNA genes have moved and/or inverted relative to the ancestral gene arrangement of insects, which is present in the fruit fly Drosophila yakuba. At least nine protein-coding genes (atp6, atp8, cox2, cob, nad1-nad3, nad5, and nad6) have moved; moreover, four of these genes (atp6, atp8, nad1, and nad3) have inverted. The large number of gene rearrangements in the mt genome of H. macropus is unprecedented for an arthropod.  相似文献   

8.
Whether higher-order chromatin organization is related to genome stability over evolutionary time remains elusive. We find that regions of conserved gene order across the genus Drosophila are larger if they harbor genes bound by B-type lamin (Lam) and Suppressor of Under-Replication (SUUR), two proteins located at the nuclear periphery. Low recombination rates and coexpression of genes in regions of conserved gene order do not explain the lower probability of disruption in these regions by genome rearrangements. Instead, we find a significant colocalization between evolutionarily stable genomic regions associated with Lam and sequences thought to regulate local gene expression, which have the potential to impose constraints on genome rearrangement. At least in the genus Drosophila, localization of particular genomic regions at the nuclear periphery is intimately associated with their long-term integrity during evolution.  相似文献   

9.
Yeramian E 《Gene》2000,255(2):151-168
A gene identification procedure is formulated, based on large-scale structural analyses of genomic sequences. The structural property is the physical - thermal - stability of the DNA double-helix, as described by the classical helix-coil model. The analyses are detailed for the Plasmodium falciparum genome, which represents one of the most difficult cases for the gene identification problem (notably because of the extreme AT-richness of the genome). In this genome, the coding domains (either uninterrupted genes or exons in split genes) are accurately identified as regions of high thermal stability. The conclusion is based on the study of the available cloned genes, of which 17 examples are described in detail. These examples demonstrate that the physical criterion is valid for the detection of coding regions whose lengths extend from a few base pairs up to several thousand base pairs. Accordingly, the structural analyses can provide a powerful and convenient tool for the identification of complex genes in the P. falciparum genome. The limits of such a scheme are discussed. The gene identification procedure is applied to the completely sequenced chromosomes (2 and 3), and the results are compared with the database annotations. The structural analyses suggest more or less extensive revision to the annotations, and also allow new putative genes to be identified in the chromosome sequences. Several examples of such new genes are described in detail.  相似文献   

10.
The complete sequence of the mitochondrial genome of Leptorhynchoides thecatus (Acanthocephala) was determined, and a phylogenetic analysis was carried out to determine its placement within Metazoa. The genome is circular, 13,888 bp, and contains at least 36 of the 37 genes typically found in animal mitochondrial genomes. The genes for the large and small ribosomal RNA subunits are shorter than those of most metazoans, and the structures of most of the tRNA genes are atypical. There are two significant noncoding regions (377 and 294 bp), which are the best candidates for a control region; however, these regions do not appear similar to any of the control regions of other animals studied to date. The amino acid and nucleotide sequences of the protein coding genes of L. thecatus and 25 other metazoan taxa were used in both maximum likelihood and maximum parsimony phylogenetic analyses. Results indicate that among taxa with available mitochondrial genome sequences, Platyhelminthes is the closest relative to L. thecatus, which together are the sister taxon of Nematoda; however, long branches and/or base composition bias could be responsible for this result. The monophyly of Ecdysozoa, molting organisms, was not supported by any of the analyses. This study represents the first mitochondrial genome of an acanthocephalan to be sequenced and will allow further studies of systematics, population genetics, and genome evolution.Reviewing Editor: Dr. Rafael Zardoya The entire genome sequence has been deposited with the GenBank Data Libraries under-accession number AY562383.  相似文献   

11.
Organisms are remarkably adapted to diverse environments by specialized metabolisms, morphology, or behaviors. To address the molecular mechanisms underlying environmental adaptation, we have utilized a Drosophila melanogaster line, termed "Dark-fly", which has been maintained in constant dark conditions for 57 years (1400 generations). We found that Dark-fly exhibited higher fecundity in dark than in light conditions, indicating that Dark-fly possesses some traits advantageous in darkness. Using next-generation sequencing technology, we determined the whole genome sequence of Dark-fly and identified approximately 220,000 single nucleotide polymorphisms (SNPs) and 4,700 insertions or deletions (InDels) in the Dark-fly genome compared to the genome of the Oregon-R-S strain, a control strain. 1.8% of SNPs were classified as non-synonymous SNPs (nsSNPs: i.e., they alter the amino acid sequence of gene products). Among them, we detected 28 nonsense mutations (i.e., they produce a stop codon in the protein sequence) in the Dark-fly genome. These included genes encoding an olfactory receptor and a light receptor. We also searched runs of homozygosity (ROH) regions as putative regions selected during the population history, and found 21 ROH regions in the Dark-fly genome. We identified 241 genes carrying nsSNPs or InDels in the ROH regions. These include a cluster of alpha-esterase genes that are involved in detoxification processes. Furthermore, analysis of structural variants in the Dark-fly genome showed the deletion of a gene related to fatty acid metabolism. Our results revealed unique features of the Dark-fly genome and provided a list of potential candidate genes involved in environmental adaptation.  相似文献   

12.
13.
Jabbari K  Bernardi G 《Gene》2000,247(1-2):287-292
In the present work we show that in the Drosophila genome (which covers a 37-51% GC range at a DNA size of approx.50kb) a linear correlation holds between GC (or GC(3)50kb) genomic sequences embedding them. This correlation allows us to position the two compositional distributions of (a) coding sequences, and (b) of long DNA segments relative to each other and to calculate gene concentration across the compositional range of the Drosophila genome. Using this approach, we show that gene concentration increases with increasing GC of the regions embedding the genes, reaching a 7-fold higher level in the GC-richest regions compared with the GC-poorest regions. The gene distribution of the Drosophila genome is, therefore, similar to (although less striking than) that of the human genome, whereas it is very different from those of the Arabidopsis genome, which has about the same size as the Drosophila genome.  相似文献   

14.
15.
Complete structure of the chloroplast genome of Arabidopsis thaliana.   总被引:7,自引:0,他引:7  
The complete nucleotide sequence of the chloroplast genome of Arabidopsis thaliana has been determined. The genome as a circular DNA composed of 154,478 bp containing a pair of inverted repeats of 26,264 bp, which are separated by small and large single copy regions of 17,780 bp and 84,170 bp, respectively. A total of 87 potential protein-coding genes including 8 genes duplicated in the inverted repeat regions, 4 ribosomal RNA genes and 37 tRNA genes (30 gene species) representing 20 amino acid species were assigned to the genome on the basis of similarity to the chloroplast genes previously reported for other species. The translated amino acid sequences from respective potential protein-coding genes showed 63.9% to 100% sequence similarity to those of the corresponding genes in the chloroplast genome of Nicotiana tabacum, indicating the occurrence of significant diversity in the chloroplast genes between two dicot plants. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.  相似文献   

16.

Background

The different regions of a genome do not evolve at the same rate. For example, comparative genomic studies have suggested that the sex chromosomes and the regions harbouring the immune defence genes in the Major Histocompatability Complex (MHC) may evolve faster than other genomic regions. The advent of the next generation sequencing technologies has made it possible to study which genomic regions are evolutionary liable to change and which are static, as well as enabling an increasing number of genome studies of non-model species. However, de novo sequencing of the whole genome of an organism remains non-trivial. In this study, we present the draft genome of the black grouse, which was developed using a reference-guided assembly strategy.

Results

We generated 133 Gbp of sequence data from one black grouse individual by the SOLiD platform and used a combination of de novo assembly and chicken reference genome mapping to assemble the reads into 4572 scaffolds with a total length of 1022 Mb. The draft genome well covers the main chicken chromosomes 1 ~ 28 and Z which have a total length of 1001 Mb. The draft genome is fragmented, but has a good coverage of the homologous chicken genes. Especially, 33.0% of the coding regions of the homologous genes have more than 90% proportion of their sequences covered. In addition, we identified ~1 M SNPs from the genome and identified 106 genomic regions which had a high nucleotide divergence between black grouse and chicken or between black grouse and turkey.

Conclusions

Our results support the hypothesis that the chromosome X (Z) evolves faster than the autosomes and our data are consistent with the MHC regions being more liable to change than the genome average. Our study demonstrates how a moderate sequencing effort can be combined with existing genome references to generate a draft genome for a non-model species.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-180) contains supplementary material, which is available to authorized users.  相似文献   

17.
18.
A collection of 9,990 single-pass nuclear genomic sequences, corresponding to 5 Mb of tomato DNA, were obtained using methylation filtration (MF) strategy and reduced to 7,053 unique undermethylated genomic islands (UGIs) distributed as follows: (1) 59% non-coding sequences, (2) 28% coding sequences, (3) 12% transposons—96% of which are class I retroelements, and (4) 1% organellar sequences integrated into the nuclear genome over the past approximately 100 million years. A more detailed analysis of coding UGIs indicates that the unmethylated portion of tomato genes extends as far as 676 bp upstream and 766 bp downstream of coding regions with an average of 174 and 171 bp, respectively. Based on the analysis of the UGI copy distribution, the undermethylated portion of the tomato genome is determined to account for the majority of the unmethylated genes in the genome and is estimated to constitute 61±15 Mb of DNA (~5% of the entire genome)—which is significantly less than the 220 Mb estimated for gene-rich euchromatic arms of the tomato genome. This result indicates that, while most genes reside in the euchromatin, a significant portion of euchromatin is methylated in the intergenic spacer regions. Implications of the results for sequencing the genome of tomato and other solanaceous species are discussed.  相似文献   

19.
The honeybee (Apis mellifera) has a genome with a wide variation in GC content showing 2 clear modal GC values, in some ways reminiscent of an isochore-like structure. To gain insight into causes and consequences of this pattern, we used a comparative approach to study the genome-wide alignment of primarily coding sequence of A. mellifera with Drosophila melanogaster and Anopheles gambiae. The latter 2 species show a higher average GC content than A. mellifera and no indications of bimodality, suggesting that the GC-poor mode is a derived condition in honeybee. In A. mellifera, synonymous sites of genes generally adopt the GC content of the region in which they reside. A large proportion of genes in GC-poor regions have not been assigned to the honeybee assembly because of the low sequence complexity of their genome neighborhood. The synonymous substitution rate between A. mellifera and the other species is very close to saturation, but analyses of nonsynonymous substitutions as well as amino acid substitutions indicate that the GC-poor regions are not evolving faster than the GC-rich regions. We describe the codon usage and amino acid usage and show that they are remarkably heterogeneous within the honeybee genome between the 2 different GC regions. Specifically, the genes located in GC-poor regions show a much larger deviation in both codon usage bias and amino acid usage from the Dipterans than the genes located in the GC-rich regions.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号