首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Satoshi Fukuchi  Ken Nishikawa 《DNA research》2004,11(4):219-31, 311-313
Genome annotation produces a considerable number of putative proteins lacking sequence similarity to known proteins. These are referred to as "orphans." The proportion of orphan genes varies among genomes, and is independent of genome size. In the present study, we show that the proportion of orphan genes roughly correlates with the isolation index of organisms (IIO), an indicator introduced in the present study, which represents the degree of isolation of a given genome as measured by sequence similarity. However, there are outlier genomes with respect to the linear correlation, consisting of those genomes that may contain excess amounts of orphan genes. Comparisons of genome sequences among closely related strains revealed that some of the annotated genes are not conserved, suggesting that they are ORFs occurring by chance. Exclusion of these non-conserved ORFs within closely related genomes improved the correlation between the proportion of orphan genes and the IIO values. Assuming that the correlation holds in general, this relationship was used to estimate the number of "authentic" orphan genes in a genome. Using this definition of authentic orphan genes, the anomalies arising from over-assignments, e.g., the percentages of structural annotations, were corrected for 16 genomes, including those of five archaea.  相似文献   

3.
汪乐洋  黄海燕  吴强 《遗传》2017,39(4):313-325
在基因组中,编码区存在许多高度相似的基因簇或基因群(多拷贝基因),非编码区也存在大量的重复序列。这些重复序列能通过改变染色体的三维结构调控基因的转录,对于生物体的遗传与进化起到了重要的作用。其高度同源的特征使得利用CRISPR/Cas9技术进行基因组编辑时面临更加复杂的状况。如果编辑的片段是二倍体或多倍体,还会产生各条染色单体上的编辑情况不相同的现象。为此本文选择了2个位于同一染色体相距11 kb的高度同源300 bp片段(L1和L2)进行CRISPR介导的DNA片段编辑。采用一对sgRNA(分别共同靶向两片段的上、下游位点)引导Cas9对HepG2细胞两个高度相似的DNA片段进行切割。片段编辑的细胞进一步单克隆化后,对获得的22个L1/L2编辑的CRISPR单克隆细胞株进行详细的基因型鉴定。结果发现除了这两个DNA片段本身被删除外,它们之间的大片段也存在被删除的现象,三个片段的各种反转组合也很频繁。该研究结果对于采用CRISPR/Cas9系统编辑多拷贝基因或重复序列,尤其是对二倍体或多倍体生物进行基因组编辑时具有重要的借鉴和参考价值。  相似文献   

4.
We have determined the three-dimensional (3D) architecture of the Caulobacter crescentus genome by combining genome-wide chromatin interaction detection, live-cell imaging, and computational modeling. Using chromosome conformation capture carbon copy (5C), we derive ~13 kb resolution 3D models of the Caulobacter genome. The resulting models illustrate that the genome is ellipsoidal with periodically arranged arms. The parS sites, a pair of short contiguous sequence elements known to be involved in chromosome segregation, are positioned at one pole, where they anchor the chromosome to the cell and contribute to the formation of a compact chromatin conformation. Repositioning these elements resulted in rotations of the chromosome that changed the subcellular positions of most genes. Such rotations did not lead to large-scale changes in gene expression, indicating that genome folding does not strongly affect gene regulation. Collectively, our data suggest that genome folding is globally dictated by the parS sites and chromosome segregation.  相似文献   

5.
The Anopheles gambiae genome project yielded almost complete sequences for the autosomes and for a large part of the X chromosome, however, no information for the Y chromosome was obtained. Yet, by design, fragmented Y chromosome sequences should be present in the resulting assembly. Here we report the search for Anopheles Y chromosome genes using a strategy successfully applied for identification of Y genes in Drosophila. A complete set of the unmapped scaffolds was targeted in a broad TBLASTN search using both A. gambiae predicted genes and all proteins from nr database as query sequences. After filtering of the BLAST report, we selected 181 scaffolds possibly containing fragments of Y chromosome genes to experimentally test their Y-linkage. Surprisingly, none of the tested sequences appeared to originate from the Y chromosome. Several factors could account for the failure to detect Y genes, including their different organization in A. gambiae compared to Drosophila and the suboptimal quality of the assembly and annotation of the Anopheles genome. Regardless of the cause, our results illuminate problems associated with the genome analysis of outbred organisms.  相似文献   

6.
While genome sequencing is becoming ever more routine, genome annotation remains a challenging process. Identification of the coding sequences within the genomic milieu presents a tremendous challenge, especially for eukaryotes with their complex gene architectures. Here, we present a method to assist the annotation process through the use of proteomic data and bioinformatics. Mass spectra of digested protein preparations of the organism of interest were acquired and searched against a protein database created by a six-frame translation of the genome. The identified peptides were mapped back to the genome, compared to the current annotation, and then categorized as supporting or extending the current genome annotation. We named the classified peptides Expressed Peptide Tags (EPTs). The well-annotated bacterium Rhodopseudomonas palustris was used as a control for the method and showed a high degree of correlation between EPT mapping and the current annotation, with 86% of the EPTs confirming existing gene calls and less than 1% of the EPTs expanding on the current annotation. The eukaryotic plant pathogens Phytophthora ramorum and Phytophthora sojae, whose genomes have been recently sequenced and are much less well-annotated, were also subjected to this method. A series of algorithmic steps were taken to increase the confidence of EPT identification for these organisms, including generation of smaller subdatabases to be searched against, and definition of EPT criteria that accommodates the more complex eukaryotic gene architecture. As expected, the analysis of the Phytophthora species showed less correlation between EPT mapping and their current annotation. While approximately 76% of Phytophthora EPTs supported the current annotation, a portion of them (7.7% and 12.9% for P. ramorum and P. sojae, respectively) suggested modification to current gene calls or identified novel genes that were missed by the current genome annotation of these organisms.  相似文献   

7.
The parasitic nematode, Brugia malayi, causes lymphatic filariasis in humans, which in severe cases leads to the condition known as elephantiasis. The parasite contains an endosymbiotic alpha-proteobacterium of the genus Wolbachia that is required for normal worm development and fecundity and is also implicated in the pathology associated with infections by these filarial nematodes. Bacterial artificial chromosome libraries were constructed from B. malayi DNA and provide over 11-fold coverage of the nematode genome. Wolbachia genomic fragments were simultaneously cloned into the libraries giving over 5-fold coverage of the 1.1 Mb bacterial genome. A physical framework for the Wolbachia genome was developed by construction of a plasmid library enriched for Wolbachia DNA as a source of sequences to hybridise to high-density bacterial artificial chromosome colony filters. Bacterial artificial chromosome end sequencing provided additional Wolbachia probe sequences to facilitate assembly of a contig that spanned the entire genome. The Wolbachia sequences provided a marker approximately every 10 kb. Four rare-cutting restriction endonucleases were used to restriction map the genome to a resolution of approximately 60 kb and demonstrate concordance between the bacterial artificial chromosome clones and native Wolbachia genomic DNA. Comparison of Wolbachia sequences to public databases using BLAST algorithms under stringent conditions allowed confident prediction of 69 Wolbachia peptide functions and two rRNA genes. Comparison to closely related complete genomes revealed that while most sequences had orthologs in the genome of the Wolbachia endosymbiont from Drosophila melanogaster, there was no evidence for long-range synteny. Rather, there were a few cases of short-range conservation of gene order extending over regions of less than 10 kb. The molecular scaffold produced for the genome of the Wolbachia from B. malayi forms the basis of a genomic sequencing effort for this bacterium, circumventing the difficult challenge of purifying sufficient endosymbiont DNA from a tropical parasite for a whole genome shotgun sequencing strategy.  相似文献   

8.
The goal of this study was to determine to what extent the Aquificales are related to the epsilon-Proteobacteria. The genome sequence of several members of this group as well as the genome sequence of Aquifex aeolicus are available. In this study we used information extracted from those whole-genome sequences to gain further insights into the relationships between these organisms, including the fraction of shared putative orthologous protein-encoding genes, dinucleotide relative abundance values and the sequences of the 16S rRNA gene and 20 housekeeping genes. The results of our analyses show that it is not straightforward to come to a consistent picture of the phylogenetic position of the order Aquificales but our data clearly show that there is no particularly close relationship between A. aeolicus and the epsilon-Proteobacteria as (i) they do not share more genes with each other than do other distantly related organisms and (ii) they do not share significant sequence similarity in many macromolecules. In addition, there is considerable evidence that confirms the placement of the Aquificales near the root of the bacterial tree.  相似文献   

9.
10.
All organisms contain transposons with the potential to disrupt and rearrange genes. Despite the presence of these destabilizing sequences, some genomes show remarkable stability over evolutionary time. Do bacteria defend the genome against disruption by transposons? Phage Mu replicates by transposition and virtually all genes are potential insertion targets. To test whether bacteria limit Mu transposition to specific parts of the chromosome, DNA arrays of Salmonella enterica were used to quantitatively measure target site preference and compare the data with Escherichia coli. Essential genes were as susceptible to transposon disruption as non‐essential ones in both organisms, but the correlation of transposition hot spots among homologous genes was poor. Genes in highly transcribed operons were insulated from transposon mutagenesis in both organisms. A 10 kb cold spot on the pSLT plasmid was near parS, a site to which the ParB protein binds and spreads along DNA. Deleting ParB erased the plasmid cold spot, and an ectopic parS site placed in the Salmonella chromosome created a new cold spot in the presence of ParB. Our data show that competition between cellular proteins and transposition proteins on plasmids and the chromosome is a dominant factor controlling the genetic footprint of transposons in living cells.  相似文献   

11.
All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.  相似文献   

12.
Chromosomes are intricately folded and packaged in the cell nucleus and interact with the nuclear envelope. This complex nuclear architecture has a profound effect on how the genome works and how the cells function. The main goal of review is to highlight recent studies on the effect of chromosome–nuclear envelope interactions on chromatin folding and function in the nucleus. The data obtained suggest that chromosome–nuclear envelope attachments are important for the organization of nuclear architecture in various organisms. A combination of experimental cell biology methods with computational modeling offers a unique opportunity to explore the fundamental relationships between different aspects of 3D genome organization in greater details. This powerful interdisciplinary approach could reveal how the organization and function of the genome in the nuclear space is affected by the chromosome–nuclear envelope attachments and will enable the development of novel approaches to regulate gene expression.  相似文献   

13.
14.
The phenotypic effects of random mutations depend on both the architecture of the genome and the gene-trait relationships. Both levels thus play a key role in the mutational variability of the phenotype, and hence in the long-term evolutionary success of the lineage. Here, by simulating the evolution of organisms with flexible genomes, we show that the need for an appropriate phenotypic variability induces a relationship between the deleteriousness of gene mutations and the quantity of non-coding sequences maintained in the genome. The more deleterious the gene mutations, the shorter the intergenic sequences. Indeed, in a shorter genome, fewer genes are affected by rearrangements (duplications, deletions, inversions, translocations) at each replication, which compensates for the higher impact of each gene mutation. This spontaneous adjustment of genome structure allows the organisms to retain the same average fitness loss per replication, despite the higher impact of single gene mutations. These results show how evolution can generate unexpected couplings between distinct organization levels.  相似文献   

15.
The position of a gene in the genome may have important consequences for its function. Therefore, when a new duplicate gene arises, its location may be critical in determining its fate. Our recent work in humans, mouse, and Drosophila provided a test by studying the patterns of duplication in sex chromosome evolution. We revealed a bias in the generation and recruitment of new gene copies involving the X chromosome that has been shaped largely by selection for male germline functions. The gene movement patterns we observed reflect an ongoing process as some of the new genes are very young while others were present before the divergence of humans and mouse. This suggests a continuing redistribution of male-related genes to achieve a more efficient allocation of male functions. This notion should be further tested in organisms employing other sex determination systems or in organisms differing in germline X chromosome inactivation. It is likely that the selective forces that were detected in these studies are also acting on other types of duplicate genes. As a result, future work elucidating sex chromosome differentiation by other mutational mechanisms will shed light on this important process.  相似文献   

16.
One of the common features of bacterial genomes is a strong compositional asymmetry between differently replicating DNA strands (leading and lagging). The main cause of the observed bias is the mutational pressure associated with replication. This suggests that genes translocated between differently replicating DNA strands are subjected to a higher mutational pressure, which may influence their composition and divergence rate. Analyses of groups of completely sequenced bacterial genomes have revealed that the highest divergence rate is observed for the DNA sequences that in closely related genomes are located on different DNA strands in respect to their role in replication. Paradoxically, for this group of sequences the absolute values of divergence rate are higher for closely related species than for more diverged ones. Since this effect concerns only the specific group of orthologs, there must be a specific mechanism introducing bias into the structure of chromosome by enriching the set of homologs in trans position in newly diverged species in relatively highly diverged sequences. These highly diverged sequences may be of varied nature: (1) paralogs or other fast-evolving genes under weak selection; or (2) pseudogenes that will probably be eliminated from the genome during further evolution; or (3) genes whose history after divergence is longer than the history of the genomes in which they are found. The use of these highly diverged sequences for phylogenetic analyses may influence the topology and branch length of phylogenetic trees. The changing mutational pressure may contribute to arising of genes with new functions as well.  相似文献   

17.
《Fly》2013,7(3):192-204
We used the Illumina reversible-short sequencing technology to obtain 17-fold average depth (s.d.~8) of ~94% of the euchromatic genome and ~1-5% of the heterochromatin sequence of the Drosophila melogaster isogenic strain w1118; iso-2; iso-3. We show that this strain has a ~9 kb deletion that uncovers the first exon of the white (w) gene, ~4 kb of downstream promoter sequences, and most of the first intron, thus demonstrating that whole-genome sequencing can be used for mutation characterization. We chose this strain because there are thousands of transposon insertion lines and hundreds of isogenic deficiency lines available with this genetic background, such as the Exelixis, Inc., and the DrosDEL collections. We compared our sequence to Release 5 of the finished reference genome sequence which was made from the isogenic strain y1; cn1 bw1 sp1 and identified ~356,614 candidate SNPs in the ~117 Mb unique sequence genome, which represents a substitution rate of ~1/305 nucleotides (~0.30%). The distribution of SNPs is not uniform, but rather there is a ~2-fold increase in SNPs on the autosome arms compared with the X chromosome and a ~7-fold increase when compared to the small 4th chromosome. This is consistent with previous analyses that demonstrated a correlation between recombination frequency and SNP frequency. An unexpected finding was a SNP hotpot in a ~20Mb central region of the 4th chromosome, which might indicate higher than expected recombination frequency in this region of this chromosome. Interestingly, genes involved in sensory perception are enriched in SNP hotspots and genes encoding developmental genes are enriched in SNP coldspots, which suggests that recombination frequencies might be proportional to the evolutionary selection coefficient. There are currently 12 Drosophila species sequenced, and this represents one of many isogenic Drosophila melanogaster genome sequences that are in progress. Because of the dramatic increase in power in using isogenic lines rather than outbred individuals, the SNP information should be valuable as a test bed for understanding genotype-by-environment interactions in human population studies.  相似文献   

18.
We examined evolutionary mechanisms in the tetraploid Elymus caninus by comparing the phylogenetic relationships of 21 accessions suggested by sequence data from two single copy nuclear genes, the largest subunit of RNA polymerase II (RPB2) and phosphoenolpyruvate carboxylase (pepC), and one non-coding chloroplast region, TrnD/T. Elymus caninus is known combining two different genomes, an St genome and an H genome. Data from two single copy nuclear genes showed that there are two versions of the St genome in the species, St1 and St2. Most accessions combined one of these versions with an H genome version but two accessions had both versions of the St sequence for RPB2. This suggests that the RPB2gene may have been duplicated without chromosome doubling, possibly induced by transposable element. Our data also indicate that the H genome sequences in E. caninus have multiple origins, and a close phylogenetic relationship between Hordeum bogdanii and H sequences in some accessions of E. caninus. Thus, it is more likely that H. bogdanii is one of the major donors of the H copy in E. caninus. The maternal origin of E. caninus is the St genome species. There was no correlation between the geographic origin of the accessions and their sequence divergence.  相似文献   

19.
The availability of sequenced bacterial genomes allows a deeper understanding of their organizational features that are related with fundamental cellular processes such as coordinated gene expression, chromosome replication and cell division. Nevertheless, recent genome comparisons and experimental work highlighted the fluidity of bacterial chromosomes, including genome rearrangements that imperil the selective features of chromosome order. As a result, the clash between elements generating rearrangements and chromosome organization is a classic case of evolutionary conflict.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号