首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Venetis C  Theologidis I  Zouros E  Rodakis GC 《Gene》2007,406(1-2):79-90
Species of the marine mussel genus Mytilus are known to contain two mitochondrial genomes, one transmitted maternally (the F genome) and the other paternally (the M genome). The two genomes have diverged by more than 20% in DNA sequence. Here we present the complete sequence of a third genome, genome C, which we found in the sperm of a Mytilus galloprovincialis male. The coding part of the new genome resembles in sequence the F genome, from which it differs by about 2% on average, but differs from the M genome by as much as the F from the M. Its major control region (CR) is more than three times larger than that of the F or the M genome and consists of repeated sequence domains of the CR of the M genome flanked by domains of the CR of the F genome. We present a sequence of events that reconstruct most parsimoniously the derivation of the C genome from the F and M genomes. The sequence consists of a duplication of CR elements of the M genome and subsequent insertion of these tandemly repeated elements in the F genome by recombination. The fact that the C genome was found as the only mitochondrial genome in the sperm of the male from which it was extracted suggests that it is transmitted paternally.  相似文献   

3.
Gluconacetobacter diazotrophicus PAl 5 is of agricultural significance due to its ability to provide fixed nitrogen to plants. Consequently, its genome sequence has been eagerly anticipated to enhance understanding of endophytic nitrogen fixation. Two groups have sequenced the PAl 5 genome from the same source (ATCC 49037), though the resulting sequences contain a surprisingly high number of differences. Therefore, an optical map of PAl 5 was constructed in order to determine which genome assembly more closely resembles the chromosomal DNA by aligning each sequence against a physical map of the genome. While one sequence aligned very well, over 98% of the second sequence contained numerous rearrangements. The many differences observed between these two genome sequences could be owing to either assembly errors or rapid evolutionary divergence. The extent of the differences derived from sequence assembly errors could be assessed if the raw sequencing reads were provided by both genome centers at the time of genome sequence submission. Hence, a new genome sequence standard is proposed whereby the investigator supplies the raw reads along with the closed sequence so that the community can make more accurate judgments on whether differences observed in a single stain may be of biological origin or are simply caused by differences in genome assembly procedures.  相似文献   

4.
Sequence organization of the human genome   总被引:1,自引:0,他引:1  
The organization of three sequence classes—single copy, repetitive, and inverted repeated sequences—within the human genome has been studied by renaturation techniques, hydroxylapatite binding methods, and DNA hyperchromism. Repetitive sequence classes are distributed throughout 80% or more of the genome. Slightly more than half of the genome consists of short single copy sequences, with a length of about 2 kb interspersed with repetitive sequences. The average length of the repetitive sequences is also small and approximates the length of these sequences found in other organisms. The sequence organization of the human genome therefore resembles the sequence organization found in Xenopus and sea urchin. The inverted repeats are essentially randomly positioned with respect to both sequence class and sequence arrangement, so that all three sequence classes are found to be mutually interspersed in a portion of the genome.  相似文献   

5.
6.
The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC‐by‐BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high‐resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high‐resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome‐scale analysis of repetitive sequences and revealed a ~800‐kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone‐by‐clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC‐contig physical map and validate sequence assembly on a chromosome‐arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome‐by‐chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules.  相似文献   

7.
The genome sequencing project has generated and will continue to generate enormous amounts of sequence data. Since the first complete genome sequence of bacteriumHacmophilus influenzac was published in 1995, the complete genome sequences of 2 eukaryotic and about 22 prokaryotic organisms have been determined. Given this ever-increasing amounts of sequence information, new strategies are necessary to efficiently pursue the next phase of the genome project—the elucidation of gene expression patterns and gene product function on a whole genome scale. In order to assign functional information to the genome sequence, DNA chip technology was developed to efficiently identify the differential expression pattern of independent biological samples. DNA chip provides a new tool for genome expression analysis that may revolutionize many aspects of human life including new drug discovery and human disease diagnostics.  相似文献   

8.
鸡基因组研究新进展   总被引:1,自引:1,他引:0  
牟彦双  李辉 《遗传》2006,28(5):617-622
鸡基因组测序草图的完成标志着禽类功能基因组时代的到来。鸡不仅是全世界广泛饲养且有重要经济价值的禽类,而且是极具生命科学研究价值的模式动物。因此,鸡基因组测序草图的完成将对遗传育种和生物学研究有重要的影响。本文综述了近年来鸡基因组研究的最新进展,主要内容包括鸡基因组的有关数据、物理图谱、遗传连锁图谱、比较基因组学、序列表达标签、生物信息学等方面所取得的成绩,同时对鸡基因组研究结果的应用前景进行了展望。  相似文献   

9.
With the advent of DNA sequencing technologies, more and more reference genome sequences are available for many organisms. Analyzing sequence variation and understanding its biological importance are becoming a major research aim. However, how to store and process the huge amount of eukaryotic genome data, such as those of the human, mouse and rice, has become a challenge to biologists. Currently available bioinformatics tools used to compress genome sequence data have some limitations, such as the requirement of the reference single nucleotide polymorphisms (SNPs) map and information on deletions and insertions. Here, we present a novel compression tool for storing and analyzing Genome ReSequencing data, named GRS. GRS is able to process the genome sequence data without the use of the reference SNPs and other sequence variation information and automatically rebuild the individual genome sequence data using the reference genome sequence. When its performance was tested on the first Korean personal genome sequence data set, GRS was able to achieve ~159-fold compression, reducing the size of the data from 2986.8 to 18.8 MB. While being tested against the sequencing data from rice and Arabidopsis thaliana, GRS compressed the 361.0 MB rice genome data to 4.4 MB, and the A. thaliana genome data from 115.1 MB to 6.5 KB. This de novo compression tool is available at http://gmdd.shgmo.org/Computational-Biology/GRS.  相似文献   

10.
11.
A 190-kb mitochondrial DNA sequence interrupted by seven foreign DNA segments was identified in rice chromosome 12. This fragment is the largest mitochondrial fragment translocated into the rice nuclear genome. The sequence is composed of a 190-kb segment of mitochondrial origin corresponding to 38.79% of the mitochondrial genome, 45 kb comprising four segments of retrotransposon origin, and 13 kb comprising three segments of unknown origin. The 190-kb sequence shows more than 99.68% similarity to the current mitochondrial sequence, suggesting that its integration into the nucleus was quite recent. Several sequences in the 190-kb segment have been rearranged relative to the current mitochondrial sequence, suggesting that the past and present arrangements of the mitochondrial genome differ. The four retrotransposons show no mutual sequence similarity and are integrated into different locations, suggesting that their integration events were independent, frequent, and quite recent. A fragment of the mitochondrial genome present in the nuclear genome, such as the 248-kb sequence characterized in this study, is a good relic with which to investigate the past mitochondrial genome structure and the behavior of independent retrotransposons during evolution.  相似文献   

12.
Crop genome sequencing: lessons and rationales   总被引:1,自引:0,他引:1  
2010 marks the 10th anniversary of the completion of the first plant genome sequence (Arabidopsis thaliana). Triggered by advancements in sequencing technologies, many crop genome sequences have been produced, with eight published since 2008. To date, however, only the rice (Oryza sativa) genome sequence has been finished to a quality level similar to that of the Arabidopsis sequence. This trend to produce draft genomes could affect the ability of researchers to address biological questions of speciation and recent evolution or to link sequence variation accurately to phenotypes. Here, we review the current crop genome sequencing activities, discuss how variability in sequence quality impacts utility for different studies and provide a perspective for a paradigm shift in selecting crops for sequencing in the future.  相似文献   

13.
Recent segmental and gene duplications in the mouse genome   总被引:2,自引:0,他引:2       下载免费PDF全文

Background

The high quality of the mouse genome draft sequence and its associated annotations are an invaluable biological resource. Identifying recent duplications in the mouse genome, especially in regions containing genes, may highlight important events in recent murine evolution. In addition, detecting recent sequence duplications can reveal potentially problematic regions of the genome assembly. We use BLAST-based computational heuristics to identify large (≥ 5 kb) and recent (≥ 90% sequence identity) segmental duplications in the mouse genome sequence. Here we present a database of recently duplicated regions of the mouse genome found in the mouse genome sequencing consortium (MGSC) February 2002 and February 2003 assemblies.

Results

We determined that 33.6 Mb of 2,695 Mb (1.2%) of sequence from the February 2003 mouse genome sequence assembly is involved in recent segmental duplications, which is less than that observed in the human genome (around 3.5-5%). From this dataset, 8.9 Mb (26%) of the duplication content consisted of 'unmapped' chromosome sequence. Moreover, we suspect that an additional 18.5 Mb of sequence is involved in duplication artifacts arising from sequence misassignment errors in this genome assembly. By searching for genes that are located within these regions, we identified 675 genes that mapped to duplicated regions of the mouse genome. Sixteen of these genes appear to have been duplicated independently in the human genome. From our dataset we further characterized a 42 kb recent segmental duplication of Mater, a maternal-effect gene essential for embryogenesis in mice.

Conclusion

Our results provide an initial analysis of the recently duplicated sequence and gene content of the mouse genome. Many of these duplicated loci, as well as regions identified to be involved in potential sequence misassignment errors, will require further mapping and sequencing to achieve accuracy. A Genome Browser database was set up to display the identified duplication content presented in this work. This data will also be relevant to the growing number of investigators who use the draft genome sequence for experimental design and analysis.
  相似文献   

14.
The release of the complete genome sequence of the yeast Saccharomyces cerevisiae has ushered in a new phase of genome research in which sequence function will be assigned. The goal is to determine the biological function of each of the >6,000 open reading frames in the yeast genome. Innovative approaches have been developed that exploit the sequence data and yield information about gene expression levels, protein levels, subcellular localization and gene function for the entire genome.  相似文献   

15.
Nucleotide sequence comparisons were performed on a highly heterogeneous region of three human cytomegalovirus strains, Toledo, Towne, and AD169. The low-passage, virulent Toledo genome contained a DNA segment of approximately 13 kbp that was not found in the Towne genome and a segment of approximately 15 kbp that was not found in the AD169 genome. The Towne strain contained approximately 4.7 kbp of DNA that was absent from the AD169 genome, and only about half of this segment was present, arranged in an inverted orientation, in the Toledo genome. These additional sequences were located at the unique long (UL)/b' (IRL) boundary within the L component of the viral genome. A region representing nucleotides 175082 to 178221 of the AD169 genome was conserved in all three strains; however, substantial reduction in the size of the adjacent b' sequence was found. The additional DNA segment within the Toledo genome contained 19 open reading frames not present in the AD169 genome. The additional DNA segment within the Towne genome contained four new open reading frames, only one of which shared homology with the Toledo genome. This comparison was extended to five additional clinical isolates, and the additional Toledo sequence was conserved in all. These findings reveal a dramatic level of genome sequence complexity that may explain the differences that these strains exhibit in virulence and tissue tropism. Although the additional sequences have not altered the predicted size of the viral genome (230 to 235 kbp), a total of 22 new open reading frames (denoted UL133 to UL154), many of which have sequence characteristics of glycoproteins, are now defined as cytomegalovirus specific. Our work suggests that wild-type virus carries more than 220 genes, some of which are lost by large-scale deletion and rearrangement of the UL/b' region during laboratory passage.  相似文献   

16.
The draft genome sequence of a transgenic virus-resistant papaya marks the first genome sequence of a commercially important transgenic crop plant.  相似文献   

17.
Here we report the genome sequence of a plant growth-promoting rhizobacterium, Pseudomonas putida S11. The length of the draft genome sequence is approximately 5,970,799 bp, with a G+C content of 62.4%. The genome contains 6,076 protein-coding sequences.  相似文献   

18.
Sequencing-by-synthesis technologies can reduce the cost of generating de novo genome assemblies. We report a method for assembling draft genome sequences of eukaryotic organisms that integrates sequence information from different sources, and demonstrate its effectiveness by assembling an approximately 32.5 Mb draft genome sequence for the forest pathogen Grosmannia clavigera, an ascomycete fungus. We also developed a method for assessing draft assemblies using Illumina paired end read data and demonstrate how we are using it to guide future sequence finishing. Our results demonstrate that eukaryotic genome sequences can be accurately assembled by combining Illumina, 454 and Sanger sequence data.  相似文献   

19.
Structure of the 3' terminus of the hepatitis C virus genome.   总被引:10,自引:7,他引:3       下载免费PDF全文
Hepatitis C virus (HCV), a positive-strand RNA virus, has been considered to have a poly(U) stretch at the 3' terminus of the genome. We previously found a novel 98-nucleotide sequence downstream from the poly(U) stretch on the HCV genome by primer extension analysis of the 5' end of the antigenomic-strand RNA in infected liver (T. Tanaka, N. Kato, M.-J. Cho, and K. Shimotohno, Biochem. Biophys. Res. Commun. 215: 744-749, 1995). Here, we show that the novel sequence is a highly conserved 3' tail of the HCV genome. We repeated primer extension analyses with four HCV-infected liver samples and found the 98-nucleotide sequence in all the samples. Furthermore, experiments in which RNA oligonucleotide was ligated to the 3' end of the HCV genome existing in infectious serum revealed nearly identical 3' termini with no extra sequence downstream from the 98-nucleotide sequence, suggesting that this sequence is the tail of the HCV genome. This tail sequence was highly conserved among individuals and even between the two most genetically distant HCV types, II/1b and III/2a. Computer modeling predicted that the tail sequence can form a conserved stem-and-loop structure. These results suggest that the novel 3' tail is a common structure of the HCV genome that plays an important role in initiation of genomic replication.  相似文献   

20.
In higher eukaryotic cells, chromosomes are folded inside the nucleus. Recent advances in whole-genome mapping technologies have revealed the multiscale features of 3D genome organization that are intertwined with fundamental genome functions. However, DNA sequence determinants that modulate the formation of 3D genome organization remain poorly characterized. In the past few years, predicting 3D genome organization based on DNA sequence features has become an active area of research. Here, we review the recent progress in computational approaches to unraveling important sequence elements for 3D genome organization. In particular, we discuss the rapid development of machine learning-based methods that facilitate the connections between DNA sequence features and 3D genome architectures at different scales. While much progress has been made in developing predictive models for revealing important sequence features for 3D genome organization, new research is urgently needed to incorporate multi-omic data and enhance model interpretability, further advancing our understanding of gene regulation mechanisms through the lens of 3D genome organization.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号