首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background and Aims

Peanut (Arachis hypogaea) is an allotetraploid (AABB-type genome) of recent origin, with a genome of about 2·8 Gb and a high repetitive content. This study reports an analysis of the repetitive component of the peanut A genome using bacterial artificial chromosome (BAC) clones from A. duranensis, the most probable A genome donor, and the probable consequences of the activity of these elements since the divergence of the peanut A and B genomes.

Methods

The repetitive content of the A genome was analysed by using A. duranensis BAC clones as probes for fluorescence in situ hybridization (BAC-FISH), and by sequencing and characterization of 12 genomic regions. For the analysis of the evolutionary dynamics, two A genome regions are compared with their B genome homeologues.

Key Results

BAC-FISH using 27 A. duranensis BAC clones as probes gave dispersed and repetitive DNA characteristic signals, predominantly in interstitial regions of the peanut A chromosomes. The sequences of 14 BAC clones showed complete and truncated copies of ten abundant long terminal repeat (LTR) retrotransposons, characterized here. Almost all dateable transposition events occurred <3·5 million years ago, the estimated date of the divergence of A and B genomes. The most abundant retrotransposon is Feral, apparently parasitic on the retrotransposon FIDEL, followed by Pipa, also non-autonomous and probably parasitic on a retrotransposon we named Pipoka. The comparison of the A and B genome homeologous regions showed conserved segments of high sequence identity, punctuated by predominantly indel regions without significant similarity.

Conclusions

A substantial proportion of the highly repetitive component of the peanut A genome appears to be accounted for by relatively few LTR retrotransposons and their truncated copies or solo LTRs. The most abundant of the retrotransposons are non-autonomous. The activity of these retrotransposons has been a very significant driver of genome evolution since the evolutionary divergence of the A and B genomes.  相似文献   

2.
The Poales (includes the grasses) and Asparagales [includes onion (Allium cepa L.) and asparagus (Asparagus officinalis L.)] are the two most economically important monocot orders. The Poales are a member of the commelinoid monocots, a group of orders sister to the Asparagales. Comparative genomic analyses have revealed a high degree of synteny among the grasses; however, it is not known if this synteny extends to other major monocot groups such as the Asparagales. Although we previously reported no evidence for synteny at the recombinational level between onion and rice, microsynteny may exist across shorter genomic regions in the grasses and Asparagales. We sequenced nine asparagus BACs to reveal physically linked genic-like sequences and determined their most similar positions in the onion and rice genomes. Four of the asparagus BACs were selected using molecular markers tightly linked to the sex-determining M locus on chromosome 5 of asparagus. These BACs possessed only two putative coding regions and had long tracts of degenerated retroviral elements and transposons. Five asparagus BACs were selected after hybridization of three onion cDNAs that mapped to three different onion chromosomes. Genic-like sequences that were physically linked on the cDNA-selected BACs or genetically linked on the M-linked BACs showed significant similarities (e < −20) to expressed sequences on different rice chromosomes, revealing no evidence for microsynteny between asparagus and rice across these regions. Genic-like sequences that were linked in asparagus were used to identify highly similar (e < −20) expressed sequence tags (ESTs) of onion. These onion ESTs mapped to different onion chromosomes and no relationship was observed between physical or genetic linkages in asparagus and genetic linkages in onion. These results further indicate that synteny among grass genomes does not extend to a sister order in the monocots and that asparagus may not be an appropriate smaller genome model for plants in the Asparagales with enormous nuclear genomes.  相似文献   

3.

Background and Aims

The cultivated jute species Corchorus olitorius and Corchorus capsularis are important fibre crops. The analysis of repetitive DNA sequences, comprising a major part of plant genomes, has not been carried out in jute but is useful to investigate the long-range organization of chromosomes. The aim of this study was the identification of repetitive DNA sequences to facilitate comparative molecular and cytogenetic studies of two jute cultivars and to develop a fluorescent in situ hybridization (FISH) karyotype for chromosome identification.

Methods

A plasmid library was generated from C. olitorius and C. capsularis with genomic restriction fragments of 100–500 bp, which was complemented by targeted cloning of satellite DNA by PCR. The diversity of the repetitive DNA families was analysed comparatively. The genomic abundance and chromosomal localization of different repeat classes were investigated by Southern analysis and FISH, respectively. The cytosine methylation of satellite arrays was studied by immunolabelling.

Key Results

Major satellite repeats and retrotransposons have been identified from C. olitorius and C. capsularis. The satellite family CoSat I forms two undermethylated species-specific subfamilies, while the long terminal repeat (LTR) retrotransposons CoRetro I and CoRetro II show similarity to the Metaviridea of plant retroelements. FISH karyotypes were developed by multicolour FISH using these repetitive DNA sequences in combination with 5S and 18S–5·8S–25S rRNA genes which enable the unequivocal chromosome discrimination in both jute species.

Conclusions

The analysis of the structure and diversity of the repeated DNA is crucial for genome sequence annotation. The reference karyotypes will be useful for breeding of jute and provide the basis for karyotyping homeologous chromosomes of wild jute species to reveal the genetic and evolutionary relationship between cultivated and wild Corchorus species.  相似文献   

4.
《BMC genomics》2014,15(1)

Background

Sugarcane is the source of sugar in all tropical and subtropical countries and is becoming increasingly important for bio-based fuels. However, its large (10 Gb), polyploid, complex genome has hindered genome based breeding efforts. Here we release the largest and most diverse set of sugarcane genome sequences to date, as part of an on-going initiative to provide a sugarcane genomic information resource, with the ultimate goal of producing a gold standard genome.

Results

Three hundred and seventeen chiefly euchromatic BACs were sequenced. A reference set of one thousand four hundred manually-annotated protein-coding genes was generated. A small RNA collection and a RNA-seq library were used to explore expression patterns and the sRNA landscape. In the sucrose and starch metabolism pathway, 16 non-redundant enzyme-encoding genes were identified. One of the sucrose pathway genes, sucrose-6-phosphate phosphohydrolase, is duplicated in sugarcane and sorghum, but not in rice and maize. A diversity analysis of the s6pp duplication region revealed haplotype-structured sequence composition. Examination of hom(e)ologous loci indicate both sequence structural and sRNA landscape variation. A synteny analysis shows that the sugarcane genome has expanded relative to the sorghum genome, largely due to the presence of transposable elements and uncharacterized intergenic and intronic sequences.

Conclusion

This release of sugarcane genomic sequences will advance our understanding of sugarcane genetics and contribute to the development of molecular tools for breeding purposes and gene discovery.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-540) contains supplementary material, which is available to authorized users.  相似文献   

5.

Background

Retrotransposons have been extensively studied in plants and animals and have been shown to have an impact on human genome dynamics and evolution. Their ability to move within genomes gives retrotransposons to affect genome instability.

Methods

we examined the polymorphic inserted AluYa5, evolutionary young Alu, in the progesterone receptor gene to determine the effects of Alu insertion on molecular environment. We used mono-allelic inserted cell lines which carry both Alu-present and Alu-absent alleles. To determine the epigenetic change and gene expression, we performed restriction enzyme digestion, Pyrosequencing, and Chromatin Immunoprecipitation.

Results

We observed that the polymorphic insertion of evolutionally young Alu causes increasing levels of DNA methylation in the surrounding genomic area and generates inactive histone tail modifications. Consequently the Alu insertion deleteriously inactivates the neighboring gene expression.

Conclusion

The mono-allelic Alu insertion cell line clearly showed that polymorphic inserted repetitive elements cause the inactivation of neighboring gene expression, bringing aberrant epigenetic changes.  相似文献   

6.

Background

Pseudomonas aeruginosa is an important opportunistic pathogen responsible for many infections in hospitalized and immunocompromised patients. Previous reports estimated that approximately 10% of its 6.6 Mbp genome varies from strain to strain and is therefore referred to as “accessory genome”. Elements within the accessory genome of P. aeruginosa have been associated with differences in virulence and antibiotic resistance. As whole genome sequencing of bacterial strains becomes more widespread and cost-effective, methods to quickly and reliably identify accessory genomic elements in newly sequenced P. aeruginosa genomes will be needed.

Results

We developed a bioinformatic method for identifying the accessory genome of P. aeruginosa. First, the core genome was determined based on sequence conserved among the completed genomes of twelve reference strains using Spine, a software program developed for this purpose. The core genome was 5.84 Mbp in size and contained 5,316 coding sequences. We then developed an in silico genome subtraction program named AGEnt to filter out core genomic sequences from P. aeruginosa whole genomes to identify accessory genomic sequences of these reference strains. This analysis determined that the accessory genome of P. aeruginosa ranged from 6.9-18.0% of the total genome, was enriched for genes associated with mobile elements, and was comprised of a majority of genes with unknown or unclear function. Using these genomes, we showed that AGEnt performed well compared to other publically available programs designed to detect accessory genomic elements. We then demonstrated the utility of the AGEnt program by applying it to the draft genomes of two previously unsequenced P. aeruginosa strains, PA99 and PA103.

Conclusions

The P. aeruginosa genome is rich in accessory genetic material. The AGEnt program accurately identified the accessory genomes of newly sequenced P. aeruginosa strains, even when draft genomes were used. As P. aeruginosa genomes become available at an increasingly rapid pace, this program will be useful in cataloging the expanding accessory genome of this bacterium and in discerning correlations between phenotype and accessory genome makeup. The combination of Spine and AGEnt should be useful in defining the accessory genomes of other bacterial species as well.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-737) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Mosses are the largest of the three extant clades of gametophyte-dominant land plants and remain poorly studied using comparative genomic methods. Major monophyletic moss lineages are characterised by different types of a spore dehiscence apparatus called the peristome, and the most important unsolved problem in higher-level moss systematics is the branching order of these peristomate clades. Organellar genome sequencing offers the potential to resolve this issue through the provision of both genomic structural characters and a greatly increased quantity of nucleotide substitution characters, as well as to elucidate organellar evolution in mosses. We publish and describe the chloroplast and mitochondrial genomes of Tetraphis pellucida, representative of the most phylogenetically intractable and morphologically isolated peristomate lineage.

Results

Assembly of reads from Illumina SBS and Pacific Biosciences RS sequencing reveals that the Tetraphis chloroplast genome comprises 127,489 bp and the mitochondrial genome 107,730 bp. Although genomic structures are similar to those of the small number of other known moss organellar genomes, the chloroplast lacks the petN gene (in common with Tortula ruralis) and the mitochondrion has only a non-functional pseudogenised remnant of nad7 (uniquely amongst known moss chondromes).

Conclusions

Structural genomic features exist with the potential to be informative for phylogenetic relationships amongst the peristomate moss lineages, and thus organellar genome sequences are urgently required for exemplars from other clades. The unique genomic and morphological features of Tetraphis confirm its importance for resolving one of the major questions in land plant phylogeny and for understanding the evolution of the peristome, a likely key innovation underlying the diversity of mosses. The functional loss of nad7 from the chondrome is now shown to have occurred independently in all three bryophyte clades as well as in the early-diverging tracheophyte Huperzia squarrosa.  相似文献   

8.
9.
10.
11.
12.
13.

Background

Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because de novo assembly for non-model genomes and multi-genome alignment are challenging.

Results

To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (https://sourceforge.net/projects/aaf-phylogeny) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms.

Conclusion

Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1647-5) contains supplementary material, which is available to authorized users.  相似文献   

14.

Background

The ~17 Gb hexaploid bread wheat genome is a high priority and a major technical challenge for genomic studies. In particular, the D sub-genome is relatively lacking in genetic diversity, making it both difficult to map genetically, and a target for introgression of agriculturally useful traits. Elucidating its sequence and structure will therefore facilitate wheat breeding and crop improvement.

Results

We generated shotgun sequences from each arm of flow-sorted Triticum aestivum chromosome 5D using 454 FLX Titanium technology, giving 1.34× and 1.61× coverage of the short (5DS) and long (5DL) arms of the chromosome respectively. By a combination of sequence similarity and assembly-based methods, ~74% of the sequence reads were classified as repetitive elements, and coding sequence models of 1314 (5DS) and 2975 (5DL) genes were generated. The order of conserved genes in syntenic regions of previously sequenced grass genomes were integrated with physical and genetic map positions of 518 wheat markers to establish a virtual gene order for chromosome 5D.

Conclusions

The virtual gene order revealed a large-scale chromosomal rearrangement in the peri-centromeric region of 5DL, and a concentration of non-syntenic genes in the telomeric region of 5DS. Although our data support the large-scale conservation of Triticeae chromosome structure, they also suggest that some regions are evolving rapidly through frequent gene duplications and translocations.

Sequence accessions

EBI European Nucleotide Archive, Study no. ERP002330

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1080) contains supplementary material, which is available to authorized users.  相似文献   

15.

Background and Aims

Lolium perenne (perennial ryegrass) is the most important forage grass species of temperate regions. We have previously released the chloroplast genome sequence of L. perenne ‘Cashel’. Here nine chloroplast microsatellite markers are published, which were designed based on knowledge about genetically variable regions within the L. perenne chloroplast genome. These markers were successfully used for characterizing the genetic diversity in Lolium and different grass species.

Methods

Chloroplast genomes of 14 Poaceae taxa were screened for mononucleotide microsatellite repeat regions and primers designed for their amplification from nine loci. The potential of these markers to assess genetic diversity was evaluated on a set of 16 Irish and 15 European L. perenne ecotypes, nine L. perenne cultivars, other Lolium taxa and other grass species.

Key Results

All analysed Poaceae chloroplast genomes contained more than 200 mononucleotide repeats (chloroplast simple sequence repeats, cpSSRs) of at least 7 bp in length, concentrated mainly in the large single copy region of the genome. Nucleotide composition varied considerably among subfamilies (with Pooideae biased towards poly A repeats). The nine new markers distinguish L. perenne from all non-Lolium taxa. TeaCpSSR28 was able to distinguish between all Lolium species and Lolium multiflorum due to an elongation of an A8 mononucleotide repeat in L. multiflorum. TeaCpSSR31 detected a considerable degree of microsatellite length variation and single nucleotide polymorphism. TeaCpSSR27 revealed variation within some L. perenne accessions due to a 44-bp indel and was hence readily detected by simple agarose gel electrophoresis. Smaller insertion/deletion events or single nucleotide polymorphisms detected by these new markers could be visualized by polyacrylamide gel electrophoresis or DNA sequencing, respectively.

Conclusions

The new markers are a valuable tool for plant breeding companies, seed testing agencies and the wider scientific community due to their ability to monitor genetic diversity within breeding pools, to trace maternal inheritance and to distinguish closely related species.  相似文献   

16.

Background

Enterococcus mundtii is a yellow-pigmented microorganism rarely found in human infections. The draft genome sequence of E. mundtii was recently announced. Its genome encodes at least 2,589 genes and 57 RNAs, and 4 putative genomic islands have been detected. The objective of this study was to compare the genetic content of E. mundtii with respect to other enterococcal species and, more specifically, to identify genes coding for putative virulence traits present in enterococcal opportunistic pathogens.

Results

An in-depth mining of the annotated genome was performed in order to uncover the unique properties of this microorganism, which allowed us to detect a gene encoding the antimicrobial peptide mundticin among other relevant features. Moreover, in this study a comparative genomic analysis against commensal and pathogenic enterococcal species, for which genomic sequences have been released, was conducted for the first time. Furthermore, our study reveals significant similarities in gene content between this environmental isolate and the selected enterococci strains (sharing an “enterococcal gene core” of 805 CDS), which contributes to understand the persistence of this genus in different niches and also improves our knowledge about the genetics of this diverse group of microorganisms that includes environmental, commensal and opportunistic pathogens.

Conclusion

Although E. mundtii CRL1656 is phylogenetically closer to E. faecium, frequently responsible of nosocomial infections, this strain does not encode the most relevant relevant virulence factors found in the enterococcal clinical isolates and bioinformatic predictions indicate that it possesses the lowest number of putative pathogenic genes among the most representative enterococcal species. Accordingly, infection assays using the Galleria mellonella model confirmed its low virulence.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-489) contains supplementary material, which is available to authorized users.  相似文献   

17.

Background

Cryptosporidium hominis is a dominant species for human cryptosporidiosis. Within the species, IbA10G2 is the most virulent subtype responsible for all C. hominis–associated outbreaks in Europe and Australia, and is a dominant outbreak subtype in the United States. In recent yearsIaA28R4 is becoming a major new subtype in the United States. In this study, we sequenced the genomes of two field specimens from each of the two subtypes and conducted a comparative genomic analysis of the obtained sequences with those from the only fully sequenced Cryptosporidium parvum genome.

Results

Altogether, 8.59-9.05 Mb of Cryptosporidium sequences in 45–767 assembled contigs were obtained from the four specimens, representing 94.36-99.47% coverage of the expected genome. These genomes had complete synteny in gene organization and 96.86-97.0% and 99.72-99.83% nucleotide sequence similarities to the published genomes of C. parvum and C. hominis, respectively. Several major insertions and deletions were seen between C. hominis and C. parvum genomes, involving mostly members of multicopy gene families near telomeres. The four C. hominis genomes were highly similar to each other and divergent from the reference IaA25R3 genome in some highly polymorphic regions. Major sequence differences among the four specimens sequenced in this study were in the 5′ and 3′ ends of chromosome 6 and the gp60 region, largely the result of genetic recombination.

Conclusions

The sequence similarity among specimens of the two dominant outbreak subtypes and genetic recombination in chromosome 6, especially around the putative virulence determinant gp60 region, suggest that genetic recombination plays a potential role in the emergence of hyper-transmissible C. hominis subtypes. The high sequence conservation between C. parvum and C. hominis genomes and significant differences in copy numbers of MEDLE family secreted proteins and insulinase-like proteases indicate that telomeric gene duplications could potentially contribute to host expansion in C. parvum.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1517-1) contains supplementary material, which is available to authorized users.  相似文献   

18.

Background

Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes?

Results

A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the human, dog, and cow genomes. To maximize genome coverage, the coordinates of all BAC end sequence hits to the cow and dog genomes were also converted to the equivalent human genome coordinates. The 84,624 sheep BACs (about 5.4-fold genome coverage) with paired ends in the correct orientation (tail-to-tail) and spacing, combined with information from sheep BAC comparative genome contigs (CGCs) built separately on the dog and cow genomes, were used to construct 1,172 sheep BAC-CGCs, covering 91.2% of the human genome. Clustered non-tail-to-tail and outsize BACs located close to the ends of many BAC-CGCs linked BAC-CGCs covering about 70% of the genome to at least one other BAC-CGC on the same chromosome. Using the BAC-CGCs, the intrachromosomal and interchromosomal BAC-CGC linkage information, human/cow and vertebrate synteny, and the sheep marker map, a virtual sheep genome was constructed. To identify BACs potentially located in gaps between BAC-CGCs, an additional set of 55,668 sheep BACs were positioned on the sheep genome with lower confidence. A coordinate conversion process allowed us to transfer human genes and other genome features to the virtual sheep genome to display on a sheep genome browser.

Conclusion

We demonstrate that limited sequencing of BACs combined with positioning on a well assembled genome and integrating locations from other less well assembled genomes can yield extensive, detailed subgene-level maps of mammalian genomes, for which genomic resources are currently limited.  相似文献   

19.

Background

The application of phages is a promising tool to reduce the number of Campylobacter along the food chain. Besides the efficacy against a broad range of strains, phages have to be safe in terms of their genomes. Thus far, no genes with pathogenic potential (e.g., genes encoding virulence factors) have been detected in Campylobacter phages. However, preliminary studies suggested that the genomes of group II phages may be diverse and prone to genomic rearrangements.

Results

We determined and analysed the genomic sequence (182,761 bp) of group II phage CP21 that is closely related to the already characterized group II phages CP220 and CPt10. The genomes of these phages are comprised of four modules separated by very similar repeat regions, some of which harbouring open reading frames (ORFs). Though, the arrangement of the modules and the location of some ORFs on the genomes are different in CP21 and in CP220/CPt10. In this work, a PCR system was established to study the modular genome organization of other group II phages demonstrating that they belong to different subgroups of the CP220-like virus genus, the prototypes of which are CP21 and CP220. The subgroups revealed different restriction patterns and, interestingly enough, also distinct host specificities, tail fiber proteins and tRNA genes. We additionally analysed the genome of group II phage vB_CcoM-IBB_35 (IBB_35) for which to date only five individual contigs could be determined. We show that the contigs represent modules linked by long repeat regions enclosing some yet not identified ORFs (e.g., for a head completion protein). The data suggest that IBB_35 is a member of the CP220 subgroup.

Conclusion

Campylobacter group II phages are diverse regarding their genome organization. Since all hitherto characterized group II phages contain numerous genes for transposases and homing endonucleases as well as similar repeat regions, it cannot be excluded that these phages are genetically unstable. To answer this question, further experiments and sequencing of more group II phages should be performed.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1837-1) contains supplementary material, which is available to authorized users.  相似文献   

20.

Background

There is a need to characterize genomes of the foodborne pathogen, Salmonella enterica serovar Enteritidis (SE) and identify genetic information that could be ultimately deployed for differentiating strains of the organism, a need that is yet to be addressed mainly because of the high degree of clonality of the organism. In an effort to achieve the first characterization of the genomes of SE of Canadian origin, we carried out massively parallel sequencing of the nucleotide sequence of 11 SE isolates obtained from poultry production environments (n = 9), a clam and a chicken, assembled finished genomes and investigated diversity of the SE genome.

Results

The median genome size was 4,678,683 bp. A total of 4,833 chromosomal genes defined the pan genome of our field SE isolates consisting of 4,600 genes present in all the genomes, i.e., core genome, and 233 genes absent in at least one genome (accessory genome). Genome diversity was demonstrable by the presence of 1,360 loci showing single nucleotide polymorphism (SNP) in the core genome which was used to portray the genetic distances by means of a phylogenetic tree for the SE isolates. The accessory genome consisted mostly of previously identified SE prophage sequences as well as two, apparently full- sized, novel prophages namely a 28 kb sequence provisionally designated as SE-OLF-10058 (3) prophage and a 43 kb sequence provisionally designated as SE-OLF-10012 prophage.

Conclusions

The number of SNPs identified in the relatively large core genome of SE is a reflection of substantial diversity that could be exploited for strain differentiation as shown by the development of an informative phylogenetic tree. Prophage sequences can also be exploited for SE strain differentiation and lineage tracking. This work has laid the ground work for further studies to develop a readily adoptable laboratory test for the subtyping of SE.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-713) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号