首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The halophilic archaeon Haloferax volcanii has a multireplicon genome, consisting of a main chromosome, three secondary chromosomes, and a plasmid. Genes for the initiator protein Cdc6/Orc1, which are commonly located adjacent to archaeal origins of DNA replication, are found on all replicons except plasmid pHV2. However, prediction of DNA replication origins in H. volcanii is complicated by the fact that this species has no less than 14 cdc6/orc1 genes. We have used a combination of genetic, biochemical, and bioinformatic approaches to map DNA replication origins in H. volcanii. Five autonomously replicating sequences were found adjacent to cdc6/orc1 genes and replication initiation point mapping was used to confirm that these sequences function as bidirectional DNA replication origins in vivo. Pulsed field gel analyses revealed that cdc6/orc1-associated replication origins are distributed not only on the main chromosome (2.9 Mb) but also on pHV1 (86 kb), pHV3 (442 kb), and pHV4 (690 kb) replicons. Gene inactivation studies indicate that linkage of the initiator gene to the origin is not required for replication initiation, and genetic tests with autonomously replicating plasmids suggest that the origin located on pHV1 and pHV4 may be dominant to the principal chromosomal origin. The replication origins we have identified appear to show a functional hierarchy or differential usage, which might reflect the different replication requirements of their respective chromosomes. We propose that duplication of H. volcanii replication origins was a prerequisite for the multireplicon structure of this genome, and that this might provide a means for chromosome-specific replication control under certain growth conditions. Our observations also suggest that H. volcanii is an ideal organism for studying how replication of four replicons is regulated in the context of the archaeal cell cycle.  相似文献   

2.
3.

Background

Multipartite mitochondrial genomes are very rare in animals but have been found previously in two insect orders with highly rearranged genomes, the Phthiraptera (parasitic lice), and the Psocoptera (booklice/barklice).

Results

We provide the first report of a multipartite mitochondrial genome architecture in a third order with highly rearranged genomes: Thysanoptera (thrips). We sequenced the complete mitochondrial genomes of two divergent members of the Scirtothrips dorsalis cryptic species complex. The East Asia 1 species has the single circular chromosome common to animals while the South Asia 1 species has a genome consisting of two circular chromosomes. The fragmented South Asia 1 genome exhibits extreme chromosome size asymmetry with the majority of genes on the large, 14.28 kb, chromosome and only nad6 and trnC on the 0.92 kb mini-circle chromosome. This genome also features paralogous control regions with high similarity suggesting a very recent origin of the nad6 mini-circle chromosome in the South Asia 1 cryptic species.

Conclusions

Thysanoptera, along with the other minor paraenopteran insect orders should be considered models for rapid mitochondrial genome evolution, including fragmentation. Continued use of these models will facilitate a greater understanding of recombination and other mitochondrial genome evolutionary processes across eukaryotes.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1672-4) contains supplementary material, which is available to authorized users.  相似文献   

4.

Background

The relatively short read lengths from next generation sequencing (NGS) technologies still pose a challenge for de novo assembly of complex mammal genomes. One important solution is to use paired-end (PE) sequence information experimentally obtained from long-range DNA fragments (>1 kb). Here, we characterize and extend a long-range PE library construction method based on direct intra-molecule ligation (or molecular linker-free circularization) for NGS.

Results

We found that the method performs stably for PE sequencing of 2- to 5- kb DNA fragments, and can be extended to 10–20 kb (and even in extremes, up to ∼35 kb). We also characterized the impact of low quality input DNA on the method, and develop a whole-genome amplification (WGA) based protocol using limited input DNA (<1 µg). Using this PE dataset, we accurately assembled the YanHuang (YH) genome, the first sequenced Asian genome, into a scaffold N50 size of >2 Mb, which is over100-times greater than the initial size produced with only small insert PE reads(17 kb). In addition, we mapped two 7- to 8- kb insertions in the YH genome using the larger insert sizes of the long-range PE data.

Conclusions

In conclusion, we demonstrate here the effectiveness of this long-range PE sequencing method and its use for the de novo assembly of a large, complex genome using NGS short reads.  相似文献   

5.
6.

Background

With thousands of fungal genomes being sequenced, each genome containing up to 70 secondary metabolite (SM) clusters 30–80 kb in size, breakthrough techniques are needed to characterize this SM wealth.

Results

Here we describe a novel system-level methodology for unbiased cloning of intact large SM clusters from a single fungal genome for one-step transformation and expression in a model host. All 56 intact SM clusters from Aspergillus terreus were individually captured in self-replicating fungal artificial chromosomes (FACs) containing both the E. coli F replicon and an Aspergillus autonomously replicating sequence (AMA1). Candidate FACs were successfully shuttled between E. coli and the heterologous expression host A. nidulans. As proof-of-concept, an A. nidulans FAC strain was characterized in a novel liquid chromatography-high resolution mass spectrometry (LC-HRMS) and data analysis pipeline, leading to the discovery of the A. terreus astechrome biosynthetic machinery.

Conclusion

The method we present can be used to capture the entire set of intact SM gene clusters and/or pathways from fungal species for heterologous expression in A. nidulans and natural product discovery.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1561-x) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Lactobacillus salivarius strains are increasingly being exploited for their probiotic properties in humans and animals. Dissemination of antibiotic resistance genes among species with food or probiotic-association is undesirable and is often mediated by plasmids or integrative and conjugative elements. L. salivarius strains typically have multireplicon genomes including circular megaplasmids that encode strain-specific traits for intestinal survival and probiotic activity. Linear plasmids are less common in lactobacilli and show a very limited distribution in L. salivarius. Here we present experimental evidence that supports an unusually complex multireplicon genome structure in the porcine isolate L. salivarius JCM1046.

Results

JCM1046 harbours a 1.83 Mb chromosome, and four plasmids which constitute 20% of the genome. In addition to the known 219 kb repA-type megaplasmid pMP1046A, we identified and experimentally validated the topology of three additional replicons, the circular pMP1046B (129 kb), a linear plasmid pLMP1046 (101 kb) and pCTN1046 (33 kb) harbouring a conjugative transposon. pMP1046B harbours both plasmid-associated replication genes and paralogues of chromosomally encoded housekeeping and information-processing related genes, thus qualifying it as a putative chromid. pLMP1046 shares limited sequence homology or gene synteny with other L. salivarius plasmids, and its putative replication-associated protein is homologous to the RepA/E proteins found in the large circular megaplasmids of L. salivarius. Plasmid pCTN1046 harbours a single copy of an integrated conjugative transposon (Tn6224) which appears to be functionally intact and includes the tetracycline resistance gene tetM.

Conclusion

Experimental validation of sequence assemblies and plasmid topology resolved the complex genome architecture of L. salivarius JCM1046. A high-coverage draft genome sequence would not have elucidated the genome complexity in this strain. Given the expanding use of L. salivarius as a probiotic, it is important to determine the genotypic and phenotypic organization of L. salivarius strains. The identification of Tn6224-like elements in this species has implications for strain selection for probiotic applications.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-771) contains supplementary material, which is available to authorized users.  相似文献   

8.
9.

Background

Copy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing.

Results

A total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson’s correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding.

Conclusions

Our results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-962) contains supplementary material, which is available to authorized users.  相似文献   

10.

Background

Comparative evolutionary analysis of whole genomes requires not only accurate annotation of gene space, but also proper annotation of the repetitive fraction which is often the largest component of most if not all genomes larger than 50 kb in size.

Results

Here we present the Rice TE database (RiTE-db) - a genus-wide collection of transposable elements and repeated sequences across 11 diploid species of the genus Oryza and the closely-related out-group Leersia perrieri. The database consists of more than 170,000 entries divided into three main types: (i) a classified and curated set of publicly-available repeated sequences, (ii) a set of consensus assemblies of highly-repetitive sequences obtained from genome sequencing surveys of 12 species; and (iii) a set of full-length TEs, identified and extracted from 12 whole genome assemblies.

Conclusions

This is the first report of a repeat dataset that spans the majority of repeat variability within an entire genus, and one that includes complete elements as well as unassembled repeats. The database allows sequence browsing, downloading, and similarity searches. Because of the strategy adopted, the RiTE-db opens a new path to unprecedented direct comparative studies that span the entire nuclear repeat content of 15 million years of Oryza diversity.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1762-3) contains supplementary material, which is available to authorized users.  相似文献   

11.

Background

Genome evolution in the gymnosperm lineage of seed plants has given rise to many of the most complex and largest plant genomes, however the elements involved are poorly understood.

Methodology/Principal Findings

Gymny is a previously undescribed retrotransposon family in Pinus that is related to Athila elements in Arabidopsis. Gymny elements are dispersed throughout the modern Pinus genome and occupy a physical space at least the size of the Arabidopsis thaliana genome. In contrast to previously described retroelements in Pinus, the Gymny family was amplified or introduced after the divergence of pine and spruce (Picea). If retrotransposon expansions are responsible for genome size differences within the Pinaceae, as they are in angiosperms, then they have yet to be identified. In contrast, molecular divergence of Gymny retrotransposons together with other families of retrotransposons can account for the large genome complexity of pines along with protein-coding genic DNA, as revealed by massively parallel DNA sequence analysis of Cot fractionated genomic DNA.

Conclusions/Significance

Most of the enormous genome complexity of pines can be explained by divergence of retrotransposons, however the elements responsible for genome size variation are yet to be identified. Genomic resources for Pinus including those reported here should assist in further defining whether and how the roles of retrotransposons differ in the evolution of angiosperm and gymnosperm genomes.  相似文献   

12.

Background

Sugarcane smut can cause losses in cane yield and sugar content that range from 30% to total crop failure. Losses tend to increase with the passage of years. Sporisorium scitamineum is the fungus that causes sugarcane smut. This fungus has the potential to infect all sugarcane species unless a species is resistant to biotrophic fungal pathogens. However, it remains unclear how the fungus breaks through the cell walls of sugarcane and causes the formation of black or gray whip-like structures on the sugarcane plants.

Results

Here, we report the first high-quality genome sequence of S. scitamineum assembled de novo with a contig N50 of 41 kb, a scaffold N50 of 884 kb and genome size 19.8 Mb, containing an estimated 6,636 genes. This phytopathogen can utilize a wide range of carbon and nitrogen sources. A reduced set of genes encoding plant cell wall hydrolytic enzymes leads to its biotrophic lifestyle, in which damage to the host should be minimized. As a bipolar mating fungus, a and b loci are linked and the mating-type locus segregates as a single locus. The S. scitamineum genome has only 6 G protein-coupled receptors (GPCRs) grouped into five classes, which are responsible for transducing extracellular signals into intracellular responses, however, the genome is without any PTH11-like GPCR. There are 192 virulence associated genes in the genome of S. scitamineum, among which 31 expressed in all the stages, which mainly encode for energy metabolism and redox of short-chain compound related enzymes. Sixty-eight candidates for secreted effector proteins (CSEPs) were found in the genome of S. scitamineum, and 32 of them expressed in the different stages of sugarcane infection, which are probably involved in infection and/or triggering defense responses. There are two non-ribosomal peptide synthetase (NRPS) gene clusters that are involved in the generation of ferrichrome and ferrichrome A, while the terpenes gene cluster is composed of three unknown function genes and seven biosynthesis related genes.

Conclusions

As a destructive pathogen to sugar industry, the S. scitamineum genome will facilitate future research on the genomic basis and the pathogenic mechanisms of sugarcane smut.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-996) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background

There is a need to characterize genomes of the foodborne pathogen, Salmonella enterica serovar Enteritidis (SE) and identify genetic information that could be ultimately deployed for differentiating strains of the organism, a need that is yet to be addressed mainly because of the high degree of clonality of the organism. In an effort to achieve the first characterization of the genomes of SE of Canadian origin, we carried out massively parallel sequencing of the nucleotide sequence of 11 SE isolates obtained from poultry production environments (n = 9), a clam and a chicken, assembled finished genomes and investigated diversity of the SE genome.

Results

The median genome size was 4,678,683 bp. A total of 4,833 chromosomal genes defined the pan genome of our field SE isolates consisting of 4,600 genes present in all the genomes, i.e., core genome, and 233 genes absent in at least one genome (accessory genome). Genome diversity was demonstrable by the presence of 1,360 loci showing single nucleotide polymorphism (SNP) in the core genome which was used to portray the genetic distances by means of a phylogenetic tree for the SE isolates. The accessory genome consisted mostly of previously identified SE prophage sequences as well as two, apparently full- sized, novel prophages namely a 28 kb sequence provisionally designated as SE-OLF-10058 (3) prophage and a 43 kb sequence provisionally designated as SE-OLF-10012 prophage.

Conclusions

The number of SNPs identified in the relatively large core genome of SE is a reflection of substantial diversity that could be exploited for strain differentiation as shown by the development of an informative phylogenetic tree. Prophage sequences can also be exploited for SE strain differentiation and lineage tracking. This work has laid the ground work for further studies to develop a readily adoptable laboratory test for the subtyping of SE.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-713) contains supplementary material, which is available to authorized users.  相似文献   

14.
15.
16.

Background

Vibrio parahaemolyticus is a Gram-negative halophilic bacterium. Infections with the bacterium could become systemic and can be life-threatening to immunocompromised individuals. Genome sequences of a few clinical isolates of V. parahaemolyticus are currently available, but the genome dynamics across the species and virulence potential of environmental strains on a genome-scale have not been described before.

Results

Here we present genome sequences of four V. parahaemolyticus clinical strains from stool samples of patients and five environmental strains in Hong Kong. Phylogenomics analysis based on single nucleotide polymorphisms revealed a clear distinction between the clinical and environmental isolates. A new gene cluster belonging to the biofilm associated proteins of V. parahaemolyticus was found in clincial strains. In addition, a novel small genomic island frequently found among clinical isolates was reported. A few environmental strains were found harboring virulence genes and prophage elements, indicating their virulence potential. A unique biphenyl degradation pathway was also reported. A database for V. parahaemolyticus (http://kwanlab.bio.cuhk.edu.hk/vp) was constructed here as a platform to access and analyze genome sequences and annotations of the bacterium.

Conclusions

We have performed a comparative genomics analysis of clinical and environmental strains of V. parahaemolyticus. Our analyses could facilitate understanding of the phylogenetic diversity and niche adaptation of this bacterium.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1135) contains supplementary material, which is available to authorized users.  相似文献   

17.

Purpose

A previous study has indicated suggestive association of the hepatocyte growth factor (HGF) gene with Keratoconus. We wished to assess this association in an independent Caucasian cohort as well as assess its association with corneal curvature.

Participants

Keratoconus patients were recruited from private and public clinics in Melbourne, Australia. Non-keratoconic individuals were identified from the Genes in Myopia (GEM) study from Australia. A total of 830 individuals were used for the analysis including 157 keratoconic and 673 non keratoconic subjects.

Methods

Tag single nucleotide polymorphisms (tSNPs) were chosen to encompass the hepatocyte growth factor gene as well as 2 kb upstream of the start codon through to 2 kb downstream of the stop codon. Logistic and linear regression including age and gender as covariates were applied in statistical analysis with subsequent Bonferroni correction.

Results

Ten tSNPs were genotyped. Following statistical analysis and multiple testing correction, a statistically significant association was found for the tSNP rs2286194 {p = 1.1×10-3 Odds Ratio 0.52, 95% CI - 0.35, 0.77} for keratoconus. No association was found between the 10 tSNPs and corneal curvature.

Conclusions

These findings provide additional evidence of significant association of the HGF gene with Keratoconus. This association does not appear to act through the corneal curvature route.  相似文献   

18.
19.
20.

Background

Cost effective next generation sequencing technologies now enable the production of genomic datasets for many novel planktonic eukaryotes, representing an understudied reservoir of genetic diversity. O. tauri is the smallest free-living photosynthetic eukaryote known to date, a coccoid green alga that was first isolated in 1995 in a lagoon by the Mediterranean sea. Its simple features, ease of culture and the sequencing of its 13 Mb haploid nuclear genome have promoted this microalga as a new model organism for cell biology. Here, we investigated the quality of genome assemblies of Illumina GAIIx 75 bp paired-end reads from Ostreococcus tauri, thereby also improving the existing assembly and showing the genome to be stably maintained in culture.

Results

The 3 assemblers used, ABySS, CLCBio and Velvet, produced 95% complete genomes in 1402 to 2080 scaffolds with a very low rate of misassembly. Reciprocally, these assemblies improved the original genome assembly by filling in 930 gaps. Combined with additional analysis of raw reads and PCR sequencing effort, 1194 gaps have been solved in total adding up to 460 kb of sequence. Mapping of RNAseq Illumina data on this updated genome led to a twofold reduction in the proportion of multi-exon protein coding genes, representing 19% of the total 7699 protein coding genes. The comparison of the DNA extracted in 2001 and 2009 revealed the fixation of 8 single nucleotide substitutions and 2 deletions during the approximately 6000 generations in the lab. The deletions either knocked out or truncated two predicted transmembrane proteins, including a glutamate-receptor like gene.

Conclusion

High coverage (>80 fold) paired-end Illumina sequencing enables a high quality 95% complete genome assembly of a compact ~13 Mb haploid eukaryote. This genome sequence has remained stable for 6000 generations of lab culture.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1103) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号