首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Structural variants (SVs) represent an important genetic resource for both natural and artificial selection. Here we present a chromosome-scale reference genome for domestic yak (Bos grunniens) that has longer contigs and scaffolds (N50 44.72 and 114.39 Mb, respectively) than reported for any other ruminant genome. We further obtained long-read resequencing data for 6 wild and 23 domestic yaks and constructed a genetic SV map of 372,220 SVs that covers the geographic range of the yaks. The majority of the SVs contains repetitive sequences and several are in or near genes. By comparing SVs in domestic and wild yaks, we identified genes that are predominantly related to the nervous system, behavior, immunity, and reproduction and may have been targeted by artificial selection during yak domestication. These findings provide new insights in the domestication of animals living at high altitude and highlight the importance of SVs in animal domestication.  相似文献   

2.

Background

Characterizing large genomic variants is essential to expanding the research and clinical applications of genome sequencing. While multiple data types and methods are available to detect these structural variants (SVs), they remain less characterized than smaller variants because of SV diversity, complexity, and size. These challenges are exacerbated by the experimental and computational demands of SV analysis. Here, we characterize the SV content of a personal genome with Parliament, a publicly available consensus SV-calling infrastructure that merges multiple data types and SV detection methods.

Results

We demonstrate Parliament’s efficacy via integrated analyses of data from whole-genome array comparative genomic hybridization, short-read next-generation sequencing, long-read (Pacific BioSciences RSII), long-insert (Illumina Nextera), and whole-genome architecture (BioNano Irys) data from the personal genome of a single subject (HS1011). From this genome, Parliament identified 31,007 genomic loci between 100 bp and 1 Mbp that are inconsistent with the hg19 reference assembly. Of these loci, 9,777 are supported as putative SVs by hybrid local assembly, long-read PacBio data, or multi-source heuristics. These SVs span 59 Mbp of the reference genome (1.8%) and include 3,801 events identified only with long-read data. The HS1011 data and complete Parliament infrastructure, including a BAM-to-SV workflow, are available on the cloud-based service DNAnexus.

Conclusions

HS1011 SV analysis reveals the limits and advantages of multiple sequencing technologies, specifically the impact of long-read SV discovery. With the full Parliament infrastructure, the HS1011 data constitute a public resource for novel SV discovery, software calibration, and personal genome structural variation analysis.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1479-3) contains supplementary material, which is available to authorized users.  相似文献   

3.
4.
Domestication and selection for important performance traits can impact the genome, which is most often reflected by reduced heterozygosity in and surrounding genes related to traits affected by selection. In this study, analysis of the genomic impact caused by domestication and artificial selection was conducted by investigating the signatures of selection using single nucleotide polymorphisms (SNPs) in channel catfish (Ictalurus punctatus). A total of 8.4 million candidate SNPs were identified by using next generation sequencing. On average, the channel catfish genome harbors one SNP per 116 bp. Approximately 6.6 million, 5.3 million, 4.9 million, 7.1 million and 6.7 million SNPs were detected in the Marion, Thompson, USDA103, Hatchery strain, and wild population, respectively. The allele frequencies of 407,861 SNPs differed significantly between the domestic and wild populations. With these SNPs, 23 genomic regions with putative selective sweeps were identified that included 11 genes. Although the function for the majority of the genes remain unknown in catfish, several genes with known function related to aquaculture performance traits were included in the regions with selective sweeps. These included hypoxia-inducible factor 1β· HIFιβ ¨ and the transporter gene ATP-binding cassette sub-family B member 5 (ABCB5). HIF1β· is important for response to hypoxia and tolerance to low oxygen levels is a critical aquaculture trait. The large numbers of SNPs identified from this study are valuable for the development of high-density SNP arrays for genetic and genomic studies of performance traits in catfish.  相似文献   

5.
6.
Dissecting the genetic mechanisms underlying dioecy (i.e., separate female and male individuals) is critical for understanding the evolution of this pervasive reproductive strategy. Nonetheless, the genetic basis of sex determination remains unclear in many cases, especially in systems where dioecy has arisen recently. Within the economically important plant genus Solanum (∼2,000 species), dioecy is thought to have evolved independently at least 4 times across roughly 20 species. Here, we generate the first genome sequence of a dioecious Solanum and use it to ascertain the genetic basis of sex determination in this species. We de novo assembled and annotated the genome of Solanum appendiculatum (assembly size: ∼750 Mb scaffold N50: 0.92 Mb; ∼35,000 genes), identified sex-specific sequences and their locations in the genome, and inferred that males in this species are the heterogametic sex. We also analyzed gene expression patterns in floral tissues of males and females, finding approximately 100 genes that are differentially expressed between the sexes. These analyses, together with observed patterns of gene-family evolution specific to S. appendiculatum, consistently implicate a suite of genes from the regulatory network controlling pectin degradation and modification in the expression of sex. Furthermore, the genome of a species with a relatively young sex-determination system provides the foundational resources for future studies on the independent evolution of dioecy in this clade.  相似文献   

7.
8.
The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [∼80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12 kb, with 50% of the reads between 15.5 and 50 kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [∼90–99% (A + T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission.  相似文献   

9.
In the thousands of years of rice domestication in Asia, many useful genes have been lost from the gene pool. Wild rice is a key source of diversity for domesticated rice. Genome sequencing has suggested that the wild rice populations in northern Australia may include novel taxa, within the AA genome group of close (interfertile) wild relatives of domesticated rice that have evolved independently due to geographic separation and been isolated from the loss of diversity associated with gene flow from the large populations of domesticated rice in Asia. Australian wild rice was collected from 27 sites from Townsville to the northern tip of Cape York. Whole chloroplast genome sequences and 4,555 nuclear gene sequences (more than 8 Mbp) were used to explore genetic relationships between these populations and other wild and domesticated rices. Analysis of the chloroplast and nuclear data showed very clear evidence of distinctness from other AA genome Oryza species with significant divergence between Australian populations. Phylogenetic analysis suggested the Australian populations represent the earliest‐branching AA genome lineages and may be critical resources for global rice food security. Nuclear genome analysis demonstrated that the diverse O. meridionalis populations were sister to all other AA genome taxa while the Australian O. rufipogon‐like populations were associated with the clade that included domesticated rice. Populations of apparent hybrids between the taxa were also identified suggesting ongoing dynamic evolution of wild rice in Australia. These introgressions model events similar to those likely to have been involved in the domestication of rice.  相似文献   

10.
11.
Temperate japonica/geng (GJ) rice yield has significantly improved due to intensive breeding efforts, dramatically enhancing global food security. However, little is known about the underlying genomic structural variations (SVs) responsible for this improvement. We compared 58 long-read assemblies comprising cultivated and wild rice species in the present study, revealing 156 319 SVs. The phylogenomic analysis based on the SV dataset detected the putatively selected region of GJ sub-populations. A significant portion of the detected SVs overlapped with genic regions were found to influence the expression of involved genes inside GJ assemblies. Integrating the SVs and causal genetic variants underlying agronomic traits into the analysis enables the precise identification of breeding signatures resulting from complex breeding histories aimed at stress tolerance, yield potential and quality improvement. Further, the results demonstrated genomic and genetic evidence that the SV in the promoter of LTG1 is accounting for chilling sensitivity, and the increased copy numbers of GNP1 were associated with positive effects on grain number. In summary, the current study provides genomic resources for retracing the properties of SVs-shaped agronomic traits during previous breeding procedures, which will assist future genetic, genomic and breeding research on rice.  相似文献   

12.
Whole genome sequencing studies are essential to obtain a comprehensive understanding of the vast pattern of human genomic variations. Here we report the results of a high-coverage whole genome sequencing study for 44 unrelated healthy Caucasian adults, each sequenced to over 50-fold coverage (averaging 65.8×). We identified approximately 11 million single nucleotide polymorphisms (SNPs), 2.8 million short insertions and deletions, and over 500,000 block substitutions. We showed that, although previous studies, including the 1000 Genomes Project Phase 1 study, have catalogued the vast majority of common SNPs, many of the low-frequency and rare variants remain undiscovered. For instance, approximately 1.4 million SNPs and 1.3 million short indels that we found were novel to both the dbSNP and the 1000 Genomes Project Phase 1 data sets, and the majority of which (∼96%) have a minor allele frequency less than 5%. On average, each individual genome carried ∼3.3 million SNPs and ∼492,000 indels/block substitutions, including approximately 179 variants that were predicted to cause loss of function of the gene products. Moreover, each individual genome carried an average of 44 such loss-of-function variants in a homozygous state, which would completely “knock out” the corresponding genes. Across all the 44 genomes, a total of 182 genes were “knocked-out” in at least one individual genome, among which 46 genes were “knocked out” in over 30% of our samples, suggesting that a number of genes are commonly “knocked-out” in general populations. Gene ontology analysis suggested that these commonly “knocked-out” genes are enriched in biological process related to antigen processing and immune response. Our results contribute towards a comprehensive characterization of human genomic variation, especially for less-common and rare variants, and provide an invaluable resource for future genetic studies of human variation and diseases.  相似文献   

13.
The functional integrity of neurons requires the bidirectional active transport of synaptic vesicles (SVs) in axons. The kinesin motor KIF1A transports SVs from somas to stable SV clusters at synapses, while dynein moves them in the opposite direction. However, it is unclear how SV transport is regulated and how SVs at clusters interact with motor proteins. We addressed these questions by isolating a rare temperature-sensitive allele of Caenorhabditis elegans unc-104 (KIF1A) that allowed us to manipulate SV levels in axons and dendrites. Growth at 20° and 14° resulted in locomotion rates that were ∼3 and 50% of wild type, respectively, with similar effects on axonal SV levels. Corresponding with the loss of SVs from axons, mutants grown at 14° and 20° showed a 10- and 24-fold dynein-dependent accumulation of SVs in their dendrites. Mutants grown at 14° and switched to 25° showed an abrupt irreversible 50% decrease in locomotion and a 50% loss of SVs from the synaptic region 12-hr post-shift, with no further decreases at later time points, suggesting that the remaining clustered SVs are stable and resistant to retrograde removal by dynein. The data further showed that the synapse-assembly proteins SYD-1, SYD-2, and SAD-1 protected SV clusters from degradation by motor proteins. In syd-1, syd-2, and sad-1 mutants, SVs accumulate in an UNC-104-dependent manner in the distal axon region that normally lacks SVs. In addition to their roles in SV cluster stability, all three proteins also regulate SV transport.  相似文献   

14.
China is rich of germplasm resources of common wild rice (Oryza rufipogon Griff.) and Asian cultivated rice (O. sativa L.) which consists of two subspecies, indica and japonica. Previous studies have shown that China is one of the domestication centers of O. sativa. However, the geographic origin and the domestication times of O. sativa in China are still under debate. To settle these disputes, six chloroplast loci and four mitochondrial loci were selected to examine the relationships between 50 accessions of Asian cultivated rice and 119 accessions of common wild rice from China based on DNA sequence analysis in the present study. The results indicated that Southern China is the genetic diversity center of O. rufipogon and it might be the primary domestication region of O. sativa. Molecular dating suggested that the two subspecies had diverged 0.1 million years ago, much earlier than the beginning of rice domestication. Genetic differentiations and phylogeography analyses indicated that indica was domesticated from tropical O. rufipogon while japonica was domesticated from O. rufipogon which located in higher latitude. These results provided molecular evidences for the hypotheses of (i) Southern China is the origin center of O. sativa in China and (ii) the two subspecies of O. sativa were domesticated multiple times.  相似文献   

15.
Elucidation of the rice genome is expected to broaden our understanding of genes related to the agronomic characteristics and the genetic relationship among cultivars. In this study, we conducted whole-genome sequencings of 6 cultivars, including 5 temperate japonica cultivars and 1 tropical japonica cultivar (Moroberekan), by using next-generation sequencing (NGS) with Nipponbare genome as a reference. The temperate japonica cultivars contained 2 sake brewing (Yamadanishiki and Gohyakumangoku), 1 landrace (Kameji), and 2 modern cultivars (Koshihikari and Norin 8). Almost >83% of the whole genome sequences of the Nipponbare genome could be covered by sequenced short-reads of each cultivar, including Omachi, which has previously been reported to be a temperate japonica cultivar. Numerous single nucleotide polymorphisms (SNPs), insertions, and deletions were detected among the various cultivars and the Nipponbare genomes. Comparison of SNPs detected in each cultivar suggested that Moroberekan had 5-fold more SNPs than the temperate japonica cultivars. Success of the 2 approaches to improve the efficacy of sequence data by using NGS revealed that sequencing depth was directly related to sequencing coverage of coding DNA sequences: in excess of 30× genome sequencing was required to cover approximately 80% of the genes in the rice genome. Further, the contigs prepared using the assembly of unmapped reads could increase the value of NGS short-reads and, consequently, cover previously unavailable sequences. These approaches facilitated the identification of new genes in coding DNA sequences and the increase of mapping efficiency in different regions. The DNA polymorphism information between the 7 cultivars and Nipponbare are available at NGRC_Rices_Build1.0 (http://www.nodai-genome.org/oryza_sativa_en.html).  相似文献   

16.
Multiplexed single nucleotide polymorphism (SNP) markers have the potential to increase the speed and cost-effectiveness of genotyping, provided that an optimal SNP density is used for each application. To test the efficiency of multiplexed SNP genotyping for diversity, mapping and breeding applications in rice (Oryza sativa L.), we designed seven GoldenGate VeraCode oligo pool assay (OPA) sets for the Illumina BeadXpress Reader. Validated markers from existing 1536 Illumina SNPs and 44?K Affymetrix SNP chips developed at Cornell University were used to select subsets of informative SNPs for different germplasm groups with even distribution across the genome. A 96-plex OPA was developed for quality control purposes and for assigning a sample into one of the five O. sativa population subgroups. Six 384-plex OPAs were designed for genetic diversity analysis, DNA fingerprinting, and to have evenly-spaced polymorphic markers for quantitative trait locus (QTL) mapping and background selection for crosses between different germplasm pools in rice: Indica/Indica, Indica/Japonica, Japonica/Japonica, Indica/O. rufipogon, and Japonica/O. rufipogon. After testing on a diverse set of rice varieties, two of the SNP sets were re-designed by replacing poor-performing SNPs. Pilot studies were successfully performed for diversity analysis, QTL mapping, marker-assisted backcrossing, and developing specialized genetic stocks, demonstrating that 384-plex SNP genotyping on the BeadXpress platform is a robust and efficient method for marker genotyping in rice.  相似文献   

17.
How does asexual reproduction influence genome evolution? Although is it clear that genomic structural variation is common and important in natural populations, we know very little about how one of the most fundamental of eukaryotic traits—mode of genomic inheritance—influences genome structure. We address this question with the New Zealand freshwater snail Potamopyrgus antipodarum, which features multiple separately derived obligately asexual lineages that coexist and compete with otherwise similar sexual lineages. We used whole-genome sequencing reads from a diverse set of sexual and asexual individuals to analyze genomic abundance of a critically important gene family, rDNA (the genes encoding rRNAs), that is notable for dynamic and variable copy number. Our genomic survey of rDNA in P. antipodarum revealed two striking results. First, the core histone and 5S rRNA genes occur between tandem copies of the 18S–5.8S–28S gene cluster, a unique architecture for these crucial gene families. Second, asexual P. antipodarum harbor dramatically more rDNA–histone copies than sexuals, which we validated through molecular and cytogenetic analysis. The repeated expansion of this genomic region in asexual P. antipodarum lineages following distinct transitions to asexuality represents a dramatic genome structural change associated with asexual reproduction—with potential functional consequences related to the loss of sexual reproduction.  相似文献   

18.
Two sets of iso-1-cytochrome c variants have been prepared with N-terminal insertions of pure polyglutamine, i.e., PolyQ variants, or polyglutamine interrupted with lysine every sixth residue, i.e., Gln-rich variants. The polymer properties of these pure polyGln or Gln-rich sequences have been evaluated using equilibrium and kinetic His-heme loop formation methods for loop sizes ranging from 22 to 46 in 1.5, 3.0, and 6.0 M guanidine hydrochloride (GdnHCl). In 6.0 M GdnHCl, the scaling exponent, ν3, for the pure polyGln sequences, is ∼1.7—significantly less than ν3 ≈ 2.15 for the Gln-rich sequences. The stability of the His-heme loops becomes progressively greater for the pure polyGln sequences relative to the Gln-rich sequences as GdnHCl concentration decreases from 6.0 to 1.5 M. Thus, the context of the sequence effects the polymer properties of Gln repeats even in denaturing concentrations of GdnHCl. Comparison of data for the Gln-rich variants with previous results for Gly-rich and Ala-rich variants shows that ν3 ∼ 2.2 for the Gln-rich, Gly-rich, and Ala-rich sequences in 6.0 M GdnHCl, whereas ν3 remains unchanged at 3.0 M GdnHCl concentration for the Gln-rich and Ala-rich sequences but decreases to ∼1.7 for the Gly-rich sequences. Thus, the polymer properties of Gln-rich and Ala-rich sequences are less sensitive to solvent quality in denaturing solutions of GdnHCl than Gly-rich sequences. Evaluation of Flory’s characteristic ratio, Cn, for the Gln-rich and Ala-rich sequences relative to the Gly-rich sequences shows that Gln-rich sequences are stiffer than Ala-rich sequences at both 3.0 and 6.0 M GdnHCl.  相似文献   

19.
20.
Biological age measures outperform chronological age in predicting various aging outcomes, yet little is known regarding genetic predisposition. We performed genome‐wide association scans of two age‐adjusted biological age measures (PhenoAgeAcceleration and BioAgeAcceleration), estimated from clinical biochemistry markers (Levine et al., 2018; Levine, 2013) in European‐descent participants from UK Biobank. The strongest signals were found in the APOE gene, tagged by the two major protein‐coding SNPs, PhenoAgeAccel—rs429358 (APOE e4 determinant) (p = 1.50 × 10−72); BioAgeAccel—rs7412 (APOE e2 determinant) (p = 3.16 × 10−60). Interestingly, we observed inverse APOE e2 and e4 associations and unique pathway enrichments when comparing the two biological age measures. Genes associated with BioAgeAccel were enriched in lipid related pathways, while genes associated with PhenoAgeAccel showed enrichment for immune system, cell function, and carbohydrate homeostasis pathways, suggesting the two measures capture different aging domains. Our study reaffirms that aging patterns are heterogeneous across individuals, and the manner in which a person ages may be partly attributed to genetic predisposition.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号