首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this study, we investigated the genetic variants, including SNPs and indels (short insertions or deletions, less than 50 bp in length), in the genomes and genetic structures of five pig populations (in the northern Taihu Lake region, Jiangsu Province) using the genotyping by genome reducing and sequencing (GGRS) approach. A total of 581 million good reads with an average depth of 11× and an average coverage of 2.16% were used to call variants. In general, 202 106 SNPs and 34 415 indels were obtained, of which 2690 SNPs and 224 indels were capable of inducing protein‐coding changes. The genes containing these variants were extracted for functional annotation. The results of gene enrichment analysis revealed that the SNPs under investigation may be associated with reproduction, disease resistance, meat quality and adipose tissue traits, whereas the indels were associated mainly with adipose tissue and disease. Analysis of the genetic structure showed that each population displayed comparable, large differentiations from the others, indicating their uniqueness. In conclusion, the results of our study provide the first genomic overview of the genetic variants and population structures of five Chinese indigenous pig populations.  相似文献   

2.
3.
Identification of genomic variants within dogs is important for understanding genetic factors contributing to breed diversity and phenotypic traits. This study aimed to identify sources of variation in the Bullmastiff using high‐density signal intensity and whole‐genome sequence data. Close to 3000 copy number variants (CNVs) were identified in Bullmastiff dogs using Canine HD BeadChip data. When CNVs were collated, 82 CNV regions (CNVRs) were detected, 50% in transcribed regions encompassing 432 genes. Fifty of the CNVRs detected have not been reported in other breeds and represent potential breed‐specific variants. A proportion of the CNVR variants with predicted modifying effects on gene pathways may contribute to breed traits. Approximately 5 million putative variants per dog, inclusive of single nucleotide polymorphisms (SNPs), multi‐nucleotide polymorphisms (MNPs) and insertion and deletions (INDELs), were identified from DNA sequence data on a small number of animals. Identification of genetic variants in the Bullmastiff highlights sources of variation in the breed and molecular markers that will assist in future trait and disease investigations in dogs.  相似文献   

4.
Whole‐genome sequencing studies are vital to gain a thorough understanding of genomic variation. Here, we summarize the results of a whole‐genome sequencing study comprising 88 horses and ponies from diverse breeds at 19.1× average coverage. The paired‐end reads were mapped to the current EquCab3.0 horse reference genome assembly, and we identified approximately 23.5 million single nucleotide variants and 2.3 million short indel variants. Our dataset included at least 7 million variants that were not previously reported. On average, each individual horse genome carried ~5.7 million single nucleotides and 0.8 million small indel variants with respect to the reference genome assembly. The variants were functionally annotated. We provide two examples for potentially deleterious recessive alleles that were identified in a heterozygous state in individual genome sequences. Appropriate management of such deleterious recessive alleles in horse breeding programs should help to improve fertility and reduce the prevalence of heritable diseases. This comprehensive dataset has been made publicly available, will represent a valuable resource for future horse genetic studies and supports the goal of accelerating the rates of genetic gain in domestic horse.  相似文献   

5.
Heterozyosity is an important feature of many plant genomes, and is related to heterosis. Sweet orange, a highly heterozygous species, is thought to have originated from an inter‐species hybrid between pummelo and mandarin. To investigate the heterozygosity of the sweet orange genome and examine how this heterozygosity affects gene expression, we characterized the genome of Valencia orange for single nucleotide variations (SNVs), small insertions and deletions (InDels) and structural variations (SVs), and determined their functional effects on protein‐coding genes and non‐coding sequences. Almost half of the genes containing large‐effect SNVs and InDels were expressed in a tissue‐specific manner. We identified 3542 large SVs (>50 bp), including deletions, insertions and inversions. Most of the 296 genes located in large‐deletion regions showed low expression levels. RNA‐Seq reads and DNA sequencing reads revealed that the alleles of 1062 genes were differentially expressed. In addition, we detected approximately 42 Mb of contigs that were not found in the reference genome of a haploid sweet orange by de novo assembly of unmapped reads, and annotated 134 protein‐coding genes within these contigs. We discuss how this heterozygosity affects the quality of genome assembly. This study advances our understanding of the genome architecture of sweet orange, and provides a global view of gene expression at heterozygous loci.  相似文献   

6.
Whole-genome sequencing is becoming commonplace, but the accuracy and completeness of variant calling by the most widely used platforms from Illumina and Complete Genomics have not been reported. Here we sequenced the genome of an individual with both technologies to a high average coverage of ~76×, and compared their performance with respect to sequence coverage and calling of single-nucleotide variants (SNVs), insertions and deletions (indels). Although 88.1% of the ~3.7 million unique SNVs were concordant between platforms, there were tens of thousands of platform-specific calls located in genes and other genomic regions. In contrast, 26.5% of indels were concordant between platforms. Target enrichment validated 92.7% of the concordant SNVs, whereas validation by genotyping array revealed a sensitivity of 99.3%. The validation experiments also suggested that >60% of the platform-specific variants were indeed present in the genome. Our results have important implications for understanding the accuracy and completeness of the genome sequencing platforms.  相似文献   

7.
U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30× genomic sequence coverage using a novel 50-base mate paired strategy with a 1.4kb mean insert library. A total of 1,014,984,286 mate-end and 120,691,623 single-end two-base encoded reads were generated from five slides. All data were aligned using a custom designed tool called BFAST, allowing optimal color space read alignment and accurate identification of DNA variants. The aligned sequence reads and mate-pair information identified 35 interchromosomal translocation events, 1,315 structural variations (>100 bp), 191,743 small (<21 bp) insertions and deletions (indels), and 2,384,470 single nucleotide variations (SNVs). Among these observations, the known homozygous mutation in PTEN was robustly identified, and genes involved in cell adhesion were overrepresented in the mutated gene list. Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy. Protein coding sequences were disrupted predominantly in this cancer cell line due to small indels, large deletions, and translocations. In total, 512 genes were homozygously mutated, including 154 by SNVs, 178 by small indels, 145 by large microdeletions, and 35 by interchromosomal translocations to reveal a highly mutated cell line genome. Of the small homozygously mutated variants, 8 SNVs and 99 indels were novel events not present in dbSNP. These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers. The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date.  相似文献   

8.
9.
Casuarina equisetifolia (C. equisetifolia), a conifer‐like angiosperm with resistance to typhoon and stress tolerance, is mainly cultivated in the coastal areas of Australasia. C. equisetifolia, making it a valuable model to study secondary growth associated genes and stress‐tolerance traits. However, the genome sequence is unavailable and therefore wood‐associated growth rate and stress resistance at the molecular level is largely unexplored. We therefore constructed a high‐quality draft genome sequence of C. equisetifolia by a combination of Illumina second‐generation sequencing reads and Pacific Biosciences single‐molecule real‐time (SMRT) long reads to advance the investigation of this species. Here, we report the genome assembly, which contains approximately 300 megabases (Mb) and scaffold size of N50 is 1.06 Mb. Additionally, gene annotation, assisted by a combination of prediction and RNA‐seq data, generated 29 827 annotated protein‐coding genes and 1983 non‐coding genes, respectively. Furthermore, we found that the total number of repetitive sequences account for one‐third of the genome assembly. Here we also construct the genome‐wide map of DNA modification, such as two novel forms N6‐adenine (6mA) and N4‐methylcytosine (4mC) at the level of single‐nucleotide resolution using single‐molecule real‐time (SMRT) sequencing. Interestingly, we found that 17% of 6mA modification genes and 15% of 4mC modification genes also included alternative splicing events. Finally, we investigated cellulose, hemicellulose, and lignin‐related genes, which were associated with secondary growth and contained different DNA modifications. The high‐quality genome sequence and annotation of C. equisetifolia in this study provide a valuable resource to strengthen our understanding of the diverse traits of trees.  相似文献   

10.
Whole genome sequencing studies are essential to obtain a comprehensive understanding of the vast pattern of human genomic variations. Here we report the results of a high-coverage whole genome sequencing study for 44 unrelated healthy Caucasian adults, each sequenced to over 50-fold coverage (averaging 65.8×). We identified approximately 11 million single nucleotide polymorphisms (SNPs), 2.8 million short insertions and deletions, and over 500,000 block substitutions. We showed that, although previous studies, including the 1000 Genomes Project Phase 1 study, have catalogued the vast majority of common SNPs, many of the low-frequency and rare variants remain undiscovered. For instance, approximately 1.4 million SNPs and 1.3 million short indels that we found were novel to both the dbSNP and the 1000 Genomes Project Phase 1 data sets, and the majority of which (∼96%) have a minor allele frequency less than 5%. On average, each individual genome carried ∼3.3 million SNPs and ∼492,000 indels/block substitutions, including approximately 179 variants that were predicted to cause loss of function of the gene products. Moreover, each individual genome carried an average of 44 such loss-of-function variants in a homozygous state, which would completely “knock out” the corresponding genes. Across all the 44 genomes, a total of 182 genes were “knocked-out” in at least one individual genome, among which 46 genes were “knocked out” in over 30% of our samples, suggesting that a number of genes are commonly “knocked-out” in general populations. Gene ontology analysis suggested that these commonly “knocked-out” genes are enriched in biological process related to antigen processing and immune response. Our results contribute towards a comprehensive characterization of human genomic variation, especially for less-common and rare variants, and provide an invaluable resource for future genetic studies of human variation and diseases.  相似文献   

11.
Onychostoma macrolepis is an emerging commercial cyprinid fish species. It is a model system for studies of sexual dimorphism and genome evolution. Here, we report the chromosome‐level assembly of the O.macrolepis genome obtained from the integration of nanopore long‐read sequencing with physical maps produced using Bionano and Hi‐C technology. A total of 87.9 Gb of nanopore sequence provided approximately 100‐fold coverage of the genome. The preliminary genome assembly was 883.2 Mb in size with a contig N50 size of 11.2 Mb. The 969 corrected contigs obtained from Bionano optical mapping were assembled into 853 scaffolds and produced an assembly of 886.5 Mb with a scaffold N50 of 16.5 Mb. Finally, using the Hi‐C data, 881.3 Mb (99.4% of genome) in 526 scaffolds were anchored and oriented in 25 chromosomes ranging in size from 25.27 to 56.49 Mb. In total, 24,770 protein‐coding genes were predicted in the genome, and ~96.85% of the genes were functionally annotated. The annotated assembly contains 93.3% complete genes from the BUSCO reference set. In addition, we identified 409 Mb (46.23% of the genome) of repetitive sequence, and 11,213 non‐coding RNAs, in the genome. Evolutionary analysis revealed that O. macrolepis diverged from common carp approximately 24.25 million years ago. The chromosomes of O. macrolepis showed an unambiguous correspondence to the chromosomes of zebrafish. The high‐quality genome assembled in this work provides a valuable genomic resource for further biological and evolutionary studies of O. macrolepis.  相似文献   

12.
The Chinese Taihu pig breeds are an invaluable component of the world's pig genetic resources, and they are the most prolific breeds of swine in the world. In this study, the genomes of 252 pigs of the six indigenous breeds in the Taihu Lake region were sequenced using the genotyping by genome reducing and sequencing approach. A total of 950 million good reads were obtained using an Illumina Hiseq2000 at an average depth of 13× (for SNP calling) and an average coverage of 2.3%. In total, 122 632 indels, 31 444 insertions, 44 056 deletions and 455 CNVs (copy number variants) were identified in the genomes of the pigs. Approximately 2.3% of these genetic markers were mapped to gene exon regions, and 25% were in QTL regions related to economically important traits. The KEGG pathway or GO enrichment analyses revealed that genetic variants assumed to be large‐effect mutations were significantly overrepresented in 22 SNP, 56 indel, 26 insertion, 28 deletion and three CNV gene sets. A total of 343 breed‐specific SNPs were also identified in the six Chinese indigenous pigs. The findings from this study can contribute to future investigations of the genetic diversity, population structure, positive selection signals and molecular evolutionary history of these pigs at the genome level and can serve as a valuable reference for improving the breeding and cultivation of these pigs.  相似文献   

13.
Sarcophaga peregrina is considered to be of great ecological, medical and forensic significance, and has unusual biological characteristics such as an ovoviviparous reproductive pattern and adaptation to feed on carrion. The availability of a high‐quality genome will help to further reveal the mechanisms underlying these charcateristics. Here we present a de novo‐assembled genome at chromosome scale for S. peregrina. The final assembled genome was 560.31 Mb with contig N50 of 3.84 Mb. Hi‐C scaffolding reliably anchored six pseudochromosomes, accounting for 97.76% of the assembled genome. Moreover, 45.70% of repeat elements were identified in the genome. A total of 14,476 protein‐coding genes were functionally annotated, accounting for 92.14% of all predicted genes. Phylogenetic analysis indicated that S. peregrina and S. bullata diverged ~ 7.14 million years ago. Comparative genomic analysis revealed expanded and positively selected genes related to biological features that aid in clarifying its ovoviviparous reproduction and carrion‐feeding adaptations, such as lipid metabolism, olfactory receptor activity, antioxidant enzymes, proteolysis and serine‐type endopeptidase activity. Protein‐coding genes associated with ovoviparity, such as yolk proteins, transferrin and acid sphingomyelinase, were identified. This study provides a valuable genomic resource for S. peregrina, and sheds insight into further revealing the underlying molecular mechanisms of adaptive evolution.  相似文献   

14.
Insertions and deletions (indels) in human genomes are associated with a wide range of phenotypes, including various clinical disorders. High-throughput, next generation sequencing (NGS) technologies enable the detection of short genetic variants, such as single nucleotide variants (SNVs) and indels. However, the variant calling accuracy for indels remains considerably lower than for SNVs. Here we present a comparative study of the performance of variant calling tools for indel calling, evaluated with a wide repertoire of NGS datasets. While there is no single optimal tool to suit all circumstances, our results demonstrate that the choice of variant calling tool greatly impacts the precision and recall of indel calling. Furthermore, to reliably detect indels, it is essential to choose NGS technologies that offer a long read length and high coverage coupled with specific variant calling tools.  相似文献   

15.
Grapevine (Vitis vinifera L.) is one of the world's most important crop plants, which is of large economic value for fruit and wine production. There is much interest in identifying genomic variations and their functional effects on inter‐varietal, phenotypic differences. Using an approach developed for the analysis of human and mammalian genomes, which combines high‐throughput sequencing, array comparative genomic hybridization, fluorescent in situ hybridization and quantitative PCR, we created an inter‐varietal atlas of structural variations and single nucleotide variants (SNVs) for the grapevine genome analyzing four economically and genetically relevant table grapevine varieties. We found 4.8 million SNVs and detected 8% of the grapevine genome to be affected by genomic variations. We identified more than 700 copy number variation (CNV) regions and more than 2000 genes subjected to CNV as potential candidates for phenotypic differences between varieties.  相似文献   

16.
The Ehlers‐Danlos syndromes (EDSs) are a heterogeneous group of inherited connective tissue disorders characterized by skin hyperextensibility, joint hypermobility and tissue fragility. Inherited disorders similar to human EDS have been reported in different mammalian species. In the present study, we investigated a female mixed‐breed dog with clinical signs of EDS. Whole‐genome sequencing of the affected dog revealed two missense variants in the TNXB gene, encoding the extracellular matrix protein tenascin XB. In humans, TNXB genetic variants cause classical‐like EDS or the milder hypermobile EDS. The affected dog was heterozygous at both identified variants. Each variant allele was transmitted from one of the case's parents, consistent with compound heterozygosity. Although one of the variant alleles, XM_003431680.3:c.2012G>A, p.(Ser671Asn), was private to the family of the affected dog and absent from whole‐genome sequencing data of 599 control dogs, the second variant allele, XM_003431680.3:c.2900G>A, p.(Gly967Asp), is present at a low frequency in the Chihuahua and Poodle population. Given that TNXB is a functional candidate gene for EDS, we suggest that compound heterozygosity for the identified TNXB variants may have caused the EDS‐like phenotype in the affected dog. Chihuahuas and Poodles should be monitored for EDS cases, which might confirm the hypothesized pathogenic effect of the segregating TNXB variant.  相似文献   

17.
Although pioneering sequencing projects have shed light on the boxer and poodle genomes, a number of challenges need to be met before the sequencing and annotation of the dog genome can be considered complete. Here, we present the DNA sequence of the Jindo dog genome, sequenced to 45-fold average coverage using Illumina massively parallel sequencing technology. A comparison of the sequence to the reference boxer genome led to the identification of 4 675 437 single nucleotide polymorphisms (SNPs, including 3 346 058 novel SNPs), 71 642 indels and 8131 structural variations. Of these, 339 non-synonymous SNPs and 3 indels are located within coding sequences (CDS). In particular, 3 non-synonymous SNPs and a 26-bp deletion occur in the TCOF1 locus, implying that the difference observed in cranial facial morphology between Jindo and boxer dogs might be influenced by those variations. Through the annotation of the Jindo olfactory receptor gene family, we found 2 unique olfactory receptor genes and 236 olfactory receptor genes harbouring non-synonymous homozygous SNPs that are likely to affect smelling capability. In addition, we determined the DNA sequence of the Jindo dog mitochondrial genome and identified Jindo dog-specific mtDNA genotypes. This Jindo genome data upgrade our understanding of dog genomic architecture and will be a very valuable resource for investigating not only dog genetics and genomics but also human and dog disease genetics and comparative genomics.  相似文献   

18.
Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a more significant effect on phenotypic variation than do other types of genetic variants. Hence, a comprehensive list of these functional variants would be of considerable interest in swine genomic studies, particularly those targeting fertility and production traits. Whole‐genome sequence was obtained from 72 of the founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the founding boars (12 Duroc and 12 Landrace) and 48 Yorkshire–Landrace composite sows. Sequence reads were mapped to the Sscrofa10.2 genome build, resulting in a mean of 6.1 fold (×) coverage per genome. A total of 22 342 915 high confidence SNPs were identified from the sequenced genomes. These included 21 million previously reported SNPs and 79% of the 62 163 SNPs on the PorcineSNP60 BeadChip assay. Variation was detected in the coding sequence or untranslated regions (UTRs) of 87.8% of the genes in the porcine genome: loss‐of‐function variants were predicted in 504 genes, 10 202 genes contained nonsynonymous variants, 10 773 had variation in UTRs and 13 010 genes contained synonymous variants. Approximately 139 000 SNPs were classified as loss‐of‐function, nonsynonymous or regulatory, which suggests that over 99% of the variation detected in our pigs could potentially be ignored, allowing us to focus on a much smaller number of functional SNPs during future analyses.  相似文献   

19.
Ecological and environmental heterogeneity can produce genetic differentiation in highly mobile species. Accordingly, local adaptation may be expected across comparatively short distances in the presence of marked environmental gradients. Within the European continent, wolves (Canis lupus) exhibit distinct north–south population differentiation. We investigated more than 67‐K single nucleotide polymorphism (SNP) loci for signatures of local adaptation in 59 unrelated wolves from four previously identified population clusters (northcentral Europe n = 32, Carpathian Mountains n = 7, Dinaric‐Balkan n = 9, Ukrainian Steppe n = 11). Our analyses combined identification of outlier loci with findings from genome‐wide association study of individual genomic profiles and 12 environmental variables. We identified 353 candidate SNP loci. We examined the SNP position and neighboring megabase (1 Mb, one million bases) regions in the dog (C. lupus familiaris) genome for genes potentially under selection, including homologue genes in other vertebrates. These regions included functional genes for, for example, temperature regulation that may indicate local adaptation and genes controlling for functions universally important for wolves, including olfaction, hearing, vision, and cognitive functions. We also observed strong outliers not associated with any of the investigated variables, which could suggest selective pressures associated with other unmeasured environmental variables and/or demographic factors. These patterns are further supported by the examination of spatial distributions of the SNPs associated with universally important traits, which typically show marked differences in allele frequencies among population clusters. Accordingly, parallel selection for features important to all wolves may eclipse local environmental selection and implies long‐term separation among population clusters.  相似文献   

20.
The ladybird beetle Propylea japonica is an important natural enemy in agro‐ecological systems. Studies on the strong tolerance of P. japonica to high temperatures and insecticides, and its population and phenotype diversity have recently increased. However, abundant genome resources for obtaining insights into stress‐resistance mechanisms and genetic intra‐species diversity for P. japonica are lacking. Here, we constructed the P. japonica genome maps using Pacific Bioscience (PacBio) and Illumina sequencing technologies. The genome size was 850.90 Mb with a contig N50 of 813.13 kb. The Hi‐C sequence data were used to upgrade draft genome assemblies; 4,777 contigs were assembled to 10 chromosomes; and the final draft genome assembly was 803.93 Mb with a contig N50 of 813.98 kb and a scaffold N50 of 100.34 Mb. Approximately 495.38 Mb of repeated sequences was annotated. The 18,018 protein‐coding genes were predicted, of which 95.78% were functionally annotated, and 1,407 genes were species‐specific. The phylogenetic analysis showed that P. japonica diverged from the ancestor of Anoplophora glabripennis and Tribolium castaneum ~ 236.21 million years ago. We detected that some important gene families involved in detoxification of pesticides and tolerance to heat stress were expanded in P. japonica, especially cytochrome P450 and Hsp70 genes. Overall, the high‐quality draft genome sequence of P. japonica will provide invaluable resource for understanding the molecular mechanisms of stress resistance and will facilitate the research on population genetics, evolution and phylogeny of Coccinellidae. This genome will also provide new avenues for conserving the diversity of predator insects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号