首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The domestic dog serves as an excellent model to investigate the genetic basis of disease. More than 400 heritable traits analogous to human diseases have been described in dogs. To further canine medical genetics research, we established the Dog Biomedical Variant Database Consortium (DBVDC) and present a comprehensive list of functionally annotated genome variants that were identified with whole genome sequencing of 582 dogs from 126 breeds and eight wolves. The genomes used in the study have a minimum coverage of 10× and an average coverage of ~24×. In total, we identified 23 133 692 single‐nucleotide variants (SNVs) and 10 048 038 short indels, including 93% undescribed variants. On average, each individual dog genome carried ~4.1 million single‐nucleotide and ~1.4 million short‐indel variants with respect to the reference genome assembly. About 2% of the variants were located in coding regions of annotated genes and loci. Variant effect classification showed 247 141 SNVs and 99 562 short indels having moderate or high impact on 11 267 protein‐coding genes. On average, each genome contained heterozygous loss‐of‐function variants in 30 potentially embryonic lethal genes and 97 genes associated with developmental disorders. More than 50 inherited disorders and traits have been unravelled using the DBVDC variant catalogue, enabling genetic testing for breeding and diagnostics. This resource of annotated variants and their corresponding genotype frequencies constitutes a highly useful tool for the identification of potential variants causative for rare inherited disorders in dogs.  相似文献   

2.
The Chinese Taihu pig breeds are an invaluable component of the world's pig genetic resources, and they are the most prolific breeds of swine in the world. In this study, the genomes of 252 pigs of the six indigenous breeds in the Taihu Lake region were sequenced using the genotyping by genome reducing and sequencing approach. A total of 950 million good reads were obtained using an Illumina Hiseq2000 at an average depth of 13× (for SNP calling) and an average coverage of 2.3%. In total, 122 632 indels, 31 444 insertions, 44 056 deletions and 455 CNVs (copy number variants) were identified in the genomes of the pigs. Approximately 2.3% of these genetic markers were mapped to gene exon regions, and 25% were in QTL regions related to economically important traits. The KEGG pathway or GO enrichment analyses revealed that genetic variants assumed to be large‐effect mutations were significantly overrepresented in 22 SNP, 56 indel, 26 insertion, 28 deletion and three CNV gene sets. A total of 343 breed‐specific SNPs were also identified in the six Chinese indigenous pigs. The findings from this study can contribute to future investigations of the genetic diversity, population structure, positive selection signals and molecular evolutionary history of these pigs at the genome level and can serve as a valuable reference for improving the breeding and cultivation of these pigs.  相似文献   

3.
The diploid genome sequence of an individual human   总被引:4,自引:1,他引:3  
Presented here is a genome sequence of an individual human. It was produced from ∼32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.  相似文献   

4.
The single‐humped dromedary (Camelus dromedarius) is the most numerous and widespread of domestic camel species and is a significant source of meat, milk, wool, transportation and sport for millions of people. Dromedaries are particularly well adapted to hot, desert conditions and harbour a variety of biological and physiological characteristics with evolutionary, economic and medical importance. To understand the genetic basis of these traits, an extensive resource of genomic variation is required. In this study, we assembled at 65× coverage, a 2.06 Gb draft genome of a female dromedary whose ancestry can be traced to an isolated population from the Canary Islands. We annotated 21 167 protein‐coding genes and estimated ~33.7% of the genome to be repetitive. A comparison with the recently published draft genome of an Arabian dromedary resulted in 1.91 Gb of aligned sequence with a divergence of 0.095%. An evaluation of our genome with the reference revealed that our assembly contains more error‐free bases (91.2%) and fewer scaffolding errors. We identified ~1.4 million single‐nucleotide polymorphisms with a mean density of 0.71 × 10?3 per base. An analysis of demographic history indicated that changes in effective population size corresponded with recent glacial epochs. Our de novo assembly provides a useful resource of genomic variation for future studies of the camel's adaptations to arid environments and economically important traits. Furthermore, these results suggest that draft genome assemblies constructed with only two differently sized sequencing libraries can be comparable to those sequenced using additional library sizes, highlighting that additional resources might be better placed in technologies alternative to short‐read sequencing to physically anchor scaffolds to genome maps.  相似文献   

5.
Populus euphratica is well adapted to extreme desert environments and is an important model species for elucidating the mechanisms of abiotic stress resistance in trees. The current assembly of P. euphratica genome is highly fragmented with many gaps and errors, thereby impeding downstream applications. Here, we report an improved chromosome‐level reference genome of P. euphratica (v2.0) using single‐molecule sequencing and chromosome conformation capture (Hi‐C) technologies. Relative to the previous reference genome, our assembly represents a nearly 60‐fold improvement in contiguity, with a scaffold N50 size of 28.59 Mb. Using this genome, we have found that extensive expansion of Gypsy elements in P. euphratica led to its rapid increase in genome size compared to any other Salicaceae species studied to date, and potentially contributed to adaptive divergence driven by insertions near genes involved in stress tolerance. We also detected a wide range of unique structural rearrangements in P. euphratica, including 2,549 translocations, 454 inversions, 121 tandem and 14 segmental duplications. Several key genes likely to be involved in tolerance to abiotic stress were identified within these regions. This high‐quality genome represents a valuable resource for poplar breeding and genetic improvement in the future, as well as comparative genomic analysis with other Salicaceae species.  相似文献   

6.
White spotting phenotypes in horses can range in severity from the common white markings up to completely white horses. EDNRB, KIT, MITF, PAX3 and TRPM1 represent known candidate genes for such phenotypes in horses. For the present study, we re‐investigated a large horse family segregating a variable white spotting phenotype, for which conventional Sanger sequencing of the candidate genes’ individual exons had failed to reveal the causative variant. We obtained whole genome sequence data from an affected horse and specifically searched for structural variants in the known candidate genes. This analysis revealed a heterozygous ~1.9‐kb deletion spanning exons 10–13 of the KIT gene (chr3:77,740,239_77,742,136del1898insTATAT). In continuity with previously named equine KIT variants we propose to designate the newly identified deletion variant W22. We had access to 21 horses carrying the W22 allele. Four of them were compound heterozygous W20/W22 and had a completely white phenotype. Our data suggest that W22 represents a true null allele of the KIT gene, whereas the previously identified W20 leads to a partial loss of function. These findings will enable more precise genetic testing for depigmentation phenotypes in horses.  相似文献   

7.
We report on a whole‐genome draft sequence of rye (Secale cereale L.). Rye is a diploid Triticeae species closely related to wheat and barley, and an important crop for food and feed in Central and Eastern Europe. Through whole‐genome shotgun sequencing of the 7.9‐Gbp genome of the winter rye inbred line Lo7 we obtained a de novo assembly represented by 1.29 million scaffolds covering a total length of 2.8 Gbp. Our reference sequence represents nearly the entire low‐copy portion of the rye genome. This genome assembly was used to predict 27 784 rye gene models based on homology to sequenced grass genomes. Through resequencing of 10 rye inbred lines and one accession of the wild relative S. vavilovii, we discovered more than 90 million single nucleotide variants and short insertions/deletions in the rye genome. From these variants, we developed the high‐density Rye600k genotyping array with 600 843 markers, which enabled anchoring the sequence contigs along a high‐density genetic map and establishing a synteny‐based virtual gene order. Genotyping data were used to characterize the diversity of rye breeding pools and genetic resources, and to obtain a genome‐wide map of selection signals differentiating the divergent gene pools. This rye whole‐genome sequence closes a gap in Triticeae genome research, and will be highly valuable for comparative genomics, functional studies and genome‐based breeding in rye.  相似文献   

8.
Bivalves, a highly diverse and the most evolutionarily successful class of invertebrates native to aquatic habitats, provide valuable molecular resources for understanding the evolutionary adaptation and aquatic ecology. Here, we reported a high‐quality chromosome‐level genome assembly of the razor clam Sinonovacula constricta using Pacific Bioscience single‐molecule real‐time sequencing, Illumina paired‐end sequencing, 10X Genomics linked‐reads and Hi‐C reads. The genome size was 1,220.85 Mb, containing scaffold N50 of 65.93 Mb and contig N50 of 976.94 Kb. A total of 899 complete (91.92%) and seven partial (0.72%) matches of the 978 metazoa Benchmarking Universal Single‐Copy Orthologs were determined in this genome assembly. And Hi‐C scaffolding of the genome resulted in 19 pseudochromosomes. A total of 28,594 protein‐coding genes were predicted in the S. constricta genome, of which 25,413 genes (88.88%) were functionally annotated. In addition, 39.79% of the assembled genome was composed of repetitive sequences, and 4,372 noncoding RNAs were identified. The enrichment analyses of the significantly expanded and contracted genes suggested an evolutionary adaptation of S. constricta to highly stressful living environments. In summary, the genomic resources generated in this work not only provide a valuable reference genome for investigating the molecular mechanisms of S. constricta biological functions and evolutionary adaptation, but also facilitate its genetic improvement and disease treatment. Meanwhile, the obtained genome greatly improves our understanding of the genetics of molluscs and their comparative evolution.  相似文献   

9.
Cicer arietinum L. (chickpea) is the third most important food legume crop. We have generated the draft sequence of a desi‐type chickpea genome using next‐generation sequencing platforms, bacterial artificial chromosome end sequences and a genetic map. The 520‐Mb assembly covers 70% of the predicted 740‐Mb genome length, and more than 80% of the gene space. Genome analysis predicts the presence of 27 571 genes and 210 Mb as repeat elements. The gene expression analysis performed using 274 million RNA‐Seq reads identified several tissue‐specific and stress‐responsive genes. Although segmental duplicated blocks are observed, the chickpea genome does not exhibit any indication of recent whole‐genome duplication. Nucleotide diversity analysis provides an assessment of a narrow genetic base within the chickpea cultivars. We have developed a resource for genetic markers by comparing the genome sequences of one wild and three cultivated chickpea genotypes. The draft genome sequence is expected to facilitate genetic enhancement and breeding to develop improved chickpea varieties.  相似文献   

10.
11.
Ficus erecta, a wild relative of the common fig (F. carica), is a donor of Ceratocystis canker resistance in fig breeding programmes. Interspecific hybridization followed by recurrent backcrossing is an effective method to transfer the resistance trait from wild to cultivated fig. However, this process is time consuming and labour intensive for trees, especially for gynodioecious plants such as fig. In this study, genome resources were developed for F. erecta to facilitate fig breeding programmes. The genome sequence of F. erecta was determined using single‐molecule real‐time sequencing technology. The resultant assembly spanned 331.6 Mb with 538 contigs and an N50 length of 1.9 Mb, from which 51 806 high‐confidence genes were predicted. Pseudomolecule sequences corresponding to the chromosomes of F. erecta were established with a genetic map based on single nucleotide polymorphisms from double‐digest restriction‐site‐associated DNA sequencing. Subsequent linkage analysis and whole‐genome resequencing identified a candidate gene for the Ceratocystis canker resistance trait. Genome‐wide genotyping analysis enabled the selection of female lines that possessed resistance and effective elimination of the donor genome from the progeny. The genome resources provided in this study will accelerate and enhance disease‐resistance breeding programmes in fig.  相似文献   

12.
The 1.5 Gbp/2C genome of pedunculate oak (Quercus robur) has been sequenced. A strategy was established for dealing with the challenges imposed by the sequencing of such a large, complex and highly heterozygous genome by a whole‐genome shotgun (WGS) approach, without the use of costly and time‐consuming methods, such as fosmid or BAC clone‐based hierarchical sequencing methods. The sequencing strategy combined short and long reads. Over 49 million reads provided by Roche 454 GS‐FLX technology were assembled into contigs and combined with shorter Illumina sequence reads from paired‐end and mate‐pair libraries of different insert sizes, to build scaffolds. Errors were corrected and gaps filled with Illumina paired‐end reads and contaminants detected, resulting in a total of 17 910 scaffolds (>2 kb) corresponding to 1.34 Gb. Fifty per cent of the assembly was accounted for by 1468 scaffolds (N50 of 260 kb). Initial comparison with the phylogenetically related Prunus persica gene model indicated that genes for 84.6% of the proteins present in peach (mean protein coverage of 90.5%) were present in our assembly. The second and third steps in this project are genome annotation and the assignment of scaffolds to the oak genetic linkage map. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement, the oak genome data have been released into public sequence repositories in advance of publication. In this presubmission paper, the oak genome consortium describes its principal lines of work and future directions for analyses of the nature, function and evolution of the oak genome.  相似文献   

13.
14.
The Tetraodontidae family are known to have relatively small and compact genomes compared to other vertebrates. The obscure puffer fish Takifugu obscurus is an anadromous species that migrates to freshwater from the sea for spawning. Thus the euryhaline characteristics of T. obscurus have been investigated to gain understanding of their survival ability, osmoregulation, and other homeostatic mechanisms in both freshwater and seawater. In this study, a high quality chromosome‐level reference genome for T. obscurus was constructed using long‐read Pacific Biosciences (PacBio) Sequel sequencing and a Hi‐C‐based chromatin contact map platform. The final genome assembly of T. obscurus is 381 Mb, with a contig N50 length of 3,296 kb and longest length of 10.7 Mb, from a total of 62 Gb of raw reads generated using single‐molecule real‐time sequencing technology from a PacBio Sequel platform. The PacBio data were further clustered into chromosome‐scale scaffolds using a Hi‐C approach, resulting in a 373 Mb genome assembly with a contig N50 length of 15.2 Mb and and longest length of 28 Mb. When we directly compared the 22 longest scaffolds of T. obscurus to the 22 chromosomes of the tiger puffer Takifugu rubripes, a clear one‐to‐one orthologous relationship was observed between the two species, supporting the chromosome‐level assembly of T. obscurus. This genome assembly can serve as a valuable genetic resource for exploring fugu‐specific compact genome characteristics, and will provide essential genomic information for understanding molecular adaptations to salinity fluctuations and the evolution of osmoregulatory mechanisms.  相似文献   

15.
16.
17.
18.
Molecular identification of mutant alleles responsible for certain phenotypic alterations is a central goal of genetic analyses. In this study we describe a rapid procedure suitable for the identification of induced recessive and dominant mutations applied to two Zea mays mutants expressing a dwarf and a pale green phenotype, respectively, which were obtained through pollen ethyl methanesulfonate (EMS) mutagenesis. First, without prior backcrossing, induced mutations (single nucleotide polymorphisms, SNPs) segregating in a (M2) family derived from a heterozygous (M1) parent were identified using whole‐genome shotgun (WGS) sequencing of a small number of (M2) individuals with mutant and wild‐type phenotypes. Second, the state of zygosity of the mutation causing the phenotype was determined for each sequenced individual by phenotypic segregation analysis of the self‐pollinated (M3) offspring. Finally, we filtered for segregating EMS‐induced SNPs whose state of zygosity matched the determined state of zygosity of the mutant locus in each sequenced (M2) individuals. Through this procedure, combining sequencing of individuals and Mendelian inheritance, three and four SNPs in linkage passed our zygosity filter for the homozygous dwarf and heterozygous pale green mutation, respectively. The dwarf mutation was found to be allelic to the an1 locus and caused by an insertion in the largest exon of the AN1 gene. The pale green mutation affected the nuclear W2 gene and was caused by a non‐synonymous amino acid exchange in encoded chloroplast DNA polymerase with a predicted deleterious effect. This coincided with lower cpDNA levels in pale green plants.  相似文献   

19.
ABSTRACT: BACKGROUND: The turkey (Meleagris gallopavo) is an important agricultural species and the second largest contributor to the world's poultry meat production. Genetic improvement is attributed largely to selective breeding programs that rely on highly heritable phenotypic traits, such as body size and breast muscle development. Commercial breeding with small effective population sizes and epistasis can result in loss of genetic diversity, which in turn can lead to reduced individual fitness and reduced response to selection. The presence of genomic diversity in domestic livestock species therefore, is of great importance and a prerequisite for rapid and accurate genetic improvement of selected breeds in various environments, as well as to facilitate rapid adaptation to potential changes in breeding goals. Genomic selection requires a large number of genetic markers such as e.g. single nucleotide polymorphisms (SNPs) the most abundant source of genetic variation within the genome. RESULTS: Alignment of next generation sequencing data of 32 individual turkeys from different populations was used for the discovery of 5.49 million SNPs, which subsequently were used for the analysis of genetic diversity among the different populations. All of the commercial lines branched from a single node relative to the heritage varieties and the South Mexican turkey population. Heterozygosity of all individuals from the different turkey populations ranged from 0.17-2.73 SNPs/Kb, while heterozygosity of populations ranged from 0.73-1.64 SNPs/Kb. The average frequency of heterozygous SNPs in individual turkeys was 1.07 SNPs/Kb. Five genomic regions with very low nucleotide variation were identified in domestic turkeys that showed state of fixation towards alleles different than wild alleles. CONCLUSION: The turkey genome is much less diverse with a relatively low frequency of heterozygous SNPs as compared to other livestock species like chicken and pig. The whole genome SNP discovery study in turkey resulted in the detection of 5.49 million putative SNPs compared to the reference genome. All commercial lines appear to share a common origin. Presence of different alleles/haplotypes in the SM population highlights that specific haplotypes have been selected in the modern domesticated turkey.  相似文献   

20.
Hierarchical shotgun sequencing remains the method of choice for assembling high‐quality reference sequences of complex plant genomes. The efficient exploitation of current high‐throughput technologies and powerful computational facilities for large‐insert clone sequencing necessitates the sequencing and assembly of a large number of clones in parallel. We developed a multiplexed pipeline for shotgun sequencing and assembling individual bacterial artificial chromosomes (BACs) using the Illumina sequencing platform. We illustrate our approach by sequencing 668 barley BACs (Hordeum vulgare L.) in a single Illumina HiSeq 2000 lane. Using a newly designed parallelized computational pipeline, we obtained sequence assemblies of individual BACs that consist, on average, of eight sequence scaffolds and represent >98% of the genomic inserts. Our BAC assemblies are clearly superior to a whole‐genome shotgun assembly regarding contiguity, completeness and the representation of the gene space. Our methods may be employed to rapidly obtain high‐quality assemblies of a large number of clones to assemble map‐based reference sequences of plant and animal species with complex genomes by sequencing along a minimum tiling path.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号