首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Inversion polymorphisms have been linked to a variety of fundamental biological and evolutionary processes. Yet few studies have used large-scale genomic sequencing to directly compare the haplotypes associated with the standard and inverted chromosome arrangements. Here we describe the targeted genomic sequencing and comparison of haplotypes representing alternative arrangements of a common inversion polymorphism linked to a suite of phenotypes in the white-throated sparrow (Zonotrichia albicollis). More than 7.4 Mb of genomic sequence was generated and assembled from both the standard (ZAL2) and inverted (ZAL2(m)) arrangements. Sequencing of a pair of inversion breakpoints led to the identification of a ZAL2-specific segmental duplication, as well as evidence of breakpoint reusage. Comparison of the haplotype-based sequence assemblies revealed low genetic differentiation outside versus inside the inversion indicative of historical patterns of gene flow and suppressed recombination between ZAL2 and ZAL2(m). Finally, despite ZAL2(m) being maintained in a near constant state of heterozygosity, no signatures of genetic degeneration were detected on this chromosome. Overall, these results provide important insights into the genomic attributes of an inversion polymorphism linked to mate choice and variation in social behavior.  相似文献   

2.
Sequencing of 6.7 Mb of the melon genome using a BAC pooling strategy   总被引:1,自引:0,他引:1  

Background  

Cucumis melo (melon) belongs to the Cucurbitaceae family, whose economic importance among horticulture crops is second only to Solanaceae. Melon has a high intra-specific genetic variation, morphologic diversity and a small genome size (454 Mb), which make it suitable for a great variety of molecular and genetic studies. A number of genetic and genomic resources have already been developed, such as several genetic maps, BAC genomic libraries, a BAC-based physical map and EST collections. Sequence information would be invaluable to complete the picture of the melon genomic landscape, furthering our understanding of this species' evolution from its relatives and providing an important genetic tool. However, to this day there is little sequence data available, only a few melon genes and genomic regions are deposited in public databases. The development of massively parallel sequencing methods allows envisaging new strategies to obtain long fragments of genomic sequence at higher speed and lower cost than previous Sanger-based methods.  相似文献   

3.
Pigeonpea is an important legume food crop grown primarily by smallholder farmers in many semi-arid tropical regions of the world. We used the Illumina next-generation sequencing platform to generate 237.2 Gb of sequence, which along with Sanger-based bacterial artificial chromosome end sequences and a genetic map, we assembled into scaffolds representing 72.7% (605.78 Mb) of the 833.07 Mb pigeonpea genome. Genome analysis predicted 48,680 genes for pigeonpea and also showed the potential role that certain gene families, for example, drought tolerance-related genes, have played throughout the domestication of pigeonpea and the evolution of its ancestors. Although we found a few segmental duplication events, we did not observe the recent genome-wide duplication events observed in soybean. This reference genome sequence will facilitate the identification of the genetic basis of agronomically important traits, and accelerate the development of improved pigeonpea varieties that could improve food security in many developing countries.  相似文献   

4.
The Asian citrus psyllid, Diaphorina citri, is the insect vector of the causal agent of huanglongbing (HLB), a devastating bacterial disease of commercial citrus. Presently, few genomic resources exist for D. citri. In this study, we utilized PacBio HiFi and chromatin confirmation contact (Hi-C) sequencing to sequence, assemble, and compare three high-quality, chromosome-scale genome assemblies of D. citri collected from California, Taiwan, and Uruguay. Our assemblies had final sizes of 282.67 Mb (California), 282.89 Mb (Taiwan), and 266.67 Mb (Uruguay) assembled into 13 pseudomolecules—a reduction in assembly size of 41–45% compared with previous assemblies which we validated using flow cytometry. We identified the X chromosome in D. citri and annotated each assembly for repetitive elements, protein-coding genes, transfer RNAs, ribosomal RNAs, piwi-interacting RNA clusters, and endogenous viral elements. Between 19,083 and 20,357 protein-coding genes were predicted. Repetitive DNA accounts for 36.87–38.26% of each assembly. Comparative analyses and mitochondrial haplotype networks suggest that Taiwan and Uruguay D. citri are more closely related, while California D. citri are closely related to Florida D. citri. These high-quality, chromosome-scale assemblies provide new genomic resources to researchers to further D. citri and HLB research.  相似文献   

5.
Recent segmental and gene duplications in the mouse genome   总被引:2,自引:0,他引:2       下载免费PDF全文

Background

The high quality of the mouse genome draft sequence and its associated annotations are an invaluable biological resource. Identifying recent duplications in the mouse genome, especially in regions containing genes, may highlight important events in recent murine evolution. In addition, detecting recent sequence duplications can reveal potentially problematic regions of the genome assembly. We use BLAST-based computational heuristics to identify large (≥ 5 kb) and recent (≥ 90% sequence identity) segmental duplications in the mouse genome sequence. Here we present a database of recently duplicated regions of the mouse genome found in the mouse genome sequencing consortium (MGSC) February 2002 and February 2003 assemblies.

Results

We determined that 33.6 Mb of 2,695 Mb (1.2%) of sequence from the February 2003 mouse genome sequence assembly is involved in recent segmental duplications, which is less than that observed in the human genome (around 3.5-5%). From this dataset, 8.9 Mb (26%) of the duplication content consisted of 'unmapped' chromosome sequence. Moreover, we suspect that an additional 18.5 Mb of sequence is involved in duplication artifacts arising from sequence misassignment errors in this genome assembly. By searching for genes that are located within these regions, we identified 675 genes that mapped to duplicated regions of the mouse genome. Sixteen of these genes appear to have been duplicated independently in the human genome. From our dataset we further characterized a 42 kb recent segmental duplication of Mater, a maternal-effect gene essential for embryogenesis in mice.

Conclusion

Our results provide an initial analysis of the recently duplicated sequence and gene content of the mouse genome. Many of these duplicated loci, as well as regions identified to be involved in potential sequence misassignment errors, will require further mapping and sequencing to achieve accuracy. A Genome Browser database was set up to display the identified duplication content presented in this work. This data will also be relevant to the growing number of investigators who use the draft genome sequence for experimental design and analysis.
  相似文献   

6.
7.
8.
田鼠属的一些近缘种间具有独特的社会行为多态性。例如Microtusochrogaster和M .pinetorum为一夫一妻制 ,而M .montanus和M .pennsylvanicus则为独居和一夫多妻制。无论是在野外还是人工饲养的条件下 ,单配制的田鼠其雌、雄成年个体一经交配即在两者之间形成长期的配偶关系并且双亲共同哺育后代。已证明神经多肽加压素 (Vasopressin)参与了田鼠单配制行为的神经调控。本篇综述了过去以及近期关于加压素调控田鼠配偶关系形成的研究结果和进展。首先 ,阐述了加压素V1a受体 (V1aR)在脑分布的种间差异 ,并以此来鉴别特定脑区在配偶关系形成中的功能 ;其次 ,探讨了运用V1aR拮抗物的药理学方法来决定究竟哪些脑区参与配偶关系的形成 ,还描述了田鼠种间V1aR基因结构和功能的不同 ,以及这些不同对V1aR在大脑的分布和行为调控潜在的作用机制 ;最后 ,讨论了最新的研究结果 ,即对一夫多妻制田鼠进行脑V1aR基因的改造 ,从而使之表现出一夫一妻制田鼠的行为。总之 ,了解复杂的社会性行为的遗传和神经机制可以加深我们对种间和种内行为分歧进化的理解  相似文献   

9.
Fang Y  Li Z  Liu J  Shu C  Wang X  Zhang X  Yu X  Zhao D  Liu G  Hu S  Zhang J  Al-Mssallem I  Yu J 《遗传学报》2011,38(12):567-576
Bacillus thuringiensis (B.thuringiensis) is a soil-dwelling Gram-positive bacterium and its plasmid-encoded toxins (Cry) are commonly used as biological alternatives to pesticides.In a pangenomic study,we sequenced seven B.thuringiensis isolates in both high coverage and base quality using the next-generation sequencing platform.The B.thuringiensis pangenome was extrapolated to have 4196 core genes and an asymptotic value of 558 unique genes when a new genome is added.Compared to the pangenomes of its closely related species of the same genus,B.thuringiensis pangenome shows an open characteristic,similar to B.cereus but not to B.anthracis; the latter has a closed pangenome.We also found extensive divergence among the seven B.thuringiensis genome assemblies,which harbor ample repeats and single nucleotide polymorphisms (SNPs).The identities among orthologous genes are greater than 84.5% and the hotspots for the genome variations were discovered in genomic regions of 2.3-2.8 Mb and 5.0-5.6 Mb.We concluded that high-coverage sequence assemblies from multiple strains,before all the gaps are closed,are very useful for pangenomic studies.  相似文献   

10.
As part of a larger project to sequence the Populus genome and generate genomic resources for this emerging model tree, we constructed a physical map of the Populus genome, representing one of the few such maps of an undomesticated, highly heterozygous plant species. The physical map, consisting of 2802 contigs, was constructed from fingerprinted bacterial artificial chromosome (BAC) clones. The map represents approximately 9.4-fold coverage of the Populus genome, which has been estimated from the genome sequence assembly to be 485 ± 10 Mb in size. BAC ends were sequenced to assist long-range assembly of whole-genome shotgun sequence scaffolds and to anchor the physical map to the genome sequence. Simple sequence repeat-based markers were derived from the end sequences and used to initiate integration of the BAC and genetic maps. A total of 2411 physical map contigs, representing 97% of all clones assigned to contigs, were aligned to the sequence assembly (JGI Populus trichocarpa , version 1.0). These alignments represent a total coverage of 384 Mb (79%) of the entire poplar sequence assembly and 295 Mb (96%) of linkage group sequence assemblies. A striking result of the physical map contig alignments to the sequence assembly was the co-localization of multiple contigs across numerous regions of the 19 linkage groups. Targeted sequencing of BAC clones and genetic analysis in a small number of representative regions showed that these co-aligning contigs represent distinct haplotypes in the heterozygous individual sequenced, and revealed the nature of these haplotype sequence differences.  相似文献   

11.
Cicer arietinum L. (chickpea) is the third most important food legume crop. We have generated the draft sequence of a desi‐type chickpea genome using next‐generation sequencing platforms, bacterial artificial chromosome end sequences and a genetic map. The 520‐Mb assembly covers 70% of the predicted 740‐Mb genome length, and more than 80% of the gene space. Genome analysis predicts the presence of 27 571 genes and 210 Mb as repeat elements. The gene expression analysis performed using 274 million RNA‐Seq reads identified several tissue‐specific and stress‐responsive genes. Although segmental duplicated blocks are observed, the chickpea genome does not exhibit any indication of recent whole‐genome duplication. Nucleotide diversity analysis provides an assessment of a narrow genetic base within the chickpea cultivars. We have developed a resource for genetic markers by comparing the genome sequences of one wild and three cultivated chickpea genotypes. The draft genome sequence is expected to facilitate genetic enhancement and breeding to develop improved chickpea varieties.  相似文献   

12.
The pooid subfamily of grasses includes some of the most important crop, forage and turf species, such as wheat, barley and Lolium. Developing genomic resources, such as whole-genome physical maps, for analysing the large and complex genomes of these crops and for facilitating biological research in grasses is an important goal in plant biology. We describe a bacterial artificial chromosome (BAC)-based physical map of the wild pooid grass Brachypodium distachyon and integrate this with whole genome shotgun sequence (WGS) assemblies using BAC end sequences (BES). The resulting physical map contains 26 contigs spanning the 272 Mb genome. BES from the physical map were also used to integrate a genetic map. This provides an independent validation and confirmation of the published WGS assembly. Mapped BACs were used in Fluorescence In Situ Hybridisation (FISH) experiments to align the integrated physical map and sequence assemblies to chromosomes with high resolution. The physical, genetic and cytogenetic maps, integrated with whole genome shotgun sequence assemblies, enhance the accuracy and durability of this important genome sequence and will directly facilitate gene isolation.  相似文献   

13.
Despite considerable advances in sequencing of the human genome over the past few years, the organization and evolution of human pericentromeric regions have been difficult to resolve. This is due, in part, to the presence of large, complex blocks of duplicated genomic sequence at the boundary between centromeric satellite and unique euchromatic DNA. Here, we report the identification and characterization of an approximately 49-kb repeat sequence that exists in more than 40 copies within the human genome. This repeat is specific to highly duplicated pericentromeric regions with multiple copies distributed in an interspersed fashion among a subset of human chromosomes. Using this interspersed repeat (termed PIR4) as a marker of pericentromeric DNA, we recovered and sequence-tagged 3 Mb of pericentromeric DNA from a variety of human chromosomes as well as nonhuman primate genomes. A global evolutionary reconstruction of the dispersal of PIR4 sequence and analysis of flanking sequence supports a model in which pericentromeric duplications initiated before the separation of the great ape species (>12 MYA). Further, analyses of this duplication and associated flanking duplications narrow the major burst of pericentromeric duplication activity to a time just before the divergence of the African great ape and human species (5 to 7 MYA). These recent duplication exchange events substantially restructured the pericentromeric regions of hominoid chromosomes and created an architecture where large blocks of sequence are shared among nonhomologous chromosomes. This report provides the first global view of the series of historical events that have reshaped human pericentromeric regions over recent evolutionary time.  相似文献   

14.
Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb) and 7 (1.1 Mb) from an individual from the International HapMap Project (NA12872). We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage≥4-fold, and 97.9% concordant in regions with coverage≥15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.  相似文献   

15.
Interpreting the genomic and phenotypic consequences of copy-number variation (CNV) is essential to understanding the etiology of genetic disorders. Whereas deletion CNVs lead obviously to haploinsufficiency, duplications might cause disease through triplosensitivity, gene disruption, or gene fusion at breakpoints. The mutational spectrum of duplications has been studied at certain loci, and in some cases these copy-number gains are complex chromosome rearrangements involving triplications and/or inversions. However, the organization of clinically relevant duplications throughout the genome has yet to be investigated on a large scale. Here we fine-mapped 184 germline duplications (14.7 kb–25.3 Mb; median 532 kb) ascertained from individuals referred for diagnostic cytogenetics testing. We performed next-generation sequencing (NGS) and whole-genome sequencing (WGS) to sequence 130 breakpoints from 112 subjects with 119 CNVs and found that most (83%) were tandem duplications in direct orientation. The remainder were triplications embedded within duplications (8.4%), adjacent duplications (4.2%), insertional translocations (2.5%), or other complex rearrangements (1.7%). Moreover, we predicted six in-frame fusion genes at sequenced duplication breakpoints; four gene fusions were formed by tandem duplications, one by two interconnected duplications, and one by duplication inserted at another locus. These unique fusion genes could be related to clinical phenotypes and warrant further study. Although most duplications are positioned head-to-tail adjacent to the original locus, those that are inverted, triplicated, or inserted can disrupt or fuse genes in a manner that might not be predicted by conventional copy-number assays. Therefore, interpreting the genetic consequences of duplication CNVs requires breakpoint-level analysis.  相似文献   

16.
17.
Salmonids are of particular interest to evolutionary biologists due to their incredible diversity of life‐history strategies and the speed at which many salmonid species have diversified. In Switzerland alone, over 30 species of Alpine whitefish from the subfamily Coregoninae have evolved since the last glacial maximum, with species exhibiting a diverse range of morphological and behavioural phenotypes. This, combined with the whole genome duplication which occurred in the ancestor of all salmonids, makes the Alpine whitefish radiation a particularly interesting system in which to study the genetic basis of adaptation and speciation and the impacts of ploidy changes and subsequent rediploidization on genome evolution. Although well‐curated genome assemblies exist for many species within Salmonidae, genomic resources for the subfamily Coregoninae are lacking. To assemble a whitefish reference genome, we carried out PacBio sequencing from one wild‐caught Coregonus sp. “Balchen” from Lake Thun to ~90× coverage. PacBio reads were assembled independently using three different assemblers, falcon , canu and wtdbg2 and subsequently scaffolded with additional Hi‐C data. All three assemblies were highly contiguous, had strong synteny to a previously published Coregonus linkage map, and when mapping additional short‐read data to each of the assemblies, coverage was fairly even across most chromosome‐scale scaffolds. Here, we present the first de novo genome assembly for the Salmonid subfamily Coregoninae. The final 2.2‐Gb wtdbg2 assembly included 40 scaffolds, an N50 of 51.9 Mb and was 93.3% complete for BUSCOs. The assembly consisted of ~52% transposable elements and contained 44,525 genes.  相似文献   

18.
Distal hereditary motor neuropathies predominantly affect the motor neurons of the peripheral nervous system leading to chronic disability. Using whole genome sequencing (WGS) we have identified a novel structural variation (SV) within the distal hereditary motor neuropathy locus on chromosome 7q34–q36.2 (DHMN1). The SV involves the insertion of a 1.35 Mb DNA fragment into the DHMN1 disease locus. The source of the inserted sequence is 2.3 Mb distal to the disease locus at chromosome 7q36.3. The insertion involves the duplication of five genes (LOC389602, RNF32, LMBR1, NOM1, MNX1) and partial duplication of UBE3C. The genomic structure of genes within the DHMN1 locus are not disrupted by the insertion and no disease causing point mutations within the locus were identified. This suggests the novel SV is the most likely DNA mutation disrupting the DHMN1 locus. Due to the size and position of the DNA insertion, the gene(s) directly affected by the genomic re-arrangement remains elusive. Our finding represents a new genetic cause for hereditary motor neuropathies and highlights the growing importance of interrogating the non-coding genome for SV mutations in families which have been excluded for genome wide coding mutations.  相似文献   

19.
The human genome reference (HGR) completion marked the genomics era beginning, yet despite its utility universal application is limited by the small number of individuals used in its development. This is highlighted by the presence of high-quality sequence reads failing to map within the HGR. Sequences failing to map generally represent 2–5 % of total reads, which may harbor regions that would enhance our understanding of population variation, evolution, and disease. Alternatively, complete de novo assemblies can be created, but these effectively ignore the groundwork of the HGR. In an effort to find a middle ground, we developed a bioinformatic pipeline that maps paired-end reads to the HGR as separate single reads, exports unmappable reads, de novo assembles these reads per individual and then combines assemblies into a secondary reference assembly used for comparative analysis. Using 45 diverse 1000 Genomes Project individuals, we identified 351,361 contigs covering 195.5 Mb of sequence unincorporated in GRCh38. 30,879 contigs are represented in multiple individuals with ~40 % showing high sequence complexity. Genomic coordinates were generated for 99.9 %, with 52.5 % exhibiting high-quality mapping scores. Comparative genomic analyses with archaic humans and primates revealed significant sequence alignments and comparisons with model organism RefSeq gene datasets identified novel human genes. If incorporated, these sequences will expand the HGR, but more importantly our data highlight that with this method low coverage (~10–20×) next-generation sequencing can still be used to identify novel unmapped sequences to explore biological functions contributing to human phenotypic variation, disease and functionality for personal genomic medicine.  相似文献   

20.
Social behavior of small mammals living under natural conditions often is inferred from live-trapping data, particularly from incidents in which two or more individuals are captured together in a trap. We examined whether multiple-capture data from a long-term study of prairie voles (Microtus ochrogaster) and meadow voles (Microtus pennsylvanicus) were consistent with well-known species differences in social behavior (whereas prairie voles are highly social and display monogamy, meadow voles are less social and promiscuous). When possible, we also examined multiple captures of two nontarget species, northern short-tailed shrews (Blarina brevicauda) and western harvest mice (Reithrodontomys megalotis). Percent of total captures that were multiple captures and percent of total adult captures that were male–female captures were highest for prairie voles and lowest for meadow voles; values for harvest mice and shrews were in between those of the vole species, but more similar to values for meadow voles. Repeat captures of the same male–female pair occurred most commonly in prairie voles, and multiple captures of this species typically involved individuals from the same social group. Multiple captures of adults and juveniles were more common in prairie voles than meadow voles, except for captures of at least one adult male and at least one juvenile, which did not differ between the two vole species. Multiple capture data for prairie voles and meadow voles were largely consistent with established species differences in social behavior, suggesting that such data can provide an accurate indication of social and mating systems of small mammals.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号