首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
BackgroundSNPs are the most abundant polymorphism type, and have been explored in many crop genomic studies, including rice and maize. SNP discovery in allotetraploid cotton genomes has lagged behind that of other crops due to their complexity and polyploidy. In this study, genome-wide SNPs are detected systematically using next-generation sequencing and efficient SNP genotyping methods, and used to construct a linkage map and characterize the structural variations in polyploid cotton genomes.ResultsWe construct an ultra-dense inter-specific genetic map comprising 4,999,048 SNP loci distributed unevenly in 26 allotetraploid cotton linkage groups and covering 4,042 cM. The map is used to order tetraploid cotton genome scaffolds for accurate assembly of G. hirsutum acc. TM-1. Recombination rates and hotspots are identified across the cotton genome by comparing the assembled draft sequence and the genetic map. Using this map, genome rearrangements and centromeric regions are identified in tetraploid cotton by combining information from the publicly-available G. raimondii genome with fluorescent in situ hybridization analysis.ConclusionsWe report the genotype-by-sequencing method used to identify millions of SNPs between G. hirsutum and G. barbadense. We construct and use an ultra-dense SNP map to correct sequence mis-assemblies, merge scaffolds into pseudomolecules corresponding to chromosomes, detect genome rearrangements, and identify centromeric regions in allotetraploid cottons. We find that the centromeric retro-element sequence of tetraploid cotton derived from the D subgenome progenitor might have invaded the A subgenome centromeres after allotetrapolyploid formation. This study serves as a valuable genomic resource for genetic research and breeding of cotton.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-015-0678-1) contains supplementary material, which is available to authorized users.  相似文献   

2.
A combined approach of whole genome shotgun sequencing and ultra-high density linkage mapping using skim sequencing of a segregating population is effective for assembling allopolyploid genomes.See related Research, http://dx.doi.org/10.1186/s13059-015-0582-8  相似文献   

3.

Premise of the Study

Recurrent formation of polyploid taxa is a common observation in many plant groups. Haploid, cytoplasmic genomes like the plastid genome can be used to overcome the problem of homeologous genes and recombination in polyploid taxa. Fragaria (Rosaceae) contains several octo‐ and decaploid species. We use plastome sequences to infer the plastid ancestry of these taxa with special focus on the decaploid Fragaria cascadensis.

Methods

We used genome skimming of 96 polyploid Fragaria samples on a single Illumina HiSeq 3000 lane to obtain whole plastome sequences. These sequences were used for phylogenetic reconstructions and dating analyses. Ploidy of all samples was inferred with flow cytometry, and plastid inheritance was examined in a controlled cross of F. cascadensis.

Key Results

The plastid genome phylogeny shows that only the octoploid F. chiloensis is monophyletic, all other polyploid taxa were supported to be para‐ or polyphyletic. The decaploid Fragaria cascadensis has biparental plastid inheritance and four different plastid donors. Diversification of the F. cascadensis clades occurred in the last 230,000 years. The southern part of its distribution range harbors considerably higher genetic diversity, suggestive of a potential refugium.

Conclusions

Fragaria cascadensis had at least four independent origins from parents with different plastomes. In contrast, para‐ and polyphyletic taxa of the octoploid Fragaria species are best explained by incomplete lineage sorting and/or hybridization. Biogeographic patterns in F. cascadensis are probably a result of range shift during the last glacial maximum.  相似文献   

4.
Two overlapping bacterial artificial chromosome (BAC) clones from the B genome of the tetraploid wheat Triticum turgidum were identified, each of which contains one of the two high-molecular-weight (HMW) glutenin genes, comprising the complex Glu-B1 locus. The complete sequence (285 506 bp of DNA) of this chromosomal region was determined. The two paralogous x-type ( Glu-1-1 ) and y-type ( Glu-1-2 ) HMW-glutenin genes of the complex Glu-B1 locus were found to be separated by ca. 168 000 bp instead of the 51 000 bp separation previously reported for the orthologous Glu-D1 locus of Aegilops tauschii, the D-genome donor of hexaploid wheat. This difference in intergene spacing is due almost entirely to be the insertion of clusters of nested retrotransposons. Otherwise, the orientation and order of the HMW glutenins and adjacent genes were identical in the two genomes. A comparison of these orthologous regions indicates modes and patterns of sequence divergence, with implications for the overall Triticeae genome structure and evolution. A duplicate globulin gene, found 5' of each HMW-glutenin gene, assists to tentatively define the original duplication event leading to the paralogous x- and y-type HMW-glutenin genes. The intergenic regions of the two loci are composed of different patterns and classes of retrotransposons, indicating that insertion times of these retroelements were after the divergence of the two wheat genomes. In addition, a putative receptor kinase gene near the y-type HMW-glutenin gene at the Glu-B1 locus is likely active as it matches recently reported ESTs from germinating barley endosperm. The presence of four genes represented only in the Triticeae endosperm ESTs suggests an endosperm-specific chromosome domain.  相似文献   

5.
6.
Nucleotide diversity in gorillas   总被引:9,自引:0,他引:9  
Yu N  Jensen-Seaman MI  Chemnick L  Ryder O  Li WH 《Genetics》2004,166(3):1375-1383
Comparison of the levels of nucleotide diversity in humans and apes may provide valuable information for inferring the demographic history of these species, the effect of social structure on genetic diversity, patterns of past migration, and signatures of past selection events. Previous DNA sequence data from both the mitochondrial and the nuclear genomes suggested a much higher level of nucleotide diversity in the African apes than in humans. Noting that the nuclear DNA data from the apes were very limited, we previously conducted a DNA polymorphism study in humans and another in chimpanzees and bonobos, using 50 DNA segments randomly chosen from the noncoding, nonrepetitive parts of the human genome. The data revealed that the nucleotide diversity (pi) in bonobos (0.077%) is actually lower than that in humans (0.087%) and that pi in chimpanzees (0.134%) is only 50% higher than that in humans. In the present study we sequenced the same 50 segments in 15 western lowland gorillas and estimated pi to be 0.158%. This is the highest value among the African apes but is only about two times higher than that in humans. Interestingly, available mtDNA sequence data also suggest a twofold higher nucleotide diversity in gorillas than in humans, but suggest a threefold higher nucleotide diversity in chimpanzees than in humans. The higher mtDNA diversity in chimpanzees might be due to the unique pattern in the evolution of chimpanzee mtDNA. From the nuclear DNA pi values, we estimated that the long-term effective population sizes of humans, bonobos, chimpanzees, and gorillas are, respectively, 10,400, 12,300, 21,300, and 25,200.  相似文献   

7.
Hovav R  Chaudhary B  Udall JA  Flagel L  Wendel JF 《Genetics》2008,179(3):1725-1733
A putative advantage of allopolyploidy is the possibility of differential selection of duplicated (homeologous) genes originating from two different progenitor genomes. In this note we explore this hypothesis using a high throughput, SNP-specific microarray technology applied to seed trichomes (cotton) harvested from three developmental time points in wild and modern accessions of two independently domesticated cotton species, Gossypium hirsutum and G. barbadense. We show that homeolog expression ratios are dynamic both developmentally and over the several-thousand-year period encompassed by domestication and crop improvement, and that domestication increased the modulation of homeologous gene expression. In both species, D-genome expression was preferentially enhanced under human selection pressure, but for nonoverlapping sets of genes for the two independent domestication events. Our data suggest that human selection may have operated on different components of the fiber developmental genetic program in G. hirsutum and G. barbadense, leading to convergent rather than parallel genetic alterations and resulting morphology.  相似文献   

8.
Although long-read sequencing can often enable chromosome-level reconstruction of genomes, it is still unclear how one can routinely obtain gapless assemblies. In the model plant Arabidopsis thaliana, other than the reference accession Col-0, all other accessions de novo assembled with long-reads until now have used PacBio continuous long reads (CLR). Although these assemblies sometimes achieved chromosome-arm level contigs, they inevitably broke near the centromeres, excluding megabases of DNA from analysis in pan-genome projects. Since PacBio high-fidelity (HiFi) reads circumvent the high error rate of CLR technologies, albeit at the expense of read length, we compared a CLR assembly of accession Eyach15-2 to HiFi assemblies of the same sample. The use of five different assemblers starting from subsampled data allowed us to evaluate the impact of coverage and read length. We found that centromeres and rDNA clusters are responsible for 71% of contig breaks in the CLR scaffolds, while relatively short stretches of GA/TC repeats are at the core of >85% of the unfilled gaps in our best HiFi assemblies. Since the HiFi technology consistently enabled us to reconstruct gapless centromeres and 5S rDNA clusters, we demonstrate the value of the approach by comparing these previously inaccessible regions of the genome between the Eyach15-2 accession and the reference accession Col-0.  相似文献   

9.
Discovering the seeds of diversity in plant genomes   总被引:1,自引:0,他引:1       下载免费PDF全文
A report on the Keystone Symposium 'Comparative Genomics of Plants', Taos, USA, 4-9 March 2004.  相似文献   

10.
Nucleotide sequence homologies in control regions of prokaryotic genomes   总被引:7,自引:0,他引:7  
G M Studnicka 《Gene》1987,58(1):45-57
Functional recognition sites for several regulatory factors, including RNA polymerase, cyclic adenosine monophosphate receptor protein and ribosomes, do not always have strong consensus nucleotide sequence homology, yet they are capable of biological activity. Using the computer, other nucleotide sequences can be found that have equal or significantly greater consensus homology, but whose biological function has not been characterized. This analysis shows that no arbitrary 'cutoff score' can successfully distinguish active recognition sites from uncharacterized homologies, due to the great natural diversity in the strength and conservation of functional sites. It also predicts that the strong 'cryptic' homologies presented here are of two types: some might already have a biological function which has so far not been detected, whereas certain single-point mutations might be able to confer activity upon the others by correcting a key structural defect.  相似文献   

11.
Forest tree species provide many examples of well-studied adaptive differentiation, where the search for the underlying genes might be possible. In earlier studies and in our common conditions in a greenhouse, northern populations set bud earlier than southern ones. A difference in latitude of origin of one degree corresponded to a change of 1.4 days in number of days to terminal bud set of seedlings. Earlier physiological and ecological genetics work in conifers and other plants have suggested that such variation could be governed by phytochromes. Nucleotide variation was examined at two phytochrome loci (PHYP and PHYO, homologues of the Arabidopsis thaliana PHYB and PHYA, respectively) in three populations: northern Finland, southern Finland and northern Spain. In our samples of 12-15 sequences (2980 and 1156 base pairs at the two loci) we found very low nonsynonymous variation; pi was 0.0003 and 0.0002 at PHYP and PHYO loci, respectively. There was no functional differentiation between populations at the photosensory domains of either locus. The overall silent variation was also low, only 0.0024 for the PHYP locus. The low estimates of silent variation are consistent with the estimated low synonymous substitution rates between Pinus sylvestris and Picea abies at the PHYO locus. Despite the low level of nucleotide variation, haplotypic diversity was relatively high (0.42 and 0.41 for fragments of 1156 nucleotides) at the two loci.  相似文献   

12.
13.
Sampling nucleotide diversity in cotton   总被引:1,自引:0,他引:1  

Background  

Cultivated cotton is an annual fiber crop derived mainly from two perennial species, Gossypium hirsutum L. or upland cotton, and G. barbadense L., extra long-staple fiber Pima or Egyptian cotton. These two cultivated species are among five allotetraploid species presumably derived monophyletically between G. arboreum and G. raimondii. Genomic-based approaches have been hindered by the limited variation within species. Yet, population-based methods are being used for genome-wide introgression of novel alleles from G. mustelinum and G. tomentosum into G. hirsutum using combinations of backcrossing, selfing, and inter-mating. Recombinant inbred line populations between genetics standards TM-1, (G. hirsutum) × 3-79 (G. barbadense) have been developed to allow high-density genetic mapping of traits.  相似文献   

14.
The ray-finned fishes ('fishes') vary widely in genome size, morphology and adaptations. Teleosts, which comprise approximately 23600 species, constitute >99% of living fishes. The radiation of teleosts has been attributed to a genome duplication event, which is proposed to have occurred in an ancient teleost. But more evidence is required to support the genome-duplication hypothesis and to establish a causal relationship between additional genes and teleost diversity. Fish genomes seem to be 'plastic' in comparison with other vertebrate genomes because genetic changes, such as polyploidization, gene duplications, gain of spliceosomal introns and speciation, are more frequent in fishes.  相似文献   

15.
16.
Nucleotide diversity on the ovine Y chromosome   总被引:1,自引:0,他引:1  
To investigate the impact of male-mediated introgression during the evolution of sheep breeds, a sequencing approach was used to identify single nucleotide polymorphisms (SNPs) from the male-specific region of the ovine Y chromosome (MSY). A total of 4380 bp, which comprised nine fragments from five MSY genes was sequenced within a panel of 14 males from seven breeds. Sequence alignment identified a single segregating site, an A/G SNP located approximately 1685 bp upstream of the ovine SRY gene. The resulting estimation of nucleotide diversity (piY = 0.90 +/- 0.50 x 10(-4)) falls towards the lower end of estimates from other species. This was compared with the nucleotide diversity estimated from the autosomal component of the genome. Sequence analysis of 2933 bp amplified from eight autosomal genes revealed a nucleotide diversity (piA = 2.15 +/- 0.27 x 10(-3)) higher than previously reported for sheep. Following adjustment for the contrasting influence of effective population size and a male biased mutation rate, comparison revealed that approximately 10% of the expected nucleotide diversity is present on the ovine Y chromosome.  相似文献   

17.
18.
Senecio cambrensis (Welsh groundsel) is a new allohexaploid species, which originated in Wales, UK, in the early part of the 20th century following hybridization between the native tetraploid groundsel (Senecio vulgaris) and the introduced diploid Oxford ragwort (Senecio squalidus). A survey of the number of populations and flowering individuals per population of S. cambrensis in Wales was conducted at peak flowering time in June 2002, 2003 and 2004. The results show a dramatic decrease in both population number and population size of the species since the 1980s when the last population census was conducted. A survey of amplified fragment length polymorphism (AFLP) variation showed that this decline has occurred despite the fact that S. cambrensis contains a high level of genetic diversity with each individual screened possessing a unique multilocus phenotype. The level of variance within the species was similar to that found in one parent (S. vulgaris) and slightly greater than that among samples of the other parent (S. squalidus). Only a small proportion (5%) of AFLP diversity was partitioned among populations indicating a lack of population structure and possibly high levels of gene flow via seed dispersal in what is predominantly a selfing species. Senecio cambrensis showed closer similarity in AFLP phenotype to S. vulgaris than to S. squalidus. Possible causes of this and also the high level of AFLP diversity found in S. cambrensis are discussed. It is suggested that intergenomic recombination following occasional multivalent formation during meiosis in S. cambrensis is likely to be an important cause of both phenomena, although other causes are not ruled out.  相似文献   

19.
We extracted nucleotide sequences from the EMBL database that flank dinucleotide microsatellites in the long sequenced parts of the human, mouse and drosophila genomes. Comparison of the flanking sequences showed that the microsatellites were mostly connected to the bulk of genomic DNA through conserved, highly non-random and mostly (A+T)-rich sequences having many dozens of nucleotides in length. In many cases, the connectors were mutated versions of the flanked microsatellites whose sequence pattern gradually vanished with the distance from the microsatellite center. Hence many microsatellites have hundreds rather than dozens of nucleotides in length, and their ends are diffuse. In contrast, some microsatellites containing predominantly C and/or G, did not influence their neighborhood at all. These results make us change notions about the microsatellite nature. They also indicate that the microsatellites are the dominant part of eukaryotic genomes.  相似文献   

20.
MOTIVATION: The availability of the whole genomic sequences of HIV-1 viruses provides an excellent resource for studying the HIV-1 phylogenies using all the genetic materials. However, such huge volumes of data create computational challenges in both memory consumption and CPU usage. RESULTS: We propose the complete composition vector representation for an HIV-1 strain, and a string scoring method to extract the nucleotide composition strings that contain the richest evolutionary information for phylogenetic analysis. In this way, a large-scale whole genome phylogenetic analysis for thousands of strains can be done both efficiently and effectively. By using 42 carefully curated strains as references, we apply our method to subtype 1156 HIV-1 strains (10.5 million nucleotides in total), which include 825 pure subtype strains and 331 recombinants. Our results show that our nucleotide composition string selection scheme is computationally efficient, and is able to define both pure subtypes and recombinant forms for HIV-1 strains using the 5000 top ranked nucleotide strings. AVAILABILITY: The Java executable and the HIV-1 datasets are accessible through 'http://www.cs.ualberta.ca/~ghlin/src/WebTools/hiv.php. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号