首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Genotyping by sequencing (GBS) is the latest application of next-generation sequencing protocols for the purposes of discovering and genotyping SNPs in a variety of crop species and populations. Unlike other high-density genotyping technologies which have mainly been applied to general interest “reference” genomes, the low cost of GBS makes it an attractive means of saturating mapping and breeding populations with a high density of SNP markers. One barrier to the widespread use of GBS has been the difficulty of the bioinformatics analysis as the approach is accompanied by a high number of erroneous SNP calls which are not easily diagnosed or corrected. In this study, we use a 384-plex GBS protocol to add 30,984 markers to an indica (IR64) × japonica (Azucena) mapping population consisting of 176 recombinant inbred lines of rice (Oryza sativa) and we release our imputation and error correction pipeline to address initial GBS data sparsity and error, and streamline the process of adding SNPs to RIL populations. Using the final imputed and corrected dataset of 30,984 markers, we were able to map recombination hot and cold spots and regions of segregation distortion across the genome with a high degree of accuracy, thus identifying regions of the genome containing putative sterility loci. We mapped QTL for leaf width and aluminum tolerance, and were able to identify additional QTL for both phenotypes when using the full set of 30,984 SNPs that were not identified using a subset of only 1,464 SNPs, including a previously unreported QTL for aluminum tolerance located directly within a recombination hotspot on chromosome 1. These results suggest that adding a high density of SNP markers to a mapping or breeding population through GBS has a great value for numerous applications in rice breeding and genetics research.  相似文献   

2.
Medicago truncatula has all the characteristics required for a concerted analysis of nitrogen-fixing symbiosis withRhizobium using the tools of molecular biology, cellular biology and genetics.M. truncatula is a diploid and autogamous plant has a relatively small genome, and preliminary molecular analysis suggests that allelic heterozygosity is minimal compared with the cross-fertilising tetraploid alfalfa (Medicago sativa). TheM. truncatula cultivar Jemalong is nodulated by theRhizobium meliloti strain 2011, which has already served to define many of the bacterial genes involved in symbiosis with alfalfa. A genotype of Jemalong has been identified which can be regenerated after transformation byAgrobacterium, thus allowing the analysis ofin-vitro-modified genes in an homologous transgenic system. Finally, by virtue of the diploid, self-fertilising and genetically homogeneous character ofM. truncatula, it should be relatively straightforward to screen for recessive mutations in symbiotic genes, to carry out genetic analysis, and to construct an RFLP map for this plant.  相似文献   

3.
Advances in next-generation sequencing offer high-throughput and cost-effective genotyping alternatives, including genotyping-by-sequencing (GBS). Results have shown that this methodology is efficient for genotyping a variety of species, including those with complex genomes. To assess the utility of GBS in cultivated hexaploid oat (Avena sativa L.), seven bi-parental mapping populations and diverse inbred lines from breeding programs around the world were studied. We examined technical factors that influence GBS SNP calls, established a workflow that combines two bioinformatics pipelines for GBS SNP calling, and provided a nomenclature for oat GBS loci. The high-throughput GBS system enabled us to place 45,117 loci on an oat consensus map, thus establishing a positional reference for further genomic studies. Using the diversity lines, we estimated that a minimum density of one marker per 2 to 2.8 cM would be required for genome-wide association studies (GWAS), and GBS markers met this density requirement in most chromosome regions. We also demonstrated the utility of GBS in additional diagnostic applications related to oat breeding. We conclude that GBS is a powerful and useful approach, which will have many additional applications in oat breeding and genomic studies.  相似文献   

4.
Recently developed plant genomics approaches (LD mapping and genome-wide selection) require many molecular markers distributed throughout the plant genome. As a result, the availability of an increasing number of markers is essential for maintaining highly efficient and accurate plant breeding programs. In this study, we identified SNP loci in sunflower using a genotyping by sequencing (GBS) approach in an intraspecific F2 mapping population. A total of 271,445,770 reads were generated by the Genome Analyzer II next-generation sequencing platform and 29.2 % of the reads were aligned to unique locations in the genome. A total of 46,278 SNP loci were identified and 7646 SNP loci were validated in an F2 population. In addition, a SNP-based linkage map was constructed. This is the first report of SNP discovery in sunflower by GBS. The SNP markers and SNP-based linkage map will be valuable molecular genetics tools for sunflower breeding.  相似文献   

5.
Verticillium wilt, caused by the soilborne fungus, Verticillium alfalfae, is one of the most serious diseases of alfalfa (Medicago sativa L.) worldwide. To identify loci associated with resistance to Verticillium wilt, a bulk segregant analysis was conducted in susceptible or resistant pools constructed from 13 synthetic alfalfa populations, followed by association mapping in two F1 populations consisted of 352 individuals. Simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) markers were used for genotyping. Phenotyping was done by manual inoculation of the pathogen to replicated cloned plants of each individual and disease severity was scored using a standard scale. Marker-trait association was analyzed by TASSEL. Seventeen SNP markers significantly associated with Verticillium wilt resistance were identified and they were located on chromosomes 1, 2, 4, 7 and 8. SNP markers identified on chromosomes 2, 4 and 7 co-locate with regions of Verticillium wilt resistance loci reported in M. truncatula. Additional markers identified on chromosomes 1 and 8 located the regions where no Verticillium resistance locus has been reported. This study highlights the value of SNP genotyping by high resolution melting to identify the disease resistance loci in tetraploid alfalfa. With further validation, the markers identified in this study could be used for improving resistance to Verticillium wilt in alfalfa breeding programs.  相似文献   

6.
Genotyping-by-sequencing (GBS) is a rapid and cost-effective genome-wide genotyping technique applicable whether a reference genome is available or not. Due to the cost-coverage trade-off, however, GBS typically produces large amounts of missing marker genotypes, whose imputation becomes therefore both challenging and critical for later analyses. In this work, the performance of four general imputation methods (K-nearest neighbors, Random Forest, singular value decomposition, and mean value) and two genotype-specific methods (“Beagle” and FILLIN) was measured on GBS data from alfalfa (Medicago sativa L., autotetraploid, heterozygous, without reference genome) and rice (Oryza sativa L., diploid, 100 % homozygous, with reference genome). Alfalfa SNP were aligned on the genome of the closely related species Medicago truncatula L.. Benchmarks consisted in progressive data filtering for marker call rate (up to 70 %) and increasing proportions (up to 20 %) of known genotypes masked for imputation. The relative performance was measured as the total proportion of correctly imputed genotypes, globally and within each genotype class (two homozygotes in rice, two homozygotes and one heterozygote in alfalfa). We found that imputation accuracy was robust to increasing missing rates, and consistently higher in rice than in alfalfa. Accuracy was as high as 90–100 % for the major (most frequent) homozygous genotype, but dropped to 80–90 % (rice) and below 30 % (alfalfa) in the minor homozygous genotype. Beagle was the best performing method, both accuracy- and time-wise, in rice. In alfalfa, KNNI and RFI gave the highest accuracies, but KNNI was much faster.  相似文献   

7.
Simple sequence repeat (SSR) or microsatellite DNA markers have been shown to function well in plant and mammalian species for genetic map construction and genotype identification. The objectives of the work reported here were to search GenBank for the presence of SSR-containing sequences from the genus Medicago, to assess the presence and frequency of SSR DNA in the alfalfa (Medicago sativa (L.) L. &L.) genome, and to examine the function of selected markers in a spectrum of perennial and annual Medicago species. The screening of an alfalfa genomic DNA library and sequencing of clones putatively containing SSRs indicated approximately 19 000 (AT)n + (CT)n + (CA)n + (ATT)n SSRs in the tetraploid genome. Inheritance was consistent with Mendelian expectations at four selected SSR loci with different core motifs. Additionally, genotypes of a range of Medicago species, including 10 perennial subspecies of the M. sativa complex and other perennial and annual Medicago species, were analyzed at each of the loci to ascertain the presence, number, and size of SSR alleles at each locus in each genotype. These studies indicate that SSR markers can function in alfalfa for the construction of genetic maps and will also be useful in a range of Medicago species for purposes of assessing genetic relatedness and taxonomic relationships, and for genotype identification.  相似文献   

8.
Besides their use in mRNA expression profiling, oligonucleotide microarrays have also been applied to single-nucleotide polymorphism (SNP) and loss of heterozygosity (LOH) or allelic imbalance studies. In this report, we evaluate the reliability of using whole genome amplified DNA for analysis with an oligonucleotide microarray containing 11 560 SNPs to detect allelic imbalance and chromosomal copy number abnormalities. Whole genome SNP analyses were performed with DNA extracted from osteosarcoma tissues and patient-matched blood. SNP calls were then generated by Affymetrix® GeneChip® DNA Analysis Software. In two osteosarcoma cases, using unamplified DNA, we identified 793 and 1070 SNP loci with allelic imbalance, respectively. In a parallel experiment with amplified DNA, 78% and 83% of these SNP loci with allelic imbalance was detected. The average false-positive rate is 13.8%. Furthermore, using the Affymetrix® GeneChip® Chromosome Copy Number Tool to analyze the SNP array data, we were able to detect identical chromosomal regions with gain or loss in both amplified and unamplified DNA at cytoband resolution.  相似文献   

9.
Development of an RFLP map in diploid alfalfa   总被引:18,自引:3,他引:15  
Summary We have developed a restriction fragment length polymorphism (RFLP) linkage map in diploid alfalfa (Medicago sativa L.) to be used as a tool in alfalfa improvement programs. An F2 mapping population of 86 individuals was produced from a cross between a plant of the W2xiso population (M. sativa ssp. sativa) and a plant from USDA PI440501 (M. sativa ssp. coerulea). The current map contains 108 cDNA markers covering 467.5 centimorgans. The short length of the map is probably due to low recombination in this cross. Marker order may be maintained in other populations even though the distance between clones may change. About 50% of the mapped loci showed segregation distortion, mostly toward excess heterozygotes. This is circumstantial evidence supporting the maximum heterozygote theory which states that relative vigor is dependent on maximizing the number of loci with multiple alleles. The application of the map to tetraploid populations is discussed.  相似文献   

10.
To deploy a high-throughput genotyping platform in germplasm management, we designed and tested a custom OPA (Oligo Pool All), LSGermOPA, for assessing the genetic diversity and population structure of the USDA cultivated lettuce (Lactuca sativa L.) germplasm collection using Illumina’s GoldenGate assay. This OPA contains 384 EST (expressed sequence tag)-derived SNP (single nucleotide polymorphism) markers selected from a large set of SNP markers experimentally validated and mapped by the Compositae Genome Project. Used for genotyping were DNA samples prepared from bulked leaves of five randomly-selected seedlings from each of 380 lettuce accessions. High-quality genotype data were obtained from 354 of the 384 SNPs. The reproducibility of automatic genotype calls was 99.8% as calculated from the four pairs of duplicated DNA samples in the assay. An unexpectedly high percentage of heterozygous genotypes at the polymorphic loci for most accessions indicated a high level of heterogeneity within accessions. Only 148 homogenous accessions, collectively comprising all five horticultural types, were used in subsequent analyses to demonstrate the usefulness of LSGermOPA. The results of phylogenetic relationship, population structure and genetic differentiation analyses were consistent with previous reports using other marker systems. This suggests that LSGermOPA is capable of revealing sufficient levels of polymorphism among lettuce cultivars and is appropriate for rapid assessment of genetic diversity and population structure in the lettuce germplasm collection. Challenges and strategies for effective genotyping and managing lettuce germplasm are discussed.  相似文献   

11.
Summary A high frequency of paternal plastid transmission occurred in progeny from crosses among normal green alfalfa plants. Plastid transmission was analyzed by hybridization of radiolabeled alfalfa plastid DNA (cpDNA) probes to Southern blots of restriction digests of the progeny DNA. Each probe revealed a specific polymorphism differentiating the parental plastid genomes. Of 212 progeny, 34 were heteroplastidic, with their cpDNAs ranging from predominantly paternal to predominantly maternal. Regrowth of shoots from heteroplasmic plants following removal of top growth revealed the persistence of mixed plastids in a given plant. However, different shoots within a green heteroplasmic plant exhibited paternal, maternal, or mixed cpDNAs. Evidence of maternal nuclear genomic influence on the frequency of paternal plastid transmission was observed in some reciprocal crosses. A few tetraploid F1 progeny were obtained from tetraploid (2n=4x=32) Medicago sativa ssp. sativa x diploid (2n=2x=16) M. sativa ssp. falcata crosses, and resulted from unreduced gametes. Here more than the maternal genome alone apparently functioned in controlling plastid transmission. Considering all crosses, only 5 of 212 progeny cpDNAs lacked evidence of a definitive paternal plastid fragment.Contribution No. 89-524-J from the Kansas Agricultural Experiment Station, Kansas State University, Manhattan  相似文献   

12.
Cultivated alfalfa (Medicago sativa) is an autotetraploid. However, all three existing alfalfa genetic maps resulted from crosses of diploid alfalfa. The current study was undertaken to evaluate the use of Simple Sequence Repeat (SSR) DNA markers for mapping in diploid and tetraploid alfalfa. Ten SSR markers were incorporated into an existing F2 diploid alfalfa RFLP map and also mapped in an F2 tetraploid population. The tetraploid population had two to four alleles in each of the loci examined. The segregation of these alleles in the tetraploid mapping population generally was clear and easy to interpret. Because of the complexity of tetrasomic linkage analysis and a lack of computer software to accommodate it, linkage relationships at the tetraploid level were determined using a single-dose allele (SDA) analysis, where the presence or absence of each allele was scored independently of the other alleles at the same locus. The SDA diploid map was also constructed to compare mapping using SDA to the standard co-dominant method. Linkage groups were generally conserved among the tetraploid and the two diploid linkage maps, except for segments where severe segregation distortion was present. Segregation distortion, which was present in both tetraploid and diploid populations, probably resulted from inbreeding depression. The ease of analysis together with the abundance of SSR loci in the alfalfa genome indicated that SSR markers should be a useful tool for mapping tetraploid alfalfa. Received: 10 September 1999 / Accepted: 11 November 1999  相似文献   

13.
Flexibility and low cost make genotyping‐by‐sequencing (GBS) an ideal tool for population genomic studies of nonmodel species. However, to utilize the potential of the method fully, many parameters affecting library quality and single nucleotide polymorphism (SNP) discovery require optimization, especially for conifer genomes with a high repetitive DNA content. In this study, we explored strategies for effective GBS analysis in pine species. We constructed GBS libraries using HpaII, PstI and EcoRI‐MseI digestions with different multiplexing levels and examined the effect of restriction enzymes on library complexity and the impact of sequencing depth and size selection of restriction fragments on sequence coverage bias. We tested and compared UNEAK, Stacks and GATK pipelines for the GBS data, and then developed a reference‐free SNP calling strategy for haploid pine genomes. Our GBS procedure proved to be effective in SNP discovery, producing 7000–11 000 and 14 751 SNPs within and among three pine species, respectively, from a PstI library. This investigation provides guidance for the design and analysis of GBS experiments, particularly for organisms for which genomic information is lacking.  相似文献   

14.
The diversity in the Plasmodium falciparum genome can be used to explore parasite population dynamics, with practical applications to malaria control. The ability to identify the geographic origin and trace the migratory patterns of parasites with clinically important phenotypes such as drug resistance is particularly relevant. With increasing single-nucleotide polymorphism (SNP) discovery from ongoing Plasmodium genome sequencing projects, a demand for high SNP and sample throughput genotyping platforms for large-scale population genetic studies is required. Low parasitaemias and multiple clone infections present a number of challenges to genotyping P. falciparum. We addressed some of these issues using a custom 384-SNP Illumina GoldenGate assay on P. falciparum DNA from laboratory clones (long-term cultured adapted parasite clones), short-term cultured parasite isolates and clinical (non-cultured isolates) samples from East and West Africa, Southeast Asia and Oceania. Eighty percent of the SNPs (n = 306) produced reliable genotype calls on samples containing as little as 2 ng of total genomic DNA and on whole genome amplified DNA. Analysis of artificial mixtures of laboratory clones demonstrated high genotype calling specificity and moderate sensitivity to call minor frequency alleles. Clear resolution of geographically distinct populations was demonstrated using Principal Components Analysis (PCA), and global patterns of population genetic diversity were consistent with previous reports. These results validate the utility of the platform in performing population genetic studies of P. falciparum.  相似文献   

15.
Restriction site-associated DNA sequencing or genotyping-by-sequencing (GBS) approaches allow for rapid and cost-effective discovery and genotyping of thousands of single-nucleotide polymorphisms (SNPs) in multiple individuals. However, rigorous quality control practices are needed to avoid high levels of error and bias with these reduced representation methods. We developed a formal statistical framework for filtering spurious loci, using Mendelian inheritance patterns in nuclear families, that accommodates variable-quality genotype calls and missing data—both rampant issues with GBS data—and for identifying sex-linked SNPs. Simulations predict excellent performance of both the Mendelian filter and the sex-linkage assignment under a variety of conditions. We further evaluate our method by applying it to real GBS data and validating a subset of high-quality SNPs. These results demonstrate that our metric of Mendelian inheritance is a powerful quality filter for GBS loci that is complementary to standard coverage and Hardy–Weinberg filters. The described method, implemented in the software MendelChecker, will improve quality control during SNP discovery in nonmodel as well as model organisms.  相似文献   

16.
The rapid development of next-generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS), a low-cost, reduced representation sequencing method, is becoming a common approach for whole-genome marker profiling in many species. With quickly developing sequencing technologies, adapting current GBS methodologies to new platforms will leverage these advancements for future studies. To test new semiconductor sequencing platforms for GBS, we genotyped a barley recombinant inbred line (RIL) population. Based on a previous GBS approach, we designed bar code and adapter sets for the Ion Torrent platforms. Four sets of 24-plex libraries were constructed consisting of 94 RILs and the two parents and sequenced on two Ion platforms. In parallel, a 96-plex library of the same RILs was sequenced on the Illumina HiSeq 2000. We applied two different computational pipelines to analyze sequencing data; the reference-independent TASSEL pipeline and a reference-based pipeline using SAMtools. Sequence contigs positioned on the integrated physical and genetic map were used for read mapping and variant calling. We found high agreement in genotype calls between the different platforms and high concordance between genetic and reference-based marker order. There was, however, paucity in the number of SNP that were jointly discovered by the different pipelines indicating a strong effect of alignment and filtering parameters on SNP discovery. We show the utility of the current barley genome assembly as a framework for developing very low-cost genetic maps, facilitating high resolution genetic mapping and negating the need for developing de novo genetic maps for future studies in barley. Through demonstration of GBS on semiconductor sequencing platforms, we conclude that the GBS approach is amenable to a range of platforms and can easily be modified as new sequencing technologies, analysis tools and genomic resources develop.  相似文献   

17.
Whole‐genome duplications have occurred in the recent ancestors of many plants, fish, and amphibians, resulting in a pervasiveness of paralogous loci and the potential for both disomic and tetrasomic inheritance in the same genome. Paralogs can be difficult to reliably genotype and are often excluded from genotyping‐by‐sequencing (GBS) analyses; however, removal requires paralogs to be identified which is difficult without a reference genome. We present a method for identifying paralogs in natural populations by combining two properties of duplicated loci: (i) the expected frequency of heterozygotes exceeds that for singleton loci, and (ii) within heterozygotes, observed read ratios for each allele in GBS data will deviate from the 1:1 expected for singleton (diploid) loci. These deviations are often not apparent within individuals, particularly when sequence coverage is low; but, we postulated that summing allele reads for each locus over all heterozygous individuals in a population would provide sufficient power to detect deviations at those loci. We identified paralogous loci in three species: Chinook salmon (Oncorhynchus tshawytscha) which retains regions with ongoing residual tetrasomy on eight chromosome arms following a recent whole‐genome duplication, mountain barberry (Berberis alpina) which has a large proportion of paralogs that arose through an unknown mechanism, and dusky parrotfish (Scarus niger) which has largely rediploidized following an ancient whole‐genome duplication. Importantly, this approach only requires the genotype and allele‐specific read counts for each individual, information which is readily obtained from most GBS analysis pipelines.  相似文献   

18.
Information on genetic diversity and population structure of a tetraploid alfalfa collection might be valuable in effective use of the genetic resources. A set of 336 worldwide genotypes of tetraploid alfalfa (Medicago sativa subsp. sativa L.) was genotyped using 85 genome-wide distributed SSR markers to reveal the genetic diversity and population structure in the alfalfa. Genetic diversity analysis identified a total of 1056 alleles across 85 marker loci. The average expected heterozygosity and polymorphism information content values were 0.677 and 0.638, respectively, showing high levels of genetic diversity in the cultivated tetraploid alfalfa germplasm. Comparison of genetic characteristics across chromosomes indicated regions of chromosomes 2 and 3 had the highest genetic diversity. A higher genetic diversity was detected in alfalfa landraces than that of wild materials and cultivars. Two populations were identified by the model-based population structure, principal coordinate and neighbor-joining analyses, corresponding to China and other parts of the world. However, lack of strictly correlation between clustering and geographic origins suggested extensive germplasm exchanges of alfalfa germplasm across diverse geographic regions. The quantitative analysis of the genetic diversity and population structure in this study could be useful for genetic and genomic analysis and utilization of the genetic variation in alfalfa breeding.  相似文献   

19.
Affymetrix SNP arrays have been widely used for single-nucleotide polymorphism (SNP) genotype calling and DNA copy number variation inference. Although numerous methods have achieved high accuracy in these fields, most studies have paid little attention to the modeling of hybridization of probes to off-target allele sequences, which can affect the accuracy greatly. In this study, we address this issue and demonstrate that hybridization with mismatch nucleotides (HWMMN) occurs in all SNP probe-sets and has a critical effect on the estimation of allelic concentrations (ACs). We study sequence binding through binding free energy and then binding affinity, and develop a probe intensity composite representation (PICR) model. The PICR model allows the estimation of ACs at a given SNP through statistical regression. Furthermore, we demonstrate with cell-line data of known true copy numbers that the PICR model can achieve reasonable accuracy in copy number estimation at a single SNP locus, by using the ratio of the estimated AC of each sample to that of the reference sample, and can reveal subtle genotype structure of SNPs at abnormal loci. We also demonstrate with HapMap data that the PICR model yields accurate SNP genotype calls consistently across samples, laboratories and even across array platforms.  相似文献   

20.

Key message

New software to make tetraploid genotype calls from SNP array data was developed, which uses hierarchical clustering and multiple F1 populations to calibrate the relationship between signal intensity and allele dosage.

Abstract

SNP arrays are transforming breeding and genetics research for autotetraploids. To fully utilize these arrays, the relationship between signal intensity and allele dosage must be calibrated for each marker. We developed an improved computational method to automate this process, which is provided as the R package ClusterCall. In the training phase of the algorithm, hierarchical clustering within an F1 population is used to group samples with similar intensity values, and allele dosages are assigned to clusters based on expected segregation ratios. In the prediction phase, multiple F1 populations and the prediction set are clustered together, and the genotype for each cluster is the mode of the training set samples. A concordance metric, defined as the proportion of training set samples equal to the mode, can be used to eliminate unreliable markers and compare different algorithms. Across three potato families genotyped with an 8K SNP array, ClusterCall scored 5729 markers with at least 0.95 concordance (94.6% of its total), compared to 5325 with the software fitTetra (82.5% of its total). The three families were used to predict genotypes for 5218 SNPs in the SolCAP diversity panel, compared with 3521 SNPs in a previous study in which genotypes were called manually. One of the additional markers produced a significant association for vine maturity near a well-known causal locus on chromosome 5. In conclusion, when multiple F1 populations are available, ClusterCall is an efficient method for accurate, autotetraploid genotype calling that enables the use of SNP data for research and plant breeding.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号