首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Studies of population genetics increasingly use next‐generation DNA sequencing to identify microsatellite loci in nonmodel organisms. There are, however, relatively few studies that validate the feasibility of transitioning from marker development to experimental application across populations and species. North American coralsnakes of the Micrurus fulvius species complex occur in the United States and Mexico, and little is known about their population structure and phylogenetic relationships. This absence of information and population genetics markers is particularly concerning because they are highly venomous and have important implications on human health. To alleviate this problem in coralsnakes, we investigated the feasibility of using 454 shotgun sequences for microsatellite marker development. First, a genomic shotgun library from a single individual was sequenced (approximately 7.74 megabases; 26 831 reads) to identify potentially amplifiable microsatellite loci (PALs). We then hierarchically sampled 76 individuals from throughout the geographic distribution of the species complex and examined whether PALs were amplifiable and polymorphic. Approximately half of the loci tested were readily amplifiable from all individuals, and 80% of the loci tested for variation were variable and thus informative as population genetic markers. To evaluate the repetitive landscape characteristics across multiple snakes, we also compared microsatellite content between the coralsnake and two other previously sampled snakes, the venomous copperhead (Agkistrodon contortrix) and Burmese python (Python molurus).  相似文献   

2.
3.
The application of next-generation sequencing (NGS) technologies for the development of simple sequence repeat (SSR) or microsatellite loci for genetic research in the botanical sciences is described. Microsatellite markers are one of the most informative and versatile DNA-based markers used in plant genetic research, but their development has traditionally been a difficult and costly process. NGS technologies allow the efficient identification of large numbers of microsatellites at a fraction of the cost and effort of traditional approaches. The major advantage of NGS methods is their ability to produce large amounts of sequence data from which to isolate and develop numerous genome-wide and gene-based microsatellite loci. The two major NGS technologies with emergent application in SSR isolation are 454 and Illumina. A review is provided of several recent studies demonstrating the efficient use of 454 and Illumina technologies for the discovery of microsatellites in plants. Additionally, important aspects during NGS isolation and development of microsatellites are discussed, including the use of computational tools and high-throughput genotyping methods. A data set of microsatellite loci in the plastome and mitochondriome of cranberry (Vaccinium macrocarpon Ait.) is provided to illustrate a successful application of 454 sequencing for SSR discovery. In the future, NGS technologies will massively increase the number of SSRs and other genetic markers available to conduct genetic research in understudied but economically important crops such as cranberry.  相似文献   

4.
Chokecherry (Prunus virginiana L.) (2n?=?4x?=?32) is a unique Prunus species for both genetics and disease-resistance research due to its tetraploid nature and X-disease resistance. However, no genetic and genomic information on chokecherry is available. A partial chokecherry genome was sequenced using Roche 454 sequencing technology. A total of 145,094 reads covering 4.8?Mbp of the chokecherry genome were generated and 15,113 contigs were assembled, of which 11,675 contigs were larger than 100?bp in size. A total of 481 SSR loci were identified from 234 (out of 11,675) contigs and 246 polymerase chain reaction (PCR) primer pairs were designed. Of 246 primers, 212 (86.2?%) effectively produced amplification from the genomic DNA of chokecherry. All 212 amplifiable chokecherry primers were used to amplify genomic DNA from 11 other rosaceous species (sour cherry, sweet cherry, black cherry, peach, apricot, plum, apple, crabapple, pear, juneberry, and raspberry). Thus, chokecherry SSR primers can be transferable across Prunus species and other rosaceous species. An average of 63.2 and 58.7?% of amplifiable chokecherry primers amplified DNA from cherry and other Prunus species, respectively, while 47.2?% of amplifiable chokecherry primers amplified DNA from other rosaceous species. Using random genome sequence data generated from next-generation sequencing technology to identify microsatellite loci appears to be rapid and cost-efficient, particularly for species with no sequence information available. Sequence information and confirmed transferability of the identified chokecherry SSRs among species will be valuable for genetic research in Prunus and other rosaceous species. Key message A total of 246 SSR primers were identified from chokecherry genome sequences. Of which, 212 were confirmed amplifiable both in chokecherry and other 11 other rosaceous species.  相似文献   

5.
The development of microsatellite loci has become more efficient using next‐generation sequencing (NGS) approaches, and many studies imply that the amount of applicable loci is large. However, few studies have sought to quantify the number of loci that are retained for use out of the thousands of sequence reads initially obtained. We analyzed the success rate of microsatellite loci development for three amphibian species using a 454 NGS approach on tetra‐nucleotide motif‐enriched species‐specific libraries. The number of sequence reads obtained differed strongly between species and ranged from 19,562 for Triturus cristatus to 55,626 for Lissotriton helveticus, with 52,075 reads obtained for Calotriton asper. PHOBOS was used to identify sequences with tetra‐nucleotide repeat motifs with a minimum repeat number of ten and high quality primer binding sites. Of 107 sequences for T. cristatus, 316 for C. asper and 319 for L. helveticus, we tested the amplification success, polymorphism, and degree of heterozygosity for 41 primer combinations each for C. asper and T. cristatus, and 22 for L. helveticus. We found 11 polymorphic loci for T. cristatus, 20 loci for C. asper, and 15 loci for L. helveticus. Extrapolated, the number of potentially amplifiable loci (PALs) resulted in estimated species‐specific success rates of 0.15% (T. cristatus), 0.30% (C. asper), and 0.39% (L. helveticus). Compared with representative Illumina NGS approaches, our applied 454‐sequencing approach on specifically enriched sublibraries proved to be quite competitive in terms of success rates and number of finally applicable loci.  相似文献   

6.
Using high throughput sequencing we obtained a large number of microsatellites from Podocnemis lewyana, an endemic turtle from northwestern South America. We used 454 Genome Sequence FLX platform of sheared genomic DNA from randomly sampling approximately 17% of the haploid genome. We identified 86,501 reads (8.1% of all reads) that contained our definition of microsatellite loci. AC and TC were the most abundant motifs in the P. lewyana genome. TGC and AAAC were most abundant tri and tetra-nucleotide motifs respectively. 72.7% of microsatellite reads had flanking sequence regions suitable for primer design and PCR amplification. We validated the identified potentially amplifiable loci (PAL) and tested for polymorphism by selecting 15 loci corresponding to tetranucleotides. Twelve loci showed polymorphism in eight individuals. These findings demonstrates that microsatellite detection using next-generation sequencing is an efficient way of getting a lot of loci for listed taxa and in turn will have a large impact on future genetic studies aiming to understand and implement conservation plans for this highly threatened freshwater turtle.  相似文献   

7.
The American cranberry (Vaccinium macrocarpon Ait.) is a major commercial fruit crop in North America, but limited genetic resources have been developed for the species. Furthermore, the paucity of codominant DNA markers has hampered the advance of genetic research in cranberry and the Ericaceae family in general. Therefore, we used Roche 454 sequencing technology to perform low-coverage whole genome shotgun sequencing of the cranberry cultivar ‘HyRed’. After de novo assembly, the obtained sequence covered 266.3 Mb of the estimated 540–590 Mb in cranberry genome. A total of 107,244 SSR loci were detected with an overall density across the genome of 403 SSR/Mb. The AG repeat was the most frequent motif in cranberry accounting for 35% of all SSRs and together with AAG and AAAT accounted for 46% of all loci discovered. To validate the SSR loci, we designed 96 primer-pairs using contig sequence data containing perfect SSR repeats, and studied the genetic diversity of 25 cranberry genotypes. We identified 48 polymorphic SSR loci with 2–15 alleles per locus for a total of 323 alleles in the 25 cranberry genotypes. Genetic clustering by principal coordinates and genetic structure analyzes confirmed the heterogeneous nature of cranberries. The parentage composition of several hybrid cultivars was evident from the structure analyzes. Whole genome shotgun 454 sequencing was a cost-effective and efficient way to identify numerous SSR repeats in the cranberry sequence for marker development.  相似文献   

8.
9.
We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation), to assess sequencing quality (alternatively referred to as "noise" or "error") within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within a sample. It also provides global (whole sample) error estimates that can be used to identify samples with high or varying levels of sequencing error that may confound downstream analyses, particularly in the case of studies that utilize data from multiple sequencing samples. For shotgun metagenomic data, we believe that DRISEE provides estimates of sequencing error that are more accurate and less constrained by technical limitations than existing methods that rely on reference genomes or the use of scores (e.g. Phred). Here, DRISEE is applied to (non amplicon) data sets from both the 454 and Illumina platforms. The DRISEE error estimate is obtained by analyzing sets of artifactual duplicate reads (ADRs), a known by-product of both sequencing platforms. We present DRISEE as an open-source, platform-independent method to assess sequencing error in shotgun metagenomic data, and utilize it to discover previously uncharacterized error in de novo sequence data from the 454 and Illumina sequencing platforms.  相似文献   

10.
为全面了解余甘子转录组SSR位点的分布特征和变异规律,本研究利用Illumina Hiseq 4000平台对余甘子叶片转录组进行测序,通过MISA软件对获得的Unigenes进行SSR位点搜索和统计分析。结果发现9 538条包含SSR位点的Unigenes,共检测到9 991个SSR位点,平均每5.49 kB出现1个SSR。单碱基和二碱基为余甘子转录组SSR主要重复类型,分别占SSR总数的42.3%和30.79%。位于基因编码区的SSR位点共有1 731个,出现频率为0.039 SSRs/kB,优势重复类型为三碱基重复。余甘子转录组SSR中共有169种重复基元,其中所占比例最高的是A/T(42.10%),其次是AG/CT(22.91%)和AAG/CTT(5.02%)。SSR各基元的重复次数波动于4~75次,且多数集中于4~20次。重复片段长度≥ 20 bp的SSR占21.20%,且SSR发生频率与片段长度呈显著负相关(P<0.01),相关系数为-0.561。本研究获得的余甘子转录组SSR位点出现频率较高、分布密度较大、低级重复基元较多,重复次数较高、长片段较多,大多数SSR位点的多态性潜能较高,用于余甘子遗传多样性分析的潜力较大,为下一步余甘子转录组SSR标记的大规模开发和群体遗传学研究提供了重要的数据信息,进而为余甘子野生资源的保护和合理开发利用提供了参考依据。  相似文献   

11.
Development and optimization of novel species-specific microsatellites, or simple sequence repeats (SSRs) remains an important step for studies in ecology, evolution, and behavior. Numerous approaches exist for identifying new SSRs that vary widely in terms of both time and cost investments. A recent approach of using paired-end Illumina sequence data in conjunction with the bioinformatics pipeline, PAL_FINDER, has the potential to substantially reduce the cost and labor investment while also improving efficiency. However, it does not appear that the approach has been widely adopted, perhaps due to concerns over its broad applicability across taxa. Therefore, to validate the utility of the approach we developed SSRs for 32 species representing 30 families, 25 orders, 11 classes, and six phyla and optimized SSRs for 13 of the species. Overall the IPE method worked extremely well and we identified 1000s of SSRs for all species (mean = 128,485), with 17% of loci being potentially amplifiable loci, and 25% of these met our most stringent criteria designed to that avoid SSRs associated with repetitive elements. Approximately 61% of screened primers yielded strong amplification of a single locus.  相似文献   

12.

Background

Illumina sequencing with its high number of reads and low per base pair cost is an attractive technology for development of molecular resources for non-model organisms. While many software packages have been developed to identify short tandem repeats (STRs) from next-generation sequencing data, these methods do not inform the investigator as to whether or not candidate loci are polymorphic in their target populations.

Results

We provide a python program iMSAT that uses the polymorphism data obtained from mapping individual Illumina sequence reads onto a reference genome to identify polymorphic STRs. Using this approach, we identified 9,119 candidate polymorphic STRs for use with the parasitoid wasp Trioxys pallidus and 2,378 candidate polymorphic STRs for use with the aphid Chromaphis juglandicola. For both organisms we selected 20 candidate tri-nucleotide STRs for validation. Using fluorescent-labeled oligonucleotide primers, we genotyped 91 female T. pallidus collected in nine localities and 46 female C. juglandicola collected in 4 localities and found 15 of the examined markers to be polymorphic for T. pallidus and 12 of the examined markers to be polymorphic for C. juglandicola.

Conclusions

We present a novel approach that uses standard Illumina barcoding primers and a single Illumina HiSeq run to target polymorphic STR fragments to develop and test STR markers. We validate this approach using the parasitoid wasp T. pallidus and its aphid host C. juglandicola. This approach, which would also be compatible with 454 Sequencing, allowed us to quickly identify markers with known variability. Accordingly, our method constitutes a significant improvement over existing STR identification software packages.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-858) contains supplementary material, which is available to authorized users.  相似文献   

13.
14.
High-throughput targeted SSR marker development in peach (Prunus persica).   总被引:2,自引:0,他引:2  
Simple sequence repeats (SSRs) have proven to be highly polymorphic, easily reproducible, codominant markers. However, developing an SSR map is very time consuming and expensive, and most SSRs are not specifically linked to gene loci of immediate interest. The ideal situation would be to combine a high-throughput, relatively inexpensive mapping technique with rapid identification of SSR loci in mapped regions of interest. For this reason, we coupled the high-throughput technique of AFLP mapping with subsequent direct targeting of SSRs identified in AFLP-marked regions of interest. This approach relied on the availability of peach bacterial artificial chromosome (BAC) library resources. We present examples of using this strategy to rapidly identify SSR loci tightly linked to two important, simply inherited traits in peach (Prunus persica (L.) Batsch): root-knot nematode resistance and control of the evergrowing trait. SSRs developed in this study were also tested for their transportability in other Prunus species and in apricots.  相似文献   

15.
16.
17.
Microsatellites (simple sequence repeats, SSRs) are important genetic markers in tree breeding and conservation. Here we utilized high-throughput 454 sequencing technology to mine microsatellites from masson pine (MP) genomic DNA. First, we analyzed the characteristics of SSRs in all nonredundant MP reads (genome survey sequences, GSSs) and compared them with loblolly pine (LP) GSSs and BACs (bacterial artificial chromosome clone sequences), and three other nonconiferous species GSSs. Second, a set of MP GSS–SSR primer pairs were designed. There were extremely low overall GSS–SSR densities (28 SSR/Mb) in MP when compared with LP (48 SSR/Mb) and the other species. AT, AAT, AAAT, and AAAAAT were the richest motifs in di-, tri-, tetra-, and hexanucleotides, respectively. Two hundred forty GSS–SSR primer pairs were designed in total, and 20 novel polymorphic markers were identified using three populations (two natural and one clonal seed orchard) as evaluating samples. These markers should be useful for future MP population genetics studies.  相似文献   

18.
Simple sequence repeats (SSRs) are indel mutational hotspots in genomes. In prokaryotes, SSR loci can cause phase variation, a microbial survival strategy that relies on stochastic, reversible on-off switching of gene activity. By analyzing multiple strains of 42 fully sequenced prokaryotic species, we measure the relative variability and density distribution of SSRs in coding regions. We demonstrate that repeat type strongly influences indel mutation rates, and that the most mutable types are most strongly avoided across genomes. We thoroughly characterize SSR density and variability as a function of N→C position along protein sequences. Using codon-shuffling algorithms that preserve amino acid sequence, we assess evolutionary pressures on SSRs. We find that coding sequences suppress repeats in the middle of proteins, and enrich repeats near termini, yielding U-shaped SSR density curves. We show that for many species this characteristic shape can be attributed to purely biophysical constraints of protein structure. In multiple cases, however, particularly in certain pathogenic bacteria, we observe over enrichment of SSRs near protein N-termini significantly beyond expectation based on structural constraints. This increases the probability that frameshifts result in non-functional proteins, revealing that these species may evolutionarily tune SSR positions in coding regions to facilitate phase variation.  相似文献   

19.
ABSTRACT: BACKGROUND: Next-Generation Sequencing has revolutionized our approach to ancient DNA (aDNA) research, by providing complete genomic sequences of ancient individuals and extinct species. However, the recovery of genetic material from long-dead organisms is still complicated by a number of issues, including post-mortem DNA damage and high levels of environmental contamination. Together with error profiles specific to the type of sequencing platforms used, these specificities could limit our ability to map sequencing reads against modern reference genomes and therefore limit our ability to identify endogenous ancient reads, reducing the efficiency of shotgun sequencing aDNA. RESULTS: In this study, we compare different computational methods for improving the accuracy and sensitivity of aDNA sequence identification, based on shotgun sequencing reads recovered from Pleistocene horse extracts using Illumina GAIIx and Helicos Heliscope platforms. We show that the performance of the Burrows Wheeler Aligner (BWA), that has been developed for mapping of undamaged sequencing reads using platforms with low rates of indel-types of sequencing errors, can be employed at acceptable run-times by modifying default parameters in a platform-specific manner. We also examine if trimming likely damaged positions at read ends can increase the recovery of genuine aDNA fragments and if accurate identification of human contamination can be achieved using a strategy previously suggested based on best hit filtering. We show that combining our different mapping and filtering approaches can increase the number of high-quality endogenous hits recovered by up to 33%. CONCLUSIONS: We have shown that Illumina and Helicos sequences recovered from aDNA extracts could not be aligned to modern reference genomes with the same efficiency unless mapping parameters are optimized for the specific types of errors generated by these platforms and by post-mortem DNA damage. Our findings have important implications for future aDNA research, as we define mapping guidelines that improve our ability to identify genuine aDNA sequences, which in turn could improve the genotyping accuracy of ancient specimens. Our framework provides a significant improvement to the standard procedures used for characterizing ancient genomes, which is challenged by contamination and often low amounts of DNA material.  相似文献   

20.
Current efforts to recover the Neandertal and mammoth genomes by 454 DNA sequencing demonstrate the sensitivity of this technology. However, routine 454 sequencing applications still require microgram quantities of initial material. This is due to a lack of effective methods for quantifying 454 sequencing libraries, necessitating expensive and labour-intensive procedures when sequencing ancient DNA and other poor DNA samples. Here we report a 454 sequencing library quantification method based on quantitative PCR that effectively eliminates these limitations. We estimated both the molecule numbers and the fragment size distributions in sequencing libraries derived from Neandertal DNA extracts, SAGE ditags and bonobo genomic DNA, obtaining optimal sequencing yields without performing any titration runs. Using this method, 454 sequencing can routinely be performed from as little as 50 pg of initial material without titration runs, thereby drastically reducing costs while increasing the scope of sample throughput and protocol development on the 454 platform. The method should also apply to Illumina/Solexa and ABI/SOLiD sequencing, and should therefore help to widen the accessibility of all three platforms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号