共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
Mari J?rve Lev A. Zhivotovsky Siiri Rootsi Hela Help Evgeny I. Rogaev Elza K. Khusnutdinova Toomas Kivisild Juan J. Sanchez 《PloS one》2009,4(9)
Background
Polymorphic Y chromosome short tandem repeats (STRs) have been widely used in population genetic and evolutionary studies. Compared to di-, tri-, and tetranucleotide repeats, STRs with longer repeat units occur more rarely and are far less commonly used.Principal Findings
In order to study the evolutionary dynamics of STRs according to repeat unit size, we analysed variation at 24 Y chromosome repeat loci: 1 tri-, 14 tetra-, 7 penta-, and 2 hexanucleotide loci. According to our results, penta- and hexanucleotide repeats have approximately two times lower repeat variance and diversity than tri- and tetranucleotide repeats, indicating that their mutation rate is about half of that of tri- and tetranucleotide repeats. Thus, STR markers with longer repeat units are more robust in distinguishing Y chromosome haplogroups and, in some cases, phylogenetic splits within established haplogroups.Conclusions
Our findings suggest that Y chromosome STRs of increased repeat unit size have a lower rate of evolution, which has significant relevance in population genetic and evolutionary studies. 相似文献4.
5.
Background
The adzuki bean weevil, Callosobruchus chinensis L., is one of the most destructive pests of stored legume seeds such as mungbean, cowpea, and adzuki bean, which usually cause considerable loss in the quantity and quality of stored seeds during transportation and storage. However, a lack of genetic information of this pest results in a series of genetic questions remain largely unknown, including population genetic structure, kinship, biotype abundance, and so on. Co-dominant microsatellite markers offer a great resolving power to determine these events. Here, we report rapid microsatellite isolation from C. chinensis via high-throughput sequencing.Principal Findings
In this study, 94,560,852 quality-filtered and trimmed reads were obtained for the assembly of genome using Illumina paired-end sequencing technology. In total, the genome with total length of 497,124,785 bp, comprising 403,113 high quality contigs was generated with de novo assembly. More than 6800 SSR loci were detected and a suit of 6303 primer pair sequences were designed and 500 of them were randomly selected for validation. Of these, 196 pair of primers, i.e. 39.2%, produced reproducible amplicons that were polymorphic among 8 C. chinensis genotypes collected from different geographical regions. Twenty out of 196 polymorphic SSR markers were used to analyze the genetic diversity of 18 C. chinensis populations. The results showed the twenty SSR loci were highly polymorphic among these populations.Conclusions
This study presents a first report of genome sequencing and de novo assembly for C. chinensis and demonstrates the feasibility of generating a large scale of sequence information and SSR loci isolation by Illumina paired-end sequencing. Our results provide a valuable resource for C. chinensis research. These novel markers are valuable for future genetic mapping, trait association, genetic structure and kinship among C. chinensis. 相似文献6.
Purpose
Short Tandem Repeat (STR) genetic markers hold great potential in forensic investigations, molecular diagnostics and molecular genetics research. AmpFlSTR® Identifiler™ PCR amplification kit is a multiplex system for co-amplification of 15 STR markers used worldwide in forensic investigations. This study attempts to assess forensic validity of these STRs in Pakistani population and to investigate its applicability in quick and simultaneous diagnosis and tracing parental source of common chromosomal aneuploidies.Methodology
Samples from 554 healthy Pakistani individuals from 5 different ethnicities were analyzed for forensic parameters using Identifiler STRs and 74 patients' samples with different aneuploidies were evaluated for diagnostic strengths of these markers.Results
All STRs hold sufficient forensic applicability in Pakistani population with paternity index between 1.5 and 3.5, polymorphic information content from 0.63 to 0.87 and discrimination power ≥ 0.9 (except TPOX locus). Variation from Hardy–Weinberg equilibrium was observed at some loci reflecting selective breeding and intermarriages trend in Pakistan. Among aneuploidic samples, all trisomies were precisely detectable while aneuploidies involving sex chromosomes or missing chromosomes were not clearly detectable using Identifiler STRs. Parental origin of aneuploidy was traceable in 92.54% patients.Conclusion
The studied STR markers are valuable tools for forensic application in Pakistan and utilizable for quick and simultaneous identification of some common trisomic conditions. Adding more sex chromosome specific STR markers can immensely increase the diagnostic and forensic potential of this system. 相似文献7.
8.
9.
10.
Background
The integration of multiple complementary approaches is a powerful way to understand the processes of diversification and speciation. The parasitoid wasp Aphidius transcaspicus Telenga (Hymenoptera: Braconidae) is a parasitoid of Hyalopterus aphids across a wide geographic range. This species shows a remarkable degree of genetic structure among western, central, and eastern Mediterranean population clusters. In this paper we attempt to better characterize this genetic structure.Methodology/Principal Findings
We use a Bayesian coalescent analysis of gene flow under the Isolation with Migration model using mitochondrial and microsatellite markers together with climate-based ecological niche models to better understand the genetic structure of A. transcaspicus in the Mediterranean. The coalescent analysis revealed low levels of migration among western and eastern Mediterranean populations (Nm<1) that were not statistically distinguishable from zero. Niche models showed that localities within population clusters each occupy areas of continuously high environmental suitability, but are separated from each other by large regions of completely unsuitable habitat that could limit dispersal. Overall, environmental characteristics were similar among the population clusters, though significant differences did emerge.Conclusions/Significance
These results support contemporary allopatric isolation of Mediterranean populations of A. transcaspicus, which together with previous analyses indicating partial behaviorally mediated reproductive isolation, suggest that the early stages of cryptic speciation may be in progress. 相似文献11.
ISMapper: identifying transposase insertion sites in bacterial genomes from short read sequence data
Jane Hawkey Mohammad Hamidian Ryan R. Wick David J. Edwards Helen Billman-Jacobe Ruth M. Hall Kathryn E. Holt 《BMC genomics》2015,16(1)
Background
Insertion sequences (IS) are small transposable elements, commonly found in bacterial genomes. Identifying the location of IS in bacterial genomes can be useful for a variety of purposes including epidemiological tracking and predicting antibiotic resistance. However IS are commonly present in multiple copies in a single genome, which complicates genome assembly and the identification of IS insertion sites. Here we present ISMapper, a mapping-based tool for identification of the site and orientation of IS insertions in bacterial genomes, directly from paired-end short read data.Results
ISMapper was validated using three types of short read data: (i) simulated reads from a variety of species, (ii) Illumina reads from 5 isolates for which finished genome sequences were available for comparison, and (iii) Illumina reads from 7 Acinetobacter baumannii isolates for which predicted IS locations were tested using PCR. A total of 20 genomes, including 13 species and 32 distinct IS, were used for validation. ISMapper correctly identified 97 % of known IS insertions in the analysis of simulated reads, and 98 % in real Illumina reads. Subsampling of real Illumina reads to lower depths indicated ISMapper was able to correctly detect insertions for average genome-wide read depths >20x, although read depths >50x were required to obtain confident calls that were highly-supported by evidence from reads. All ISAba1 insertions identified by ISMapper in the A. baumannii genomes were confirmed by PCR. In each A. baumannii genome, ISMapper successfully identified an IS insertion upstream of the ampC beta-lactamase that could explain phenotypic resistance to third-generation cephalosporins. The utility of ISMapper was further demonstrated by profiling genome-wide IS6110 insertions in 138 publicly available Mycobacterium tuberculosis genomes, revealing lineage-specific insertions and multiple insertion hotspots.Conclusions
ISMapper provides a rapid and robust method for identifying IS insertion sites directly from short read data, with a high degree of accuracy demonstrated across a wide range of bacteria. 相似文献12.
Ting-Wen Chen Ruei-Chi Gan Yi-Feng Chang Wei-Chao Liao Timothy H. Wu Chi-Ching Lee Po-Jung Huang Cheng-Yang Lee Yi-Ywan M. Chen Cheng-Hsun Chiu Petrus Tang 《BMC genomics》2015,16(1)
Background
Whole genome sequence construction is becoming increasingly feasible because of advances in next generation sequencing (NGS), including increasing throughput and read length. By simply overlapping paired-end reads, we can obtain longer reads with higher accuracy, which can facilitate the assembly process. However, the influences of different library sizes and assembly methods on paired-end sequencing-based de novo assembly remain poorly understood.Results
We used 250 bp Illumina Miseq paired-end reads of different library sizes generated from genomic DNA from Escherichia coli DH1 and Streptococcus parasanguinis FW213 to compare the assembly results of different library sizes and assembly approaches. Our data indicate that overlapping paired-end reads can increase read accuracy but sometimes cause insertion or deletions. Regarding genome assembly, merged reads only outcompete original paired-end reads when coverage depth is low, and larger libraries tend to yield better assembly results. These results imply that distance information is the most critical factor during assembly. Our results also indicate that when depth is sufficiently high, assembly from subsets can sometimes produce better results.Conclusions
In summary, this study provides systematic evaluations of de novo assembly from paired end sequencing data. Among the assembly strategies, we find that overlapping paired-end reads is not always beneficial for bacteria genome assembly and should be avoided or used with caution especially for genomes containing high fraction of repetitive sequences. Because increasing numbers of projects aim at bacteria genome sequencing, our study provides valuable suggestions for the field of genomic sequence construction.Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1859-8) contains supplementary material, which is available to authorized users. 相似文献13.
Patricio Jeraldo Krishna Kalari Xianfeng Chen Jaysheel Bhavsar Ashutosh Mangalam Bryan White Heidi Nelson Jean-Pierre Kocher Nicholas Chia 《PloS one》2014,9(12)
Motivation
16S rDNA hypervariable tag sequencing has become the de facto method for accessing microbial diversity. Illumina paired-end sequencing, which produces two separate reads for each DNA fragment, has become the platform of choice for this application. However, when the two reads do not overlap, existing computational pipelines analyze data from read separately and underutilize the information contained in the paired-end reads.Results
We created a workflow known as Illinois Mayo Taxon Organization from RNA Dataset Operations (IM-TORNADO) for processing non-overlapping reads while retaining maximal information content. Using synthetic mock datasets, we show that the use of both reads produced answers with greater correlation to those from full length 16S rDNA when looking at taxonomy, phylogeny, and beta-diversity.Availability and Implementation
IM-TORNADO is freely available at http://sourceforge.net/projects/imtornado and produces BIOM format output for cross compatibility with other pipelines such as QIIME, mothur, and phyloseq. 相似文献14.
Christopher P. E. Lange Mihaela Campan Toshinori Hinoue Roderick F. Schmitz Andrea E. van der Meulen-de Jong Hilde Slingerland Peter J. M. J. Kok Cornelis M. van Dijk Daniel J. Weisenberger Hui Shen Robertus A. E. M. Tollenaar Peter W. Laird 《PloS one》2012,7(11)
Background
There is an increasing demand for accurate biomarkers for early non-invasive colorectal cancer detection. We employed a genome-scale marker discovery method to identify and verify candidate DNA methylation biomarkers for blood-based detection of colorectal cancer.Methodology/Principal Findings
We used DNA methylation data from 711 colorectal tumors, 53 matched adjacent-normal colonic tissue samples, 286 healthy blood samples and 4,201 tumor samples of 15 different cancer types. DNA methylation data were generated by the Illumina Infinium HumanMethylation27 and the HumanMethylation450 platforms, which determine the methylation status of 27,578 and 482,421 CpG sites respectively. We first performed a multistep marker selection to identify candidate markers with high methylation across all colorectal tumors while harboring low methylation in healthy samples and other cancer types. We then used pre-therapeutic plasma and serum samples from 107 colorectal cancer patients and 98 controls without colorectal cancer, confirmed by colonoscopy, to verify candidate markers. We selected two markers for further evaluation: methylated THBD (THBD-M) and methylated C9orf50 (C9orf50-M). When tested on clinical plasma and serum samples these markers outperformed carcinoembryonic antigen (CEA) serum measurement and resulted in a high sensitive and specific test performance for early colorectal cancer detection.Conclusions/Significance
Our systematic marker discovery and verification study for blood-based DNA methylation markers resulted in two novel colorectal cancer biomarkers, THBD-M and C9orf50-M. THBD-M in particular showed promising performance in clinical samples, justifying its further optimization and clinical testing. 相似文献15.
16.
Sébastien Rodrigue Arne C. Materna Sonia C. Timberlake Matthew C. Blackburn Rex R. Malmstrom Eric J. Alm Sallie W. Chisholm 《PloS one》2010,5(7)
Background
Different high-throughput nucleic acid sequencing platforms are currently available but a trade-off currently exists between the cost and number of reads that can be generated versus the read length that can be achieved.Methodology/Principal Findings
We describe an experimental and computational pipeline yielding millions of reads that can exceed 200 bp with quality scores approaching that of traditional Sanger sequencing. The method combines an automatable gel-less library construction step with paired-end sequencing on a short-read instrument. With appropriately sized library inserts, mate-pair sequences can overlap, and we describe the SHERA software package that joins them to form a longer composite read.Conclusions/Significance
This strategy is broadly applicable to sequencing applications that benefit from low-cost high-throughput sequencing, but require longer read lengths. We demonstrate that our approach enables metagenomic analyses using the Illumina Genome Analyzer, with low error rates, and at a fraction of the cost of pyrosequencing. 相似文献17.
18.
19.
Background
Influenza viruses exist as a large group of closely related viral genomes, also called quasispecies. The composition of this influenza viral quasispecies can be determined by an accurate and sensitive sequencing technique and data analysis pipeline. We compared the suitability of two benchtop next-generation sequencers for whole genome influenza A quasispecies analysis: the Illumina MiSeq sequencing-by-synthesis and the Ion Torrent PGM semiconductor sequencing technique.Results
We first compared the accuracy and sensitivity of both sequencers using plasmid DNA and different ratios of wild type and mutant plasmid. Illumina MiSeq sequencing reads were one and a half times more accurate than those of the Ion Torrent PGM. The majority of sequencing errors were substitutions on the Illumina MiSeq and insertions and deletions, mostly in homopolymer regions, on the Ion Torrent PGM. To evaluate the suitability of the two techniques for determining the genome diversity of influenza A virus, we generated plasmid-derived PR8 virus and grew this virus in vitro. We also optimized an RT-PCR protocol to obtain uniform coverage of all eight genomic RNA segments. The sequencing reads obtained with both sequencers could successfully be assembled de novo into the segmented influenza virus genome. After mapping of the reads to the reference genome, we found that the detection limit for reliable recognition of variants in the viral genome required a frequency of 0.5% or higher. This threshold exceeds the background error rate resulting from the RT-PCR reaction and the sequencing method. Most of the variants in the PR8 virus genome were present in hemagglutinin, and these mutations were detected by both sequencers.Conclusions
Our approach underlines the power and limitations of two commonly used next-generation sequencers for the analysis of influenza virus gene diversity. We conclude that the Illumina MiSeq platform is better suited for detecting variant sequences whereas the Ion Torrent PGM platform has a shorter turnaround time. The data analysis pipeline that we propose here will also help to standardize variant calling in small RNA genomes based on next-generation sequencing data. 相似文献20.