首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Population Genetics of Y-Chromosome Short Tandem Repeats in Humans   总被引:8,自引:0,他引:8  
Eight human short tandem repeat polymorphisms (STRs) also known as microsatellites—DYS19, DYS388, DYS390, DYS391, DYS392, DYS393, DYS389I, and DYS389II, mapping in the Y chromosome—were analyzed in two Iberian samples (Basques and Catalans). Allele frequency distributions showed significant differences only for DYS392. Fst and gene diversity index (D) were estimated for the Y STRs. The values obtained are comparable to those of autosomal STR if corrections for the smaller effective population size on the Y chromosome are taken into account. This suggests that Y-chromosome microsatellites might be as useful as their autosomal counterparts to both human population genetics and forensics. Our results also reinforce the hypothesis that selective sweeps in the Y chromosome in recent times are unlikely. Haplotypes combining five of the loci were constructed for 71 individuals, showing 29 different haplotypes. A haplotype tree was constructed, from which an estimate of 7,000 to 60,000 years for the age of the Y-chromosome variation in Iberia was derived, in accordance with previous estimates obtained with mtDNA sequences and nuclear markers. Received: 3 January 1997 / Accepted: 25 April 1997  相似文献   

3.
The majority of studies employing short tandem repeats (STRs) require investigation of several of these genetic markers. As such, we demonstrate the feasibility of the trinucleotide threading (TnT) approach for scalable analysis of STRs. The TnT method represents a parallel amplification alternative that addresses the obstacles associated with multiplex PCR. In this study, analysis of the STR fragments was performed with capillary gel electrophoresis; however, it should be possible to combine our approach with the massive 454 sequencing platform to considerably increase the number of targeted STRs.  相似文献   

4.
Nine rare (biallelic) mutations and six short tandem repeats (STR) mapping to the nonrecombining portion of the Y chromosome were genotyped in 734 males from different geographical regions inhabited by the contemporary Armenian population. The analysis of molecular variance (AMOVA) showed that 48.9% of total STR genetic variation was explained by the differences between the haplogroups isolated based on biallelic polymorphism, whereas only 1.3% of genetic variation could be attributed to the differences between the geographic groups.  相似文献   

5.
HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5) viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences) and genotypic (e.g., population sequencing linked to bioinformatic algorithms) assays are the most widely used. Although several next-generation sequencing (NGS) platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences), Illumina®, and Ion Torrent™ (Life Technologies). Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels) and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used), compared to Trofile (80%) and population sequencing (70%). In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage.  相似文献   

6.
Next generation sequencing technologies, like ultra-deep pyrosequencing (UDPS), allows detailed investigation of complex populations, like RNA viruses, but its utility is limited by errors introduced during sample preparation and sequencing. By tagging each individual cDNA molecule with barcodes, referred to as Primer IDs, before PCR and sequencing these errors could theoretically be removed. Here we evaluated the Primer ID methodology on 257,846 UDPS reads generated from a HIV-1 SG3Δenv plasmid clone and plasma samples from three HIV-infected patients. The Primer ID consisted of 11 randomized nucleotides, 4,194,304 combinations, in the primer for cDNA synthesis that introduced a unique sequence tag into each cDNA molecule. Consensus template sequences were constructed for reads with Primer IDs that were observed three or more times. Despite high numbers of input template molecules, the number of consensus template sequences was low. With 10,000 input molecules for the clone as few as 97 consensus template sequences were obtained due to highly skewed frequency of resampling. Furthermore, the number of sequenced templates was overestimated due to PCR errors in the Primer IDs. Finally, some consensus template sequences were erroneous due to hotspots for UDPS errors. The Primer ID methodology has the potential to provide highly accurate deep sequencing. However, it is important to be aware that there are remaining challenges with the methodology. In particular it is important to find ways to obtain a more even frequency of resampling of template molecules as well as to identify and remove artefactual consensus template sequences that have been generated by PCR errors in the Primer IDs.  相似文献   

7.
从XNP基因内部筛选多态性较强的多态基因座,为连锁分析和间接诊断奠定基因,通过核酸同源性分析获得含有XNP基因的基因组克隆,并通过对比分析cDNA与基因组DNA的对应关系确定基因的非外显子序列,利用BCMSearch Launcher程序从中筛选短串联重复序列,采用PCR扩增技术和聚丙烯酰胺凝胶电泳方法,对所筛选出的短串联重复序列进行多态性分析,结果从XNP基因内筛选出5个短串联重复序列,多态性分析表明,其中的2个短串联重复序列(XNPSTR1和XNPSTR4)具有多态性,在100名无血缘关系的女性中,分别观察到4和11个等位基因,杂合度分别为47%和70%,XNPSTR1位于XNP基因的3′端,XNPSTR4位于第10内含子,结论是:从XNP基因内筛选出两个多态位点,可用于XNP基因的连锁分析和间接基因诊断。  相似文献   

8.
E. Arnason  D. M. Rand 《Genetics》1992,132(1):211-220
The mitochondrial DNA of the Atlantic cod (Gadus morhua) contains a tandem array of 40-bp repeats in the D-loop region of the molecule. Variation among molecules in the copy number of these repeats results in mtDNA length variation and heteroplasmy (the presence of more than one form of mtDNA in an individual). In a sample of fish collected from different localities around Iceland and off George's Bank, each individual was heteroplasmic for two or more mtDNAs ranging in repeat copy number from two (common) to six (rare). An earlier report on mtDNA heteroplasmy in sturgeon (Acipenser transmontanus) presented a competitive displacement model for length mutations in mtDNAs containing tandem arrays and the cod data deviate from this model. Depending on the nature of putative secondary structures and the location of D-loop strand termination, additional mechanisms of length mutation may be needed to explain the range of mtDNA length variants maintained in these populations. The balance between genetic drift and mutation in maintaining this length polymorphism is estimated through a hierarchical analysis of diversity of mtDNA length variation in the Iceland samples. Eighty percent of the diversity lies within individuals, 8% among individuals and 12% among localities. An estimate of theta = 2N(eo) mu greater than 1 indicates that this system is characterized by a high mutation rate and is governed primarily by deterministic dynamics. The sequences of repeat arrays from fish collected in Norway, Iceland and George's Bank show no nucleotide variation suggesting that there is very little substructuring to the North Atlantic cod population.  相似文献   

9.
Somatic mutations in KRAS, NRAS, and BRAF genes are related to resistance to anti-EGFR antibodies in colorectal cancer. We have established an extended RAS and BRAF mutation assay using a next-generation sequencer to analyze these mutations. Multiplexed deep sequencing was performed to detect somatic mutations within KRAS, NRAS, and BRAF, including minor mutated components. We first validated the technical performance of the multiplexed deep sequencing using 10 normal DNA and 20 formalin-fixed, paraffin-embedded (FFPE) tumor samples. To demonstrate the potential clinical utility of our assay, we profiled 100 FFPE tumor samples and 15 plasma samples obtained from colorectal cancer patients. We used a variant calling approach based on a Poisson distribution. The distribution of the mutation-positive population was hypothesized to follow a Poisson distribution, and a mutation-positive status was defined as a value greater than the significance level of the error rate (α = 2 x 10-5). The cut-off value was determined to be the average error rate plus 7 standard deviations. Mutation analysis of 100 clinical FFPE tumor specimens was performed without any invalid cases. Mutations were detected at a frequency of 59% (59/100). KRAS mutation concordance between this assay and Scorpion-ARMS was 92% (92/100). DNA obtained from 15 plasma samples was also analyzed. KRAS and BRAF mutations were identified in both the plasma and tissue samples of 6 patients. The genetic screening assay using next-generation sequencer was validated for the detection of clinically relevant RAS and BRAF mutations using FFPE and liquid samples.  相似文献   

10.
Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/.  相似文献   

11.
二代测序技术的涌现推动了基因组学研究,特别是在疾病相关的遗传变异研究中发挥了重要作用.虽然大多数遗传变异类型都可以借助于各种二代测序分析工具进行检测,但是仍然存在局限性,比如短串联重复序列的长度变异.许多遗传疾病是由短串联重复序列的长度扩张导致的,尤其是亨廷顿病等多种神经系统疾病.然而,现在几乎没有工具能够利用二代测序检测长度大于测序读长的短串联重复序列变异.为了突破这一限制,我们开发了一个全新的方法,该方法基于双末端二代测序辨识短串联重复序列长度变异,并可估计其扩张长度,将其应用于一项基于全外显子组测序的运动神经元疾病临床研究中,成功地鉴定出致病的短串联重复序列长度扩张.该方法首次原创性地利用测序读长覆盖深度特征来解决短串联重复序列变异检测问题,在人类遗传疾病研究中具有广泛的应用价值,并且对于其他二代测序分析方法的开发具有启发性意义.  相似文献   

12.

Background

Risk assessment of tick-borne and zoonotic disease emergence necessitates sound knowledge of the particular microorganisms circulating within the communities of these major vectors. Assessment of pathogens carried by wild ticks must be performed without a priori, to allow for the detection of new or unexpected agents.

Methodology/Principal Findings

We evaluated the potential of Next-Generation Sequencing techniques (NGS) to produce an inventory of parasites carried by questing ticks. Sequences corresponding to parasites from two distinct genera were recovered in Ixodes ricinus ticks collected in Eastern France: Babesia spp. and Theileria spp. Four Babesia species were identified, three of which were zoonotic: B. divergens, Babesia sp. EU1 and B. microti; and one which infects cattle, B. major. This is the first time that these last two species have been identified in France. This approach also identified new sequences corresponding to as-yet unknown organisms similar to tropical Theileria species.

Conclusions/Significance

Our findings demonstrate the capability of NGS to produce an inventory of live tick-borne parasites, which could potentially be transmitted by the ticks, and uncovers unexpected parasites in Western Europe.  相似文献   

13.
High-throughput sequencing of the taxonomically informative 16S rRNA gene provides a powerful approach for exploring microbial diversity. Here we compare the performances of two common “benchtop” sequencing platforms, Illumina MiSeq and Ion Torrent Personal Genome Machine (PGM), for bacterial community profiling by 16S rRNA (V1-V2) amplicon sequencing. We benchmarked performance by using a 20-organism mock bacterial community and a collection of primary human specimens. We observed comparatively higher error rates with the Ion Torrent platform and report a pattern of premature sequence truncation specific to semiconductor sequencing. Read truncation was dependent on both the directionality of sequencing and the target species, resulting in organism-specific biases in community profiles. We found that these sequencing artifacts could be minimized by using bidirectional amplicon sequencing and an optimized flow order on the Ion Torrent platform. Results of bacterial community profiling performed on the mock community and a collection of 18 human-derived microbiological specimens were generally in good agreement for both platforms; however, in some cases, results differed significantly. Disparities could be attributed to the failure to generate full-length reads for particular organisms on the Ion Torrent platform, organism-dependent differences in sequence error rates affecting classification of certain species, or some combination of these factors. This study demonstrates the potential for differential bias in bacterial community profiles resulting from the choice of sequencing platform alone.  相似文献   

14.
15.
Interpreting the genomic and phenotypic consequences of copy-number variation (CNV) is essential to understanding the etiology of genetic disorders. Whereas deletion CNVs lead obviously to haploinsufficiency, duplications might cause disease through triplosensitivity, gene disruption, or gene fusion at breakpoints. The mutational spectrum of duplications has been studied at certain loci, and in some cases these copy-number gains are complex chromosome rearrangements involving triplications and/or inversions. However, the organization of clinically relevant duplications throughout the genome has yet to be investigated on a large scale. Here we fine-mapped 184 germline duplications (14.7 kb–25.3 Mb; median 532 kb) ascertained from individuals referred for diagnostic cytogenetics testing. We performed next-generation sequencing (NGS) and whole-genome sequencing (WGS) to sequence 130 breakpoints from 112 subjects with 119 CNVs and found that most (83%) were tandem duplications in direct orientation. The remainder were triplications embedded within duplications (8.4%), adjacent duplications (4.2%), insertional translocations (2.5%), or other complex rearrangements (1.7%). Moreover, we predicted six in-frame fusion genes at sequenced duplication breakpoints; four gene fusions were formed by tandem duplications, one by two interconnected duplications, and one by duplication inserted at another locus. These unique fusion genes could be related to clinical phenotypes and warrant further study. Although most duplications are positioned head-to-tail adjacent to the original locus, those that are inverted, triplicated, or inserted can disrupt or fuse genes in a manner that might not be predicted by conventional copy-number assays. Therefore, interpreting the genetic consequences of duplication CNVs requires breakpoint-level analysis.  相似文献   

16.
17.
A large, non-coding ATTCT repeat expansion causes the neurodegenerative disorder, spinocerebellar ataxia type 10 (SCA10). In a subset of SCA10 patients, interruption motifs are present at the 5’ end of the expansion and strongly correlate with epileptic seizures. Thus, interruption motifs are a predictor of the epileptic phenotype and are hypothesized to act as a phenotypic modifier in SCA10. Yet, the exact internal sequence structure of SCA10 expansions remains unknown due to limitations in current technologies for sequencing across long extended tracts of tandem nucleotide repeats. We used the third generation sequencing technology, Single Molecule Real Time (SMRT) sequencing, to obtain full-length contiguous expansion sequences, ranging from 2.5 to 4.4 kb in length, from three SCA10 patients with different clinical presentations. We obtained sequence spanning the entire length of the expansion and identified the structure of known and novel interruption motifs within the SCA10 expansion. The exact interruption patterns in expanded SCA10 alleles will allow us to further investigate the potential contributions of these interrupting sequences to the pathogenic modification leading to the epilepsy phenotype in SCA10. Our results also demonstrate that SMRT sequencing is useful for deciphering long tandem repeats that pose as “gaps” in the human genome sequence.  相似文献   

18.
Chemical mutagenesis efficiently generates phenotypic variation in otherwise homogeneous genetic backgrounds, enabling functional analysis of genes. Advances in mutation detection have brought the utility of induced mutant populations on par with those produced by insertional mutagenesis, but systematic cataloguing of mutations would further increase their utility. We examined the suitability of multiplexed global exome capture and sequencing coupled with custom-developed bioinformatics tools to identify mutations in well-characterized mutant populations of rice (Oryza sativa) and wheat (Triticum aestivum). In rice, we identified ∼18,000 induced mutations from 72 independent M2 individuals. Functional evaluation indicated the recovery of potentially deleterious mutations for >2600 genes. We further observed that specific sequence and cytosine methylation patterns surrounding the targeted guanine residues strongly affect their probability to be alkylated by ethyl methanesulfonate. Application of these methods to six independent M2 lines of tetraploid wheat demonstrated that our bioinformatics pipeline is applicable to polyploids. In conclusion, we provide a method for developing large-scale induced mutation resources with relatively small investments that is applicable to resource-poor organisms. Furthermore, our results demonstrate that large libraries of sequenced mutations can be readily generated, providing enhanced opportunities to study gene function and assess the effect of sequence and chromatin context on mutations.  相似文献   

19.

Background

There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats.

Methodology/Principal Findings

Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads.

Conclusions

Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length.  相似文献   

20.
Most existing statistical methods developed for calling single nucleotide polymorphisms (SNPs) using next-generation sequencing (NGS) data are based on Bayesian frameworks, and there does not exist any SNP caller that produces p-values for calling SNPs in a frequentist framework. To fill in this gap, we develop a new method MAFsnp, a Multiple-sample based Accurate and Flexible algorithm for calling SNPs with NGS data. MAFsnp is based on an estimated likelihood ratio test (eLRT) statistic. In practical situation, the involved parameter is very close to the boundary of the parametric space, so the standard large sample property is not suitable to evaluate the finite-sample distribution of the eLRT statistic. Observing that the distribution of the test statistic is a mixture of zero and a continuous part, we propose to model the test statistic with a novel two-parameter mixture distribution. Once the parameters in the mixture distribution are estimated, p-values can be easily calculated for detecting SNPs, and the multiple-testing corrected p-values can be used to control false discovery rate (FDR) at any pre-specified level. With simulated data, MAFsnp is shown to have much better control of FDR than the existing SNP callers. Through the application to two real datasets, MAFsnp is also shown to outperform the existing SNP callers in terms of calling accuracy. An R package “MAFsnp” implementing the new SNP caller is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号