首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
High-throughput sequencing technologies are widely used to analyse genomic variants or rare mutational events in different fields of genomic research, with a fast development of new or adapted platforms and technologies, enabling amplicon-based analysis of single target genes or even whole genome sequencing within a short period of time. Each sequencing platform is characterized by well-defined types of errors, resulting from different steps in the sequencing workflow. Here we describe a universal method to prepare amplicon libraries that can be used for sequencing on different high-throughput sequencing platforms. We have sequenced distinct exons of the CREB binding protein (CREBBP) gene and analysed the output resulting from three major deep-sequencing platforms. platform-specific errors were adjusted according to the result of sequence analysis from the remaining platforms. Additionally, bioinformatic methods are described to determine platform dependent errors. Summarizing the results we present a platform-independent cost-efficient and timesaving method that can be used as an alternative to commercially available sample-preparation kits.  相似文献   

2.
We address the bioinformatic issue of accurately separating amplified genes of the major histocompatibility complex (MHC) from artefacts generated during high‐throughput sequencing workflows. We fit observed ultra‐deep sequencing depths (hundreds to thousands of sequences per amplicon) of allelic variants to expectations from genetic models of copy number variation (CNV). We provide a simple, accurate and repeatable method for genotyping multigene families, evaluating our method via analyses of 209 b of MHC class IIb exon 2 in guppies (Poecilia reticulata). Genotype repeatability for resequenced individuals (N = 49) was high (100%) within the same sequencing run. However, repeatability dropped to 83.7% between independent runs, either because of lower mean amplicon sequencing depth in the initial run or random PCR effects. This highlights the importance of fully independent replicates. Significant improvements in genotyping accuracy were made by greatly reducing type I genotyping error (i.e. accepting an artefact as a true allele), which may occur when using low‐depth allele validation thresholds used by previous methods. Only a small amount (4.9%) of type II error (i.e. rejecting a genuine allele as an artefact) was detected through fully independent sequencing runs. We observed 1–6 alleles per individual, and evidence of sharing of alleles across loci. Variation in the total number of MHC class II loci among individuals, both among and within populations was also observed, and some genotypes appeared to be partially hemizygous; total allelic dosage added up to an odd number of allelic copies. Collectively, observations provide evidence of MHC CNV and its complex basis in natural populations.  相似文献   

3.
4.
PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II’s sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.  相似文献   

5.
Two length variants of 5S rDNA repeated units were detected in the genome of East European butterfly Melitaea trivia. Both repeat variants contain the 5S rRNA coding region of the same length of 120 bp, but possess the intergenic spacer region (IGS) of different size, 78 and 125 bp, respectively. The level of sequence similarity between the two 5S rDNA variants amounts to 43.9-45.5% in the IGS, whereas the coding region appears to be more conservative. In the IGS, microsatellite sequence motives were found; amplification of these motives could be involved in the evolution of the 5S rDNA.  相似文献   

6.
7.
Mitochondrial disorders are by far the most genetically heterogeneous group of diseases, involving two genomes, the 16.6 kb mitochondrial genome and ~ 1500 genes encoded in the nuclear genome. For maternally inherited mitochondrial DNA disorders, a complete molecular diagnosis requires several different methods for the detection and quantification of mtDNA point mutations and large deletions. For mitochondrial disorders caused by autosomal recessive, dominant, and X-linked nuclear genes, the diagnosis has relied on clinical, biochemical, and molecular studies to point to a group of candidate genes followed by stepwise Sanger sequencing of the candidate genes one-by-one. The development of Next Generation Sequencing (NGS) has revolutionized the diagnostic approach. Using massively parallel sequencing (MPS) analysis of the entire mitochondrial genome, mtDNA point mutations and deletions can be detected and quantified in one single step. The NGS approach also allows simultaneous analyses of a group of genes or the whole exome, thus, the mutations in causative gene(s) can be identified in one-step. New approaches make genetic analyses much faster and more efficient. Huge amounts of sequencing data produced by the new technologies brought new challenges to bioinformatics, analytical pipelines, and interpretation of numerous novel variants. This article reviews the clinical utility of next generation sequencing for the molecular diagnoses of complex dual genome mitochondrial disorders.  相似文献   

8.
The cattle genome contains several distinct centromeric satellites with interrelated evolutionary histories. We compared these satellites in Bovini species that diverged 0.2 to about 5 Myr ago. Quantification of hybridization signals by phosphor imaging revealed a large variation in the relative amounts of the major satellites. In the genome of water buffalo this has led to the complete deletion of satellite III. Comparative sequencing and PCR-RFLP analysis of satellites IV, 1.711a, and 1.711b from the related Bos and Bison species revealed heterogeneities in 0.5 to 2% of the positions, again with variations in the relative amounts of sequence variants. Restriction patterns generated by double digestions suggested a recombination of sequence variants. Our results are compatible with a model of the life history of satellites during which homogeneity of interacting repeat units is both cause and consequence of the rapid turnover of satellite DNA. Initially, a positive feedback loop leads to a rapid saltatory amplification of homogeneous repeat units. In the second phase, mutations inhibit the interaction of repeat units and coexisting sequence variants amplify independently. Homogenization by the spreading of one of the variants is prevented by recombination and the satellite is eventually outcompeted by another, more homogeneous tandem repeat sequence. Received: 21 July 2000 / Accepted: 30 October 2000  相似文献   

9.
Mutations in the ABCA1 gene are the cause of familial high density lipoprotein deficiency (FHD). Because these mutations are spread over the entire gene, their detection requires the sequencing of all 50 exons. The aim of this study was to validate denaturing high-performance liquid chromatography (DHPLC) in mutation detection as an alternative to systematic sequencing. Exons of the ABCA1 gene were amplified using primers employed for sequencing. Temperatures for DHPLC were deducted from a software and empirically defined for each amplicon. To assess DHPLC reliability, we tested 30 sequence variants found in FHD patients and controls. Combined DHPLC and sequencing was applied to the genotyping of new FHD patients. Most of the amplicons required from two to five temperature conditions to obtain partially denatured DNA over the entire amplicon length. Twenty-nine of the variants found by sequencing were detected by DHPLC (97% sensitivity). The detection of the last variant (in exon 40) required different primers and amplification conditions. DHPLC and sequencing analysis of new FHD patients revealed that all amplicons showing a heteroduplex DHPLC profile contained sequence variants. No variants were detected in amplicons with a homoduplex profile. DHPLC is a sensitive and reliable method for the detection of ABCA1 gene mutations.  相似文献   

10.
Summary Ribosomal DNA (rDNA) repeats of the plant-parasitic nematode Meloidogyne arenaria are heterogeneous in size and appear to contain 5S rRNA gene sequences. Moreover, in a recA + bacterial host, plasmid clones of a 9 kb rDNA repeat show deletion events within a 2 kb intergenic spacer (IGS), between 28S and 5S DNA sequences. These deletions appear to result from a reduction in the number of tandem 129 by repeats in the IGS. The loss of such repeats might explain how rDNA length heterogeneity, observed in the Meloidogyne genome, could have arisen. Each 129 by repeat also contains three copies of an 8 by subrepeat, which has sequence similarity to an element found in the IGS repeats of some plant rDNAs.  相似文献   

11.
Ribosomal (r)DNA undergoes concerted evolution, the mechanisms of which are unequal crossing over and gene conversion. Despite the fundamental importance of these mechanisms to the evolution of rDNA, their rates have been estimated only in a few model species. We estimated recombination rate in rDNA by quantifying the relative frequency of intraindividual length variants in an expansion segment of the 18S rRNA gene of the cladoceran crustacean, Daphnia obtusa, in four apomictically propagated lines. We also used quantitative PCR to estimate rDNA copy number. The apomictic lines were sampled every 5 generations for 90 generations, and we considered each significant change in the frequency distribution of length variants between time intervals to be the result of a recombination event. Using this method, we calculated the recombination rate for this region to be 0.02-0.06 events/generation on the basis of three different estimates of rDNA copy number. In addition, we observed substantial changes in rDNA copy number within and between lines. Estimates of haploid copy number varied from 53 to 233, with a mean of 150. We also measured the relative frequency of length variants in 30 lines at generations 5, 50, and 90. Although length variant frequencies changed significantly within and between lines, the overall average frequency of each length variant did not change significantly between the three generations sampled, suggesting that there is little or no bias in the direction of change due to recombination.  相似文献   

12.
It has been suggested that Locusta migratoria amplifies its ribosomal RNA genes in the growing oocytes (Kunz (1967) Chromosoma20, 332–370). Cloned ribosomal DNA of L. migratoria was used to analyze rDNA structure and number. The rDNA is localized on three chromosome pairs in six nucleolus organizers. It was found that all structural variants of the rRNA genes which have been described previously are represented in the same relative amounts in DNA from isolated oocytes as in somatic cells. Furthermore, the rRNA gene number is not increased in oocyte DNA, i.e., amplification does not occur. Therefore, the large number of multiple nucleoli seen in the growing oocytes has to be interpreted as the fully extended and fully active set of chromosomal rRNA genes. The total rRNA gene number was determined by dot blot hybridization to be about 3300 genes/haploid genome.  相似文献   

13.
14.
The nuclear 18S, 5.8S and 25S rRNA genes exist as thousands of rDNA repeats in the Scots pine genome. The number and location of rDNA loci (nucleolus organizers, NORs) were studied by cytological methods, and a restriction map from the coding region of the Scots pine rDNA repeat was constructed using digoxigenin-labeled flax rDNA as a probe. Based on the maximum number of nucleoli and chromosomal secondary constrictions, Scots pine has at least eight NORs in its haploid genome. The size of the Scots pine rDNA repeat unit is approximately 27 kb, two- or threefold larger than the typical angiosperm rDNA unit, but similar in size to other characterized conifer rDNA repeats. The intergenic spacer region (IGS) of the rDNA repeat unit in Scots pine is longer than 20 kb, and the transcribed spacer regions surrounding the 5.8S gene (ITS1 and ITS2) span a region of 2.9 kb. Restriction analysis revealed that although the coding regions of rDNA repeats are homogeneous, heterogeneity exists in the intergenic spacer region between individuals, as well as among the rDNA repeats within individuals.  相似文献   

15.
Lactoris fernandeziana, endemic to the island of Masatierra in the Juan Fernandez Archipelago, is the only living member of the primitive angiosperm family, Lactoridaceae. The species was surveyed for ribosomal DNA (rDNA) and RAPD (Random Amplified Polymorphic DNA) variation. Previous analyses of allozymes had revealed no variation within the species. Variation was found for length in the intergenic spacer and for restriction sites in the 18S–25S genes of rDNA, and for the presence of amplified bands using 16 primers. Different rDNA repeat lengths and restriction site variants were detected within individuals as well as within and among populations. The level of variation in RAPDs is low relative to other Juan Fernandez endemic species surveyed, and nearly all variants were restricted to single populations. The rDNA length variants were distributed throughout the island, whereas the rDNA restriction site variants and RAPD markers indicated minor genetic differences among the populations.  相似文献   

16.
The human alpha-fetoprotein gene spans 19,489 base pairs from the putative "Cap" site to the polyadenylation site. It is composed of 15 exons separated by 14 introns, which are symmetrically placed within the three domains of alpha-fetoprotein. In the 5' region, a putative TATAAA box is at position -21, and a variant sequence, CCAAC, of the common CAT box is at -65. Enhancer core sequences GTGGTTTAAAG are found in introns 3 and 4, and several copies of glucocorticoid response sequences AGATACAGTA are found on the template strand of the gene. There are six polymorphic sites within 4690 base pairs of contiguous DNA derived from two allelic alpha-fetoprotein genes. This amounts to a measured polymorphic frequency of 0.13%, or 6.4 X 10(-4)/site, which is about 5-10 times lower than values estimated from studies on polymorphic restriction sites in other regions of the human genome. There are four types of repetitive sequence elements in the introns and flanking regions of the human alpha-fetoprotein gene. At least one of these is apparently a novel structure (designated Xba) and is found as a pair of direct repeats, with one copy in intron 7 and the other in intron 8. It is conceivable that within the last 2 million years the copy in intron 8 gave rise to the repeat in intron 7. Their present location on both sides of exon 8 gives these sequences a potential for disrupting the functional integrity of the gene in the event of an unequal crossover between them. There are three Alu elements, one of which is in intron 4; the others are located in the 3' flanking region. A solitary Kpn repeat is found in intron 3. The Xba and Kpn repeats were only detected by complete sequencing of the introns. Neither X, Xba, nor Kpn elements are present in the related human albumin gene, whereas Alu's are present in different positions. From phylogenetic evidence, it appears that Alu elements were inserted into the alpha-fetoprotein gene at some time postdating the mammalian radiation 85 million years ago.  相似文献   

17.
Multiple copies of a gene may lead to difficulty in the interpretation of typing results because polymorphism of the copies may wrongly lead to the conclusion that different types are present in a specimen. To determine the copy number per genome of the nuclear rDNA and beta-tubulin genes analyzed for the typing of Pneumocystis carinii f. sp. hominis, we developed a strategy based on the use of the same multicompetitor molecule in two different quantitative-competitive PCRs, one for the gene under study and the other for a reference single copy gene, allowing direct comparison of the results of both PCRs. Control experiments showed that the strategy was sensitive enough to detect duplication of a gene. The copy number of the nuclear rDNA operon was determined by amplification of the intron of the 26S rDNA gene and that of the beta-tubulin by amplification of the region surrounding the intron no. 6. The method was first tested on P. c. carinii, the special form commonly infecting rats. Pneumocystis c. carinii was found to contain a single copy of the rDNA operon. The method was then applied to P. c. hominis. The results confirmed that P. c. hominis genome contains a single copy of the nuclear rDNA and beta-tubulin genes.  相似文献   

18.
The 78 101 base pair long sequence of a cluster of 22-kDa alpha zein genes in the maize inbred BSSS53 was determined. Each zein gene is contained within a repeat unit that varies in length. If such a repeat, or amplicon, is aligned along the entire sequence, a 10.5-fold sequence amplification is delineated. Because of insertions and deletions in intergenic regions, many of the zein genes are spaced over different distances. Only three out of 10 zein-related sequences have an intact open reading frame, indicating an unusual large number of genes unable to contribute to the accumulation of normal-size 22-kDa zein proteins. It is proposed that the seven remaining zein-related sequences be considered gene reserves because of their potential to be restored by gene conversion. Intergenic insertions in the cluster range from 1098 to 14 896 base pairs. Although they are composed of transposable element sequences, they also contain additional open reading frames, two of them showing homology to rice cDNA sequences. The average amplicon is 4423 base pairs long, with the sequence surrounding each zein gene more than 90 % conserved. Coincidently, the size of the amplicon is equivalent to the average gene density (one gene within 4640 bp) in the Arabidopsis thaliana genome, one of the smallest in plants. Successive steps of amplification and insertion of DNA might explain to a certain degree how genome size variation has been generated in plants.  相似文献   

19.
20.
Complete and accurate knowledge of the genes and allelic variants of the human immunoglobulin gene loci is critical for studies of B cell repertoire development and somatic point mutation, but evidence from studies of VDJ rearrangements suggests that our knowledge of the available immunoglobulin gene repertoire is far from complete. The reported repertoire has changed little over the last 15 years. This is, in part, a consequence of the inefficiencies involved in searching for new members of large, multigenic gene families by cloning and sequencing. The advent of high-throughput sequencing provides a new avenue by which the germline repertoire can be explored. In this report, we describe pyrosequencing studies of the heavy chain IGHV1, IGHV3 and IGHV4 gene subgroups in ten Papua New Guineans. Thousands of 454 reads aligned with complete identity to 51 previously reported functional IGHV genes and allelic variants. A new gene, IGHV3-NL1*01, was identified, which differs from the nearest previously reported gene by 15 nucleotides. Sixteen new IGHV alleles were also identified, 15 of which varied from previously reported functional IGHV genes by between one and four nucleotides, while one sequence appears to be a functional variant of the pseudogene IGHV3-25. BLAST searches suggest that at least six of these new genes are carried within the relatively well-studied populations of North America, Europe or Asia. This study substantially expands the known immunoglobulin gene repertoire and demonstrates that genetic variation of immunoglobulin genes can now be efficiently explored in different human populations using high-throughput pyrosequencing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号