首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.

Background

Unambiguous human leukocyte antigen (HLA) typing is important in transplant matching and disease association studies. High-resolution HLA typing that is not restricted to the peptide-binding region can decrease HLA allele ambiguities. Cost and technology constraints have hampered high-throughput and efficient high resolution unambiguous HLA typing. We have developed a method for HLA genotyping that preserves the very high-resolution that can be obtained by next-generation sequencing (NGS) but also achieves substantially increased efficiency. Unambiguous HLA-A, B, C and DRB1 genotypes can be determined for 96 individuals in a single run of the Illumina MiSeq.

Results

Long-range amplification of full-length HLA genes from four loci was performed in separate polymerase chain reactions (PCR) using primers and PCR conditions that were optimized to reduce co-amplification of other HLA loci. Amplicons from the four HLA loci of each individual were then pooled and subjected to enzymatic library generation. All four loci of an individual were then tagged with one unique index combination. This multi-locus individual tagging (MIT) method combined with NGS enabled the four loci of 96 individuals to be analyzed in a single 500 cycle sequencing paired-end run of the Illumina-MiSeq. The MIT-NGS method generated sequence reads from the four loci were then discriminated using commercially available NGS HLA typing software. Comparison of the MIT-NGS with Sanger sequence-based HLA typing methods showed that all the ambiguities and discordances between the two methods were due to the accuracy of the MIT-NGS method.

Conclusions

The MIT-NGS method enabled accurate, robust and cost effective simultaneous analyses of four HLA loci per sample and produced 6 or 8-digit high-resolution unambiguous phased HLA typing data from 96 individuals in a single NGS run.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-864) contains supplementary material, which is available to authorized users.  相似文献   

3.
Single nucleotide polymorphisms (SNPs) and insertions-deletions (InDels) are valuable molecular markers for genomics and genetics studies and molecular breeding. The advent of next-generation sequencing techniques has enabled researchers to approach high-throughput and cost-effective SNP and InDel discovery on a genomic scale. In this report, 36 common bean genotypes grown in Canada were used to construct reduced representation libraries for next-generation sequencing. Using 76 million sequence reads generated by the Illumina HiSeq 2000 Sequencing System, we identified a total of 43,698 putative SNPs and 1,267 putative InDels. Of the SNPs, 43,504 were bi-allelic and 194 were tri-allelic, and the InDels comprised 574 insertions and 693 deletions. The putative bi-allelic SNPs were distributed across all 11 chromosomes with the highest number of SNPs observed in chromosome 2 (4,788), and the lowest in chromosome 10 (2,941). With the aid of the recent release of the first chromosome-scale version of Phaseolus vulgaris, 24,907 bi-allelic SNPs, 79 tri-allelic SNPs, 315 insertions, and 377 deletions were located in 8,758, 77, 273, and 364 genes, respectively. Among these 24,907 bi-allelic SNPs, 7,168 nonsynonymous bi-allelic SNPs were identified within 36 common bean genotypes that were located in 4,303 genes. A total of 113 putative SNPs were randomly chosen for validation using high-resolution melt analysis. Of the 113 candidate SNPs, 105 (92.9 %) contained the predicted SNPs.  相似文献   

4.
5.

Background

Copy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing.

Results

A total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson’s correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding.

Conclusions

Our results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-962) contains supplementary material, which is available to authorized users.  相似文献   

6.
A primary component of next-generation sequencing analysis is to align short reads to a reference genome, with each read aligned independently. However, reads that observe the same non-reference DNA sequence are highly correlated and can be used to better model the true variation in the target genome. A novel short-read micro re-aligner, SRMA, that leverages this correlation to better resolve a consensus of the underlying DNA sequence of the targeted genome is described here.  相似文献   

7.
Genotyping of multilocus gene families, such as the major histocompatibility complex (MHC), may be challenging because of problems with assigning alleles to loci and copy number variation among individuals. Simultaneous amplification and genotyping of multiple loci may be necessary, and in such cases, next-generation deep amplicon sequencing offers a great promise as a genotyping method of choice. Here, we describe jMHC, a computer program developed for analysing and assisting in the visualization of deep amplicon sequencing data. Software operates on FASTA files; therefore, output from any sequencing technology may be used. jMHC was designed specifically for MHC studies but it may be useful for analysing amplicons derived from other multigene families or for genotyping other polymorphic systems. The program is written in Java with user-friendly graphical interface (GUI) and can be run on Microsoft Windows, Linux OS and Mac OS.  相似文献   

8.
The ongoing revolution in DNA sequencing technology now enables the reading of thousands of millions of nucleotide bases in a single instrument run. However, this data quantity is often compromised by poor confidence in the read quality. The identification of genetic polymorphisms from this data is therefore problematic and, combined with the vast quantity of data, poses a major bioinformatics challenge. However, once these difficulties have been addressed, next-generation sequencing will offer a means to identify and characterize the wealth of genetic polymorphisms underlying the vast phenotypic variation in biological systems. We describe the recent advances in next-generation sequencing technology, together with preliminary approaches that can be applied for single nucleotide polymorphism discovery in plant species.  相似文献   

9.
10.
Wei X  Ju X  Yi X  Zhu Q  Qu N  Liu T  Chen Y  Jiang H  Yang G  Zhen R  Lan Z  Qi M  Wang J  Yang Y  Chu Y  Li X  Guang Y  Huang J 《PloS one》2011,6(12):e29500

Background

Identification of gene variants plays an important role in research on and diagnosis of genetic diseases. A combination of enrichment of targeted genes and next-generation sequencing (targeted DNA-HiSeq) results in both high efficiency and low cost for targeted sequencing of genes of interest.

Methodology/Principal Findings

To identify mutations associated with genetic diseases, we designed an array-based gene chip to capture all of the exons of 193 genes involved in 103 genetic diseases. To evaluate this technology, we selected 7 samples from seven patients with six different genetic diseases resulting from six disease-causing genes and 100 samples from normal human adults as controls. The data obtained showed that on average, 99.14% of 3,382 exons with more than 30-fold coverage were successfully detected using Targeted DNA-HiSeq technology, and we found six known variants in four disease-causing genes and two novel mutations in two other disease-causing genes (the STS gene for XLI and the FBN1 gene for MFS) as well as one exon deletion mutation in the DMD gene. These results were confirmed in their entirety using either the Sanger sequencing method or real-time PCR.

Conclusions/Significance

Targeted DNA-HiSeq combines next-generation sequencing with the capture of sequences from a relevant subset of high-interest genes. This method was tested by capturing sequences from a DNA library through hybridization to oligonucleotide probes specific for genetic disorder-related genes and was found to show high selectivity, improve the detection of mutations, enabling the discovery of novel variants, and provide additional indel data. Thus, targeted DNA-HiSeq can be used to analyze the gene variant profiles of monogenic diseases with high sensitivity, fidelity, throughput and speed.  相似文献   

11.
Molecular Biology Reports - Rice landraces are vital genetic resources for agronomic and quality traits but the undeniable collection of Kerala landraces remains poorly delineated. To effectively...  相似文献   

12.
Autoinflammatory diseases occupy one of a group of primary immunodeficiency diseases that are generally thought to be caused by mutation of genes responsible for innate immunity, rather than by acquired immunity. Mutations related to autoinflammatory diseases occur in 12 genes. For example, low-level somatic mosaic NLRP3 mutations underlie chronic infantile neurologic, cutaneous, articular syndrome (CINCA), also known as neonatal-onset multisystem inflammatory disease (NOMID). In current clinical practice, clinical genetic testing plays an important role in providing patients with quick, definite diagnoses. To increase the availability of such testing, low-cost high-throughput gene-analysis systems are required, ones that not only have the sensitivity to detect even low-level somatic mosaic mutations, but also can operate simply in a clinical setting. To this end, we developed a simple method that employs two-step tailed PCR and an NGS system, MiSeq platform, to detect mutations in all coding exons of the 12 genes responsible for autoinflammatory diseases. Using this amplicon sequencing system, we amplified a total of 234 amplicons derived from the 12 genes with multiplex PCR. This was done simultaneously and in one test tube. Each sample was distinguished by an index sequence of second PCR primers following PCR amplification. With our procedure and tips for reducing PCR amplification bias, we were able to analyze 12 genes from 25 clinical samples in one MiSeq run. Moreover, with the certified primers designed by our short program—which detects and avoids common SNPs in gene-specific PCR primers—we used this system for routine genetic testing. Our optimized procedure uses a simple protocol, which can easily be followed by virtually any office medical staff. Because of the small PCR amplification bias, we can analyze simultaneously several clinical DNA samples with low cost and can obtain sufficient read numbers to detect a low level of somatic mosaic mutations.  相似文献   

13.
14.

Background

Influenza viruses exist as a large group of closely related viral genomes, also called quasispecies. The composition of this influenza viral quasispecies can be determined by an accurate and sensitive sequencing technique and data analysis pipeline. We compared the suitability of two benchtop next-generation sequencers for whole genome influenza A quasispecies analysis: the Illumina MiSeq sequencing-by-synthesis and the Ion Torrent PGM semiconductor sequencing technique.

Results

We first compared the accuracy and sensitivity of both sequencers using plasmid DNA and different ratios of wild type and mutant plasmid. Illumina MiSeq sequencing reads were one and a half times more accurate than those of the Ion Torrent PGM. The majority of sequencing errors were substitutions on the Illumina MiSeq and insertions and deletions, mostly in homopolymer regions, on the Ion Torrent PGM. To evaluate the suitability of the two techniques for determining the genome diversity of influenza A virus, we generated plasmid-derived PR8 virus and grew this virus in vitro. We also optimized an RT-PCR protocol to obtain uniform coverage of all eight genomic RNA segments. The sequencing reads obtained with both sequencers could successfully be assembled de novo into the segmented influenza virus genome. After mapping of the reads to the reference genome, we found that the detection limit for reliable recognition of variants in the viral genome required a frequency of 0.5% or higher. This threshold exceeds the background error rate resulting from the RT-PCR reaction and the sequencing method. Most of the variants in the PR8 virus genome were present in hemagglutinin, and these mutations were detected by both sequencers.

Conclusions

Our approach underlines the power and limitations of two commonly used next-generation sequencers for the analysis of influenza virus gene diversity. We conclude that the Illumina MiSeq platform is better suited for detecting variant sequences whereas the Ion Torrent PGM platform has a shorter turnaround time. The data analysis pipeline that we propose here will also help to standardize variant calling in small RNA genomes based on next-generation sequencing data.  相似文献   

15.
Khaya senegalensis (African mahogany or dry-zone mahogany) is a high-value hardwood timber species with great potential for forest plantations in northern Australia. The species is distributed across the sub-Saharan belt from Senegal to Sudan and Uganda. Because of heavy exploitation and constraints on natural regeneration and sustainable planting, it is now classified as a vulnerable species. Here, we describe the development of microsatellite markers for K. senegalensis using next-generation sequencing to assess its intra-specific diversity across its natural range, which is a key for successful breeding programs and effective conservation management of the species. Next-generation sequencing yielded 93,943 sequences with an average read length of 234 bp. The assembled sequences contained 1030 simple sequence repeats, with primers designed for 522 microsatellite loci. Twenty-one microsatellite loci were tested with 11 showing reliable amplification and polymorphism in K. senegalensis. The 11 novel microsatellites, together with one previously published, were used to assess 73 accessions belonging to the Australian K. senegalensis domestication program, sampled from across the natural range of the species. STRUCTURE analysis shows two major clusters, one comprising mainly accessions from west Africa (Senegal to Benin) and the second based in the far eastern limits of the range in Sudan and Uganda. Higher levels of genetic diversity were found in material from western Africa. This suggests that new seed collections from this region may yield more diverse genotypes than those originating from Sudan and Uganda in eastern Africa.  相似文献   

16.
《MABS-AUSTIN》2013,5(3):628-636
To gain insight into the functional antibody repertoire of rabbits, the VH and VL repertoires of bone marrow (BM) and spleen (SP) of a naïve New Zealand White rabbit (NZW; Oryctolagus cuniculus) and that of lymphocytes collected from a NZW rabbit immunized (IM) with a 16-mer peptide were deep-sequenced. Two closely related genes, IGHV1S40 (VH1a3) and IGHV1S45 (VH4), were found to dominate (~90%) the VH repertoire of BM and SP, whereas, IGHV1S69 (VH1a1) contributed significantly (~40%) to IM. BM and SP antibodies recombined predominantly with IGHJ4. A significant proportion (~30%) of IM sequences recombined with IGHJ2. The VK repertoire was encoded by nine IGKV genes recombined with one IGKJ gene, IGKJ1. No significant bias in the VK repertoire of the BM, SP and IM samples was observed. The complementarity-determining region (CDR)-H3 and -L3 length distributions were similar in the three samples following a Gaussian curve with average length of 12.2 ± 2.4 and 11.1 ± 1.1 amino acids, respectively. The amino acid composition of the predominant CDR-H3 and -L3 loop lengths was similar to that of humans and mice, rich in Tyr, Gly, Ser and, in some specific positions, Asp. The average number of mutations along the IGHV/KV genes was similar in BM, SP and IM; close to 12 and 15 mutations for VH and VL, respectively. A monoclonal antibody specific for the peptide used as immunogen was obtained from the IM rabbit. The CDR-H3 sequence was found in 1,559 of 61,728 (2.5%) sequences, at position 10, in the rank order of the CDR-H3 frequencies. The CDR-L3 was found in 24 of 11,215 (0.2%) sequences, ranking 102. No match was found in the BM and SP samples, indicating positive selection for the hybridoma sequence. Altogether, these findings lay foundations for engineering of rabbit V regions to enhance their potential as therapeutics, i.e., design of strategies for selection of specific rabbit V regions from NGS data mining, humanization and design of libraries for affinity maturation campaigns.  相似文献   

17.
To gain insight into the functional antibody repertoire of rabbits, the VH and VL repertoires of bone marrow (BM) and spleen (SP) of a naïve New Zealand White rabbit (NZW; Oryctolagus cuniculus) and that of lymphocytes collected from a NZW rabbit immunized (IM) with a 16-mer peptide were deep-sequenced. Two closely related genes, IGHV1S40 (VH1a3) and IGHV1S45 (VH4), were found to dominate (~90%) the VH repertoire of BM and SP, whereas, IGHV1S69 (VH1a1) contributed significantly (~40%) to IM. BM and SP antibodies recombined predominantly with IGHJ4. A significant proportion (~30%) of IM sequences recombined with IGHJ2. The VK repertoire was encoded by nine IGKV genes recombined with one IGKJ gene, IGKJ1. No significant bias in the VK repertoire of the BM, SP and IM samples was observed. The complementarity-determining region (CDR)-H3 and -L3 length distributions were similar in the three samples following a Gaussian curve with average length of 12.2 ± 2.4 and 11.1 ± 1.1 amino acids, respectively. The amino acid composition of the predominant CDR-H3 and -L3 loop lengths was similar to that of humans and mice, rich in Tyr, Gly, Ser and, in some specific positions, Asp. The average number of mutations along the IGHV/KV genes was similar in BM, SP and IM; close to 12 and 15 mutations for VH and VL, respectively. A monoclonal antibody specific for the peptide used as immunogen was obtained from the IM rabbit. The CDR-H3 sequence was found in 1,559 of 61,728 (2.5%) sequences, at position 10, in the rank order of the CDR-H3 frequencies. The CDR-L3 was found in 24 of 11,215 (0.2%) sequences, ranking 102. No match was found in the BM and SP samples, indicating positive selection for the hybridoma sequence. Altogether, these findings lay foundations for engineering of rabbit V regions to enhance their potential as therapeutics, i.e., design of strategies for selection of specific rabbit V regions from NGS data mining, humanization and design of libraries for affinity maturation campaigns.  相似文献   

18.
19.
Single nucleotide polymorphisms (SNPs) are essential to the understanding of population genetic variation and diversity. Here, we performed restriction‐site‐associated DNA sequencing (RAD‐seq) on 72 individuals from 13 Chinese indigenous and three introduced chicken breeds. A total of 620 million reads were obtained using an Illumina Hiseq2000 sequencer. An average of 75 587 SNPs were identified from each individual. Further filtering strictly validated 28 895 SNPs candidates for all populations. When compared with the NCBI dbSNP (chicken_9031), 15 404 SNPs were new discoveries. In this study, RAD‐seq was performed for the first time on chickens, implicating the remarkable effectiveness and potential applications on genetic analysis and breeding technique for whole‐genome selection in chicken and other agricultural animals.  相似文献   

20.
Transposable elements (TEs) constitute the most active, diverse and ancient component in a broad range of genomes. Complete understanding of genome function and evolution cannot be achieved without a thorough understanding of TE impact and biology. However, in-depth analysis of TEs still represents a challenge due to the repetitive nature of these genomic entities. In this work, we present a broadly applicable and flexible tool: T-lex2. T-lex2 is the only available software that allows routine, automatic and accurate genotyping of individual TE insertions and estimation of their population frequencies both using individual strain and pooled next-generation sequencing data. Furthermore, T-lex2 also assesses the quality of the calls allowing the identification of miss-annotated TEs and providing the necessary information to re-annotate them. The flexible and customizable design of T-lex2 allows running it in any genome and for any type of TE insertion. Here, we tested the fidelity of T-lex2 using the fly and human genomes. Overall, T-lex2 represents a significant improvement in our ability to analyze the contribution of TEs to genome function and evolution as well as learning about the biology of TEs. T-lex2 is freely available online at http://sourceforge.net/projects/tlex.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号