首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Characterization and population genetic analysis of multilocus genes, such as those found in the major histocompatibility complex (MHC) is challenging in nonmodel vertebrates. The traditional method of extensive cloning and Sanger sequencing is costly and time‐intensive and indirect methods of assessment often underestimate total variation. Here, we explored the suitability of 454 pyrosequencing for characterizing multilocus genes for use in population genetic studies. We compared two sample tagging protocols and two bioinformatic procedures for 454 sequencing through characterization of a 185‐bp fragment of MHC DRB exon 2 in wolverines (Gulo gulo) and further compared the results with those from cloning and Sanger sequencing. We found 10 putative DRB alleles in the 88 individuals screened with between two and four alleles per individual, suggesting amplification of a duplicated DRB gene. In addition to the putative alleles, all individuals possessed an easily identifiable pseudogene. In our system, sequence variants with a frequency below 6% in an individual sample were usually artefacts. However, we found that sample preparation and data processing procedures can greatly affect variant frequencies in addition to the complexity of the multilocus system. Therefore, we recommend determining a per‐amplicon‐variant frequency threshold for each unique system. The extremely deep coverage obtained in our study (approximately 5000×) coupled with the semi‐quantitative nature of pyrosequencing enabled us to assign all putative alleles to the two DRB loci, which is generally not possible using traditional methods. Our method of obtaining locus‐specific MHC genotypes will enhance population genetic analyses and studies on disease susceptibility in nonmodel wildlife species.  相似文献   

2.
We address the bioinformatic issue of accurately separating amplified genes of the major histocompatibility complex (MHC) from artefacts generated during high‐throughput sequencing workflows. We fit observed ultra‐deep sequencing depths (hundreds to thousands of sequences per amplicon) of allelic variants to expectations from genetic models of copy number variation (CNV). We provide a simple, accurate and repeatable method for genotyping multigene families, evaluating our method via analyses of 209 b of MHC class IIb exon 2 in guppies (Poecilia reticulata). Genotype repeatability for resequenced individuals (N = 49) was high (100%) within the same sequencing run. However, repeatability dropped to 83.7% between independent runs, either because of lower mean amplicon sequencing depth in the initial run or random PCR effects. This highlights the importance of fully independent replicates. Significant improvements in genotyping accuracy were made by greatly reducing type I genotyping error (i.e. accepting an artefact as a true allele), which may occur when using low‐depth allele validation thresholds used by previous methods. Only a small amount (4.9%) of type II error (i.e. rejecting a genuine allele as an artefact) was detected through fully independent sequencing runs. We observed 1–6 alleles per individual, and evidence of sharing of alleles across loci. Variation in the total number of MHC class II loci among individuals, both among and within populations was also observed, and some genotypes appeared to be partially hemizygous; total allelic dosage added up to an odd number of allelic copies. Collectively, observations provide evidence of MHC CNV and its complex basis in natural populations.  相似文献   

3.
PCR and sequencing artefacts can seriously bias population genetic analyses, particularly of populations with low genetic variation such as endangered vertebrate populations. Here, we estimate the error rates, discuss their population genetics implications, and propose a simple detection method that helps to reduce the risk of accepting such errors. We study the major histocompatibility complex (MHC) class IIB of guppies, Poecilia reticulata and find that PCR base misincorporations inflate the apparent sequence diversity. When analysing neutral genes, such bias can inflate estimates of effective population size. Previously suggested protocols for identifying genuine alleles are unlikely to exclude all sequencing errors, or they ignore genuine sequence diversity. We present a novel and statistically robust method that reduces the likelihood of accepting PCR artefacts as genuine alleles, and which minimises the necessity of repeated genotyping. Our method identifies sequences that are unlikely to be a PCR artefact, and which need to be independently confirmed through additional PCR of the same template DNA. The proposed methods are recommended particularly for population genetic studies that involve multi-template DNA and in studies on genes with low genetic diversity.  相似文献   

4.
Genes of the highly dynamic major histocompatibility complex (MHC) are directly linked to individual fitness and are of high interest in evolutionary ecology and conservation genetics. Gene duplication and positive selection usually lead to high levels of polymorphism in the MHC region, making genotyping of MHC a challenging task. Here, we compare the performance of two methods for MHC class I genotyping in a passerine with highly duplicated MHC class I genes: capillary electrophoresis-single-strand conformation polymorphism (CE-SSCP) analysis and 454 GS FLX Titanium pyrosequencing. According to our findings, the number of MHC variants (called alleles for simplicity) detected by CE-SSCP is significantly lower than detected by 454. To resolve discrepancies between the two methods, we cloned and Sanger sequenced a MHC class I amplicon for an individual with high number of alleles. We found a perfect congruence between cloning/Sanger sequencing results and 454. Thus, in case of multi-locus amplification, CE-SSCP considerably underestimates individual MHC diversity. However, numbers of alleles detected by both methods are significantly correlated, although the correlation is weak (r = 0.32). Thus, in systems with highly duplicated MHC, 454 provides more reliable information on individual diversity than CE-SSCP.  相似文献   

5.
Genes of the major histocompatibility complex (MHC) are considered a paradigm of adaptive evolution at the molecular level and as such are frequently investigated by evolutionary biologists and ecologists. Accurate genotyping is essential for understanding of the role that MHC variation plays in natural populations, but may be extremely challenging. Here, I discuss the DNA-based methods currently used for genotyping MHC in non-model vertebrates, as well as techniques likely to find widespread use in the future. I also highlight the aspects of MHC structure that are relevant for genotyping, and detail the challenges posed by the complex genomic organization and high sequence variation of MHC loci. Special emphasis is placed on designing appropriate PCR primers, accounting for artefacts and the problem of genotyping alleles from multiple, co-amplifying loci, a strategy which is frequently necessary due to the structure of the MHC. The suitability of typing techniques is compared in various research situations, strategies for efficient genotyping are discussed and areas of likely progress in future are identified. This review addresses the well established typing methods such as the Single Strand Conformation Polymorphism (SSCP), Denaturing Gradient Gel Electrophoresis (DGGE), Reference Strand Conformational Analysis (RSCA) and cloning of PCR products. In addition, it includes the intriguing possibility of direct amplicon sequencing followed by the computational inference of alleles and also next generation sequencing (NGS) technologies; the latter technique may, in the future, find widespread use in typing complex multilocus MHC systems.  相似文献   

6.
The genotyping of highly polymorphic multigene families across many individuals used to be a particularly challenging task because of methodological limitations associated with traditional approaches. Next‐generation sequencing (NGS) can overcome most of these limitations, and it is increasingly being applied in population genetic studies of multigene families. Here, we critically review NGS bioinformatic approaches that have been used to genotype the major histocompatibility complex (MHC) immune genes, and we discuss how the significant advances made in this field are applicable to population genetic studies of gene families. Increasingly, approaches are introduced that apply thresholds of sequencing depth and sequence similarity to separate alleles from methodological artefacts. We explain why these approaches are particularly sensitive to methodological biases by violating fundamental genotyping assumptions. An alternative strategy that utilizes ultra‐deep sequencing (hundreds to thousands of sequences per amplicon) to reconstruct genotypes and applies statistical methods on the sequencing depth to separate alleles from artefacts appears to be more robust. Importantly, the ‘degree of change’ (DOC) method avoids using arbitrary cut‐off thresholds by looking for statistical boundaries between the sequencing depth for alleles and artefacts, and hence, it is entirely repeatable across studies. Although the advances made in generating NGS data are still far ahead of our ability to perform reliable processing, analysis and interpretation, the community is developing statistically rigorous protocols that will allow us to address novel questions in evolution, ecology and genetics of multigene families. Future developments in third‐generation single molecule sequencing may potentially help overcome problems that still persist in de novo multigene amplicon genotyping when using current second‐generation sequencing approaches.  相似文献   

7.
Genotyping of multilocus gene families, such as the major histocompatibility complex (MHC), may be challenging because of problems with assigning alleles to loci and copy number variation among individuals. Simultaneous amplification and genotyping of multiple loci may be necessary, and in such cases, next-generation deep amplicon sequencing offers a great promise as a genotyping method of choice. Here, we describe jMHC, a computer program developed for analysing and assisting in the visualization of deep amplicon sequencing data. Software operates on FASTA files; therefore, output from any sequencing technology may be used. jMHC was designed specifically for MHC studies but it may be useful for analysing amplicons derived from other multigene families or for genotyping other polymorphic systems. The program is written in Java with user-friendly graphical interface (GUI) and can be run on Microsoft Windows, Linux OS and Mac OS.  相似文献   

8.
The critical role of major histocompatibility complex (MHC) genes in disease resistance, along with their putative function in sexual selection, reproduction and chemical ecology, make them an important genetic system in evolutionary ecology. Studying selective pressures acting on MHC genes in the wild nevertheless requires population-wide genotyping, which has long been challenging because of their extensive polymorphism. Here, we report on large-scale genotyping of the MHC class II loci of the grey mouse lemur (Microcebus murinus) from a wild population in western Madagascar. The second exons from MHC-DRB and -DQB of 772 and 672 individuals were sequenced, respectively, using a 454 sequencing platform, generating more than 800,000 reads. Sequence analysis, through a stepwise variant validation procedure, allowed reliable typing of more than 600 individuals. The quality of our genotyping was evaluated through three independent methods, namely genotyping the same individuals by both cloning and 454 sequencing, running duplicates, and comparing parent–offspring dyads; each displaying very high accuracy. A total of 61 (including 20 new) and 60 (including 53 new) alleles were detected at DRB and DQB genes, respectively. Both loci were non-duplicated, in tight linkage disequilibrium and in Hardy–Weinberg equilibrium, despite the fact that sequence analysis revealed clear evidence of historical selection. Our results highlight the potential of 454 sequencing technology in attempts to investigate patterns of selection shaping MHC variation in contemporary populations. The power of this approach will nevertheless be conditional upon strict quality control of the genotyping data.  相似文献   

9.
Variation in the major histocompatibility complex (MHC) class I of the European bison was characterized in a sample of 99 individuals using both classical cloning/Sanger sequencing and 454 pyrosequencing. Three common (frequencies: 0.348, 0.328, and 0.283) haplotypes contain 1-3 classical class I loci. A variable and difficult to estimate precisely number of nonclassical transcribed loci, pseudogenes, and/or gene fragments were also found. The presence of additional 2 rare haplotypes (frequency of 0.020 each), observed only in heterozygotes, was inferred. The overall organization of MHC I appears similar to the cattle system, but genetic variation is much lower with only 7 classical class I alleles, approximately one-tenth of the number known in cattle and a quarter known in the American bison. An extensive transspecific polymorphism was found. MHC I is in a strong linkage disequilibrium with previously studied MHC II DRB3 gene. The most likely explanation for the low variation is a drastic bottleneck at the beginning of the 20th century. Genotype frequencies conformed to Hardy-Weinberg expectations, and no signatures of selection in contemporary populations but strong signatures of historical positive selection in sequences of classical alleles were found. A quick and reliable method of MHC I genotyping was developed.  相似文献   

10.
Genes of the vertebrate major histocompatibility complex (MHC) are of great interest to biologists because of their important role in immunity and disease, and their extremely high levels of genetic diversity. Next generation sequencing (NGS) technologies are quickly becoming the method of choice for high-throughput genotyping of multi-locus templates like MHC in non-model organisms.Previous approaches to genotyping MHC genes using NGS technologies suffer from two problems:1) a “gray zone” where low frequency alleles and high frequency artifacts can be difficult to disentangle and 2) a similar sequence problem, where very similar alleles can be difficult to distinguish as two distinct alleles. Here were present a new method for genotyping MHC loci – Stepwise Threshold Clustering (STC) – that addresses these problems by taking full advantage of the increase in sequence data provided by NGS technologies. Unlike previous approaches for genotyping MHC with NGS data that attempt to classify individual sequences as alleles or artifacts, STC uses a quasi-Dirichlet clustering algorithm to cluster similar sequences at increasing levels of sequence similarity. By applying frequency and similarity based criteria to clusters rather than individual sequences, STC is able to successfully identify clusters of sequences that correspond to individual or similar alleles present in the genomes of individual samples. Furthermore, STC does not require duplicate runs of all samples, increasing the number of samples that can be genotyped in a given project. We show how the STC method works using a single sample library. We then apply STC to 295 threespine stickleback (Gasterosteus aculeatus) samples from four populations and show that neighboring populations differ significantly in MHC allele pools. We show that STC is a reliable, accurate, efficient, and flexible method for genotyping MHC that will be of use to biologists interested in a variety of downstream applications.  相似文献   

11.
Characterization of highly duplicated genes, such as genes of the major histocompatibility complex (MHC), where multiple loci often co‐amplify, has until recently been hindered by insufficient read depths per amplicon. Here, we used ultra‐deep Illumina sequencing to resolve genotypes at exon 3 of MHC class I genes in the sedge warbler (Acrocephalus schoenobaenus). We sequenced 24 individuals in two replicates and used this data, as well as a simulated data set, to test the effect of amplicon coverage (range: 500–20 000 reads per amplicon) on the repeatability of genotyping using four different genotyping approaches. A third replicate employed unique barcoding to assess the extent of tag jumping, that is swapping of individual tag identifiers, which may confound genotyping. The reliability of MHC genotyping increased with coverage and approached or exceeded 90% within‐method repeatability of allele calling at coverages of >5000 reads per amplicon. We found generally high agreement between genotyping methods, especially at high coverages. High reliability of the tested genotyping approaches was further supported by our analysis of the simulated data set, although the genotyping approach relying primarily on replication of variants in independent amplicons proved sensitive to repeatable errors. According to the most repeatable genotyping method, the number of co‐amplifying variants per individual ranged from 19 to 42. Tag jumping was detectable, but at such low frequencies that it did not affect the reliability of genotyping. We thus demonstrate that gene families with many co‐amplifying genes can be reliably genotyped using HTS, provided that there is sufficient per amplicon coverage.  相似文献   

12.
13.
The power of population genetic analyses is often limited by sample size resulting from constraints in financial resources and time to genotype large numbers of individuals. This particularly applies to nonmodel species where detailed genomic knowledge is lacking. Next‐generation sequencing technology using primers ‘tagged’ with an individual barcode of a few nucleotides offers the opportunity to genotype hundreds of individuals at several loci in parallel ( Binladen et al. 2007 ; Meyer et al. 2008 ). The large number of sequence reads can also be used to identify artefacts by frequency distribution thresholds intrinsically determined for each run and data set. In Babik et al. (2009 ), next‐generation deep sequencing was used to genotype several major histocompatibility complex (MHC) class IIB loci of the European bank vole ( Fig. 1 ). Their approach can be useful for many researchers working with complex multiallelic templates and large sample sizes.
Figure 1 Open in figure viewer PowerPoint Hypothetical example of parallel genotyping of two individuals using individually bar‐coded primers. Polymerase chain reactions (PCRs) are performed separately for each individual using a forward primer with a unique Tag‐sequence of four nucleotides. After sequencing of pooled PCR products, sequences can be sorted by their forward primer Tag (Tag‐sorting error rate was estimated < 0.1%). Rare sequences most likely represent artefacts and due to the large amount of sequences obtained (up to 106) the artefact threshold can be determined intrinsically for each data set and was estimated to be around 3% in the case of bank vole MHC class IIB genes ( Babik et al. 2009 ). Photos by Gabriela Bydlon.  相似文献   

14.
15.
Next‐generation sequencing (NGS) technologies are revolutionizing the fields of biology and medicine as powerful tools for amplicon sequencing (AS). Using combinations of primers and barcodes, it is possible to sequence targeted genomic regions with deep coverage for hundreds, even thousands, of individuals in a single experiment. This is extremely valuable for the genotyping of gene families in which locus‐specific primers are often difficult to design, such as the major histocompatibility complex (MHC). The utility of AS is, however, limited by the high intrinsic sequencing error rates of NGS technologies and other sources of error such as polymerase amplification or chimera formation. Correcting these errors requires extensive bioinformatic post‐processing of NGS data. Amplicon Sequence Assignment (amplisas ) is a tool that performs analysis of AS results in a simple and efficient way, while offering customization options for advanced users. amplisas is designed as a three‐step pipeline consisting of (i) read demultiplexing, (ii) unique sequence clustering and (iii) erroneous sequence filtering. Allele sequences and frequencies are retrieved in excel spreadsheet format, making them easy to interpret. amplisas performance has been successfully benchmarked against previously published genotyped MHC data sets obtained with various NGS technologies.  相似文献   

16.
Major histocompatibility complex (MHC) genes encode proteins that play a central role in vertebrates' adaptive immunity to parasites. MHC loci are among the most polymorphic in vertebrates' genomes, inspiring many studies to identify evolutionary processes driving MHC polymorphism within populations and divergence between populations. Leading hypotheses include balancing selection favouring rare alleles within populations, and spatially divergent selection. These hypotheses do not always produce diagnosably distinct predictions, causing many studies of MHC to yield inconsistent or ambiguous results. We suggest a novel strategy to distinguish balancing vs. divergent selection on MHC, taking advantage of natural admixture between parapatric populations. With divergent selection, individuals with immigrant alleles will be more infected and less fit because they are susceptible to novel parasites in their new habitat. With balancing selection, individuals with locally rare immigrant alleles will be more fit (less infected). We tested these contrasting predictions using three‐spine stickleback from three replicate pairs of parapatric lake and stream habitats. We found numerous positive and negative associations between particular MHC IIβ alleles and particular parasite taxa. A few allele–parasite comparisons supported balancing selection, and others supported divergent selection between habitats. But, there was no overall tendency for fish with immigrant MHC alleles to be more or less heavily infected. Instead, locally rare MHC alleles (not necessarily immigrants) were associated with heavier infections. Our results illustrate the complex relationship between MHC IIβ allelic variation and spatially varying multispecies parasite communities: different hypotheses may be concurrently true for different allele–parasite combinations.  相似文献   

17.
Lenz TL  Becker S 《Gene》2008,427(1-2):117-123
Genetic variation in coding regions is of strong interest for biologists as it represents an important factor that drives evolution. To analyse polymorphic loci, researchers usually rely on commonly used typing techniques such as cloning, SSCP, DGGE or RSCA. However, there are potential pitfalls in screening multi-allelic templates, which are mainly the formation of sequence chimeras during PCR amplification, and mosaic sequences during cloning. One of the most challenging genomic regions to explore is the Major Histocompatibility Complex (MHC), which codes for peptide-binding proteins of the vertebrate's adaptive immune system and is well known for its exceptional polymorphism. We compared the effect of two different PCR amplification approaches in a study of the MHC class IIB genes of the three-spined stickleback (Gasterosteus aculeatus). One approach used standard PCR conditions and the other a combination of several measures to eliminate PCR artefacts. In both approaches, the amplicons obtained were cloned and sequenced. In the first, established approach, 24% of the clones represented artefacts, while in the second approach the number of artefacts were reduced ten-fold. Furthermore, it enabled easy differentiation between real alleles and artificial sequences. We also analysed the potential effects of such artefacts in genetic analysis and evolutionary interpretation, and found a slight reduction in the signature of positive selection and an increase in recombination events. Consequently, we strongly recommend to apply the new PCR approach described in this study when genotyping MHC or other polymorphic genes.  相似文献   

18.
Library preparation protocols for most sequencing technologies involve PCR amplification of the template DNA, which open the possibility that a given template DNA molecule is sequenced multiple times. Reads arising from this phenomenon, known as PCR duplicates, inflate the cost of sequencing and can jeopardize the reliability of affected experiments. Despite the pervasiveness of this artefact, our understanding of its causes and of its impact on downstream statistical analyses remains essentially empirical. Here, we develop a general quantitative model of amplification distortions in sequencing data sets, which we leverage to investigate the factors controlling the occurrence of PCR duplicates. We show that the PCR duplicate rate is determined primarily by the ratio between library complexity and sequencing depth, and that amplification noise (including in its dependence on the number of PCR cycles) only plays a secondary role for this artefact. We confirm our predictions using new and published RAD-seq libraries and provide a method to estimate library complexity and amplification noise in any data set containing PCR duplicates. We discuss how amplification-related artefacts impact downstream analyses, and in particular genotyping accuracy. The proposed framework unites the numerous observations made on PCR duplicates and will be useful to experimenters of all sequencing technologies where DNA availability is a concern.  相似文献   

19.
The advance of next generation sequencing (NGS) techniques provides an unprecedented opportunity to probe the enormous diversity of the immune repertoire by deep sequencing T-cell receptors (TCRs) and B-cell receptors (BCRs). However, an efficient and accurate analytical tool is still on demand to process the huge amount of data. We have developed a high-resolution analytical pipeline, Immune Monitor (“IMonitor”) to tackle this task. This method utilizes realignment to identify V(D)J genes and alleles after common local alignment. We compare IMonitor with other published tools by simulated and public rearranged sequences, and it demonstrates its superior performance in most aspects. Together with this, a methodology is developed to correct the PCR and sequencing errors and to minimize the PCR bias among various rearranged sequences with different V and J gene families. IMonitor provides general adaptation for sequences from all receptor chains of different species and outputs useful statistics and visualizations. In the final part of this article, we demonstrate its application on minimal residual disease detection in patients with B-cell acute lymphoblastic leukemia. In summary, this package would be of widespread usage for immune repertoire analysis.  相似文献   

20.
Genotyping of classical major histocompatibility complex (MHC) genes is challenging when they are hypervariable and occur in multiple copies. In this study, we used several different approaches to genotype the moderately variable MHC class I exon 3 (MHCIe3) and the highly polymorphic MHC class II exon 2 (MHCIIβe2) in the bluethroat (Luscinia svecica). Two family groups (eight individuals) were sequenced in replicates at both markers using Ion Torrent technology with both a single‐ and a dual‐indexed primer structure. Additionally, MHCIIβe2 was sequenced on Illumina MiSeq. Allele calling was conducted by modifications of the pipeline developed by Sommer et al. (BMC Genomics, 14, 2013, 542) and the software AmpliSAS. While the different genotyping strategies gave largely consistent results for MHCIe3, with a maximum of eight alleles per individual, MHCIIβe2 was remarkably complex with a maximum of 56 MHCIIβe2 alleles called for one individual. Each genotyping strategy detected on average 50%–82% of all MHCIIβe2 alleles per individual, but dropouts were largely allele‐specific and consistent within families for each strategy. The discrepancies among approaches indicate PCR biases caused by the platform‐specific primer tails. Further, AmpliSAS called fewer alleles than the modified Sommer pipeline. Our results demonstrate that allelic dropout is a significant problem when genotyping the hypervariable MHCIIβe2. As these genotyping errors are largely nonrandom and method‐specific, we caution against comparing genotypes across different genotyping strategies. Nevertheless, we conclude that high‐throughput approaches provide a major advance in the challenging task of genotyping hypervariable MHC loci, even though they may not reveal the complete allelic repertoire.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号