首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

The goal of haplotype assembly is to infer haplotypes of an individual from a mixture of sequenced chromosome fragments. Limited lengths of paired-end sequencing reads and inserts render haplotype assembly computationally challenging; in fact, most of the problem formulations are known to be NP-hard. Dimensions (and, therefore, difficulty) of the haplotype assembly problems keep increasing as the sequencing technology advances and the length of reads and inserts grow. The computational challenges are even more pronounced in the case of polyploid haplotypes, whose assembly is considerably more difficult than in the case of diploids. Fast, accurate, and scalable methods for haplotype assembly of diploid and polyploid organisms are needed.

Results

We develop a novel framework for diploid/polyploid haplotype assembly from high-throughput sequencing data. The method formulates the haplotype assembly problem as a semi-definite program and exploits its special structure – namely, the low rank of the underlying solution – to solve it rapidly and with high accuracy. The developed framework is applicable to both diploid and polyploid species. The code for SDhaP is freely available at https://sourceforge.net/projects/sdhap.

Conclusion

Extensive benchmarking tests on both real and simulated data show that the proposed algorithms outperform several well-known haplotype assembly methods in terms of either accuracy or speed or both. Useful recommendations for coverages needed to achieve near-optimal solutions are also provided.  相似文献   

2.
3.

Background

The mitochondrial cytochrome c oxidase subunit I (COI) gene is being used increasingly for evaluating inter- and intra-specific genetic diversity of ciliated protists. However, very few studies focus on assessing genetic divergence of the COI gene within individuals and how its presence might affect species identification and population structure analyses.

Methodology/Principal findings

We evaluated the genetic variation of the COI gene in five Paramecium species for a total of 147 clones derived from 21 individuals and 7 populations. We identified a total of 90 haplotypes with several individuals carrying more than one haplotype. Parsimony network and phylogenetic tree analyses revealed that intra-individual diversity had no effect in species identification and only a minor effect on population structure.

Conclusions

Our results suggest that the COI gene is a suitable marker for resolving inter- and intra-specific relationships of Paramecium spp.  相似文献   

4.
5.
6.

Background

Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information.

Methods

A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (π) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method.

Results

About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter π when fitting haplotypes compared to fitting markers.

Conclusions

Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy.  相似文献   

7.

Background

β2 adrenergic receptor (ADRβ2) polymorphisms including ADRβ2+46G>A have been reported to cause adverse outcomes in mild asthmatics. The extent to which ADRβ2 polymorphisms and in particular their haplotypes contribute to severe asthma is unknown.

Objective

To determine the association of ADRβ2 polymorphisms and haplotypes with asthma severity.

Methods

Caucasians (n = 2979) were genotyped for 11 ADRβ2 polymorphisms. The cohort (mean age 39.6, 60% female) included 2296 non-asthmatics, 386 mild asthmatics, 172 moderate asthmatics and 125 severe asthmatics. Haplotype frequency and haplotype pair for each subject was determined using the PHASE algorithm.

Results

The three asthmatic cohorts were comparable in age and gender but were distinguishable from each other in terms of symptoms, spirometry, medication use and health care utilisation (p <0.001). None of the polymorphisms showed a genotypic or allelic association with asthma diagnosis or severity. Nine haplotypes were identified and no association was found with asthma diagnosis or severity per se. Haplotype pair 2/4 was associated with asthma severity (Trend Test, OR 1.42, p = 0.0008) but not with asthma per se. Prevalence of haplotype pair 2/2 appeared to decrease with asthma severity (Trend Test, OR 0.78, p = 0.067). Two new haplotypes were identified, occurring exclusively in asthmatics at a frequency of ≥ 1%. In addition, a positive association between carriage of ADRβ2 +523*C and increased risk of atopy was discovered.

Conclusions

ADRβ2 haplotype pair 2/4 is associated with severe asthma and is consistent with findings of poor bronchodilator response in mild asthmatics who are also haplotype 2/4.  相似文献   

8.

Background

Evidence regarding the association of variation within ADRB2, the gene encoding the beta-adrenergic receptor 2 (ADRB2) with obesity and hypertension is exceedingly ambiguous. Despite negative reports, functional impacts of individual genetic variants have been reported. Also, functional haplotypes as well as haplotype combinations affecting expression levels in vivo of ADRB2 mRNA and protein as well as receptor sensitivity have been reported. The aim of the present study was therefore to evaluate if variations within ADRB2 as haplotypes or as haplotype combinations confer an increased prevalence of obesity and hypertension among adults.

Methodology/Principal Findings

We genotyped five variants required to capture common variation in a region including the ADRB2 locus in a population-based study of 6,514 unrelated, middle-aged Danes. Phases of the genotypes were estimated in silico. The variations were then investigated for their combined association with obesity, hypertension and related quantitative traits. The present study did not find consistent evidence for an association of ADRB2 variants with either obesity or hypertension when variations were analysed in a case-control study. The same lack of impact was also seen in the quantitative trait analyses, apart from nominal differences on waist-to-hip ratio and systolic blood pressure between specific haplotype combinations.

Conclusions/Significance

In a population-based sample of 6,514 Danes we found no consistent associations between five common variants which tag the ADRB2 locus and prevalence of obesity or hypertension neither when analysed as individual haplotypes nor as haplotype pairs.  相似文献   

9.

Background and Aims

The hypothesis of an ancient introduction, i.e. archaeophyte origin, is one of the most challenging questions in phylogeography. Arundo donax (Poaceae) is currently considered to be one of the worst invasive species globally, but it has also been widely utilzed by man across Eurasia for millennia. Despite a lack of phylogenetic data, recent literature has often speculated on its introduction to the Mediterranean region.

Methods

This study tests the hypothesis of its ancient introduction from Asia to the Mediterranean by using plastid DNA sequencing and morphometric analysis on 127 herbarium specimens collected across sub-tropical Eurasia. In addition, a bioclimatic species distribution model calibrated on 1221 Mediterranean localities was used to identify similar ecological niches in Asia.

Key Results

Despite analysis of several plastid DNA hypervariable sites and the identification of 13 haplotypes, A. donax was represented by a single haplotype from the Mediterranean to the Middle East. This haplotype is shared with invasive samples worldwide, and its nearest phylogenetic relatives are located in the Middle East. Morphometric data characterized this invasive clone by a robust morphotype distinguishable from all other Asian samples. The ecological niche modelling designated the southern Caspian Sea, southern Iran and the Indus Valley as the most suitable regions of origin in Asia for the invasive clone of A. donax.

Conclusions

Using an integrative approach, an ancient dispersion of this robust, polyploid and non-fruiting clone is hypothesized from the Middle East to the west, leading to its invasion throughout the Mediterranean Basin.  相似文献   

10.
Zhao JY  Sun JW  Gu ZY  Wang J  Wang EL  Yang XY  Qiao B  Duan WY  Huang GY  Wang HY 《PloS one》2012,7(2):e31644

Background

Clinical research indicates that periconceptional administration of folic acid can reduce the occurrence of congenital cardiac septal defects (CCSDs). The vital roles of folate exhibits in three ways: the unique methyl donor for DNA expression regulation, the de novo biosynthesis of purine and pyrimidine for DNA construction, and the serum homocysteine removal. Thymidylate synthase (TYMS) is the solo catalysis enzyme for the de novo synthesis of dTMP, which is the essential precursor of DNA biosynthesis and repair process. To examine the role of TYMS in Congenital Cardiac Septal Defects (CCSDs) risk, we investigated whether genetic polymorphisms in the TYMS gene associated with the CCSDs in a Han Chinese population.

Method

Polymorphisms in the noncoding region of TYMS were identified via direct sequencing in 32 unrelated individuals composed of half CCSDs and half control subjects. Nine SNPs and two insertion/deletion polymorphisms were genotyped from two independent case-control studies involving a total of 529 CCSDs patients and 876 healthy control participants. The associations were examined by both single polymorphism and haplotype tests using logistic regression.

Result

We found that TYMS polymorphisms were not related to the altered CCSDs risk, and even to the changed risk of VSDs subgroup, when tested in both studied groups separately or in combination. In the haplotype analysis, there were no haplotypes significantly associated with risks for CCSDs either.

Conclusion

Our results show no association between common genetic polymorphisms of the regulatory region of the TYMS gene and CCSDs in the Han Chinese population.  相似文献   

11.

Background

Atypical bovine spongiform encephalopathies (BSEs) are recently recognized prion diseases of cattle. Atypical BSEs are rare; approximately 30 cases have been identified worldwide. We tested prion gene (PRNP) haplotypes for an association with atypical BSE.

Methodology/Principle Findings

Haplotype tagging polymorphisms that characterize PRNP haplotypes from the promoter region through the three prime untranslated region of exon 3 (25.2 kb) were used to determine PRNP haplotypes of six available atypical BSE cases from Canada, France and the United States. One or two copies of a distinct PRNP haplotype were identified in five of the six cases (p = 1.3×10−4, two-tailed Fisher''s exact test; CI95% 0.263–0.901, difference between proportions). The haplotype spans a portion of PRNP that includes part of intron 2, the entire coding region of exon 3 and part of the three prime untranslated region of exon 3 (13 kb).

Conclusions/Significance

This result suggests that a genetic determinant in or near PRNP may influence susceptibility of cattle to atypical BSE.  相似文献   

12.

Background

CRISPR-Cas9 is a revolutionary genome editing technique that allows for efficient and directed alterations of the eukaryotic genome. This relatively new technology has already been used in a large number of ‘loss of function’ experiments in cultured cells. Despite its simplicity and efficiency, screening for mutated clones remains time-consuming, laborious and/or expensive.

Results

Here we report a high-throughput screening strategy that allows parallel screening of up to 96 clones, using next-generation sequencing. As a proof of principle, we used CRISPR-Cas9 to disrupt the coding sequence of the homeobox gene, Evx1 in mouse embryonic stem cells. We screened 67 CRISPR-Cas9 transfected clones simultaneously by next-generation sequencing on the Ion Torrent PGM. We were able to identify both homozygous and heterozygous Evx1 mutants, as well as mixed clones, which must be identified to maintain the integrity of subsequent experiments.

Conclusions

Our CRISPR-Cas9 screening strategy could be widely applied to screen for CRISPR-Cas9 mutants in a variety of contexts including the generation of mutant cell lines for in vitro research, the generation of transgenic organisms and for assessing the veracity of CRISPR-Cas9 homology directed repair. This technique is cost and time-effective, provides information on clonal heterogeneity and is adaptable for use on various sequencing platforms.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1002) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background

Human leukocyte antigen (HLA) is a group of genes that are extremely polymorphic among individuals and populations and have been associated with more than 100 different diseases and adverse drug effects. HLA typing is accordingly an important tool in clinical application, medical research, and population genetics. We have previously developed a phase-defined HLA gene sequencing method using MiSeq sequencing.

Results

Here we report a simple, high-throughput, and cost-effective sequencing method that includes normalized library preparation and adjustment of DNA molar concentration. We applied long-range PCR to amplify HLA-B for 96 samples followed by transposase-based library construction and multiplex sequencing with the MiSeq sequencer. After sequencing, we observed low variation in read percentages (0.2% to 1.55%) among the 96 demultiplexed samples. On this basis, all the samples were amenable to haplotype phasing using our phase-defined sequencing method. In our study, a sequencing depth of 800x was necessary and sufficient to achieve full phasing of HLA-B alleles with reliable assignment of the allelic sequence to the 8 digit level.

Conclusions

Our HLA sequencing method optimized for 96 multiplexing samples is highly time effective and cost effective and is especially suitable for automated multi-sample library preparation and sequencing.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-645) contains supplementary material, which is available to authorized users.  相似文献   

14.

Objective

CD5 plays a crucial role in autoimmunity and is a well-established genetic risk factor of developing RA. Recently, evidence of positive selection has been provided for the CD5 Pro224-Val471 haplotype in East Asian populations. The aim of the present work was to further analyze the functional relevance of non-synonymous CD5 polymorphisms conforming the ancestral and the newly derived haplotypes (Pro224-Ala471 and Pro224-Val471, respectively) as well as to investigate the potential role of CD5 on the development of SLE and/or SLE nephritis.

Methods

The CD5 SNPs rs2241002 (C/T; Pro224Leu) and rs2229177 (C/T; Ala471Val) were genotyped using TaqMan allelic discrimination assays in a total of 1,324 controls and 681 SLE patients of Spanish origin. In vitro analysis of CD3-mediated T cell proliferative and cytokine response profiles of healthy volunteers homozygous for the above mentioned CD5 haplotypes were also analyzed.

Results

T-cell proliferation and cytokine release were significantly increased showing a bias towards to a Th2 profile after CD3 cross-linking of peripheral mononuclear cells from healthy individuals homozygous for the ancestral Pro224-Ala471 (CC) haplotype, compared to the more recently derived Pro224-Val471 (CT). The same allelic combination was statistically associated with Lupus nephritis.

Conclusion

The ancestral Ala471 CD5 allele confers lymphocyte hyper-responsiveness to TCR/CD3 cross-linking and is associated with nephritis in SLE patients.  相似文献   

15.

Background and Aims

The olive (Olea europaea subsp. europaea) was domesticated in the Mediterranean area but its wild relatives are distributed over three continents, from the Mediterranean basin to South Africa and south-western Asia. Recent studies suggested that this crop originated in the Levant while a secondary diversification occurred in most westward areas. A possible contribution of the Saharan subspecies (subsp. laperrinei) has been highlighted, but the data available were too limited to draw definite conclusions. Here, patterns of genetic differentiation in the Mediterranean and Saharan olives are analysed to test for recent admixture between these taxa.

Methods

Nuclear microsatellite and plastid DNA (ptDNA) data were compiled from previous studies and completed for a sample of 470 cultivars, 390 wild Mediterranean trees and 270 Saharan olives. A network was reconstructed for the ptDNA haplotypes, while a Bayesian clustering method was applied to identify the main gene pools in the data set and then simulate and test for early generations of admixture between Mediterranean and Saharan olives.

Key Results

Four lineages of ptDNA haplotypes are recognized: three from the Mediterranean basin and one from the Sahara. Only one haplotype, primarily distributed in the Sahara, is shared between laperrinei and europaea. This haplotype is detected once in ‘Dhokar’, a cultivar from the Maghreb. Nuclear microsatellites show geographic patterns of genetic differentiation in the Mediterranean olive that reflect the primary origins of cultivars in the Levant, and indicate a high genetic differentiation between europaea and laperrinei. No first-generation hybrid between europaea and laperrinei is detected, but recent, reciprocal admixture between Mediterranean and Saharan subspecies is found in a few accessions, including ‘Dhokar’.

Conclusions

This study reports for the first time admixture between Mediterranean and Saharan olives. Although its contribution remains limited, Laperrine''s olive has been involved in the diversification of cultivated olives.  相似文献   

16.
17.

Background

Current methods for haplotype inference without pedigree information assume random mating populations. In animal and plant breeding, however, mating is often not random. A particular form of nonrandom mating occurs when parental individuals of opposite sex originate from distinct populations. In animal breeding this is called crossbreeding and hybridization in plant breeding. In these situations, association between marker and putative gene alleles might differ between the founding populations and origin of alleles should be accounted for in studies which estimate breeding values with marker data. The sequence of alleles from one parent constitutes one haplotype of an individual. Haplotypes thus reveal allele origin in data of crossbred individuals.

Results

We introduce a new method for haplotype inference without pedigree that allows nonrandom mating and that can use genotype data of the parental populations and of a crossbred population. The aim of the method is to estimate line origin of alleles. The method has a Bayesian set up with a Dirichlet Process as prior for the haplotypes in the two parental populations. The basic idea is that only a subset of the complete set of possible haplotypes is present in the population.

Conclusion

Line origin of approximately 95% of the alleles at heterozygous sites was assessed correctly in both simulated and real data. Comparing accuracy of haplotype frequencies inferred with the new algorithm to the accuracy of haplotype frequencies inferred with PHASE, an existing algorithm for haplotype inference, showed that the DP algorithm outperformed PHASE in situations of crossbreeding and that PHASE performed better in situations of random mating.  相似文献   

18.

Background

In livestock populations, missing genotypes on a large proportion of animals are a major problem to implement the estimation of marker-assisted breeding values using haplotypes. The objective of this article is to develop a method to predict haplotypes of animals that are not genotyped using mixed model equations and to investigate the effect of using these predicted haplotypes on the accuracy of marker-assisted breeding value estimation.

Methods

For genotyped animals, haplotypes were determined and for each animal the number of haplotype copies (nhc) was counted, i.e. 0, 1 or 2 copies. In a mixed model framework, nhc for each haplotype were predicted for ungenotyped animals as well as for genotyped animals using the additive genetic relationship matrix. The heritability of nhc was assumed to be 0.99, allowing for minor genotyping and haplotyping errors. The predicted nhc were subsequently used in marker-assisted breeding value estimation by applying random regression on these covariables. To evaluate the method, a population was simulated with one additive QTL and an additive polygenic genetic effect. The QTL was located in the middle of a haplotype based on SNP-markers.

Results

The accuracy of predicted haplotype copies for ungenotyped animals ranged between 0.59 and 0.64 depending on haplotype length. Because powerful BLUP-software was used, the method was computationally very efficient. The accuracy of total EBV increased for genotyped animals when marker-assisted breeding value estimation was compared with conventional breeding value estimation, but for ungenotyped animals the increase was marginal unless the heritability was smaller than 0.1. Haplotypes based on four markers yielded the highest accuracies and when only the nearest left marker was used, it yielded the lowest accuracy. The accuracy increased with increasing marker density. Accuracy of the total EBV approached that of gene-assisted BLUP when 4-marker haplotypes were used with a distance of 0.1 cM between the markers.

Conclusions

The proposed method is computationally very efficient and suitable for marker-assisted breeding value estimation in large livestock populations including effects of a number of known QTL. Marker-assisted breeding value estimation using predicted haplotypes increases accuracy especially for traits with low heritability.  相似文献   

19.
Yu X  Xie H  Wei B  Zhang M  Wang W  Wu J  Yan S  Zheng S  Zhou L 《PloS one》2011,6(11):e25933

Background

This work seeks to evaluate the association between the C/D ratios (plasma concentration of tacrolimus divided by daily dose of tacrolimus per body weight) of tacrolimus and the haplotypes of MDR1 gene combined by C1236T (rs1128503), G2677A/T (rs2032582) and C3435T (rs1045642), and to further determine the functional significance of haplotypes in the clinical pharmacokinetics of oral tacrolimus in Han Chinese liver transplant recipients.

Methodology/Principal Findings

The tacrolimus blood concentrations were continuously recorded for one month after initial administration, and the peripheral blood DNA from a total of 62 liver transplant recipients was extracted. Genotyping of C1236T, G2677A/T and C3435T was performed, and SNP frequency, Hardy-Weinberg equilibrium, linkage disequilibrium, haplotypes analysis and multiple testing were achieved by software PLINK. C/D ratios of different SNP groups or haplotype groups were compared, with a p value<0.05 considered statistically significant. Linkage studies revealed that C1236T, G2677A/T and C3435T are genetically associated with each other. Patients carrying T-T haplotype combined by C1236T and G2677A/T, and an additional T/T homozygote at either position would require higher dose of tacrolimus. Tacrolimus C/D ratios of liver transplant recipients varied significantly among different haplotype groups of MDR1 gene.

Conclusions

Our studies suggest that the genetic polymorphism could be used as a valuable molecular marker for the prediction of tacrolimus C/D ratios of liver transplant recipients.  相似文献   

20.

Background

Typhoid fever, caused by Salmonella enterica serovar Typhi (S. Typhi), is a major health problem especially in developing countries. Vaccines against typhoid are commonly used by travelers but less so by residents of endemic areas.

Methodology

We used single nucleotide polymorphism (SNP) typing to investigate the population structure of 372 S. Typhi isolated during a typhoid disease burden study and Vi vaccine trial in Kolkata, India. Approximately sixty thousand people were enrolled for fever surveillance for 19 months prior to, and 24 months following, Vi vaccination of one third of the study population (May 2003–December 2006, vaccinations given December 2004).

Principal Findings

A diverse S. Typhi population was detected, including 21 haplotypes. The most common were of the H58 haplogroup (69%), which included all multidrug resistant isolates (defined as resistance to chloramphenicol, ampicillin and co-trimoxazole). Quinolone resistance was particularly high among H58-G isolates (97% Nalidixic acid resistant, 30% with reduced susceptibility to ciprofloxacin). Multiple typhoid fever episodes were detected in 22 households, however household clustering was not associated with specific S. Typhi haplotypes.

Conclusions

Typhoid fever in Kolkata is caused by a diverse population of S. Typhi, however H58 haplotypes dominate and are associated with multidrug and quinolone resistance. Vi vaccination did not obviously impact on the haplotype population structure of the S. Typhi circulating during the study period.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号