共查询到20条相似文献,搜索用时 15 毫秒
1.
Background
Different classes of haplotype block algorithms exist and the ideal dataset to assess their performance would be to comprehensively re-sequence a large genomic region in a large population. Such data sets are expensive to collect. Alternatively, we performed coalescent simulations to generate haplotypes with a high marker density and compared block partitioning results from diversity based, LD based, and information theoretic algorithms under different values of SNP density and allele frequency. 相似文献2.
Philippe Lamy Claus L Andersen Lars Dyrskjot Niels Torring Carsten Wiuf 《BMC bioinformatics》2007,8(1):434
Background
Affymetrix SNP arrays can interrogate thousands of SNPs at the same time. This allows us to look at the genomic content of cancer cells and to investigate the underlying events leading to cancer. Genomic copy-numbers are today routinely derived from SNP array data, but the proposed algorithms for this task most often disregard the genotype information available from germline cells in paired germline-tumour samples. Including this information may deepen our understanding of the "true" biological situation e.g. by enabling analysis of allele specific copy-numbers. Here we rely on matched germline-tumour samples and have developed a Hidden Markov Model (HMM) to estimate allelic copy-number changes in tumour cells. Further with this approach we are able to estimate the proportion of normal cells in the tumour (mixture proportion). 相似文献3.
Background
DNA pooling is a technique to reduce genotyping effort while incurring only minor losses in accuracy of allele frequency estimates for single nucleotide polymorphism (SNP) markers. 相似文献4.
Jason C Ting Ying Ye George H Thomas Ingo Ruczinski Jonathan Pevsner 《BMC bioinformatics》2006,7(1):25-21
Background
A variety of diseases are caused by chromosomal abnormalities such as aneuploidies (having an abnormal number of chromosomes), microdeletions, microduplications, and uniparental disomy. High density single nucleotide polymorphism (SNP) microarrays provide information on chromosomal copy number changes, as well as genotype (heterozygosity and homozygosity). SNP array studies generate multiple types of data for each SNP site, some with more than 100,000 SNPs represented on each array. The identification of different classes of anomalies within SNP data has been challenging. 相似文献5.
Hsin-Chou Yang Hsin-Chi Lin Meijyh Kang Chun-Houh Chen Chien-Wei Lin Ling-Hui Li Jer-Yuarn Wu Yuan-Tsong Chen Wen-Harn Pan 《BMC bioinformatics》2011,12(1):100
Background
Genome-wide single-nucleotide polymorphism (SNP) arrays containing hundreds of thousands of SNPs from the human genome have proven useful for studying important human genome questions. Data quality of SNP arrays plays a key role in the accuracy and precision of downstream data analyses. However, good indices for assessing data quality of SNP arrays have not yet been developed. 相似文献6.
Tianwei Yu Hui Ye Wei Sun Ker-Chau Li Zugen Chen Sharoni Jacobs Dione K Bailey David T Wong Xiaofeng Zhou 《BMC bioinformatics》2007,8(1):145
Background
DNA copy number aberration (CNA) is one of the key characteristics of cancer cells. Recent studies demonstrated the feasibility of utilizing high density single nucleotide polymorphism (SNP) genotyping arrays to detect CNA. Compared with the two-color array-based comparative genomic hybridization (array-CGH), the SNP arrays offer much higher probe density and lower signal-to-noise ratio at the single SNP level. To accurately identify small segments of CNA from SNP array data, segmentation methods that are sensitive to CNA while resistant to noise are required. 相似文献7.
Rianne van Binsbergen Marco CAM Bink Mario PL Calus Fred A van Eeuwijk Ben J Hayes Ina Hulsegge Roel F Veerkamp 《遗传、选种与进化》2014,46(1):41
Background
The use of whole-genome sequence data can lead to higher accuracy in genome-wide association studies and genomic predictions. However, to benefit from whole-genome sequence data, a large dataset of sequenced individuals is needed. Imputation from SNP panels, such as the Illumina BovineSNP50 BeadChip and Illumina BovineHD BeadChip, to whole-genome sequence data is an attractive and less expensive approach to obtain whole-genome sequence genotypes for a large number of individuals than sequencing all individuals. Our objective was to investigate accuracy of imputation from lower density SNP panels to whole-genome sequence data in a typical dataset for cattle.Methods
Whole-genome sequence data of chromosome 1 (1737 471 SNPs) for 114 Holstein Friesian bulls were used. Beagle software was used for imputation from the BovineSNP50 (3132 SNPs) and BovineHD (40 492 SNPs) beadchips. Accuracy was calculated as the correlation between observed and imputed genotypes and assessed by five-fold cross-validation. Three scenarios S40, S60 and S80 with respectively 40%, 60%, and 80% of the individuals as reference individuals were investigated.Results
Mean accuracies of imputation per SNP from the BovineHD panel to sequence data and from the BovineSNP50 panel to sequence data for scenarios S40 and S80 ranged from 0.77 to 0.83 and from 0.37 to 0.46, respectively. Stepwise imputation from the BovineSNP50 to BovineHD panel and then to sequence data for scenario S40 improved accuracy per SNP to 0.65 but it varied considerably between SNPs.Conclusions
Accuracy of imputation to whole-genome sequence data was generally high for imputation from the BovineHD beadchip, but was low from the BovineSNP50 beadchip. Stepwise imputation from the BovineSNP50 to the BovineHD beadchip and then to sequence data substantially improved accuracy of imputation. SNPs with a low minor allele frequency were more difficult to impute correctly and the reliability of imputation varied more. Linkage disequilibrium between an imputed SNP and the SNP on the lower density panel, minor allele frequency of the imputed SNP and size of the reference group affected imputation reliability. 相似文献8.
Genome-wide and local pattern of linkage disequilibrium and persistence of phase for 3 Danish pig breeds 总被引:1,自引:0,他引:1
Background
A genome wide association study for litter size in Norwegian White Sheep (NWS) was conducted using the recently developed ovine 50K SNP chip from Illumina. After genotyping 378 progeny tested artificial insemination (AI) rams, a GWAS analysis was performed on estimated breeding values (EBVs) for litter size.Results
A QTL-region was identified on sheep chromosome 5, close to the growth differentiation factor 9 (GDF9), which is known to be a strong candidate gene for increased ovulation rate/litter size. Sequencing of the GDF9 coding region in the most extreme sires (high and low BLUP values) revealed a single nucleotide polymorphism (c.1111G>A), responsible for a Val→Met substitution at position 371 (V371M). This polymorphism has previously been identified in Belclare and Cambridge sheep, but was not found to be associated with fertility. In our NWS-population the c.1111G>A SNP showed stronger association with litter size than any other single SNP on the Illumina 50K ovine SNP chip. Based on the estimated breeding values, daughters of AI rams homozygous for c.1111A will produce minimum 0.46 - 0.57 additional lambs compared to daughters of wild-type rams.Conclusion
We have identified a missense mutation in the bioactive part of the GDF9 protein that shows strong association with litter size in NWS. Based on the NWS breeding history and the marked increase in the c.1111A allele frequency in the AI ram population since 1983, we hypothesize that c.1111A allele originate from Finnish landrace imported to Norway around 1970. Because of the widespread use of Finnish landrace and the fact that the ewes homozygous for the c.1111A allele are reported to be fertile, we expect the commercial impact of this mutation to be high. 相似文献9.
Cari A. Schmitz Carley Joseph J. Coombs David S. Douches Paul C. Bethke Jiwan P. Palta Richard G. Novy Jeffrey B. Endelman 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2017,130(4):717-726
Key message
New software to make tetraploid genotype calls from SNP array data was developed, which uses hierarchical clustering and multiple F1 populations to calibrate the relationship between signal intensity and allele dosage.Abstract
SNP arrays are transforming breeding and genetics research for autotetraploids. To fully utilize these arrays, the relationship between signal intensity and allele dosage must be calibrated for each marker. We developed an improved computational method to automate this process, which is provided as the R package ClusterCall. In the training phase of the algorithm, hierarchical clustering within an F1 population is used to group samples with similar intensity values, and allele dosages are assigned to clusters based on expected segregation ratios. In the prediction phase, multiple F1 populations and the prediction set are clustered together, and the genotype for each cluster is the mode of the training set samples. A concordance metric, defined as the proportion of training set samples equal to the mode, can be used to eliminate unreliable markers and compare different algorithms. Across three potato families genotyped with an 8K SNP array, ClusterCall scored 5729 markers with at least 0.95 concordance (94.6% of its total), compared to 5325 with the software fitTetra (82.5% of its total). The three families were used to predict genotypes for 5218 SNPs in the SolCAP diversity panel, compared with 3521 SNPs in a previous study in which genotypes were called manually. One of the additional markers produced a significant association for vine maturity near a well-known causal locus on chromosome 5. In conclusion, when multiple F1 populations are available, ClusterCall is an efficient method for accurate, autotetraploid genotype calling that enables the use of SNP data for research and plant breeding.10.
Background
Dyslipidemia and overweight are common issues in children. Identifying genetic markers of risk could lead to targeted interventions. A polymorphism of SNP rs7566605 near insulin-induced gene 2 (INSIG2) has been identified as a strong candidate gene for obesity, through its feedback control of lipid synthesis.Objective
To identify polymorphisms in INSIG2 which are associated with overweight (BMI ≥ 85% for age) and dyslipidemia in children. Hypothesis: The C allele of rs7566605 would be significantly associated with BMI and LDL.Design/Methods
We genotyped 15 SNPs in/near INSIG2 in 1,058 healthy children (53% non-Hispanic white (NHW), 37% overweight) participating in a school based study. Genotype was compared with BMI and lipid markers, adjusting for age, gender, and puberty.Results
We found a significant association between the SNP rs12464355 and LDL in NHW children, p < 0.001. The G allele is protective (lower LDL). A different SNP was associated with overweight in NHW: rs17047757. SNP rs7566605 was not associated with overweight or lipid levels.Conclusions
We identified novel genetic associations between INSIG2 and both overweight and LDL in NHW children. Polymorphisms in INSIG2 may be important in the development of obesity through its effects on lipid regulation. 相似文献11.
Sunduimijid Bolormaa Jennie E Pryce Kathryn E Kemper Ben J Hayes Yuandan Zhang Bruce Tier William Barendse Antonio Reverter Mike E Goddard 《遗传、选种与进化》2013,45(1):43
Background
The apparent effect of a single nucleotide polymorphism (SNP) on phenotype depends on the linkage disequilibrium (LD) between the SNP and a quantitative trait locus (QTL). However, the phase of LD between a SNP and a QTL may differ between Bos indicus and Bos taurus because they diverged at least one hundred thousand years ago. Here, we test the hypothesis that the apparent effect of a SNP on a quantitative trait depends on whether the SNP allele is inherited from a Bos taurus or Bos indicus ancestor.Methods
Phenotype data on one or more traits and SNP genotype data for 10 181 cattle from Bos taurus, Bos indicus and composite breeds were used. All animals had genotypes for 729 068 SNPs (real or imputed). Chromosome segments were classified as originating from B. indicus or B. taurus on the basis of the haplotype of SNP alleles they contained. Consequently, SNP alleles were classified according to their sub-species origin. Three models were used for the association study: (1) conventional GWAS (genome-wide association study), fitting a single SNP effect regardless of subspecies origin, (2) interaction GWAS, fitting an interaction between SNP and subspecies-origin, and (3) best variable GWAS, fitting the most significant combination of SNP and sub-species origin.Results
Fitting an interaction between SNP and subspecies origin resulted in more significant SNPs (i.e. more power) than a conventional GWAS. Thus, the effect of a SNP depends on the subspecies that the allele originates from. Also, most QTL segregated in only one subspecies, suggesting that many mutations that affect the traits studied occurred after divergence of the subspecies or the mutation became fixed or was lost in one of the subspecies.Conclusions
The results imply that GWAS and genomic selection could gain power by distinguishing SNP alleles based on their subspecies origin, and that only few QTL segregate in both B. indicus and B. taurus cattle. Thus, the QTL that segregate in current populations likely resulted from mutations that occurred in one of the subspecies and can have both positive and negative effects on the traits. There was no evidence that selection has increased the frequency of alleles that increase body weight. 相似文献12.
Background
Estimation of allele frequency is of fundamental importance in population genetic analyses and in association mapping. In most studies using next-generation sequencing, a cost effective approach is to use medium or low-coverage data (e.g., < 15X). However, SNP calling and allele frequency estimation in such studies is associated with substantial statistical uncertainty because of varying coverage and high error rates.Results
We evaluate a new maximum likelihood method for estimating allele frequencies in low and medium coverage next-generation sequencing data. The method is based on integrating over uncertainty in the data for each individual rather than first calling genotypes. This method can be applied to directly test for associations in case/control studies. We use simulations to compare the likelihood method to methods based on genotype calling, and show that the likelihood method outperforms the genotype calling methods in terms of: (1) accuracy of allele frequency estimation, (2) accuracy of the estimation of the distribution of allele frequencies across neutrally evolving sites, and (3) statistical power in association mapping studies. Using real re-sequencing data from 200 individuals obtained from an exon-capture experiment, we show that the patterns observed in the simulations are also found in real data.Conclusions
Overall, our results suggest that association mapping and estimation of allele frequencies should not be based on genotype calling in low to medium coverage data. Furthermore, if genotype calling methods are used, it is usually better not to filter genotypes based on the call confidence score. 相似文献13.
Yoko Fukuda Yasuo Nakahara Hidetoshi Date Yuji Takahashi Jun Goto Akinori Miyashita Ryozo Kuwano Hiroki Adachi Eiji Nakamura Shoji Tsuji 《BMC bioinformatics》2009,10(1):121-9
Background
During this recent decade, microarray-based single nucleotide polymorphism (SNP) data are becoming more widely used as markers for linkage analysis in the identification of loci for disease-associated genes. Although microarray-based SNP analyses have markedly reduced genotyping time and cost compared with microsatellite-based analyses, applying these enormous data to linkage analysis programs is a time-consuming step, thus, necessitating a high-throughput platform. 相似文献14.
Background
Single nucleotide polymorphisms (SNPs) are DNA sequence variations, occurring when a single nucleotide – adenine (A), thymine (T), cytosine (C) or guanine (G) – is altered. Arguably, SNPs account for more than 90% of human genetic variation. Our laboratory has developed a highly redundant SNP genotyping assay consisting of multiple probes with signals from multiple channels for a single SNP, based on arrayed primer extension (APEX). This mini-sequencing method is a powerful combination of a highly parallel microarray with distinctive Sanger-based dideoxy terminator sequencing chemistry. Using this microarray platform, our current genotype calling system (known as SNP Chart) is capable of calling single SNP genotypes by manual inspection of the APEX data, which is time-consuming and exposed to user subjectivity bias. 相似文献15.
Background
The putative promoter of the holocarboxylase synthetase (HLCS) gene on chromosome 21 is hypermethylated in placental tissues and could be detected as a fetal-specific DNA marker in maternal plasma. Detection of fetal trisomy 21 (T21) has been demonstrated by an epigenetic-genetic chromosome dosage approach where the amount of hypermethylated HLCS in maternal plasma is normalized using a fetal genetic marker on the Y chromosome as a chromosome dosage reference marker. We explore if this method can be applied on both male and female fetuses with the use of a paternally-inherited fetal single nucleotide polymorphism (SNP) allele on a reference chromosome for chromosome dosage normalization.Methodology
We quantified hypermethylated HLCS molecules using methylation-sensitive restriction endonuclease digestion followed by real-time or digital PCR analyses. For chromosome dosage analysis, we compared the amount of digestion-resistant HLCS to that of a SNP allele (rs6636, a C/G SNP) that the fetus has inherited from the father but absent in the pregnant mother.Principal Findings
Using a fetal-specific SNP allele on a reference chromosome, we analyzed 20 euploid and nine T21 placental tissue samples. All samples with the fetal-specific C allele were correctly classified. One sample from each of the euploid and T21 groups were misclassified when the fetal-specific G allele was used as the reference marker. We then analyzed 33 euploid and 14 T21 maternal plasma samples. All but one sample from each of the euploid and T21 groups were correctly classified using the fetal-specific C allele, while correct classification was achieved for all samples using the fetal-specific G allele as the reference marker.Conclusions
As a proof-of-concept study, we have demonstrated that the epigenetic-genetic chromosome dosage approach can be applied to the prenatal diagnosis of trisomy 21 for both male and female fetuses. 相似文献16.
Volodymyr Dvornyk Ji-Rong Long Dong-Hai Xiong Peng-Yuan Liu Lan-Juan Zhao Hui Shen Yuan-Yuan Zhang Yong-Jun Liu Sonia Rocha-Sanchez Peng Xiao Robert R Recker Hong-Wen Deng 《BMC genetics》2004,5(1):1-15
Background
Public SNP databases are frequently used to choose SNPs for candidate genes in the association and linkage studies of complex disorders. However, their utility for such studies of diseases with ethnic-dependent background has never been evaluated.Results
To estimate the accuracy and completeness of SNP public databases, we analyzed the allele frequencies of 41 SNPs in 10 candidate genes for obesity and/or osteoporosis in a large American-Caucasian sample (1,873 individuals from 405 nuclear families) by PCR-invader assay. We compared our results with those from the databases and other published studies. Of the 41 SNPs, 8 were monomorphic in our sample. Twelve were reported for the first time for Caucasians and the other 29 SNPs in our sample essentially confirmed the respective allele frequencies for Caucasians in the databases and previous studies. The comparison of our data with other ethnic groups showed significant differentiation between the three major world ethnic groups at some SNPs (Caucasians and Africans differed at 3 of the 18 shared SNPs, and Caucasians and Asians differed at 13 of the 22 shared SNPs). This genetic differentiation may have an important implication for studying the well-known ethnic differences in the prevalence of obesity and osteoporosis, and complex disorders in general.Conclusion
A comparative analysis of the SNP data of the candidate genes obtained in the present study, as well as those retrieved from the public domain, suggests that the databases may currently have serious limitations for studying complex disorders with an ethnic-dependent background due to the incomplete and uneven representation of the candidate SNPs in the databases for the major ethnic groups. This conclusion attests to the imperative necessity of large-scale and accurate characterization of these SNPs in different ethnic groups. 相似文献17.
Martin Storr Dominik Emmerdinger Julia Diegelmann Simone Pfennig Thomas Ochsenkühn Burkhard G?ke Peter Lohse Stephan Brand 《PloS one》2010,5(2)
Background
Recent evidence suggests a crucial role of the endocannabinoid system, including the cannabinoid 1 receptor (CNR1), in intestinal inflammation. We therefore investigated the influence of the CNR1 1359 G/A (p.Thr453Thr; rs1049353) single nucleotide polymorphism (SNP) on disease susceptibility and phenotype in patients with ulcerative colitis (UC) and Crohn''s disease (CD).Methods
Genomic DNA from 579 phenotypically well-characterized individuals was analyzed for the CNR1 1359 G/A SNP. Amongst these were 166 patients with UC, 216 patients with CD, and 197 healthy controls.Results
Compared to healthy controls, subjects A/A homozygous for the CNR1 1359 G/A SNP had a reduced risk to develop UC (p = 0.01, OR 0.30, 95% CI 0.12–0.78). The polymorphism did not modulate CD susceptibility, but carriers of the minor A allele had a lower body mass index than G/G wildtype carriers (p = 0.0005). In addition, homozygous carriers of the G allele were more likely to develop CD before 40 years of age (p = 5.9×10−7) than carriers of the A allele.Conclusion
The CNR1 p.Thr453Thr polymorphism appears to modulate UC susceptibility and the CD phenotype. The endocannabinoid system may influence the manifestation of inflammatory bowel diseases, suggesting endocannabinoids as potential target for future therapies. 相似文献18.
Background
The definition of human MHC class I haplotypes through association of HLA-A, HLA-Cw and HLA-B has been used to analyze ethnicity, population migrations and disease association.Results
Here, we present HLA-E allele haplotype association and population linkage disequilibrium (LD) analysis within the ~1.3 Mb bounded by HLA-B/Cw and HLA-A to increase the resolution of identified class I haplotypes. Through local breakdown of LD, we inferred ancestral recombination points both upstream and downstream of HLA-E contributing to alternative block structures within previously identified haplotypes. Through single nucleotide polymorphism (SNP) analysis of the MHC region, we also confirmed the essential genetic fixity, previously inferred by MHC allele analysis, of three conserved extended haplotypes (CEHs), and we demonstrated that commercially-available SNP analysis can be used in the MHC to help define CEHs and CEH fragments.Conclusion
We conclude that to generate high-resolution maps for relating MHC haplotypes to disease susceptibility, both SNP and MHC allele analysis must be conducted as complementary techniques. 相似文献19.
María Teruel Jose-Ezequiel Martin Carlos González-Juanatey Raquel López-Mejias Jose A Miranda-Filloy Ricardo Blanco Alejandro Balsa Dora Pascual-Salcedo Luis Rodriguez-Rodriguez Benjamin Fernández-Gutierrez Ana M Ortiz Isidoro González-Alvaro Carmen Gómez-Vaquero Nunzio Bottini Javier Llorca Miguel A González-Gay Javier Martin 《Arthritis research & therapy》2011,13(4):R116-6
Introduction
Acid phosphatase locus 1 (ACP1) encodes a low molecular weight phosphotyrosine phosphatase implicated in a number of different biological functions in the cell. The aim of this study was to determine the contribution of ACP1 polymorphisms to susceptibility to rheumatoid arthritis (RA), as well as the potential contribution of these polymorphisms to the increased risk of cardiovascular disease (CV) observed in RA patients.Methods
A set of 1,603 Spanish RA patients and 1,877 healthy controls were included in the study. Information related to the presence/absence of CV events was obtained from 1,284 of these participants. All individuals were genotyped for four ACP1 single-nucleotide polymorphisms (SNPs), rs10167992, rs11553742, rs7576247, and rs3828329, using a predesigned TaqMan SNP genotyping assay. Classical ACP1 alleles (*A, *B and *C) were imputed with SNP data.Results
No association between ACP1 gene polymorphisms and susceptibility to RA was observed. However, when RA patients were stratified according to the presence or absence of CV events, an association between rs11553742*T and CV events was found (P = 0.012, odds ratio (OR) = 2.62 (1.24 to 5.53)). Likewise, the ACP1*C allele showed evidence of association with CV events in patients with RA (P = 0.024, OR = 2.43).Conclusions
Our data show that the ACP1*C allele influences the risk of CV events in patients with RA. 相似文献20.