共查询到20条相似文献,搜索用时 0 毫秒
1.
E. L. Nicolazzi S. Biffani F. Biscarini P. Orozco ter Wengel A. Caprera N. Nazzicari A. Stella 《Animal genetics》2015,46(4):343-353
Since the beginning of the genomic era, the number of available single nucleotide polymorphism (SNP) arrays has grown considerably. In the bovine species alone, 11 SNP chips not completely covered by intellectual property are currently available, and the number is growing. Genomic/genotype data are not standardized, and this hampers its exchange and integration. In addition, software used for the analyses of these data usually requires not standard (i.e. case specific) input files which, considering the large amount of data to be handled, require at least some programming skills in their production. In this work, we describe a software toolkit for SNP array data management, imputation, genome‐wide association studies, population genetics and genomic selection. However, this toolkit does not solve the critical need for standardization of the genotypic data and software input files. It only highlights the chaotic situation each researcher has to face on a daily basis and gives some helpful advice on the currently available tools in order to navigate the SNP array data complexity. 相似文献
2.
In genome‐wide association studies, quality control (QC) of genotypes is important to avoid spurious results. It is also important to maintain long‐term data integrity, particularly in settings with ongoing genotyping (e.g. estimation of genomic breeding values). Here we discuss snpqc , a fully automated pipeline to perform QC analyses of Illumina SNP array data. It applies a wide range of common quality metrics with user‐defined filtering thresholds to generate a comprehensive QC report and a filtered dataset, including a genomic relationship matrix, ready for further downstream analyses which make it amenable for integration in high‐throughput environments. snpqc also builds a database to store genotypic, phenotypic and quality metrics to ensure data integrity and the option of integrating more samples from subsequent runs. The program is generic across species and array designs, providing a convenient interface between the genotyping laboratory and downstream genome‐wide association study or genomic prediction. 相似文献
3.
Dubé JB Johansen CT Hegele RA 《BioEssays : news and reviews in molecular, cellular and developmental biology》2011,33(6):430-437
The concentration of low-density lipoprotein (LDL) cholesterol (C) in plasma is a key determinant of cardiovascular disease risk and human genetic studies have long endeavoured to elucidate the pathways that regulate LDL metabolism. Massive genome-wide association studies (GWASs) of common genetic variation associated with LDL-C in the population have implicated SORT1 in LDL metabolism. Using experimental paradigms and standards appropriate for understanding the mechanisms by which common variants alter phenotypic expression, three recent publications have presented divergent and even contradictory findings. Interestingly, although these reports each linked SORT1 to LDL metabolism, they did not agree on a mechanism to explain the association. Here, we review recent mechanistic studies of SORT1 - the first gene identified by GWAS as a determinant of plasma LDL-C to be evaluated mechanistically. 相似文献
4.
Y. L. Bernal Rubio J. L. Gualdrón Duarte R. O. Bates C. W. Ernst D. Nonneman G. A. Rohrer A. King S. D. Shackelford T. L. Wheeler R. J. C. Cantet J. P. Steibel 《Animal genetics》2016,47(1):36-48
Genome‐wide association (GWA) studies based on GBLUP models are a common practice in animal breeding. However, effect sizes of GWA tests are small, requiring larger sample sizes to enhance power of detection of rare variants. Because of difficulties in increasing sample size in animal populations, one alternative is to implement a meta‐analysis (MA), combining information and results from independent GWA studies. Although this methodology has been used widely in human genetics, implementation in animal breeding has been limited. Thus, we present methods to implement a MA of GWA, describing the proper approach to compute weights derived from multiple genomic evaluations based on animal‐centric GBLUP models. Application to real datasets shows that MA increases power of detection of associations in comparison with population‐level GWA, allowing for population structure and heterogeneity of variance components across populations to be accounted for. Another advantage of MA is that it does not require access to genotype data that is required for a joint analysis. Scripts related to the implementation of this approach, which consider the strength of association as well as the sign, are distributed and thus account for heterogeneity in association phase between QTL and SNPs. Thus, MA of GWA is an attractive alternative to summarizing results from multiple genomic studies, avoiding restrictions with genotype data sharing, definition of fixed effects and different scales of measurement of evaluated traits. 相似文献
5.
Mathew Littlejohn Talia Grala Kathryn Sanders Caroline Walker Garry Waghorn Kevin Macdonald Richard Spelman Steve Davis Russell Snell 《Animal genetics》2012,43(6):781-784
Animal growth relative to food energy input is of key importance to agricultural production. Several recent studies highlighted genetic markers associated with food conversion efficiency in beef cattle, and there is now a requirement to validate these associations in additional populations and to assess their potential utility for selecting animals with enhanced food‐use efficiency. The current analysis tested a population of dairy cattle using 138 DNA markers previously associated with food intake and growth in a whole‐genome association analysis of beef animals. Although seven markers showed point‐wise significance at P < 0.05, none of the single‐nucleotide polymorphisms tested were significantly associated with food conversion efficiency after correction for multiple testing. These data do not support the involvement of this subset of previously implicated markers in the food conversion efficiency of the physiologically distinct New Zealand Holstein‐Friesian dairy breed. 相似文献
6.
Previously, a single nucleotide polymorphism (SNP) related to gait type was identified at position 22 999 655 of chromosome 23 in the coding region of DMRT3 (DMRT3:Ser301Ter) by showing that a cytosine (C) to adenine (A) mutation of this SNP induced pace in the Icelandic horse. We investigated the effect of DMRT3:Ser301Ter on the gait of Hokkaido Native Horses, a Japanese native breed, and examined genetic factors other than DMRT3 by exploring genome‐wide SNPs related to gait determination. All animals exhibiting pace were AA for DMRT3:Ser301Ter, confirming the association of DMRT3:Ser301Ter with gait determination; however, 14.3% of the animals exhibiting trot also had AA for DMRT3:Ser301Ter, suggesting the presence of another factor(s) cooperatively working with DMRT3:Ser301Ter for gait determination. SNPs on chromosomes 13 and 23 were detected by genome‐wide association analysis (false discovery rate <0.05), although SNPs on chromosome 23 were all located in the vicinity of DMRT3:Ser301Ter, confirming the association with DMRT3. A genome‐wide association study targeting only animals with AA for DMRT3:Ser301Ter to examine genetic factors cooperatively working with DMRT3:Ser301Ter for gait determination suggested associations of 23 SNPs on six chromosomes. In a series of analyses of the effect of a maternal factor (dam's gait) on gait determination, the effect was suggested in comparison of the frequencies of exhibiting pace in gait checks in only two animal groups having dams with different DMRT3:Ser301Ter genotypes (P < 0.05), suggesting that the gait of the dam does not have a major effect on whether progeny homozygous for the DMRT3:Ser301Ter mutation will preferentially pace or trot. 相似文献
7.
Matthew W. Horton Eriko Sasaki Maarten Koornneef Magnus Nordborg 《Plant, cell & environment》2016,39(11):2570-2579
The capacity to tolerate freezing temperatures limits the geographical distribution of many plants, including several species of agricultural importance. However, the genes involved in freezing tolerance remain largely unknown. Here, we describe the variation in constitutive freezing tolerance that occurs among worldwide accessions of Arabidopsis thaliana. We found that although plants from high latitudes tend to be more freezing tolerant than plants from low latitudes, the environmental factors that shape cold adaptation differ across the species range. Consistent with this, we found that the genetic architecture of freezing tolerance also differs across its range. Conventional genome‐wide association studies helped identify a priori and other promising candidate genes. However, simultaneously modelling climate variables and freezing tolerance together pinpointed other excellent a priori candidate genes. This suggests that if the selective factor underlying phenotypic variation is known, multi‐trait mixed models may aid in identifying the genes that underlie adaptation. 相似文献
8.
S. Tsairidou A. R. Allen R. Pong‐Wong S. H. McBride D. M. Wright O. Matika C. M. Pooley S. W. J. McDowell E. J. Glass R. A. Skuce S. C. Bishop J. A. Woolliams 《Animal genetics》2018,49(2):103-109
Genetic selection of cattle more resistant to bovine tuberculosis (bTB) may offer a complementary control strategy. Hypothesising underlying non‐additive genetic variation, we present an approach using genome‐wide high density markers to identify genomic loci with dominance effects on bTB resistance and to test previously published regions with heterozygote advantage in bTB. Our data comprised 1151 Holstein–Friesian cows from Northern Ireland, confirmed bTB cases and controls, genotyped with the 700K Illumina BeadChip. Genome‐wide markers were tested for associations between heterozygosity and bTB status using marker‐based relationships. Results were tested for robustness against genetic structure, and the genotypic frequencies of a significant locus were tested for departures from Hardy‐Weinberg equilibrium. Genomic regions identified in our study and in previous publications were tested for dominance effects. Genotypic effects were estimated through ASReml mixed models. A SNP (rs43032684) on chromosome 6 was significant at the chromosome‐wide level, explaining 1.7% of the phenotypic variance. In the controls, there were fewer heterozygotes for rs43032684 (P < 0.01) with the genotypic values suggesting that heterozygosity confers a heterozygote disadvantage. The region surrounding rs43032684 had a significant dominance effect (P < 0.01). SNP rs43032684 resides within a pseudogene with a parental gene involved in macrophage response to infection and within a copy‐number‐variation region previously associated with nematode resistance. No dominance effect was found for the region on chromosome 11, as indicated by a previous candidate region bTB study. These findings require further validation with large‐scale data. 相似文献
9.
Genomic prediction from whole-genome sequence data is attractive, as the accuracy of genomic prediction is no longer bounded by extent of linkage disequilibrium between DNA markers and causal mutations affecting the trait, given the causal mutations are in the data set. A cost-effective strategy could be to sequence a small proportion of the population, and impute sequence data to the rest of the reference population. Here, we describe strategies for selecting individuals for sequencing, based on either pedigree relationships or haplotype diversity. Performance of these strategies (number of variants detected and accuracy of imputation) were evaluated in sequence data simulated through a real Belgian Blue cattle pedigree. A strategy (AHAP), which selected a subset of individuals for sequencing that maximized the number of unique haplotypes (from single-nucleotide polymorphism panel data) sequenced gave good performance across a range of variant minor allele frequencies. We then investigated the optimum number of individuals to sequence by fold coverage given a maximum total sequencing effort. At 600 total fold coverage (x 600), the optimum strategy was to sequence 75 individuals at eightfold coverage. Finally, we investigated the accuracy of genomic predictions that could be achieved. The advantage of using imputed sequence data compared with dense SNP array genotypes was highly dependent on the allele frequency spectrum of the causative mutations affecting the trait. When this followed a neutral distribution, the advantage of the imputed sequence data was small; however, when the causal mutations all had low minor allele frequencies, using the sequence data improved the accuracy of genomic prediction by up to 30%. 相似文献
10.
With recent advances in genotyping and sequencing technologies,many disease susceptibility loci have been identified.However,much of the genetic heritability remains unexplained and the replication rate between independent studies is still low.Meanwhile,there have been increasing efforts on functional annotations of the entire human genome,such as the Encyclopedia of DNA Elements(ENCODE)project and other similar projects.It has been shown that incorporating these functional annotations to prioritize genome wide association signals may help identify true association signals.However,to our knowledge,the extent of the improvement when functional annotation data are considered has not been studied in the literature.In this article,we propose a statistical framework to estimate the improvement in replication rate with annotation data,and apply it to Crohn’s disease and DNase I hypersensitive sites.The results show that with cell line specific functional annotations,the expected replication rate is improved,but only at modest level. 相似文献
11.
Genetic variants associated with disease outcomes can be used to develop personalized treatment. To reach this precision medicine goal, hundreds of large‐scale genome‐wide association studies (GWAS) have been conducted in the past decade to search for promising genetic variants associated with various traits. They have successfully identified tens of thousands of disease‐related variants. However, in total these identified variants explain only part of the variation for most complex traits. There remain many genetic variants with small effect sizes to be discovered, which calls for the development of (a) GWAS with more samples and more comprehensively genotyped variants, for example, the NHLBI Trans‐Omics for Precision Medicine (TOPMed) Program is planning to conduct whole genome sequencing on over 100 000 individuals; and (b) novel and more powerful statistical analysis methods. The current dominating GWAS analysis approach is the “single trait” association test, despite the fact that many GWAS are conducted in deeply phenotyped cohorts including many correlated and well‐characterized outcomes, which can help improve the power to detect novel variants if properly analyzed, as suggested by increasing evidence that pleiotropy, where a genetic variant affects multiple traits, is the norm in genome‐phenome associations. We aim to develop pleiotropy informed powerful association test methods across multiple traits for GWAS. Since it is generally very hard to access individual‐level GWAS phenotype and genotype data for those existing GWAS, due to privacy concerns and various logistical considerations, we develop rigorous statistical methods for pleiotropy informed adaptive multitrait association test methods that need only summary association statistics publicly available from most GWAS. We first develop a pleiotropy test, which has powerful performance for truly pleiotropic variants but is sensitive to the pleiotropy assumption. We then develop a pleiotropy informed adaptive test that has robust and powerful performance under various genetic models. We develop accurate and efficient numerical algorithms to compute the analytical P‐value for the proposed adaptive test without the need of resampling or permutation. We illustrate the performance of proposed methods through application to joint association test of GWAS meta‐analysis summary data for several glycemic traits. Our proposed adaptive test identified several novel loci missed by individual trait based GWAS meta‐analysis. All the proposed methods are implemented in a publicly available R package. 相似文献
12.
Bader Arouisse Arthur Korte Fred van Eeuwijk Willem Kruijer 《The Plant journal : for cell and molecular biology》2020,102(4):872-882
Natural variation has become a prime resource to identify genetic variants that contribute to phenotypic variation. The regional mapping (RegMap) population is one of the most important populations for studying natural variation in Arabidopsis thaliana, and has been used in a large number of association studies and in studies on climatic adaptation. However, only 413 RegMap accessions have been completely sequenced, as part of the 1001 Genomes (1001G) Project, while the remaining 894 accessions have only been genotyped with the Affymetrix 250k chip. As a consequence, most association studies involving the RegMap are either restricted to the sequenced accessions, reducing power, or rely on a limited set of SNPs. Here we impute millions of SNPs to the 894 accessions that are exclusive to the RegMap, using the 1135 accessions of the 1001G Project as the reference panel. We assess imputation accuracy using a novel cross‐validation scheme, which we show provides a more reliable measure of accuracy than existing methods. After filtering out low accuracy SNPs, we obtain high‐quality genotypic information for 2029 accessions and 3 million markers. To illustrate the benefits of these imputed data, we reconducted genome‐wide association studies on five stress‐related traits and could identify novel candidate genes. 相似文献
13.
E. A. Abdalla F. Peñagaricano T. M. Byrem K. A. Weigel G. J. M. Rosa 《Animal genetics》2016,47(4):395-407
Bovine leukosis virus is an oncogenic virus that infects B cells, causing bovine leukosis disease. This disease is known to have a negative impact on dairy cattle production and, because no treatment or vaccine is available, finding a possible genetic solution is important. Our objective was to perform a comprehensive genetic analysis of leukosis incidence in dairy cattle. Data on leukosis occurrence, pedigree and molecular information were combined into multitrait GBLUP models with milk yield (MY) and somatic cell score (SCS) to estimate genetic parameters and to perform whole‐genome scans and pathway analysis. Leukosis data were available for 11 554 Holsteins daughters of 3002 sires from 112 herds in 16 US states. Genotypes from a 60K SNP panel were available for 961 of those bulls as well as for 2039 additional bulls. Heritability for leukosis incidence was estimated at about 8%, and the genetic correlations of leukosis disease incidence with MY and SCS were moderate at 0.18 and 0.20 respectively. The genome‐wide scan indicated that leukosis is a complex trait, possibly modulated by many genes. The gene set analysis identified many functional terms that showed significant enrichment of genes associated with leukosis. Many of these terms, such as G‐Protein Coupled Receptor Signaling Pathway, Regulation of Nucleotide Metabolic Process and different calcium‐related processes, are known to be related to retrovirus infection. Overall, our findings contribute to a better understanding of the genetic architecture of this complex disease. The functional categories associated with leukosis may be useful in future studies on fine mapping of genes and development of dairy cattle breeding strategies. 相似文献
14.
Ke Cao Yong Li Cecilia H. Deng Susan E. Gardiner Gengrui Zhu Weichao Fang Changwen Chen Xinwei Wang Lirong Wang 《Plant biotechnology journal》2019,17(10):1954-1970
Crop evolution is a long‐term process involving selection by natural evolutionary forces and anthropogenic influences; however, the genetic mechanisms underlying the domestication and improvement of fruit crops have not been well studied to date. Here, we performed a population structure analysis in peach (Prunus persica) based on the genome‐wide resequencing of 418 accessions and confirmed the presence of an obvious domestication event during evolution. We identified 132 and 106 selective sweeps associated with domestication and improvement, respectively. Analysis of their tissue‐specific expression patterns indicated that the up‐regulation of selection genes during domestication occurred mostly in fruit and seeds as opposed to other organs. However, during the improvement stage, more up‐regulated selection genes were identified in leaves and seeds than in the other organs. Genome‐wide association studies (GWAS) using 4.24 million single nucleotide polymorphisms (SNPs) revealed 171 loci associated with 26 fruit domestication traits. Among these loci, three candidate genes were highly associated with fruit weight and the sorbitol and catechin content in fruit. We demonstrated that as the allele frequency of the SNPs associated with high polyphenol composition decreased during peach evolution, alleles associated with high sugar content increased significantly. This indicates that there is genetic potential for the breeding of more nutritious fruit with enhanced bioactive polyphenols without disturbing a harmonious sugar and acid balance by crossing with wild species. This study also describes the development of the genomic resources necessary for evolutionary research in peach and provides the large‐scale characterization of key agronomic traits in this crop species. 相似文献
15.
Holstein Friesian cow training sets were created according to disease incidences. The different datasets were used to investigate the impact of random forest (RF) and genomic BLUP (GBLUP) methodology on genomic prediction accuracies. In addition, for further verifications of some specific scenarios, single‐step genomic BLUP was applied. Disease traits included the overall trait categories of (i) claw disorders, (ii) clinical mastitis and (iii) infertility from 80 741 first lactation Holstein cows kept in 58 large‐scale herds. A subset of 6744 cows was genotyped (50K SNP panel). Response variables for all scenarios were de‐regressed proofs (DRPs) and pre‐corrected phenotypes (PCPs). Initially, all sick cows were allocated to the testing set, and healthy cows represented the training set. For the ongoing cow allocation schemes, the number of sick cows in the training set increased stepwise by moving 10% of the sick cows from the testing to the training set in each step. The size of training and testing sets was kept constant by replacing the same number of cows in the testing set with (randomly selected) healthy cows from the training set. For both the RF and GBLUP methods, prediction accuracies were larger for DRPs compared to PCPs. For PCPs as a response variable, the largest prediction accuracies were observed when the disease incidences in training sets reflected the disease incidence in the whole population. A further increase in prediction accuracies for some selected cow allocation schemes (i.e. larger prediction accuracies compared to corresponding scenarios with RF or GBLUB) was achieved via single‐step GBLUP applications. Correlations between genome‐wide association study SNP effects and RF importance criteria for single SNPs were in a moderate range, from 0.42 to 0.57, when considering SNPs from all chromosomes or from specific chromosome segments. RF identified significant SNPs close to potential positional candidate genes: GAS1, GPAT3 and CYP2R1 for clinical mastitis; SPINK5 and SLC26A2 for laminitis; and FGF12 for endometritis. 相似文献
16.
17.
Because domesticated Saccharomyces cerevisiae strains have been used to produce fermented food and beverages for centuries without apparent health implications, S. cerevisiae has always been considered a Generally Recognized As Safe (GRAS) microorganism. However, the number of reported mucosal and systemic S. cerevisiae infections in the human population has increased and fatal infections have occurred even in relatively healthy individuals. In order to gain insight into the pathogenesis of S. cerevisiae and improve our understanding of the emergence of fungal pathogens, we performed a population-based genome-wide environmental association analysis of clinical vs. nonclinical origin in S. cerevisiae. Using tiling array-based, high-density genotypes of 44 clinical and 44 nonclinical S. cerevisiae strains from diverse geographical origins and source substrates, we identified several genetic loci associated with clinical background in S. cerevisiae. Associated polymorphisms within the coding sequences of VRP1, KIC1, SBE22 and PDR5, and the 5' upstream region of YGR146C indicate the importance of pseudohyphal formation, robust cell wall maintenance and cellular detoxification for S. cerevisiae pathogenesis, and constitute good candidates for follow-up verification of virulence and virulence-related factors underlying the pathogenicity of S. cerevisiae. 相似文献
18.
B. An J. Xia T. Chang X. Wang L. Xu L. Zhang X. Gao Y. Chen J. Li H. Gao 《Animal genetics》2019,50(4):386-390
We performed a genome‐wide association study to identify candidate genes for body measurement traits in 463 Wagyu beef cattle typed with the Illumina Bovine HD 770K SNP array. At the genome‐wide level, we detected 18, five and one SNPs associated with hip height, body height and body length respectively. In total, these SNPs are within or near 11 genes, six of which (PENK, XKR4, IMPAD1, PLAG1, CCND2 and SNTG1) have been reported previously and five of which (CSMD3, LAP3, SYN3, FAM19A5 and TIMP3) are novel candidate genes that we found to be associated with body measurement traits. Further exploration of these candidate genes will facilitate genetic improvement in Chinese Wagyu beef cattle. 相似文献
19.
P. G. Eusebi R. González‐Prendes R. Quintanilla J. Tibau T. F. Cardoso A. Clop M. Amills 《Animal genetics》2017,48(4):466-469
We performed a genome‐wide association study to map the genetic determinants of carcass traits in 350 Duroc pigs typed with the Porcine SNP60 BeadChip. Association analyses were carried out using the gemma software. The proportion of phenotypic variance explained by the SNPs ranged between negligible to moderate (= 0.01–0.30) depending on the trait under consideration. At the genome‐wide level, we detected one significant association between backfat thickness between the 3rd and 4th ribs and six SNPs mapping to SSC12 (37–40 Mb). We also identified several chromosome‐wide significant associations for ham weight (SSC11: 51–53 Mb, three SNPs; 67–68 Mb, two SNPs), carcass weight (SSC11: 66–68 Mb, two SNPs), backfat thickness between the 3rd and 4th ribs (SSC12: 21 Mb, one SNP; 33–40 Mb, 17 SNPs; 51–58 Mb, two SNPs), backfat thickness in the last rib (SSC12: 37 Mb, one SNP; 40–41 Mb, nine SNPs) and lean meat content (SSC13: 34 Mb, three SNPs and SSC16: 45.1 Mb, one SNP; 62–63 Mb, 10 SNPs; 71–75 Mb, nine SNPs). The ham weight trait‐associated region on SSC11 contains two genes (UCHL3 and LMO7) related to muscle development. In addition, the ACACA gene, which encodes an enzyme for the catalysis of fatty acid synthesis, maps to the SSC12 (37–41 Mb) region harbouring trait‐associated regions for backfat thickness traits. Sequencing of these candidate genes may help to uncover the causal mutations responsible for the associations found in the present study. 相似文献
20.
A genome‐wide association study (GWAS) was conducted on 15 milk production traits in Chinese Holstein. The experimental population consisted of 445 cattle, each genotyped by the GGP (GeneSeek genomic profiling)‐BovineLD V3 SNP chip, which had 26 151 public SNPs in its manifest file. After data cleaning, 20 326 SNPs were retained for the GWAS. The phenotypes were estimated breeding values of traits, provided by a public dairy herd improvement program center that had been collected once a month for 3 years. Two statistical models, a fixed‐effect linear regression model and a mixed‐effect linear model, were used to estimate the association effects of SNPs on each of the phenotypes. Genome‐wide significant and suggestive thresholds were set at 2.46E‐06 and 4.95E‐05 respectively. The two statistical models concurrently identified two genome‐wide significant (P < 0.05) SNPs on milk production traits in this Chinese Holstein population. The positional candidate genes, which were the ones closest to these two identified SNPs, were EEF2K (eukaryotic elongation factor 2 kinase) and KLHL1 (kelch like family member 1). These two genes could serve as new candidate genes for milk yield and lactation persistence, yet their roles need to be verified in further function studies. 相似文献