首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Association studies use statistical links between genetic markers and the phenotype variation across many individuals to identify genes controlling variation in the target phenotype. However, this approach, particularly conducted on a genome‐wide scale (GWAS), has limited power to identify the genes responsible for variation in traits controlled by complex genetic architectures. In this study, we employ real‐world genotype datasets from four crop species with distinct minor allele frequency distributions, population structures and linkage disequilibrium patterns. We demonstrate that different GWAS statistical approaches provide favourable trade‐offs between power and accuracy for traits controlled by different types of genetic architectures. FarmCPU provides the most favourable outcomes for moderately complex traits while a Bayesian approach adopted from genomic prediction provides the most favourable outcomes for extremely complex traits. We assert that by estimating the complexity of genetic architectures for target traits and selecting an appropriate statistical approach for the degree of complexity detected, researchers can substantially improve the ability to dissect the genetic factors controlling complex traits such as flowering time, plant height and yield component.  相似文献   

3.
In genome-wide association studies (GWAS) it is now common to search for, and find, multiple causal variants located in close proximity. It has also become standard to ask whether different traits share the same causal variants, but one of the popular methods to answer this question, coloc, makes the simplifying assumption that only a single causal variant exists for any given trait in any genomic region. Here, we examine the potential of the recently proposed Sum of Single Effects (SuSiE) regression framework, which can be used for fine-mapping genetic signals, for use with coloc. SuSiE is a novel approach that allows evidence for association at multiple causal variants to be evaluated simultaneously, whilst separating the statistical support for each variant conditional on the causal signal being considered. We show this results in more accurate coloc inference than other proposals to adapt coloc for multiple causal variants based on conditioning. We therefore recommend that coloc be used in combination with SuSiE to optimise accuracy of colocalisation analyses when multiple causal variants exist.  相似文献   

4.

Background  

Algorithms and software for CNV detection have been developed, but they detect the CNV regions sample-by-sample with individual-specific breakpoints, while common CNV regions are likely to occur at the same genomic locations across different individuals in a homogenous population. Current algorithms to detect common CNV regions do not account for the varying reliability of the individual CNVs, typically reported as confidence scores by SNP-based CNV detection algorithms. General methodologies for identifying these recurrent regions, especially those directed at SNP arrays, are still needed.  相似文献   

5.
To refine the location of a disease gene within the bounds provided by linkage analysis, many scientists use the pattern of linkage disequilibrium between the disease allele and alleles at nearby markers. We describe a method that seeks to refine location by analysis of "disease" and "normal" haplotypes, thereby using multivariate information about linkage disequilibrium. Under the assumption that the disease mutation occurs in a specific gap between adjacent markers, the method first combines parsimony and likelihood to build an evolutionary tree of disease haplotypes, with each node (haplotype) separated, by a single mutational or recombinational step, from its parent. If required, latent nodes (unobserved haplotypes) are incorporated to complete the tree. Once the tree is built, its likelihood is computed from probabilities of mutation and recombination. When each gap between adjacent markers is evaluated in this fashion and these results are combined with prior information, they yield a posterior probability distribution to guide the search for the disease mutation. We show, by evolutionary simulations, that an implementation of these methods, called "FineMap," yields substantial refinement and excellent coverage for the true location of the disease mutation. Moreover, by analysis of hereditary hemochromatosis haplotypes, we show that FineMap can be robust to genetic heterogeneity.  相似文献   

6.
Loss of coral reef resilience can lead to dramatic changes in benthic structure, often called regime shifts, which significantly alter ecosystem processes and functioning. In the face of global change and increasing direct human impacts, there is an urgent need to anticipate and prevent undesirable regime shifts and, conversely, to reverse shifts in already degraded reef systems. Such challenges require a better understanding of the human and natural drivers that support or undermine different reef regimes. The Hawaiian archipelago extends across a wide gradient of natural and anthropogenic conditions and provides us a unique opportunity to investigate the relationships between multiple reef regimes, their dynamics and potential drivers. We applied a combination of exploratory ordination methods and inferential statistics to one of the most comprehensive coral reef datasets available in order to detect, visualize and define potential multiple ecosystem regimes. This study demonstrates the existence of three distinct reef regimes dominated by hard corals, turf algae or macroalgae. Results from boosted regression trees show nonlinear patterns among predictors that help to explain the occurrence of these regimes, and highlight herbivore biomass as the key driver in addition to effluent, latitude and depth.  相似文献   

7.
8.
We study the number of causal variants and associated regions identified by top SNPs in rankings given by the popular 1 df chi-squared statistic, support vector machine (SVM) and the random forest (RF) on simulated and real data. If we apply the SVM and RF to the top 2r chi-square-ranked SNPs, where r is the number of SNPs with P-values within the Bonferroni correction, we find that both improve the ranks of causal variants and associated regions and achieve higher power on simulated data. These improvements, however, as well as stability of the SVM and RF rankings, progressively decrease as the cutoff increases to 5r and 10r. As applications we compare the ranks of previously replicated SNPs in real data, associated regions in type 1 diabetes, as provided by the Type 1 Diabetes Consortium, and disease risk prediction accuracies as given by top ranked SNPs by the three methods. Software and webserver are available at http://svmsnps.njit.edu.  相似文献   

9.
10.
SUMMARY: The program package TreeLD implements a unified approach to association mapping and fine mapping of complex trait loci and a novel approach to visualizing association data, based on an inferred ancestry of the sample. Fundamentally, the TreeLD approach is based on the idea that the evidence for association at a particular position is contained in the ancestral tree relating the sampled chromosomes at that position. TreeLD provides an easy-to-use interface and can be applied to case-control, TDT trio and quantitative trait data.  相似文献   

11.
Quantifying species trophic interaction strengths is crucial for understanding community dynamics and has significant implications for pest management and species conservation. DNA-based methods to identify species interactions have revolutionized these efforts, but a significant limitation is the poor ability to quantify the strength of trophic interactions, that is the biomass or number of prey consumed. We present an improved pipeline, called Lazaro, to map unassembled shotgun reads to a comprehensive arthropod mitogenome database and show that the number of prey reads detected is quantitatively predicted from the prey biomass consumed, even for indirect predation. Two feeding bioassays were performed: starved coccinellid larvae consuming different numbers of aphids (Prey Quantity bioassay), and starved coccinellid larvae consuming a chrysopid larvae that had consumed aphids (Direct and Indirect Predation bioassay). Prey taxonomic assignment against a mitochondrial genome database had high accuracy (99.8% positive predictive value) and the number of prey reads was directly related to the number of prey consumed and inversely related to the elapsed time since consumption with high significance (r2 = .932, p = 4.92E-6). Aphids were detected up to 6 h after direct predation plus 3 h after indirect predation (9 h in total) and detection was related to the predator-specific decay rates. Lazaro enabled quantitative predictions of prey consumption across multiple trophic levels with high taxonomic resolution while eliminating all false positives, except for a few confirmed contaminants, and may be valuable for characterizing prey consumed by field-sampled predators. Moreover, Lazaro is readily applicable for species diversity determination from any degraded environmental DNA.  相似文献   

12.
We constructed recombinant inbred lines of a cross between naturally occurring ecotypes of Avena barbata (Pott ex Link), Poaceae, associated with contrasting moisture environments. These lines were assessed for fitness in common garden reciprocal transplant experiments in two contrasting field sites in each of two years, as well as a novel, benign greenhouse environment. An AFLP (amplified fragment length polymorphism) linkage map of 129 markers spanned 644 cM in 19 linkage groups, which is smaller, with more linkage groups, than expected. Therefore parts of the A. barbata genome remain unmapped, possibly because they lack variation between the ecotypes. Nevertheless, we identified QTL (quantitative trait loci) under selection in both native environments and in the greenhouse. Across years at the same site, the same loci remain under selection, for the same alleles. Across sites, an overlapping set of loci are under selection with either (i) the same alleles favoured at both sites or (ii) loci under selection at one site and neutral at the other. QTL under selection in the greenhouse were generally unlinked to those under selection in the field because selection acted on a different trait. We found little evidence that selection favours alternate alleles in alternate environments, which would be necessary if genotype by environment interaction were to maintain genetic variation in A. barbata. Additive effect QTL were best able to explain the genetic variation among recombinant inbred lines for the greenhouse environment where heritability was highest, and past selection had not eliminated variation.  相似文献   

13.
Zhou H  Wei LJ  Xu X  Xu X 《Human heredity》2008,65(3):166-174
In the search to detect genetic associations between complex traits and DNA variants, a practice is to select a subset of Single Nucleotide Polymorphisms (tag SNPs) in a gene or chromosomal region of interest. This allows study of untyped polymorphisms in this region through the phenomenon of linkage disequilibrium (LD). However, it is crucial in the analysis to utilize such multiple SNP markers efficiently. In this study, we present a robust testing approach (T(C)) that combines single marker association test statistics or p values. This combination is based on the summation of single test statistics or p values, giving greater weight to those with lower p values. We compared the powers of T(C) in identifying common trait loci, using tag SNPs within the same haplotype block that the trait loci reside, with competing published tests, in case-control settings. These competing tests included the Bonferroni procedure (T(B)), the simple permutation procedure (T(P)), the permutation procedure proposed by Hoh et al. (T(P-H)) and its revised version using 'deflated' statistics (T(P-H_def)), the traditional chi(2) procedure (T(CHI)), the regression procedure (Hotelling T(2) test) (T(R)) and the haplotype-based test (T(H)). Results of these comparisons show that our proposed combining procedure (T(C)) is preferred in all scenarios examined. We also apply this new test to a data set from a previously reported association study on airway responsiveness to methacholine.  相似文献   

14.
15.
Genetica - This study aimed to investigate the effects of incidence rate, heritability, and polygenic variance on the statistical power of genome-wide association studies (GWAS) for threshold...  相似文献   

16.
In complex diseases, various combinations of genomic perturbations often lead to the same phenotype. On a molecular level, combinations of genomic perturbations are assumed to dys-regulate the same cellular pathways. Such a pathway-centric perspective is fundamental to understanding the mechanisms of complex diseases and the identification of potential drug targets. In order to provide an integrated perspective on complex disease mechanisms, we developed a novel computational method to simultaneously identify causal genes and dys-regulated pathways. First, we identified a representative set of genes that are differentially expressed in cancer compared to non-tumor control cases. Assuming that disease-associated gene expression changes are caused by genomic alterations, we determined potential paths from such genomic causes to target genes through a network of molecular interactions. Applying our method to sets of genomic alterations and gene expression profiles of 158 Glioblastoma multiforme (GBM) patients we uncovered candidate causal genes and causal paths that are potentially responsible for the altered expression of disease genes. We discovered a set of putative causal genes that potentially play a role in the disease. Combining an expression Quantitative Trait Loci (eQTL) analysis with pathway information, our approach allowed us not only to identify potential causal genes but also to find intermediate nodes and pathways mediating the information flow between causal and target genes. Our results indicate that different genomic perturbations indeed dys-regulate the same functional pathways, supporting a pathway-centric perspective of cancer. While copy number alterations and gene expression data of glioblastoma patients provided opportunities to test our approach, our method can be applied to any disease system where genetic variations play a fundamental causal role.  相似文献   

17.
Resources being amassed for genome-wide association (GWA) studies include "control databases" genotyped with a large-scale SNP array. How to use these databases effectively is an open question. We develop a method to match, by genetic ancestry, controls to affected individuals (cases). The impact of this method, especially for heterogeneous human populations, is to reduce the false-positive rate, inflate other spuriously small p values, and have little impact on the p values associated with true positive loci. Thus, it highlights true positives by downplaying false positives. We perform a GWA by matching Americans with type 1 diabetes (T1D) to controls from Germany. Despite the complex study design, these analyses identify numerous loci known to confer risk for T1D.  相似文献   

18.
19.
Identification of quantitative trait loci (QTL) for fiber quality traits that are stable across multiple generations and environments could facilitate marker-assisted selection for improving cotton strains. In the present study, F2, F2:3, and recombinant inbred lines (RILs, F 6:8 ) populations derived from an upland cotton (Gossypium hirsutum L.) cross between strain 0-153, which has excellent fiber quality, and strain sGK9708, a commercial transgenic cultivar, were constructed for QTL tagging of fiber quality. We used 5,742 simple sequence repeat primer pairs to screen for polymorphisms between the two parent strains. Linkage maps of F2 and RILs were constructed, containing 155 and 190 loci and with a total map distance of 959.4 centimorgans (cM) and 700.9?cM, respectively. We screened fiber quality QTL across multiple generations and environments through composite interval mapping of fiber quality data. Specifically, we studied F2 and F2:3 family lines from Anyang (Henan Province) in 2003 and 2004 and RILs in Anyang in 2007 and Anyang, Quzhou (Hebei Province), and Linqing (Shandong Province) in 2008. We identified 50 QTL for fiber quality: 10 for fiber strength, 10 for fiber length, 10 for micronaire, eight for fiber uniformity, and 12 for fiber elongation. Nine of these fiber quality QTL were identified in F2, F2:3 and RILs simultaneously. Two QTL for fiber strength on chromosomes C7 and C25 were detected in all three generations and all four environments and explained 16.67?C27.86% and 9.43?C21.36% of the phenotypic variation, respectively. These stable QTL for fiber quality traits could be used for marker assisted selection.  相似文献   

20.
Ecosystem services, i.e., services provided to humans from ecological systems have become a key issue of this century in resource management, conservation planning, and environmental decision analysis. Mapping and quantifying ecosystem services have become strategic national interests for integrating ecology with economics to help understand the effects of human policies and actions and their subsequent impacts on both ecosystem function and human well-being. Some aspects of biodiversity are valued by humans in varied ways, and thus are important to include in any assessment that seeks to identify and quantify the benefits of ecosystems to humans. Some biodiversity metrics clearly reflect ecosystem services (e.g., abundance and diversity of harvestable species), whereas others may reflect indirect and difficult to quantify relationships to services (e.g., relevance of species diversity to ecosystem resilience, cultural value of native species). Wildlife habitat has been modeled at broad spatial scales and can be used to map a number of biodiversity metrics. In the present study, we present an approach that (1) identifies mappable biodiversity metrics that are related to ecosystem services or other stakeholder concerns, (2) maps these metrics throughout a large multi-state region, and (3) compares the metric values obtained for selected watersheds within the regional context. The broader focus is to design a flexible approach for mapping metrics to produce a national-scale product. We map 20 biodiversity metrics reflecting ecosystem services or other aspects of biodiversity for all vertebrate species except fish. Metrics include species richness for all vertebrates, specific taxon groups, harvestable species (i.e., upland game, waterfowl, furbearers, small game, and big game), threatened and endangered species, and state-designated species of greatest conservation need, and also a metric for ecosystem (i.e., land cover) diversity. The project is being conducted at multiple scales in a phased approach, starting with place-based studies, then multi-state regional areas, culminating into a national-level atlas. As an example of this incremental approach, we provide results for the southwestern United States (i.e., states of Arizona, New Mexico, Nevada, Utah, and Colorado) and portions of two watersheds within this region: the San Pedro River (Arizona) and Rio Grande River (New Mexico). Geographic patterns differed considerably among metrics across the southwestern study area, but metric values for the two watershed study areas were generally greater than those for the southwestern region as a whole.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号