首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background

Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods.

Methods

In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values.

Results

When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction.

Conclusions

Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional occurrence of large negative accuracies when the evaluated line was not included in the training dataset. Furthermore, when using a multi-line training dataset, non-linear models provided information on the genotype data that was complementary to the linear models, which indicates that the underlying data distributions of the three studied lines were indeed heterogeneous.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-014-0075-3) contains supplementary material, which is available to authorized users.  相似文献   

2.
3.
4.
The fact that Goldmann applanation tonometry does not accurately account for individual corneal elastic stiffness often leads to inaccuracy in the measurement of intraocular pressure (IOP). IOP should account not only for the effect of central corneal thickness (CCT) but should also account for other corneal biomechanical factors. A computational method for accurate and reliable determination of IOP is investigated with a modified applanation tonometer in this paper. The proposed method uses a combined genetic algorithm/neural network procedure to match the clinically measured applanation force-displacement history with that obtained from a nonlinear finite element simulation of applanation. An additional advantage of the proposed method is that it also provides the ability to determine CCT and material properties of the cornea from the same applanation response data. The performance of the proposed method has been demonstrated through a parametric study and via comparison with a well known clinical case. The proposed method is also shown to be computationally efficient, which is an important practical consideration for clinical application.  相似文献   

5.
Quantitative genetic parameters are nowadays more frequently estimated with restricted maximum likelihood using the 'animal model' than with traditional methods such as parent-offspring regressions. These methods have however rarely been evaluated using equivalent data sets. We compare heritabilities and genetic correlations from animal model and parent-offspring analyses, respectively, using data on eight morphological traits in the great reed warbler (Acrocephalus arundinaceus). Animal models were run using either mean trait values or individual repeated measurements to be able to separate between effects of including more extended pedigree information and effects of replicated sampling from the same individuals. We show that the inclusion of more pedigree information by the use of mean traits animal models had limited effect on the standard error and magnitude of heritabilities. In contrast, the use of repeated measures animal model generally had a positive effect on the sampling accuracy and resulted in lower heritabilities; the latter due to lower additive variance and higher phenotypic variance. For most trait combinations, both animal model methods gave genetic correlations that were lower than the parent-offspring estimates, whereas the standard errors were lower only for the mean traits animal model. We conclude that differences in heritabilities between the animal model and parent-offspring regressions were mostly due to the inclusion of individual replicates to the animal model rather than the inclusion of more extended pedigree information. Genetic correlations were, on the other hand, primarily affected by the inclusion of more pedigree information. This study is to our knowledge the most comprehensive empirical evaluation of the performance of the animal model in relation to parent-offspring regressions in a wild population. Our conclusions should be valuable for reconciliation of data obtained in earlier studies as well as for future meta-analyses utilizing estimates from both traditional methods and the animal model.  相似文献   

6.
MOTIVATION: Using simulation studies for quantitative trait loci (QTL), we evaluate the prediction quality of regression models that include as covariates single-nucleotide polymorphism (SNP) genetic markers which did not achieve genome-wide significance in the original genome-wide association study, but were among the SNPs with the smallest P-value for the selected association test. We compare the results of such regression models to the standard approach which is to include only SNPs that achieve genome-wide significance. Using mean square prediction error as the model metric, our simulation results suggest that by using the coefficient of determination (R(2)) value as a guideline to increase or reduce the number of SNPs included in the regression model, we can achieve better prediction quality than the standard approach. However, important parameters such as trait heritability, the approximate number of QTLs, etc. have to be determined from previous studies or have to be estimated accurately.  相似文献   

7.
In longitudinal studies, measurements of the same individuals are taken repeatedly through time. Often, the primary goal is to characterize the change in response over time and the factors that influence change. Factors can affect not only the location but also more generally the shape of the distribution of the response over time. To make inference about the shape of a population distribution, the widely popular mixed-effects regression, for example, would be inadequate, if the distribution is not approximately Gaussian. We propose a novel linear model for quantile regression (QR) that includes random effects in order to account for the dependence between serial observations on the same subject. The notion of QR is synonymous with robust analysis of the conditional distribution of the response variable. We present a likelihood-based approach to the estimation of the regression quantiles that uses the asymmetric Laplace density. In a simulation study, the proposed method had an advantage in terms of mean squared error of the QR estimator, when compared with the approach that considers penalized fixed effects. Following our strategy, a nearly optimal degree of shrinkage of the individual effects is automatically selected by the data and their likelihood. Also, our model appears to be a robust alternative to the mean regression with random effects when the location parameter of the conditional distribution of the response is of interest. We apply our model to a real data set which consists of self-reported amount of labor pain measurements taken on women repeatedly over time, whose distribution is characterized by skewness, and the significance of the parameters is evaluated by the likelihood ratio statistic.  相似文献   

8.
The genetic map of chromosome 5B has been constructed by using microsatellite (SSR) analysis of 381 plants from the F2 population produced by cross of the Chinese Spring (CS) and Renan cultivars. Initially, 180 SSR markers for the common wheat 5B chromosome have been used for analysis of these cultivars. The 32 markers able to detect polymorphism between these cultivars have been located on the genetic map of chromosome 5B. Cytogenetic mapping has involved a set of CS 5B chromosome deletion lines. Totally, 51 SSR markers have been located in ten regions (deletion bins) of this chromosome by SSR analysis of these deletion lines. Five genes—TaCBFIIIc-B10, Vrn-B1, Chi-B1, Skr, and Ph1—have been integrated into the cytogenetic map of chromosome 5B using the markers either specific of or tightly linked to the genes in question. Comparison of the genetic and cytogenetic maps suggests that recombination is suppressed in the pericentromeric region of chromosome 5B, especially in the short arm segment. The 18 markers localized to deletion bins 5BL16-0.79-1.00 and 5BL18-0.66-0.79 have been used to analyze common wheat introgression lines L842, L5366-180, L73/00i, and L21-4, carrying fragments of alien genomes in the terminal region of 5B long arm. L5366-180 and L842 lines carry a fragment of the Triticum timopheevii 5GL chromosome, while L73/00i and L21-4 lines, a fragment of the Aegilops speltoides 5SL chromosome. As has been shown, the translocated fragments in these four lines are of different lengths, allowing bin 5BL18-0.66-0.79 to be divided into three shorter regions. The utility of wheat introgression lines carrying alien translocations for increasing the resolution of cytogenetic mapping is discussed.  相似文献   

9.
Biology is now entering the new era of systems biology and exerting a growing influence on the future development of various disciplines within life sciences. In early classical and molecular periods of Biology, the theoretical frames of classical and molecular quantitative genetics have been systematically established, respectively. With the new advent of systems biology, there is occurring a paradigm shift in the field of quantitative genetics. Where and how the quantitative genetics would develop after having undergone its classical and molecular periods? This is a difficult question to answer exactly. In this perspective article, the major effort was made to discuss the possible development of quantitative genetics in the systems biology era, and for which there is a high potentiality to develop towards "systems quantitative genetics". In our opinion, the systems quantitative genetics can be defined as a new discipline to address the generalized genetic laws of bioalleles controlling the heritable phenotypes of complex traits following a new dynamic network model. Other issues from quantitative genetic perspective relating to the genetical genomics, the updates of network model, and the future research prospects were also discussed.  相似文献   

10.

Background

A haplotype approach to genomic prediction using high density data in dairy cattle as an alternative to single-marker methods is presented. With the assumption that haplotypes are in stronger linkage disequilibrium (LD) with quantitative trait loci (QTL) than single markers, this study focuses on the use of haplotype blocks (haploblocks) as explanatory variables for genomic prediction. Haploblocks were built based on the LD between markers, which allowed variable reduction. The haploblocks were then used to predict three economically important traits (milk protein, fertility and mastitis) in the Nordic Holstein population.

Results

The haploblock approach improved prediction accuracy compared with the commonly used individual single nucleotide polymorphism (SNP) approach. Furthermore, using an average LD threshold to define the haploblocks (LD≥0.45 between any two markers) increased the prediction accuracies for all three traits, although the improvement was most significant for milk protein (up to 3.1 % improvement in prediction accuracy, compared with the individual SNP approach). Hotelling’s t-tests were performed, confirming the improvement in prediction accuracy for milk protein. Because the phenotypic values were in the form of de-regressed proofs, the improved accuracy for milk protein may be due to higher reliability of the data for this trait compared with the reliability of the mastitis and fertility data. Comparisons between best linear unbiased prediction (BLUP) and Bayesian mixture models also indicated that the Bayesian model produced the most accurate predictions in every scenario for the milk protein trait, and in some scenarios for fertility.

Conclusions

The haploblock approach to genomic prediction is a promising method for genomic selection in animal breeding. Building haploblocks based on LD reduced the number of variables without the loss of information. This method may play an important role in the future genomic prediction involving while genome sequences.  相似文献   

11.
12.
Random regression (RR) analysis has been recommended to estimate the genetic parameters of longitudinal data. The objective of this study was to evaluate the growth of turkeys using RR models. Data were collected from 957 turkeys and included 15,478 individual body weight recorded during the first week of life and between weeks 2 and 32 by 2-week intervals. To take into account the repeated measurements of weight for each animal, a specific overall growth curve was modelled using a cubic smoothing spline. Animal deviation to this curve was also modelled using an RR function. All data were analysed with the ASReml package. The results showed an increase in heritability estimates over the trajectory and peaked at 0.60 around 20 to 32 weeks of age. Genetic correlations showed that turkeys could be selected at earlier time points, at 12 weeks of age, in order to increase the growth rate. In general, genetic correlation estimates were higher among adjacent ages, decreasing markedly with the increase of distance between ages. Negative genetic correlations were observed between ages.  相似文献   

13.
Cobbs G 《Genetics》1986,113(2):355-365
A laboratory strain of Drosophila pseudoobscura (L116) is studied that, when crossed to sex-ratio homozygous females, produces sons that exhibit varying levels of the male sex-ratio (msr) phenotype. The msr phenotype occurs only in sex-ratio males and is due to the production of a high frequency of nullo-XY sperm. The level of the msr phenotype is variable, and new variability is generated in one father-son transmission. Pedigree studies indicate the genes for msr reside on the Y chromosome or the autosomes of the L116 stock.  相似文献   

14.
Association studies of quantitative traits have often relied on methods in which a normal distribution of the trait is assumed. However, quantitative phenotypes from complex human diseases are often censored, highly skewed, or contaminated with outlying values. We recently developed a rank-based association method that takes into account censoring and makes no distributional assumptions about the trait. In this study, we applied our new method to age-at-onset data on ALDX1 and ALDX2. Both traits are highly skewed (skewness > 1.9) and often censored. We performed a whole genome association study of age at onset of the ALDX1 trait using Illumina single-nucleotide polymorphisms. Only slightly more than 5% of markers were significant. However, we identified two regions on chromosomes 14 and 15, which each have at least four significant markers clustering together. These two regions may harbor genes that regulate age at onset of ALDX1 and ALDX2. Future fine mapping of these two regions with densely spaced markers is warranted.  相似文献   

15.

Background

With the development of sequencing technologies, more and more sequence variants are available for investigation. Different classes of variants in the human genome have been identified, including single nucleotide substitutions, insertion and deletion, and large structural variations such as duplications and deletions. Insertion and deletion (indel) variants comprise a major proportion of human genetic variation. However, little is known about their effects on humans. The absence of understanding is largely due to the lack of both biological data and computational resources.

Results

This paper presents a new indel functional prediction method HMMvar based on HMM profiles, which capture the conservation information in sequences. The results demonstrate that a scoring strategy based on HMM profiles can achieve good performance in identifying deleterious or neutral variants for different data sets, and can predict the protein functional effects of both single and multiple mutations.

Conclusions

This paper proposed a quantitative prediction method, HMMvar, to predict the effect of genetic variation using hidden Markov models. The HMM based pipeline program implementing the method HMMvar is freely available at https://bioinformatics.cs.vt.edu/zhanglab/hmm.  相似文献   

16.
Interspecies genetic variability of Baikal seal was studied for 22 proteins encoded by 24 loci. Genetic variants have been detected in transferrin (TfA = 0.96; TfB = 0.04), postalbumin (PaA = 0.130; PaB = 0.870) and "slow" carboxylesterase (CrE-6A = 0.98; CrE-6B = 0.02). Low genetic variability is characteristic of species as well as of many other representatives of Pinnipedia order.  相似文献   

17.
Knowledge of kin relationships between members of wild animal populations has broad application in ecology and evolution research by allowing the investigation of dispersal dynamics, mating systems, inbreeding avoidance, kin recognition, and kin selection as well as aiding the management of endangered populations. However, the assessment of kinship among members of wild animal populations is difficult in the absence of detailed multigenerational pedigrees. Here, we first review the distinction between genetic relatedness and kinship derived from pedigrees and how this makes the identification of kin using genetic data inherently challenging. We then describe useful approaches to kinship classification, such as parentage analysis and sibship reconstruction, and explain how the combined use of marker systems with biparental and uniparental inheritance, demographic information, likelihood analyses, relatedness coefficients, and estimation of misclassification rates can yield reliable classifications of kinship in groups with complex kin structures. We outline alternative approaches for cases in which explicit knowledge of dyadic kinship is not necessary, but indirect inferences about kinship on a group‐ or population‐wide scale suffice, such as whether more highly related dyads are in closer spatial proximity. Although analysis of highly variable microsatellite loci is still the dominant approach for studies on wild populations, we describe how the long‐awaited use of large‐scale single‐nucleotide polymorphism and sequencing data derived from noninvasive low‐quality samples may eventually lead to highly accurate assessments of varying degrees of kinship in wild populations.  相似文献   

18.
Properdin factor B phenotypes were determined in 1,112 unrelated individuals and in 151 mother/child combinations from Northern Germany. Gene frequencies were : F = 0.1960, S= 0.7905, F1 = 0.0072, S1 = 0.0063. The data of the mother/child combinations are in full accordance with the postulated gene model.  相似文献   

19.
This paper presents the impact of twins and the measures for their removal from the population of genetic algorithm (GA) when applied to effective conformational searching. It is conclusively shown that a twin removal strategy for a GA provides considerably enhanced performance when investigating solutions to complex ab initio protein structure prediction (PSP) problems in low-resolution model. Without twin removal, GA crossover and mutation operations can become ineffectual as generations lose their ability to produce significant differences, which can lead to the solution stalling. The paper relaxes the definition of chromosomal twins in the removal strategy to not only encompass identical, but also highly correlated chromosomes within the GA population, with empirical results consistently exhibiting significant improvements solving PSP problems.  相似文献   

20.

Background

In contrast to currently used single nucleotide polymorphism (SNP) panels, the use of whole-genome sequence data is expected to enable the direct estimation of the effects of causal mutations on a given trait. This could lead to higher reliabilities of genomic predictions compared to those based on SNP genotypes. Also, at each generation of selection, recombination events between a SNP and a mutation can cause decay in reliability of genomic predictions based on markers rather than on the causal variants. Our objective was to investigate the use of imputed whole-genome sequence genotypes versus high-density SNP genotypes on (the persistency of) the reliability of genomic predictions using real cattle data.

Methods

Highly accurate phenotypes based on daughter performance and Illumina BovineHD Beadchip genotypes were available for 5503 Holstein Friesian bulls. The BovineHD genotypes (631,428 SNPs) of each bull were used to impute whole-genome sequence genotypes (12,590,056 SNPs) using the Beagle software. Imputation was done using a multi-breed reference panel of 429 sequenced individuals. Genomic estimated breeding values for three traits were predicted using a Bayesian stochastic search variable selection (BSSVS) model and a genome-enabled best linear unbiased prediction model (GBLUP). Reliabilities of predictions were based on 2087 validation bulls, while the other 3416 bulls were used for training.

Results

Prediction reliabilities ranged from 0.37 to 0.52. BSSVS performed better than GBLUP in all cases. Reliabilities of genomic predictions were slightly lower with imputed sequence data than with BovineHD chip data. Also, the reliabilities tended to be lower for both sequence data and BovineHD chip data when relationships between training animals were low. No increase in persistency of prediction reliability using imputed sequence data was observed.

Conclusions

Compared to BovineHD genotype data, using imputed sequence data for genomic prediction produced no advantage. To investigate the putative advantage of genomic prediction using (imputed) sequence data, a training set with a larger number of individuals that are distantly related to each other and genomic prediction models that incorporate biological information on the SNPs or that apply stricter SNP pre-selection should be considered.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0149-x) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号