首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Rosner B  Glynn RJ 《Biometrics》2009,65(1):188-197
Summary .  The Wilcoxon Mann-Whitney (WMW) U test is commonly used in nonparametric two-group comparisons when the normality of the underlying distribution is questionable. There has been some previous work on estimating power based on this procedure ( Lehmann, 1998 , Nonparametrics ). In this article, we present an approach for estimating type II error, which is applicable to any continuous distribution, and also extend the approach to handle grouped continuous data allowing for ties. We apply these results to obtaining standard errors of the area under the receiver operating characteristic curve (AUROC) for risk-prediction rules under H 1 and for comparing AUROC between competing risk prediction rules applied to the same data set. These results are based on SAS -callable functions to evaluate the bivariate normal integral and are thus easily implemented with standard software.  相似文献   

2.
Assessing the diagnostic accuracy of a sequence of tests   总被引:10,自引:0,他引:10  
We consider the assessment of the overall diagnostic accuracy of a sequence of tests (e.g. repeated screening tests). The complexity of diagnostic choices when two or more continuous tests are used in sequence is illustrated, and different approaches to reducing the dimensionality are presented and evaluated. For instance, in practice, when a single test is used repeatedly in routine screening, the same screening threshold is typically used at each screening visit. One possible alternative is to adjust the threshold at successive visits according to individual-specific characteristics. Such possibilities represent a particular slice of a receiver operating characteristic surface, corresponding to all possible combinations of test thresholds. We focus in the development and examples on the setting where an overall test is defined to be positive if any of the individual tests are positive ('believe the positive'). The ideas developed are illustrated by an example of application to screening for prostate cancer using prostate-specific antigen.  相似文献   

3.
Receiver operating characteristic (ROC) curves are used to describe the performance of diagnostic procedures. This paper proposes a simple method for the statistical comparison of two ROC curves derived from the same set of patients and the same set of healthy subjects. Generalization to studies involving more than two screening factors is straightforward. This method does not require the calculation of variances of the areas or difference of areas under the curves.  相似文献   

4.
5.
Uno H  Cai T  Tian L  Wei LJ 《Biometrics》2011,67(4):1389-1396
Quantitative procedures for evaluating added values from new markers over a conventional risk scoring system for predicting event rates at specific time points have been extensively studied. However, a single summary statistic, for example, the area under the receiver operating characteristic curve or its derivatives, may not provide a clear picture about the relationship between the conventional and the new risk scoring systems. When there are no censored event time observations in the data, two simple scatterplots with individual conventional and new scores for "cases" and "controls" provide valuable information regarding the overall and the subject-specific level incremental values from the new markers. Unfortunately, in the presence of censoring, it is not clear how to construct such plots. In this article, we propose a nonparametric estimation procedure for the distributions of the differences between two risk scores conditional on the conventional score. The resulting quantile curves of these differences over the subject-specific conventional score provide extra information about the overall added value from the new marker. They also help us to identify a subgroup of future subjects who need the new predictors, especially when there is no unified utility function available for cost-risk-benefit decision making. The procedure is illustrated with two data sets. The first is from a well-known Mayo Clinic primary biliary cirrhosis liver study. The second is from a recent breast cancer study on evaluating the added value from a gene score, which is relatively expensive to measure compared with the routinely used clinical biomarkers for predicting the patient's survival after surgery.  相似文献   

6.
Bondugula R  Xu D 《Proteins》2007,66(3):664-670
Predicting secondary structures from a protein sequence is an important step for characterizing the structural properties of a protein. Existing methods for protein secondary structure prediction can be broadly classified into template based or sequence profile based methods. We propose a novel framework that bridges the gap between the two fundamentally different approaches. Our framework integrates the information from the fuzzy k-nearest neighbor algorithm and position-specific scoring matrices using a neural network. It combines the strengths of the two methods and has a better potential to use the information in both the sequence and structure databases than existing methods. We implemented the framework into a software system MUPRED. MUPRED has achieved three-state prediction accuracy (Q3) ranging from 79.2 to 80.14%, depending on which benchmark dataset is used. A higher Q3 can be achieved if a query protein has a significant sequence identity (>25%) to a template in PDB. MUPRED also estimates the prediction accuracy at the individual residue level more quantitatively than existing methods. The MUPRED web server and executables are freely available at http://digbio.missouri.edu/mupred.  相似文献   

7.
Quantitative genetic dissection of complex traits in a QTL-mapping pedigree   总被引:1,自引:0,他引:1  
This paper summarizes and modifies quantitative genetic analyses on a pedigree used to map genetic factors (i.e., QTLs) underlying a complex trait. The total genetic variance can be exactly estimated based on the F2 family derived from two homozygous parents for alternative alleles at all QTLs of interest. The parents, F1 hybrids, and two backcrosses are combined to each parent, and the total number of QTLs and the number of dominant QTLs are estimated under the assumptions of gene association with the two parents, equal gene effect, no linkage, and no epistasis among QTLs. Further relaxation for each of the assumptions are made in detail. The biometric estimator for the QTL number and action mode averaged over the entire genome could provide some basic and complementary information to QTL mapping designed to detect the effect and location of specific genetic factors.  相似文献   

8.
We compare several nonparametric and parametric weighting methods for the adjustment of the effect of strata. In particular, we focus on the adjustment methods in the context of receiver‐operating characteristic (ROC) analysis. Nonparametrically, rank‐based van Elteren's test and inverse‐variance (IV) weighting using the area under the ROC curve (AUC) are examined. Parametrically, the stratified t‐test and IV AUC weighted method are applied based on a binormal monotone transformation model. Stratum‐specific, pooled, and adjusted estimates are obtained. The pooled and adjusted AUCs are estimated. We illustrate and compare these weighting methods on a multi‐center diagnostic trial and through extensive Monte‐Carlo simulations.  相似文献   

9.
10.
Genetic prediction for complex traits is usually based on models including individual (infinitesimal) or marker effects. Here, we concentrate on models including both the individual and the marker effects. In particular, we develop a “Mendelian segregation” model combining infinitesimal effects for base individuals and realized Mendelian sampling in descendants described by the available DNA data. The model is illustrated with an example and the analyses of a public simulated data file. Further, the potential contribution of such models is assessed by simulation. Accuracy, measured as the correlation between true (simulated) and predicted genetic values, was similar for all models compared under different genetic backgrounds. As expected, the segregation model is worthwhile when markers capture a low fraction of total genetic variance.  相似文献   

11.
Economically important reproduction traits in sheep, such as number of lambs weaned and litter size, are expressed only in females and later in life after most selection decisions are made, which makes them ideal candidates for genomic selection. Accurate genomic predictions would lead to greater genetic gain for these traits by enabling accurate selection of young rams with high genetic merit. The aim of this study was to design and evaluate the accuracy of a genomic prediction method for female reproduction in sheep using daughter trait deviations (DTD) for sires and ewe phenotypes (when individual ewes were genotyped) for three reproduction traits: number of lambs born (NLB), litter size (LSIZE) and number of lambs weaned. Genomic best linear unbiased prediction (GBLUP), BayesR and pedigree BLUP analyses of the three reproduction traits measured on 5340 sheep (4503 ewes and 837 sires) with real and imputed genotypes for 510 174 SNPs were performed. The prediction of breeding values using both sire and ewe trait records was validated in Merino sheep. Prediction accuracy was evaluated by across sire family and random cross‐validations. Accuracies of genomic estimated breeding values (GEBVs) were assessed as the mean Pearson correlation adjusted by the accuracy of the input phenotypes. The addition of sire DTD into the prediction analysis resulted in higher accuracies compared with using only ewe records in genomic predictions or pedigree BLUP. Using GBLUP, the average accuracy based on the combined records (ewes and sire DTD) was 0.43 across traits, but the accuracies varied by trait and type of cross‐validations. The accuracies of GEBVs from random cross‐validations (range 0.17–0.61) were higher than were those from sire family cross‐validations (range 0.00–0.51). The GEBV accuracies of 0.41–0.54 for NLB and LSIZE based on the combined records were amongst the highest in the study. Although BayesR was not significantly different from GBLUP in prediction accuracy, it identified several candidate genes which are known to be associated with NLB and LSIZE. The approach provides a way to make use of all data available in genomic prediction for traits that have limited recording.  相似文献   

12.
识别复杂性状和疾病间遗传关联可以提供有用的病因学见解,并有助于确定可能的因果关系的优先级。尽管已有很多工具可以实现复杂性状和疾病间遗传关联,但是某些工具代码可读性差、并且不同工具基于不同的计算机语言、工具间的串联性较差。因此,本研究基于全基因组关联研究(GWAS)数据,提出了SCtool,一个开源、跨平台和用户友好的软件工具。SCtool整合了ldsc, TwosampleMR和MR-BMA三种软件,其主要功能是基于GWAS汇总水平的数据,识别复杂性状和疾病、复杂性状和复杂性状以及疾病与疾病间的遗传相关性并探究其间潜在的因果关联。最后,使用SCtool揭示了全身性铁状态(铁蛋白,血清铁,转铁蛋白,转铁蛋白饱和度)与表观遗传时钟GrimAge之间的遗传关联。  相似文献   

13.
We combine a new, extremely fast technique to generate a library of low energy structures of an oligopeptide (by using mutually orthogonal Latin squares to sample its conformational space) with a genetic algorithm to predict protein structures. The protein sequence is divided into oligopeptides, and a structure library is generated for each. These libraries are used in a newly defined mutation operator that, together with variation, crossover, and diversity operators, is used in a modified genetic algorithm to make the prediction. Application to five small proteins has yielded near native structures.  相似文献   

14.
Genomic best linear-unbiased prediction (GBLUP) assumes equal variance for all marker effects, which is suitable for traits that conform to the infinitesimal model. For traits controlled by major genes, Bayesian methods with shrinkage priors or genome-wide association study (GWAS) methods can be used to identify causal variants effectively. The information from Bayesian/GWAS methods can be used to construct the weighted genomic relationship matrix (G). However, it remains unclear which methods perform best for traits varying in genetic architecture. Therefore, we developed several methods to optimize the performance of weighted GBLUP and compare them with other available methods using simulated and real data sets. First, two types of methods (marker effects with local shrinkage or normal prior) were used to obtain test statistics and estimates for each marker effect. Second, three weighted G matrices were constructed based on the marker information from the first step: (1) the genomic-feature-weighted G, (2) the estimated marker-variance-weighted G, and (3) the absolute value of the estimated marker-effect-weighted G. Following the above process, six different weighted GBLUP methods (local shrinkage/normal-prior GF/EV/AEWGBLUP) were proposed for genomic prediction. Analyses with both simulated and real data demonstrated that these options offer flexibility for optimizing the weighted GBLUP for traits with a broad spectrum of genetic architectures. The advantage of weighting methods over GBLUP in terms of accuracy was trait dependant, ranging from 14.8% to marginal for simulated traits and from 44% to marginal for real traits. Local-shrinkage prior EVWGBLUP is superior for traits mainly controlled by loci of a large effect. Normal-prior AEWGBLUP performs well for traits mainly controlled by loci of moderate effect. For traits controlled by some loci with large effects (explain 25–50% genetic variance) and a range of loci with small effects, GFWGBLUP has advantages. In conclusion, the optimal weighted GBLUP method for genomic selection should take both the genetic architecture and number of QTLs of traits into consideration carefully.Subject terms: Quantitative trait, Genome-wide association studies, Animal breeding, Quantitative trait, Genome-wide association studies  相似文献   

15.
Genomic prediction has been widely utilized to estimate genomic breeding values (GEBVs) in farm animals. In this study, we conducted genomic prediction for 20 economically important traits including growth, carcass and meat quality traits in Chinese Simmental beef cattle. Five approaches (GBLUP, BayesA, BayesB, BayesCπ and BayesR) were used to estimate the genomic breeding values. The predictive accuracies ranged from 0.159 (lean meat percentage estimated by BayesCπ) to 0.518 (striploin weight estimated by BayesR). Moreover, we found that the average predictive accuracies across 20 traits were 0.361, 0.361, 0.367, 0.367 and 0.378, and the averaged regression coefficients were 0.89, 0.86, 0.89, 0.94 and 0.95 for GBLUP, BayesA, BayesB, BayesCπ and BayesR respectively. The genomic prediction accuracies were mostly moderate and high for growth and carcass traits, whereas meat quality traits showed relatively low accuracies. We concluded that Bayesian regression approaches, especially for BayesR and BayesCπ, were slightly superior to GBLUP for most traits. Increasing with the sizes of reference population, these two approaches are feasible for future application of genomic selection in Chinese beef cattle.  相似文献   

16.
This contribution moves in the direction of answering some general questions about the most effective and useful ways of modelling bioprocesses. We investigate the characteristics of models that are good at extrapolating. We trained three fully predictive models with different representational structures (differential equations, differential equations with inheritance of rates and a network of reactions) on Saccharopolyspora erythraea shake flask fermentation data using genetic programming. The models were then tested on unseen data outside the range of the training data and the resulting performances were compared. It was found that constrained models with mathematical forms analogous to internal mass balancing and stoichiometric relations were superior to flexible unconstrained models, even though no a priori knowledge of this fermentation was used.Paper presented at the international conference on trends in monitoring and control of life science applications, 7–8 October 2002, Lyngby, Denmark.  相似文献   

17.

Aim

Global-scale maps of the environment are an important source of information for researchers and decision makers. Often, these maps are created by training machine learning algorithms on field-sampled reference data using remote sensing information as predictors. Since field samples are often sparse and clustered in geographic space, model prediction requires a transfer of the trained model to regions where no reference data are available. However, recent studies question the feasibility of predictions far beyond the location of training data.

Innovation

We propose a novel workflow for spatial predictive mapping that leverages recent developments in this field and combines them in innovative ways with the aim of improved model transferability and performance assessment. We demonstrate, evaluate and discuss the workflow with data from recently published global environmental maps.

Main conclusions

Reducing predictors to those relevant for spatial prediction leads to an increase of model transferability and map accuracy without a decrease of prediction quality in areas with high sampling density. Still, reliable gap-free global predictions were not possible, highlighting that global maps and their evaluation are hampered by limited availability of reference data.  相似文献   

18.
We consider the power and sample size calculation of diagnostic studies with normally distributed multiple correlated test results. We derive test statistics and obtain power and sample size formulas. The methods are illustrated using an example of comparison of CT and PET scanner for detecting extra-hepatic disease for colorectal cancer.  相似文献   

19.
本研究旨在建立一种可用于类风湿性关节炎(rheumatoid arthritis, RA)辅助诊断的人软骨寡聚基质蛋白(cartilage oligomeric matrix protein, COMP)荧光层析检测方法。采用双抗体夹心法原理制备免疫层析试纸条,并进行性能评价及方法学对比。通过对临床样本的检测得到受试者工作特征(receiver operating characteristic, ROC)曲线,计算试纸条灵敏度、特异性、阳性和阴性预测值。线性范围为0.39–50.00ng/mL;批内批间变异系数均小于15%;试纸条37℃加速破坏20d,荧光信号强度变化范围在15%以内;与类风湿因子(rheumatoid factor,RF)、抗环瓜氨酸肽(anti-cyclic citrullinated peptide,anti-CCP)抗体无交叉反应;与ELISA试剂盒平行检测48份临床血清样本,相关性良好。采用本研究制备的试纸条检测样本,COMP区分RA患者和健康人的cut-off值为22.55 ng/mL (灵敏度为0.821,特异度为0.842,阳性预测值为0.741,阴性预...  相似文献   

20.
Summary The most common method for genetic evaluation when parents are unknown is best linear unbiased prediction with genetic groups (BLUP-G). With this method unknown parents are assumed to be unrelated to any other animals in the population. This assumption is unrealistic in most situations. If a finite number of potential parents can be identified and the probabilities of being the true parent can be assigned to these, genetic evaluation can be obtained given the uncertainty of parentage without introducing genetic groups into the model. The correct numerator relationship matrix with uncertain parentage () is derived. Rules are given to efficiently compute and -1. Computer simulation was used to compare BLUP-G with BLUP using . The simulated population consisted of ten sires and 200 dams per breeding season. The dams were always known; the sires were unknown for 10% or 30% of the males and 30% of the females. The number of potential sires was three (BLUP-1 or ten (BLUP-2), including the true sire in both cases. Equal probabilities were assigned to each potential sire. The increase in response with BLUP-1 and BLUP-2 relative to BLUP-G ranged from 4% to 8% in the fifth breeding season. Selection with BLUP-1 or BLUP-2 resulted in higher inbreeding, 17% and 12%, respectively, than with BLUP-G. Estimates of response to selection were unbiased with BLUP-1 and BLUP-2, but not unbiased with BLUP-G. Mean square error of estimated genetic means and mean prediction error variance were higher with BLUP-G than with blup-1 or BLUP-2.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号