首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Testing hypotheses about interclass correlations from familial data   总被引:1,自引:0,他引:1  
S Konishi 《Biometrics》1985,41(1):167-176
Testing problems concerning interclass correlations from familial data are considered in the case where the number of siblings varies among families. Under the assumption of multivariate normality, two test procedures are proposed for testing the hypothesis that an interclass correlation is equal to a specified value. To compare the properties of the tests, including a likelihood ratio test, Monte Carlo experiments are performed. Several test statistics are derived for testing whether two variables about a parent and child are uncorrelated. The proposed tests are compared with previous test procedures, using Monte Carlo simulation. A general procedure for finding confidence intervals for interclass correlations is also derived.  相似文献   

2.
3.
Influence curves of estimators for directional data   总被引:1,自引:0,他引:1  
  相似文献   

4.
We used survey data collected from a large plot (20 ha) of sub-tropical forest in the Dinghushan Nature Reserve, Guangdong Province, southern China, in 2005 to test the comparative performance of nine species-richness estimators (number of observed species, three species-individual curve models, five nonparametric estimators). As the true species richness, we used the 210 free-standing shrub and tree species of >1 cm diameter at breast height recorded during the survey. This true species richness was then used to calculate performance measures of bias, accuracy, and precision for each estimator, whereby we distinguished performance for low, medium, and high sampling intensity. Unsurprisingly, all estimators performed better than the number of observed species in terms of bias and accuracy. Surprisingly, however, two curve models (logistic and logarithm) outperformed all other estimators in terms of bias, accuracy, and precision, which is in contrast to most other previous studies, in which nonparametric methods usually outperform curve models. Intriguingly, relative estimator performance changed between low, medium, and high sampling intensity, sometimes dramatically, reinforcing the assertion that the influence of sampling intensity on estimator performance is an important aspect to investigate and to consider when choosing estimators for ecological surveys. Because these results are based on only one dataset, the results should be treated with caution, both because (1) the generality of these results needs to be confirmed with simulated datasets and (2) more work is needed to establish what “true” species richness is extrapolated by each of the tested estimators in both the statistical and the practical sense. Nevertheless, the two curve estimators, namely Logistic and Logarithm, should be considered in future studies of comparative performance of species-richness estimators because of their outstanding performance in this study.  相似文献   

5.
The analysis of family-study data sometimes focuses on whether a dichotomous trait tends to cluster in families. For traits with variable age-at-onset, it may be of interest to investigate whether age-at-onset itself also exhibits familial clustering. A complication in such investigations is that censoring by age-at-ascertainment can induce artifactual familial correlation in the age-at-onset of affected members. A further complication can be that sample inclusion criteria involve the affection status of family members. The purpose here is to present an approach to testing for correlation that is not confounded by censoring by age-at-ascertainment and may be applied with a broad range of inclusion criteria. The approach involves regression statistics in which subjects's covariate terms are chosen to reflect age-at-onset information from the subjects's affected family members. The results of analyses of data from a family-study of panic disorder illustrate the approach.  相似文献   

6.
7.
The measurement of biallelic pair-wise association called linkage disequilibrium (LD) is an important issue in order to understand the genomic architecture. A plethora of measures of association in two by two tables have been proposed in the literature. Beside the problem of choosing an appropriate measure, the problem of their estimation has been neglected in the literature. It needs to be emphasized that the definition of a measure and the choice of an estimator function for it are conceptually unrelated tasks. In this paper, we compare the performance of various estimators for the three popular LD measures D', r and Y in a simulation study for small to moderate samples sizes (N<=500). The usual frequency-plug-in estimators can lead to unreliable or undefined estimates. Estimators based on the computationally expensive volume measures have been proposed recently as a remedy to this well-known problem. We confirm that volume estimators have better expected mean square error than the naive plug-in estimators. But they are outperformed by estimators plugging-in easy to calculate non-informative Bayesian probability estimates into the theoretical formulae for the measures. Fully Bayesian estimators with non-informative Dirichlet priors have comparable accuracy but are computationally more expensive. We recommend the use of non-informative Bayesian plug-in estimators based on Jeffreys' prior, in particular when dealing with SNP array data where the occurrence of small table entries and table margins is likely.  相似文献   

8.
Odds ratio estimators when the data are sparse   总被引:6,自引:0,他引:6  
BRESLOW  NORMAN 《Biometrika》1981,68(1):73-84
  相似文献   

9.
10.
Summary At least two common practices exist when a negative variance component estimate is obtained, either setting it to zero or not reporting the estimate. The consequences of these practices are investigated in the context of the intraclass correlation estimation in terms of bias, variance and mean squared error (MSE). For the one-way analysis of variance random effects model and its extension to the common correlation model, we compare five estimators: analysis of variance (ANOVA), concentrated ANOVA, truncated ANOVA and two maximum likelihood-like (ML) estimators. For the balanced case, the exact bias and MSE are calculated via numerical integration of the exact sample distributions, while a Monte Carlo simulation study is conducted for the unbalanced case. The results indicate that the ANOVA estimator performs well except for designs with family size n = 2. The two ML estimators are generally poor, and the concentrated and truncated ANOVA estimators have some advantages over the ANOVA in terms of MSE. However, the large biases may make the concentrated and truncated ANOVA estimators objectionable when intraclass correlation () is small. Bias should be a concern when a pooled estimate is obtained from the literature since <0.05 in many genetic studies.  相似文献   

11.

Background  

The preprocessing of gene expression data obtained from several platforms routinely includes the aggregation of multiple raw signal intensities to one expression value. Examples are the computation of a single expression measure based on the perfect match (PM) and mismatch (MM) probes for the Affymetrix technology, the summarization of bead level values to bead summary values for the Illumina technology or the aggregation of replicated measurements in the case of other technologies including real-time quantitative polymerase chain reaction (RT-qPCR) platforms. The summarization of technical replicates is also performed in other "-omics" disciplines like proteomics or metabolomics.  相似文献   

12.
13.
Selected distributional properties of the maximum likelihood estimator and its z-transformation of three familial correlations (parental, parent-offspring, filial) were investigated numerically for the case of nuclear families with variable sibship size. This investigation was based on six different sets of the three correlations, and four different sample sizes, defining 24 sampling conditions, which were replicated 1,000 times each. It was found that the distributional properties of the correlation estimator are affected by the magnitude of the correlations even in large samples although approximate normality is achieved locally. Fisher's z-transformation, here used only in its interclass form, achieves reduction of skewness, stabilization of variance, and approach to normality already in small samples, except for the filial correlation (where it may be deemed inappropriate) in smaller samples. For both the correlation estimator and its z-transformation, the (estimated) relative efficiency was shown to be high (better than 90% in most sampling conditions), suggesting that the estimated minimum variance bound is a satisfactory estimator of the sampling variance. It is concluded that the maximum likelihood estimation of familial correlations under variable sibship size is feasible and, when prudently applied, especially in the form of their z-transformations, provides an appropriate method in analyses of family studies.  相似文献   

14.
Analysis of familial data: Linear-model approach   总被引:1,自引:0,他引:1  
MAK  T. K.; NG  K. W. 《Biometrika》1981,68(2):457-461
  相似文献   

15.
16.
Yaari G  David G 《PloS one》2012,7(1):e30112
Recently, the "hot hand" phenomenon regained interest due to the availability and accessibility of large scale data sets from the world of sports. In support of common wisdom and in contrast to the original conclusions of the seminal paper about this phenomenon by Gilovich, Vallone and Tversky in 1985, solid evidences were supplied in favor of the existence of this phenomenon in different kinds of data. This came after almost three decades of ongoing debates whether the "hot hand" phenomenon in sport is real or just a mis-perception of human subjects of completely random patterns present in reality. However, although this phenomenon was shown to exist in different sports data including basketball free throws and bowling strike rates, a somehow deeper question remained unanswered: are these non random patterns results of causal, short term, feedback mechanisms or simply time fluctuations of athletes performance. In this paper, we analyze large amounts of data from the Professional Bowling Association(PBA). We studied the results of the top 100 players in terms of the number of available records (summed into more than 450,000 frames). By using permutation approach and dividing the analysis into different aggregation levels we were able to supply evidence for the existence of the "hot hand" phenomenon in the data, in agreement with previous studies. Moreover, by using this approach, we were able to demonstrate that there are, indeed, significant fluctuations from game to game for the same player but there is no clustering of successes (strikes) and failures (non strikes) within each game. Thus we were lead to the conclusion that bowling results show correlation to recent past results but they are not influenced by them in a causal manner.  相似文献   

17.
Empirical Bayes models have been shown to be powerful tools for identifying differentially expressed genes from gene expression microarray data. An example is the WAME model, where a global covariance matrix accounts for array-to-array correlations as well as differing variances between arrays. However, the existing method for estimating the covariance matrix is very computationally intensive and the estimator is biased when data contains many regulated genes. In this paper, two new methods for estimating the covariance matrix are proposed. The first method is a direct application of the EM algorithm for fitting the multivariate t-distribution of the WAME model. In the second method, a prior distribution for the log fold-change is added to the WAME model, and a discrete approximation is used for this prior. Both methods are evaluated using simulated and real data. The first method shows equal performance compared to the existing method in terms of bias and variability, but is superior in terms of computer time. For large data sets (>15 arrays), the second method also shows superior computer run time. Moreover, for simulated data with regulated genes the second method greatly reduces the bias. With the proposed methods it is possible to apply the WAME model to large data sets with reasonable computer run times. The second method shows a small bias for simulated data, but appears to have a larger bias for real data with many regulated genes.  相似文献   

18.

Background  

Ruminant mycoplasmoses are important diseases worldwide and several are listed by the World Organization for Animal Health to be of major economic significance. In France the distribution of mycoplasmal species isolated from clinical samples collected from diseased animals upon veterinary request, is monitored by a network known as VIGIMYC (for VIGIlance to MYCoplasmoses of ruminants). The veterinary diagnostic laboratories collaborating with VIGIMYC are responsible for isolating the mycoplasmas while identification of the isolates is centralized by the French Food Safety Agency (AFSSA) in Lyon. The VIGIMYC framework can also be used for specific surveys and one example, on the prevalence of M. bovis in bovine respiratory diseases, is presented here.  相似文献   

19.
Genome-wide association studies have been instrumental in identifying genetic variants associated with complex traits such as human disease or gene expression phenotypes. It has been proposed that extending existing analysis methods by considering interactions between pairs of loci may uncover additional genetic effects. However, the large number of possible two-marker tests presents significant computational and statistical challenges. Although several strategies to detect epistasis effects have been proposed and tested for specific phenotypes, so far there has been no systematic attempt to compare their performance using real data. We made use of thousands of gene expression traits from linkage and eQTL studies, to compare the performance of different strategies. We found that using information from marginal associations between markers and phenotypes to detect epistatic effects yielded a lower false discovery rate (FDR) than a strategy solely using biological annotation in yeast, whereas results from human data were inconclusive. For future studies whose aim is to discover epistatic effects, we recommend incorporating information about marginal associations between SNPs and phenotypes instead of relying solely on biological annotation. Improved methods to discover epistatic effects will result in a more complete understanding of complex genetic effects.  相似文献   

20.
Estimating pairwise correlation from replicated genome-scale (a.k.a. OMICS) data is fundamental to cluster functionally relevant biomolecules to a cellular pathway. The popular Pearson correlation coefficient estimates bivariate correlation by averaging over replicates. It is not completely satisfactory since it introduces strong bias while reducing variance. We propose a new multivariate correlation estimator that models all replicates as independent and identically distributed (i.i.d.) samples from the multivariate normal distribution. We derive the estimator by maximizing the likelihood function. For small sample data, we provide a resampling-based statistical inference procedure, and for moderate to large sample data, we provide an asymptotic statistical inference procedure based on the Likelihood Ratio Test (LRT). We demonstrate advantages of the new multivariate correlation estimator over Pearson bivariate correlation estimator using simulations and real-world data analysis examples. AVAILABILITY: The estimator and statistical inference procedures have been implemented in an R package 'CORREP' that is available from CRAN [http://cran.r-project.org] and Bioconductor [http://www.bioconductor.org/]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号