首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
OBJECTIVE: In affected sib pair studies without genotyped parents the effect of genotyping error is generally to reduce the type I error rate and power of tests for linkage. The effect of genotyping error when parents have been genotyped is unknown. We investigated the type I error rate of the single-point Mean test for studies in which genotypes of both parents are available. METHODS: Datasets were simulated assuming no linkage and one of five models for genotyping error. In each dataset, Mendelian-inconsistent families were either excluded or regenotyped, and then the Mean test applied. RESULTS: We found that genotyping errors lead to an inflated type I error rate when inconsistent families are excluded. Depending on the genotyping-error model assumed, regenotyping inconsistent families has one of several effects. It may produce the same type I error rate as if inconsistent families are excluded; it may reduce the type I error, but still leave an anti-conservative test; or it may give a conservative test. Departures of the type I error rate from its nominal level increase with both the genotyping error rate and sample size. CONCLUSION: We recommend that markers with high error rates either be excluded from the analysis or be regenotyped in all families.  相似文献   

2.
A method to quantify the error probability at the Kirchhoff-law-Johnson-noise (KLJN) secure key exchange is introduced. The types of errors due to statistical inaccuracies in noise voltage measurements are classified and the error probability is calculated. The most interesting finding is that the error probability decays exponentially with the duration of the time window of single bit exchange. The results indicate that it is feasible to have so small error probabilities of the exchanged bits that error correction algorithms are not required. The results are demonstrated with practical considerations.  相似文献   

3.
Spencer BD 《Biometrics》2012,68(2):559-566
Latent class models are increasingly used to assess the accuracy of medical diagnostic tests and other classifications when no gold standard is available and the true state is unknown. When the latent class is treated as the true class, the latent class models provide measures of components of accuracy including specificity and sensitivity and their complements, type I and type II error rates. The error rates according to the latent class model differ from the true error rates, however, and empirical comparisons with a gold standard suggest the true error rates often are larger. We investigate conditions under which the true type I and type II error rates are larger than those provided by the latent class models. Results from Uebersax (1988, Psychological Bulletin 104, 405-416) are extended to accommodate random effects and covariates affecting the responses. The results are important for interpreting the results of latent class analyses. An error decomposition is presented that incorporates an error component from invalidity of the latent class model.  相似文献   

4.
Pedigree data can be evaluated, and subsequently corrected, by analysis of the distribution of genetic markers, taking account of the possibility of mistyping . Using a model of pedigree error developed previously, we obtained the maximum likelihood estimates of error parameters in pedigree data from Tokelau. Posterior probabilities for the possible true relationships in each family are conditional on the putative relationships and the marker data are calculated using the parameter estimates. These probabilities are used as a basis for discriminating between pedigree error and genetic marker errors in families where inconsistencies have been observed. When applied to the Tokelau data and compared with the results of retyping inconsistent families, these statistical procedures are able to discriminate between pedigree and marker error, with approximately 90% accuracy, for families with two or more offspring. The large proportion of inconsistencies inferred to be due to marker error (61%) indicates the importance of discriminating between error sources when judging the reliability of putative relationship data. Application of our model of pedigree error has proved to be an efficient way of determining and subsequently correcting sources of error in extensive pedigree data collected in large surveys.  相似文献   

5.
Error hypersurfaces are very valuable to study because of their unique status in multilayer perceptron research. Given the architecture of a multilayer perceptron, if the pattern sets are different, so are the respective error hypersurfaces in the multilayer perceptron. Using the theory of groups and Polya Theorem, this paper constructs classes of congruent pattern sets and classes of congruent error hypersurfaces, and proves that the number of classes of congruent pattern sets is equal to the number of congruent error hypersurfaces. Calculation results lead to much fewer classes of congruent error hypersurfaces than the total error hypersurfaces, and show that as the input dimension N increases, the former number increases at a much lower rate than the latter number, thus simplifying the understanding of the complexity of classes of error hypersurfaces.  相似文献   

6.
Question: Predictive vegetation modelling relies on the use of environmental variables, which are usually derived from abase data set with some level of error, and this error is propagated to any subsequently derived environmental variables. The question for this study is: What is the level of error and uncertainty in environmental variables based on the error propagated from a Digital Elevation Model (DEM) and how does it vary for both direct and indirect variables? Location: Kioloa region, New South Wales, Australia Methods: The level of error in a DEM is assessed and used to develop an error model for analysing error propagation to derived environmental variables. We tested both indirect (elevation, slope, aspect, topographic position) and direct (average air temperature, net solar radiation, and topographic wetness index) variables for their robustness to propagated error from the DEM. Results: It is shown that the direct environmental variable net solar radiation is less affected by error in the DEM than the indirect variables aspect and slope, but that regional conditions such as slope steepness and cloudiness can influence this outcome. However, the indirect environmental variable topographic position was less affected by error in the DEM than topographic wetness index. Interestingly, the results disagreed with the current assumption that indirect variables are necessarily less sensitive to propagated error because they are less derived. Conclusions: The results indicate that variables exhibit both systematic bias and instability under uncertainty. There is a clear need to consider the sensitivity of variables to error in their base data sets in addition to the question of whether to use direct or indirect variables.  相似文献   

7.
Cox DG  Kraft P 《Human heredity》2006,61(1):10-14
Deviation from Hardy-Weinberg equilibrium has become an accepted test for genotyping error. While it is generally considered that testing departures from Hardy-Weinberg equilibrium to detect genotyping error is not sensitive, little has been done to quantify this sensitivity. Therefore, we have examined various models of genotyping error, including error caused by neighboring SNPs that degrade the performance of genotyping assays. We then calculated the power of chi-square goodness-of-fit tests for deviation from Hardy-Weinberg equilibrium to detect such error. We have also examined the affects of neighboring SNPs on risk estimates in the setting of case-control association studies. We modeled the power of departure from Hardy-Weinberg equilibrium as a test to detect genotyping error and quantified the effect of genotyping error on disease risk estimates. Generally, genotyping error does not generate sufficient deviation from Hardy-Weinberg equilibrium to be detected. As expected, genotyping error due to neighboring SNPs attenuates risk estimates, often drastically. For the moment, the most widely accepted method of detecting genotyping error is to confirm genotypes by sequencing and/or genotyping via a separate method. While these methods are fairly reliable, they are also costly and time consuming.  相似文献   

8.
Contingent kernel density estimation   总被引:1,自引:0,他引:1  
Kernel density estimation is a widely used method for estimating a distribution based on a sample of points drawn from that distribution. Generally, in practice some form of error contaminates the sample of observed points. Such error can be the result of imprecise measurements or observation bias. Often this error is negligible and may be disregarded in analysis. In cases where the error is non-negligible, estimation methods should be adjusted to reduce resulting bias. Several modifications of kernel density estimation have been developed to address specific forms of errors. One form of error that has not yet been addressed is the case where observations are nominally placed at the centers of areas from which the points are assumed to have been drawn, where these areas are of varying sizes. In this scenario, the bias arises because the size of the error can vary among points and some subset of points can be known to have smaller error than another subset or the form of the error may change among points. This paper proposes a "contingent kernel density estimation" technique to address this form of error. This new technique adjusts the standard kernel on a point-by-point basis in an adaptive response to changing structure and magnitude of error. In this paper, equations for our contingent kernel technique are derived, the technique is validated using numerical simulations, and an example using the geographic locations of social networking users is worked to demonstrate the utility of the method.  相似文献   

9.
Optimal design of experiments as well as proper analysis of data are dependent on knowledge of the experimental error. A detailed analysis of the error structure of kinetic data obtained with acetylcholinesterase showed conclusively that the classical assumptions of constant absolute or constant relative error are inadequate for the dependent variable (velocity). The best mathematical models for the experimental error involved the substrate and inhibitor concentrations and reflected the rate law for the initial velocity. Data obtained with other enzymes displayed similar relationships between experimental error and the independent variables. The new empirical error functions were shown superior to previously used models when utilized in weighted non-linear-regression analysis of kinetic data. The results suggest that, in the spectrophotometric assays used in the present study, the observed experimental variance is primarily due to errors in determination of the concentrations of substrate and inhibitor and not to error in measuring the velocity.  相似文献   

10.
Ratio estimation with measurement error in the auxiliary variate   总被引:1,自引:0,他引:1  
Gregoire TG  Salas C 《Biometrics》2009,65(2):590-598
Summary .  With auxiliary information that is well correlated with the primary variable of interest, ratio estimation of the finite population total may be much more efficient than alternative estimators that do not make use of the auxiliary variate. The well-known properties of ratio estimators are perturbed when the auxiliary variate is measured with error. In this contribution we examine the effect of measurement error in the auxiliary variate on the design-based statistical properties of three common ratio estimators. We examine the case of systematic measurement error as well as measurement error that varies according to a fixed distribution. Aside from presenting expressions for the bias and variance of these estimators when they are contaminated with measurement error we provide numerical results based on a specific population. Under systematic measurement error, the biasing effect is asymmetric around zero, and precision may be improved or degraded depending on the magnitude of the error. Under variable measurement error, bias of the conventional ratio-of-means estimator increased slightly with increasing error dispersion, but far less than the increased bias of the conventional mean-of-ratios estimator. In similar fashion, the variance of the mean-of-ratios estimator incurs a greater loss of precision with increasing error dispersion compared with the other estimators we examine. Overall, the ratio-of-means estimator appears to be remarkably resistant to the effects of measurement error in the auxiliary variate.  相似文献   

11.
Hubisz MJ  Lin MF  Kellis M  Siepel A 《PloS one》2011,6(2):e17034
The recent release of twenty-two new genome sequences has dramatically increased the data available for mammalian comparative genomics, but twenty of these new sequences are currently limited to ~2× coverage. Here we examine the extent of sequencing error in these 2× assemblies, and its potential impact in downstream analyses. By comparing 2× assemblies with high-quality sequences from the ENCODE regions, we estimate the rate of sequencing error to be 1-4 errors per kilobase. While this error rate is fairly modest, sequencing error can still have surprising effects. For example, an apparent lineage-specific insertion in a coding region is more likely to reflect sequencing error than a true biological event, and the length distribution of coding indels is strongly distorted by error. We find that most errors are contributed by a small fraction of bases with low quality scores, in particular, by the ends of reads in regions of single-read coverage in the assembly. We explore several approaches for automatic sequencing error mitigation (SEM), making use of the localized nature of sequencing error, the fact that it is well predicted by quality scores, and information about errors that comes from comparisons across species. Our automatic methods for error mitigation cannot replace the need for additional sequencing, but they do allow substantial fractions of errors to be masked or eliminated at the cost of modest amounts of over-correction, and they can reduce the impact of error in downstream phylogenomic analyses. Our error-mitigated alignments are available for download.  相似文献   

12.
Advances in technology have allowed ecologists to employ remote observations of individual organism's spatial location. These data are used to model species distributions and habitat associations, which inform conservation efforts and management plans. These data are not without error. To illustrate the consequences of not considering measurement error, I introduce measurement error to a habitat selection model, using three different distributions. I show how measurement error can confound inferences made about a hypothetical organism's true habitat selection. By simulating different initial strengths of selection I show the introduction of measurement error results in the largest reduction in habitat selection strength (from truth) for very selective individuals (habitat specialists). Not surprisingly, the inclusion of error in very weakly selective individuals (habitat generalists) can result in a switching from true selection to observed avoidance. Researchers need to be aware that, first, there is measurement error in remotely observed data, and second, a tradeoff occurs between measurement error and landscape fragmentation. Landscapes with a high degree of fragmentation require spatially accurate (low measurement error) data in order to make reliable estimates of habitat selection or species distribution. The results of this study are discussed in light of the conservation of species threatened by habitat fragmentation and the management suggestions arising from selection studies.  相似文献   

13.
Most experiments are intended for the estimation of the size of effects rather than for the testing of a hypothesis of whether or not an effect occurs. Hypothesis testing is often inapplicable, is over-used and is likely to lead to misinterpretations of results. The two types of error possible in hypothesis testing are discussed. Whereas Type I error is usually examined as a matter of course, Type II error is almost always ignored. Investigations in which zero differences are important should recognise the possibility of Type II error in their interpretation. A nonsignificant result should not be interpreted as evidence of a lack of effect. Statistical significance is not synonymous with economic or scientific importance. The importance of choosing the most appropriate design is emphasised and some suggestions are made as to how important sources of error can be avoided.  相似文献   

14.
In this paper, human learning characteristics in the tracking tasks of iterative nature are investigated. Various linear and nonlinear systems are used as plant, and a human operator has to generate the proper control inputs to force these systems in tracking the desired trajectory. The learning behaviour of the human operator in modifying his control actions is studied and it is observed that the human operator can improve his performance quite efficiently despite the unavailability of any information about the system or the desired trajectories. It is concluded from the experiments that the human operator not only use the information that is directly available to him (error in this case), but also extracts some useful information (e.g. error rate) that he feels is necessary to generate a good control action. The limitation of the human performance is studied in frequency domain, and the performance of the human operator against the frequency bandwidth of error and error rate signals are highlighted. Analysis of the results revealed that a human operator gives more importance to the error rate in generating his control actions and, accordingly, it is observed that his limitation in term of performance is more sensitive to the frequency bandwidth of the error rate as compared to the error. The human operator cannot improve his performance once the frequency components of the error or error rates shift to the higher frequencies, say above 1.0 Hz.  相似文献   

15.
Shaw PA  Prentice RL 《Biometrics》2012,68(2):397-407
Uncertainty concerning the measurement error properties of self-reported diet has important implications for the reliability of nutritional epidemiology reports. Biomarkers based on the urinary recovery of expended nutrients can provide an objective measure of short-term nutrient consumption for certain nutrients and, when applied to a subset of a study cohort, can be used to calibrate corresponding self-report nutrient consumption assessments. A nonstandard measurement error model that makes provision for systematic error and subject-specific error, along with the usual independent random error, is needed for the self-report data. Three estimation procedures for hazard ratio (Cox model) parameters are extended for application to this more complex measurement error structure. These procedures are risk set regression calibration, conditional score, and nonparametric corrected score. An estimator for the cumulative baseline hazard function is also provided. The performance of each method is assessed in a simulation study. The methods are then applied to an example from the Women's Health Initiative Dietary Modification Trial.  相似文献   

16.
17.
Khurram Nadeem  Subhash R. Lele 《Oikos》2012,121(10):1656-1664
Population viability analysis (PVA) entails calculation of extinction risk, as defined by various extinction metrics, for a study population. These calculations strongly depend on the form of the population growth model and inclusion of demographic and/or environmental stochasticity. Form of the model and its parameters are determined based on observed population time series data. A typical population time series, consisting of estimated population sizes, inevitably has some observation error and likely has missing observations. In this paper, we present a likelihood based PVA in the presence of observation error and missing data. We illustrate the importance of incorporation of observation error in PVA by reanalyzing the population time series of song sparrow Melospiza melodia on Mandarte Island, British Columbia, Canada from 1975–1998. Using Akaike information criterion we show that model with observation error fits the data better than the one without observation error. The extinction risks predicted by with and without observation error models are quite different. Further analysis of possible causes for observation error revealed that some component of the observation error might be due to unreported dispersal. A complete analysis of such data, thus, would require explicit spatial models and data on dispersal along with observation error. Our conclusions are, therefore, two‐fold: 1) observation errors in PVA matter and 2) integrating these errors in PVA is not always enough and can still lead to important biases in parameter estimates if other processes such as dispersal are ignored.  相似文献   

18.

Background  

High-throughput screening (HTS) is a key part of the drug discovery process during which thousands of chemical compounds are screened and their activity levels measured in order to identify potential drug candidates (i.e., hits). Many technical, procedural or environmental factors can cause systematic measurement error or inequalities in the conditions in which the measurements are taken. Such systematic error has the potential to critically affect the hit selection process. Several error correction methods and software have been developed to address this issue in the context of experimental HTS [17]. Despite their power to reduce the impact of systematic error when applied to error perturbed datasets, those methods also have one disadvantage - they introduce a bias when applied to data not containing any systematic error [6]. Hence, we need first to assess the presence of systematic error in a given HTS assay and then carry out systematic error correction method if and only if the presence of systematic error has been confirmed by statistical tests.  相似文献   

19.
Species occurrences inherently include positional error. Such error can be problematic for species distribution models (SDMs), especially those based on fine-resolution environmental data. It has been suggested that there could be a link between the influence of positional error and the width of the species ecological niche. Although positional errors in species occurrence data may imply serious limitations, especially for modelling species with narrow ecological niche, it has never been thoroughly explored. We used a virtual species approach to assess the effects of the positional error on fine-scale SDMs for species with environmental niches of different widths. We simulated three virtual species with varying niche breadth, from specialist to generalist. The true distribution of these virtual species was then altered by introducing different levels of positional error (from 5 to 500 m). We built generalized linear models and MaxEnt models using the distribution of the three virtual species (unaltered and altered) and a combination of environmental data at 5 m resolution. The models’ performance and niche overlap were compared to assess the effect of positional error with varying niche breadth in the geographical and environmental space. The positional error negatively impacted performance and niche overlap metrics. The amplitude of the influence of positional error depended on the species niche, with models for specialist species being more affected than those for generalist species. The positional error had the same effect on both modelling techniques. Finally, increasing sample size did not mitigate the negative influence of positional error. We showed that fine-scale SDMs are considerably affected by positional error, even when such error is low. Therefore, where new surveys are undertaken, we recommend paying attention to data collection techniques to minimize the positional error in occurrence data and thus to avoid its negative effect on SDMs, especially when studying specialist species.  相似文献   

20.
In spite of more than a decade of research on noninvasive genetic sampling, the low quality and quantity of DNA in noninvasive studies continue to plague researchers. Effects of locus size on error have been documented but are still poorly understood. Further, sources of error other than allelic dropout have been described but are often not well quantified. Here we analyse the effects of locus size on allelic dropout, amplification success and error rates in noninvasive genotyping studies of three species, and quantify error other than allelic dropout.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号