首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Detection and Integration of Genotyping Errors in Statistical Genetics   总被引:15,自引:0,他引:15       下载免费PDF全文
Detection of genotyping errors and integration of such errors in statistical analysis are relatively neglected topics, given their importance in gene mapping. A few inopportunely placed errors, if ignored, can tremendously affect evidence for linkage. The present study takes a fresh look at the calculation of pedigree likelihoods in the presence of genotyping error. To accommodate genotyping error, we present extensions to the Lander-Green-Kruglyak deterministic algorithm for small pedigrees and to the Markov-chain Monte Carlo stochastic algorithm for large pedigrees. These extensions can accommodate a variety of error models and refrain from simplifying assumptions, such as allowing, at most, one error per pedigree. In principle, almost any statistical genetic analysis can be performed taking errors into account, without actually correcting or deleting suspect genotypes. Three examples illustrate the possibilities. These examples make use of the full pedigree data, multiple linked markers, and a prior error model. The first example is the estimation of genotyping error rates from pedigree data. The second-and currently most useful-example is the computation of posterior mistyping probabilities. These probabilities cover both Mendelian-consistent and Mendelian-inconsistent errors. The third example is the selection of the true pedigree structure connecting a group of people from among several competing pedigree structures. Paternity testing and twin zygosity testing are typical applications.  相似文献   

2.
Statistical analysis is error prone. A best practice for researchers using statistics would therefore be to share data among co-authors, allowing double-checking of executed tasks just as co-pilots do in aviation. To document the extent to which this ‘co-piloting’ currently occurs in psychology, we surveyed the authors of 697 articles published in six top psychology journals and asked them whether they had collaborated on four aspects of analyzing data and reporting results, and whether the described data had been shared between the authors. We acquired responses for 49.6% of the articles and found that co-piloting on statistical analysis and reporting results is quite uncommon among psychologists, while data sharing among co-authors seems reasonably but not completely standard. We then used an automated procedure to study the prevalence of statistical reporting errors in the articles in our sample and examined the relationship between reporting errors and co-piloting. Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors.  相似文献   

3.
4.
The proper determination of experimental errors in bioprocesses can be very important because experimental errors can exert a major impact on the analysis of experimental results. Despite this, the effect of experimental errors on the analysis of bioprocess data has been largely overlooked in the literature. For this reason, we performed detailed statistical analyses of experimental errors obtained during the production of lactobionic acid and sorbitol in a system utilizing as catalyst the GFOR (glucose-fructose oxidoreductase) enzyme from permeabilized cells of the bacteria Zymomonas mobilis. The magnitude of the experimental errors thus obtained were then correlated with the process operation conditions and with the composition of the culture media used for bacterial growth. It is shown that experimental errors can depend very significantly on the operation conditions and affect the interpretation of available experimental data. More specifically, in this study, experimental errors depended on the nutritional supplements added to the cultivation medium, the inoculation process, and the reaction time, which may be of fundamental importance for actual process development. The results obtained also indicate, for the first time, that GFOR activity can be affected by the composition of the medium in which cells are cultivated.  相似文献   

5.
Reduction of costs in biological signalling seems an evolutionary advantage, but recent experiments have shown signalling codes shifted to signals of high cost with an underutilization of low-cost signals. Here I derive a theory for efficient signalling that includes both errors and costs as constraints and I show that errors in the efficient translation of biological states into signals can shift codes to higher costs, effectively performing a quality control. The statistical structure of signal usage is predicted to be of a generalized Boltzmann form that penalizes signals that are costly and sensitive to errors. This predicted distribution of signal usage against signal cost has two main features: an exponential tail required for cost efficiency and an underutilization of the low-cost signals required to protect the signalling quality from the errors. These predictions are shown to correspond quantitatively to the experiments in which gathering signal statistics is feasible as in visual cortex neurons.  相似文献   

6.
Abstract

A theory based on a Langevin equation along the reaction coordinate is developed to explain and calculate systematic and statistical errors in free energy perturbation simulations. The errors are calculated exactly when both the perturbation potential and the mean potential from the surrounding degrees of freedom are harmonic in the reaction coordinate. The effect of the mean potential is small as long as the force constant is small compared to the force constant of the perturbation potential. This indicates that the results obtained with zero mean force may still be valid as long as the second derivate of the mean potential is small compared to that of the perturbation potential. The theory is applied to conversion between L and D amino acids by changing the position of the minimum of the harmonic improper dihedral potential between ±35.264 degrees. For phenylalanine bound in the active site of a protein (thermolysin) we find from 20 psec. simulations statistical errors and hysteresis that both are about 2.5 kJ/mol in agreement with what is obtained from the theoretical predictions. The statistical errors are proportional to the square root of the coupling to the heat bath and inversely proportional to the square root of integration time while the (positive) hysteresis due to that the reaction coordinate lags behind is linear in the same quantities. This shows that the systematic errors will dominate in short simulations while the statistical ones will dominate for long simulations. The treatment is based on that the systematic influence of the surroundings can be represented by a mean force upon the reaction coordinate. If the relaxation processes of the environment are slow this may not be true. Then additional errors have to be considered.  相似文献   

7.
When any process of measuring is considered, one of the basic questions is how to assess the precision of measurement methods and/or instruments. In this paper, this question is formulated and solved as a problem of tolerance regions for absolute and relative normaly distributed errors of measurements.  相似文献   

8.

Background

The removal of outliers to acquire a significant result is a questionable research practice that appears to be commonly used in psychology. In this study, we investigated whether the removal of outliers in psychology papers is related to weaker evidence (against the null hypothesis of no effect), a higher prevalence of reporting errors, and smaller sample sizes in these papers compared to papers in the same journals that did not report the exclusion of outliers from the analyses.

Methods and Findings

We retrieved a total of 2667 statistical results of null hypothesis significance tests from 153 articles in main psychology journals, and compared results from articles in which outliers were removed (N = 92) with results from articles that reported no exclusion of outliers (N = 61). We preregistered our hypotheses and methods and analyzed the data at the level of articles. Results show no significant difference between the two types of articles in median p value, sample sizes, or prevalence of all reporting errors, large reporting errors, and reporting errors that concerned the statistical significance. However, we did find a discrepancy between the reported degrees of freedom of t tests and the reported sample size in 41% of articles that did not report removal of any data values. This suggests common failure to report data exclusions (or missingness) in psychological articles.

Conclusions

We failed to find that the removal of outliers from the analysis in psychological articles was related to weaker evidence (against the null hypothesis of no effect), sample size, or the prevalence of errors. However, our control sample might be contaminated due to nondisclosure of excluded values in articles that did not report exclusion of outliers. Results therefore highlight the importance of more transparent reporting of statistical analyses.  相似文献   

9.
Conventional process-analysis-type techniques for compiling life-cycle inventories suffer from a truncation error, which is caused by the omission of resource requirements or pollutant releases of higher-order upstream stages of the production process. The magnitude of this truncation error varies with the type of product or process considered, but can be on the order of 50%. One way to avoid such significant errors is to incorporate input-output analysis into the assessment framework, resulting in a hybrid life-cycle inventory method. Using Monte-Carlo simulations, it can be shown that uncertainties of input-output– based life-cycle assessments are often lower than truncation errors in even extensive, third-order process analyses.  相似文献   

10.
多维协变量具有测量误差的结构回归模型   总被引:1,自引:1,他引:0  
提出具有测量误差的结构回归模型,研究可交换条件下多维协变量的测量误差对平均处理效应估计的影响,在没有其它的附加条件下,尽管大多数模型参数不可识别,平均处理效应仍可识别,由于平均处理效应的极大似然估计求解困难,建议在实际中使用拟极大似然估计作为替代。  相似文献   

11.
Inference of haplotypes is important in genetic epidemiology studies. However, all large genotype data sets have errors due to the use of inexpensive genotyping machines that are fallible and shortcomings in genotyping scoring softwares, which can have an enormous impact on haplotype inference. In this article, we propose two novel strategies to reduce the impact induced by genotyping errors in haplotype inference. The first method makes use of double sampling. For each individual, the “GenoSpectrum” that consists of all possible genotypes and their corresponding likelihoods are computed. The second method is a genotype clustering algorithm based on multi‐genotyping data, which also assigns a “GenoSpectrum” for each individual. We then describe two hybrid EM algorithms (called DS‐EM and MG‐EM) that perform haplotype inference based on “GenoSpectrum” of each individual obtained by double sampling and multi‐genotyping data. Both simulated data sets and a quasi real‐data set demonstrate that our proposed methods perform well in different situations and outperform the conventional EM algorithm and the HMM algorithm proposed by Sun, Greenwood, and Neal (2007, Genetic Epidemiology 31 , 937–948) when the genotype data sets have errors.  相似文献   

12.
13.
French environmental law for nature protection requires that all the facilities, works, and development projects that may affect the environment should be the subject of an impact study to evaluate their consequences, including on human health. For this analysis, the risk assessment approach is used and the population's exposure is estimated with the aid of multimedia models. The CalTOX model is frequently used for this kind of study. Unfortunately, the analysis of these studies shows that the model is often badly understood and poorly used. The difficulties encountered by the users, the errors and the problems met in the interpretation of results, which are the most commonly found in the human exposure assessment, are listed and their consequences illustrated. CalTOX has been shown to have many advantages (adaptability, speed in carrying out calculations, transparency), but it ought not to be used as a “black box” because such a use may lead to many errors and a loss of confidence in the studies.  相似文献   

14.
In isothermal titration calorimetry (ITC), the two main sources of random (statistical) error are associated with the extraction of the heat q from the measured temperature changes and with the delivery of metered volumes of titrant. The former leads to uncertainty that is approximately constant and the latter to uncertainty that is proportional to q. The role of these errors in the analysis of ITC data by nonlinear least squares is examined for the case of 1:1 binding, M+X right arrow over left arrow MX. The standard errors in the key parameters-the equilibrium constant Ko and the enthalpy DeltaHo-are assessed from the variance-covariance matrix computed for exactly fitting data. Monte Carlo calculations confirm that these "exact" estimates will normally suffice and show further that neglect of weights in the nonlinear fitting can result in significant loss of efficiency. The effects of the titrant volume error are strongly dependent on assumptions about the nature of this error: If it is random in the integral volume instead of the differential volume, correlated least-squares is required for proper analysis, and the parameter standard errors decrease with increasing number of titration steps rather than increase.  相似文献   

15.
We explored the impact of phylogeny shape on the results of interspecific statistical analyses incorporating phylogenetic information. In most phylogenetic comparative methods (PCMs), the phylogeny can be represented as a relationship matrix, and the hierarchical nature of interspecific phylogenies translates into a distinctive blocklike matrix that can be described by its eigenvectors (topology) and eigenvalues (branch lengths). Thus, differences in the eigenvectors and eigenvalues of different relationship matrices can be used to gauge the impact of possible phylogeny errors by comparing the actual phylogeny used in a PCM analysis with a second phylogenetic hypothesis that may be more accurate. For example, we can use the sum of inverse eigenvalues as a rough index to compare the impact of phylogenies with different branch lengths. Topological differences are better described by the eigenvectors. In general, phylogeny errors that involve deep splits in the phylogeny (e.g., moving a taxon across the base of the phylogeny) are likely to have much greater impact than will those involving small perturbations in the fine structure near the tips. Small perturbations, however, may have more of an impact if the phylogeny structure is highly dependent (with many recent splits near the tips of the tree). Unfortunately, the impact of any phylogeny difference on the results of a PCM depends on the details of the data being considered. Recommendations regarding the choice, design, and statistical power of interspecific analyses are also made.  相似文献   

16.
Shepherd BE  Yu C 《Biometrics》2011,67(3):1083-1091
A data coordinating team performed onsite audits and discovered discrepancies between the data sent to the coordinating center and that recorded at sites. We present statistical methods for incorporating audit results into analyses. This can be thought of as a measurement error problem, where the distribution of errors is a mixture with a point mass at 0. If the error rate is nonzero, then even if the mean of the discrepancy between the reported and correct values of a predictor is 0, naive estimates of the association between two continuous variables will be biased. We consider scenarios where there are (1) errors in the predictor, (2) errors in the outcome, and (3) possibly correlated errors in the predictor and outcome. We show how to incorporate the error rate and magnitude, estimated from a random subset (the audited records), to compute unbiased estimates of association and proper confidence intervals. We then extend these results to multiple linear regression where multiple covariates may be incorrect in the database and the rate and magnitude of the errors may depend on study site. We study the finite sample properties of our estimators using simulations, discuss some practical considerations, and illustrate our methods with data from 2815 HIV-infected patients in Latin America, of whom 234 had their data audited using a sequential auditing plan.  相似文献   

17.

Background

Inborn errors of metabolism (IEM) are a rare group of genetic diseases which can lead to several serious long-term complications in newborns. In order to address these issues as early as possible, a process called tandem mass spectrometry (MS/MS) can be used as it allows for rapid and simultaneous detection of the diseases. This analysis was performed to determine whether newborn screening by MS/MS is cost-effective in Thailand.

Method

A cost-utility analysis comprising a decision-tree and Markov model was used to estimate the cost in Thai baht (THB) and health outcomes in life-years (LYs) and quality-adjusted life year (QALYs) presented as an incremental cost-effectiveness ratio (ICER). The results were also adjusted to international dollars (I$) using purchasing power parities (PPP) (1 I$ = 17.79 THB for the year 2013). The comparisons were between 1) an expanded neonatal screening programme using MS/MS screening for six prioritised diseases: phenylketonuria (PKU); isovaleric acidemia (IVA); methylmalonic acidemia (MMA); propionic acidemia (PA); maple syrup urine disease (MSUD); and multiple carboxylase deficiency (MCD); and 2) the current practice that is existing PKU screening. A comparison of the outcome and cost of treatment before and after clinical presentations were also analysed to illustrate the potential benefit of early treatment for affected children. A budget impact analysis was conducted to illustrate the cost of implementing the programme for 10 years.

Results

The ICER of neonatal screening using MS/MS amounted to 1,043,331 THB per QALY gained (58,647 I$ per QALY gained). The potential benefits of early detection compared with late detection yielded significant results for PKU, IVA, MSUD, and MCD patients. The budget impact analysis indicated that the implementation cost of the programme was expected at approximately 2,700 million THB (152 million I$) over 10 years.

Conclusion

At the current ceiling threshold, neonatal screening using MS/MS in the Thai context is not cost-effective. However, the treatment of patients who were detected early for PKU, IVA, MSUD, and MCD, are considered favourable. The budget impact analysis suggests that the implementation of the programme will incur considerable expenses under limited resources. A long-term epidemiological study on the incidence of IEM in Thailand is strongly recommended to ascertain the magnitude of problem.  相似文献   

18.
Appropriate study design and proper statistical analysis are necessary ingredients for improving the quality and reliability of the information in journal articles. General surgery and plastic surgery articles were compared for principal author's academic degree, a Ph.D.'s presence as a coauthor, the study type, the presence of statistical analysis, the analysis' appropriateness, and the types of errors in study design or statistical analysis. Ph. D. authorship was associated with increased percentage of articles using statistical analysis. When compared with general surgery articles, plastic surgery articles performed four times fewer statistical analyses. However, when statistical analyses were performed, there were few differences between these two specialties. Although there were no differences in the types of statistical analysis errors, there were differences in the types of study design errors. The causes of these discrepancies may lie in the nature of plastic surgery; they may be reduced by adherence to Feinstein's principles of study design and result interpretation.  相似文献   

19.
Summary Microdensitometric errors can originate in the instrument, in the specimen or in the human operator. Instrumental sources of systematic error mostly reduce the apparent integrated absorbance, especially of relatively small and highly absorbing objects. They can be assessed, minimized or eliminated by available techniques, but with modern apparatus are in general important only if results of high accuracy are required. Instrument errors include: (a) distributional error, due to the use of too large a measuring spot or the specimen being out of focus; (b) glare (stray light), due mainly to multiple reflections in the microscope objective; (c) monochromator error (the use of insufficiently pure light); (d) calibration errors; and (e) errors resulting from lack of photometric linearity, or the specimen absorbance exceeding the measuring range of the instrument.Specimen errors, including problems of stain specificity and stoichiometry, are now the most important obstacles to a wider use of microdensitometry in histochemistry. The following selected topics are briefly discussed: fading; rate of staining; Beer's law deviations and the microdensitometry of opaque particles.Human errors include faulty logic, and failing to attempt an investigation because of anticipated difficulties which are in fact exaggerated or imaginary. The significance of microdensitometric results should, in general, be assessed by biological criteria rather than merely statistically; the use is urged of appropriate internal biological controls and standards wherever possible.  相似文献   

20.
Recent evidence suggests that skeletal adaptation is organized by a functional unit that includes cells of diverse origin working in coordination. Genetic and metabolic factors control and regulate the processes of modeling and remodeling, only rarely acting on the isolated individual functions of specific cell lines. Errors in the genetic or metabolic regulation of the functional unit affect the entire process of skeletal adaptation rather than specific elements of it. Viewed in this way, some metabolic bone diseases can be understood as relatively simple errors in factors that control the coordinated activities of the entire functional unit. This paper reviews the modeling and remodeling processes and demonstrates how abnormal morphological characteristics of bone tissue can be viewed as products of specific errors in the adaptive process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号