首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
More and more noninvasive genetic data are being produced but a general methodology to quantify genotyping error rates from non-pilot data remains lacking. Here we propose a mathematical approach to estimate genotyping error rates by exploring the relationship between errors and PCR replicates. This method can be used to quantify the error rates for either the multi-tubes approach designed by Taberlet et al. (Nucleic Acids Res 24: 3189–3194, 1996) or the pilot method by Prugh et al. (Mol Ecol 14: 1585–1596, 2005).  相似文献   

2.
Misspecified relationships can have serious consequences for linkage studies, resulting in either reduced power or false-positive evidence for linkage. If some individuals in the pedigree are untyped, then Mendelian errors may not be observed. Previous approaches to detection of misspecified relationships by use of genotype data were developed for sib and half-sib pairs. We extend the likelihood calculations of G?ring and Ott and Boehnke and Cox to more-general relative pairs, for which identity-by-descent (IBD) status is no longer a Markov chain, and we propose a likelihood-ratio test. We also extend the identity-by-state (IBS)-based test of Ehm and Wagner to nonsib relative pairs. The likelihood-ratio test has high power, but its drawbacks include the need to construct and apply a separate Markov chain for each possible alternative relationship and the need for simulation to assess significance. The IBS-based test is simpler but has lower power. We propose two new test statistics-conditional expected IBD (EIBD) and adjusted IBS (AIBS)-designed to retain the simplicity of IBS while increasing power by taking into account chance sharing. In simulations, the power of EIBD is generally close to that of the likelihood-ratio test. The power of AIBS is higher than that of IBS, in all cases considered. We suggest a strategy of initial screening by use of EIBD and AIBS, followed by application of the likelihood-ratio test to only a subset of relative pairs, identified by use of EIBD and AIBS. We apply the methods to a Genetic Analysis Workshop 11 data set from the Collaborative Study on the Genetics of Alcoholism.  相似文献   

3.
An alternative to frequentist approaches to multiple comparisons is Duncan's k-ratio Bayes rule approach. The purpose of this paper is to compile key results on k-ratio Bayes rules for a number of multiple comparison problems that heretofore, have only been available in separate papers or doctoral dissertations. Among other problems, multiple comparisons for means in one-way, two-way, and treatments-vs.-control structures will be reviewed. In the k-ratio approach, the optimal joint rule for a multiple comparisons problem is derived under the assumptions of additive losses and prior exchangeability for the component comparisons. In the component loss function for a comparison, a balance is achieved between the decision losses due to Type I and Type II errors by assuming that their ratio is k. The component loss is also linear in the magnitude of the error. Under the assumption of additive losses, the joint Bayes rule for the component comparisons applies to each comparison the Bayes test for that comparison considered alone. That is, a comparisonwise approach is optimal. However, under prior exchangeability of the comparisons, the component test critical regions adapt to omnibus patterns in the data. For example, for a balanced one-way array of normally distributed means, the Bayes critical t value for a difference between means is inversely related to the F ratio measuring heterogeneity among the means, resembling a continuous version of Fisher's F-protected least significant difference rule. For more complicated treatment structures, the Bayes critical t value for a difference depends intuitively on multiple F ratios and marginal difference(s) (if applicable), such that the critical t value warranted for the difference can range from being as conservative as that given by a familywise rule to actually being anti-conservative relative to that given by the unadjusted 5%-level Student's t test.  相似文献   

4.
We introduce a new method, moment reconstruction, of correcting for measurement error in covariates in regression models. The central idea is similar to regression calibration in that the values of the covariates that are measured with error are replaced by "adjusted" values. In regression calibration the adjusted value is the expectation of the true value conditional on the measured value. In moment reconstruction the adjusted value is the variance-preserving empirical Bayes estimate of the true value conditional on the outcome variable. The adjusted values thereby have the same first two moments and the same covariance with the outcome variable as the unobserved "true" covariate values. We show that moment reconstruction is equivalent to regression calibration in the case of linear regression, but leads to different results for logistic regression. For case-control studies with logistic regression and covariates that are normally distributed within cases and controls, we show that the resulting estimates of the regression coefficients are consistent. In simulations we demonstrate that for logistic regression, moment reconstruction carries less bias than regression calibration, and for case-control studies is superior in mean-square error to the standard regression calibration approach. Finally, we give an example of the use of moment reconstruction in linear discriminant analysis and a nonstandard problem where we wish to adjust a classification tree for measurement error in the explanatory variables.  相似文献   

5.
We define the memory capacity of networks of binary neurons with finite-state synapses in terms of retrieval probabilities of learned patterns under standard asynchronous dynamics with a predetermined threshold. The threshold is set to control the proportion of non-selective neurons that fire. An optimal inhibition level is chosen to stabilize network behavior. For any local learning rule we provide a computationally efficient and highly accurate approximation to the retrieval probability of a pattern as a function of its age. The method is applied to the sequential models (Fusi and Abbott, Nat Neurosci 10:485–493, 2007) and meta-plasticity models (Fusi et al., Neuron 45(4):599–611, 2005; Leibold and Kempter, Cereb Cortex 18:67–77, 2008). We show that as the number of synaptic states increases, the capacity, as defined here, either plateaus or decreases. In the few cases where multi-state models exceed the capacity of binary synapse models the improvement is small.  相似文献   

6.
Selection pressures on proteins are usually measured by comparing homologous nucleotide sequences (Zuckerkandl and Pauling 1965). Recently we introduced a novel method, termed volatility, to estimate selection pressures on proteins on the basis of their synonymous codon usage (Plotkin and Dushoff 2003; Plotkin et al. 2004). Here we provide a theoretical foundation for this approach. Under the Fisher-Wright model, we derive the expected frequencies of synonymous codons as a function of the strength of selection on amino acids, the mutation rate, and the effective population size. We analyze the conditions under which we can expect to draw inferences from biased codon usage, and we estimate the time scales required to establish and maintain such a signal. We find that synonymous codon usage can reliably distinguish between negative selection and neutrality only for organisms, such as some microbes, that experience large effective population sizes or periods of elevated mutation rates. The power of volatility to detect positive selection is also modest—requiring approximately 100 selected sites—but it depends less strongly on population size. We show that phenomena such as transient hyper-mutators can improve the power of volatility to detect selection, even when the neutral site heterozygosity is low. We also discuss several confounding factors, neglected by the Fisher-Wright model, that may limit the applicability of volatility in practice. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users. [Reviewing Editor: Dr. Lauren Meyers]  相似文献   

7.
MOTIVATION: A major problem of pattern classification is estimation of the Bayes error when only small samples are available. One way to estimate the Bayes error is to design a classifier based on some classification rule applied to sample data, estimate the error of the designed classifier, and then use this estimate as an estimate of the Bayes error. Relative to the Bayes error, the expected error of the designed classifier is biased high, and this bias can be severe with small samples. RESULTS: This paper provides a correction for the bias by subtracting a term derived from the representation of the estimation error. It does so for Boolean classifiers, these being defined on binary features. Although the general theory applies to any Boolean classifier, a model is introduced to reduce the number of parameters. A key point is that the expected correction is conservative. Properties of the corrected estimate are studied via simulation. The correction applies to binary predictors because they are mathematically identical to Boolean classifiers. In this context the correction is adapted to the coefficient of determination, which has been used to measure nonlinear multivariate relations between genes and design genetic regulatory networks. An application using gene-expression data from a microarray experiment is provided on the website http://gspsnap.tamu.edu/smallsample/ (user:'smallsample', password:'smallsample)').  相似文献   

8.
In the Gulf of Mexico (GOM), fish biomass estimates are necessary for the evaluation of habitat use and function following the mandate for ecosystem-based fisheries management in the recently reauthorized Sustainable Fisheries Act of 2007. Acoustic surveys have emerged as a potential tool to estimate fish biomass in shallow-water estuaries, however, the transformation of acoustic data into an index of fish biomass is not straightforward. In this article, we examine the consequences of equation selection for target strength (TS) to fish length relationships on potential error generation in hydroacoustic fish biomass estimates. We applied structural equation models (SEMs) to evaluate how our choice of an acoustic TS–fish length equation affected our biomass estimates, and how error occurred and propagated during this process. To demonstrate the magnitude of the error when applied to field data, we used SEMs on normally distributed simulated data to better understand the sources of error involved with converting acoustic data to fish biomass. As such, we describe where, and to what magnitude, error propagates when estimating fish biomass. Estimates of fish lengths were affected by measurement errors of TS, and from inexact relationships between fish length and TS. Differences in parameter estimates resulted in significant differences in fish biomass estimates and led to the conclusion that in the absence of known TS–fish length relationships, Love’s (J Acoust Soc Am 46:746–752, 1969) lateral-aspect equation may be an acceptable substitute for an ecosystem-specific TS–fish length relationship. Based upon SEMs applied to simulated data, perhaps the most important, yet most variable, component is the mean volume backscattering strength, which significantly inflated biomass errors in approximately 10% of the cases. Handling editor: M. Power  相似文献   

9.
We use a technique from engineering (Xia and Moog, in IEEE Trans. Autom. Contr. 48(2):330–336, 2003; Jeffrey and Xia, in Tan, W.Y., Wu, H. (Eds.), Deterministic and Stochastic Models of AIDS Epidemics and HIV Infections with Intervention, 2005) to investigate the algebraic identifiability of a popular three-dimensional HIV/AIDS dynamic model containing six unknown parameters. We find that not all six parameters in the model can be identified if only the viral load is measured, instead only four parameters and the product of two parameters (N and λ) are identifiable. We introduce the concepts of an identification function and an identification equation and propose the multiple time point (MTP) method to form the identification function which is an alternative to the previously developed higher-order derivative (HOD) method (Xia and Moog, in IEEE Trans. Autom. Contr. 48(2):330–336, 2003; Jeffrey and Xia, in Tan, W.Y., Wu, H. (Eds.), Deterministic and Stochastic Models of AIDS Epidemics and HIV Infections with Intervention, 2005). We show that the newly proposed MTP method has advantages over the HOD method in the practical implementation. We also discuss the effect of the initial values of state variables on the identifiability of unknown parameters. We conclude that the initial values of output (observable) variables are part of the data that can be used to estimate the unknown parameters, but the identifiability of unknown parameters is not affected by these initial values if the exact initial values are measured with error. These noisy initial values only increase the estimation error of the unknown parameters. However, having the initial values of the latent (unobservable) state variables exactly known may help to identify more parameters. In order to validate the identifiability results, simulation studies are performed to estimate the unknown parameters and initial values from simulated noisy data. We also apply the proposed methods to a clinical data set to estimate HIV dynamic parameters. Although we have developed the identifiability methods based on an HIV dynamic model, the proposed methodologies are generally applicable to any ordinary differential equation systems.  相似文献   

10.
The best reconstructions of the history of life will use both molecular time estimates and fossil data. Errors in molecular rate estimation typically are unaccounted for and no attempts have been made to quantify this uncertainty comprehensively. Here, focus is primarily on fossil calibration error because this error is least well understood and nearly universally disregarded. Our quantification of errors in the synapsid–diapsid calibration illustrates that although some error can derive from geological dating of sedimentary rocks, the absence of good stem fossils makes phylogenetic error the most critical. We therefore propose the use of calibration ages that are based on the first undisputed synapsid and diapsid. This approach yields minimum age estimates and standard errors of 306.1±8.5 MYR for the divergence leading to birds and mammals. Because this upper bound overlaps with the recent use of 310 MYR, we do not support the notion that several metazoan divergence times are significantly overestimated because of serious miscalibration (sensu Lee 1999). However, the propagation of relevant errors reduces the statistical significance of the pre-K–T boundary diversification of many bird lineages despite retaining similar point time estimates. Our results demand renewed investigation into suitable loci and fossil calibrations for constructing evolutionary timescales.[Reviewing Editor: Martin Kreitman]  相似文献   

11.
Petersen JE  Englund G 《Oecologia》2005,145(2):215-223
Enclosed, experimental ecosystems (“mesocosms”) are now widely used research tools in ecology. However, the small size, short duration and often simplified biological and physical complexity of mesocosm experiments raises questions about extrapolating results from these miniaturized ecosystems to nature. Dimensional analysis, a technique widely used in engineering to create scale models, employs “compensatory distortion” as a means of maintaining functional similarity in properties and relationships of interest. An earlier paper outlined a general approach to applying dimensional analysis to the construction and interpretation of mesocosm experiments (Petersen and Hastings in Am Nat 157:324, 2001). In this paper we use examples, largely drawn from the aquatic literature, to illustrate how dimensional approaches might be used to maintain key ecological properties. Such key properties include effective habitat size, environmental variability, vertical and horizontal gradients, and interactions among habitats. We distinguish both continuous and discrete approaches that can be used to achieve functional similarity through compensatory distortion. In addition to its potential as a tool for improving the realism of experimental ecosystems, the dimensional approach points towards new options for developing, testing and advancing our understanding of scaling relationships in nature. Electronic Supplementary Material Supplementary material is available for this article at  相似文献   

12.

Background  

For gene expression data obtained from a time-course microarray experiment, Liu et al. [1] developed a new algorithm for clustering genes with similar expression profiles over time. Performance of their proposal was compared with three other methods including the order-restricted inference based methodology of Peddada et al. [2, 3]. In this note we point out several inaccuracies in Liu et al. [1] and conclude that the order-restricted inference based methodology of Peddada et al. (programmed in the software ORIOGEN) indeed operates at the desired nominal Type 1 error level, an important feature of a statistical decision rule, while being computationally substantially faster than indicated by Liu et al. [1].  相似文献   

13.
During the RNA World, organisms experienced high rates of genetic errors, which implies that there was strong evolutionary pressure to reduce the errors’ phenotypical impact by suitably structuring the still-evolving genetic code. Therefore, the relative rates of the various types of genetic errors should have left characteristic imprints in the structure of the genetic code. Here, we show that, therefore, it is possible to some extent to reconstruct those error rates, as well as the nucleotide frequencies, for the time when the code was fixed. We find evidence indicating that the frequencies of G and C in the genome were not elevated. Since, for thermodynamic reasons, RNA in thermophiles tends to possess elevated G+C content, this result indicates that the fixation of the genetic code occurred in organisms which were either not thermophiles or that the code’s fixation occurred after the rise of DNA. Supplementary Materials Original data and programs are available at the author’s web site: .  相似文献   

14.
15.
Dermal connective tissue collagen is the major structural protein in skin. Fibroblasts within the dermis are largely responsible for collagen production and turnover. We have previously reported that dermal fibroblasts, in aged human skin in vivo, express elevated levels of CCN1, and that CCN1 negatively regulates collagen homeostasis by suppressing collagen synthesis and increasing collagen degradation (Quan et al. Am J Pathol 169:482–90, 2006, J Invest Dermatol 130:1697–706, 2010). In further investigations of CCN1 actions, we find that CCN1 alters collagen homeostasis by promoting expression of specific secreted proteins, which include matrix metalloproteinases and proinflammatory cytokines. We also find that CCN1-induced secretory proteins are elevated in aged human skin in vivo. We propose that CCN1 induces an “Age-Associated Secretory Phenotype”, in dermal fibroblasts, which mediates collagen reduction and fragmentation in aged human skin.  相似文献   

16.
Somites are condensations of mesodermal cells that form along the two sides of the neural tube during early vertebrate development. They are one of the first instances of a periodic pattern, and give rise to repeated structures such as the vertebrae. A number of theories for the mechanisms underpinning somite formation have been proposed. For example, in the “clock and wavefront” model (Cooke and Zeeman in J. Theor. Biol. 58:455–476, 1976), a cellular oscillator coupled to a determination wave progressing along the anterior-posterior axis serves to group cells into a presumptive somite. More recently, a chemical signaling model has been developed and analyzed by Maini and coworkers (Collier et al. in J. Theor. Biol. 207:305–316, 2000; Schnell et al. in C. R. Biol. 325:179–189, 2002; McInerney et al. in Math. Med. Biol. 21:85–113, 2004), with equations for two chemical regulators with entrained dynamics. One of the chemicals is identified as a somitic factor, which is assumed to translate into a pattern of cellular aggregations via its effect on cell–cell adhesion. Here, the authors propose an extension to this model that includes an explicit equation for an adhesive cell population. They represent cell adhesion via an integral over the sensing region of the cell, based on a model developed previously for adhesion driven cell sorting (Armstrong et al. in J. Theor. Biol. 243:98–113, 2006). The expanded model is able to reproduce the observed pattern of cellular aggregates, but only under certain parameter restrictions. This provides a fuller understanding of the conditions required for the chemical model to be applicable. Moreover, a further extension of the model to include separate subpopulations of cells is able to reproduce the observed differentiation of the somite into separate anterior and posterior halves. N.J. Armstrong was supported by a Doctoral Training Account Studentship from EPSRC. K.J. Painter and J.A. Sherratt were supported in part by Integrative Cancer Biology Program Grant CA113004 from the US National Institute of Health and in part by BBSRC grant BB/D019621/1 for the Centre for Systems Biology at Edinburgh.  相似文献   

17.
Highly multiplex DNA sequencers have greatly expanded our ability to survey human genomes for previously unknown single nucleotide polymorphisms (SNPs). However, sequencing and mapping errors, though rare, contribute substantially to the number of false discoveries in current SNP callers. We demonstrate that we can significantly reduce the number of false positive SNP calls by pooling information across samples. Although many studies prepare and sequence multiple samples with the same protocol, most existing SNP callers ignore cross-sample information. In contrast, we propose an empirical Bayes method that uses cross-sample information to learn the error properties of the data. This error information lets us call SNPs with a lower false discovery rate than existing methods.  相似文献   

18.
A statistical challenge in community ecology is to identify segregated and aggregated pairs of species from a binary presence–absence matrix, which often contains hundreds or thousands of such potential pairs. A similar challenge is found in genomics and proteomics, where the expression of thousands of genes in microarrays must be statistically analyzed. Here we adapt the empirical Bayes method to identify statistically significant species pairs in a binary presence–absence matrix. We evaluated the performance of a simple confidence interval, a sequential Bonferroni test, and two tests based on the mean and the confidence interval of an empirical Bayes method. Observed patterns were compared to patterns generated from null model randomizations that preserved matrix row and column totals. We evaluated these four methods with random matrices and also with random matrices that had been seeded with an additional segregated or aggregated species pair. The Bayes methods and Bonferroni corrections reduced the frequency of false-positive tests (type I error) in random matrices, but did not always correctly identify the non-random pair in a seeded matrix (type II error). All of the methods were vulnerable to identifying spurious secondary associations in the seeded matrices. When applied to a set of 272 published presence–absence matrices, even the most conservative tests indicated a fourfold increase in the frequency of perfectly segregated “checkerboard” species pairs compared to the null expectation, and a greater predominance of segregated versus aggregated species pairs. The tests did not reveal a large number of significant species pairs in the Vanuatu bird matrix, but in the much smaller Galapagos bird matrix they correctly identified a concentration of segregated species pairs in the genus Geospiza. The Bayesian methods provide for increased selectivity in identifying non-random species pairs, but the analyses will be most powerful if investigators can use a priori biological criteria to identify potential sets of interacting species.  相似文献   

19.

Background  

HIV can evolve drug resistance rapidly in response to new drug treatments, often through a combination of multiple mutations [13]. It would be useful to develop automated analyses of HIV sequence polymorphism that are able to predict drug resistance mutations, and to distinguish different types of functional roles among such mutations, for example, those that directly cause drug resistance, versus those that play an accessory role. Detecting functional interactions between mutations is essential for this classification. We have adapted a well-known measure of evolutionary selection pressure (K a /K s ) and developed a conditional K a /K s approach to detect important interactions.  相似文献   

20.
Huey and Slatkin’s (Q Rev Biol 51:363–384, 1976) cost–benefit model of lizard thermoregulation predicts variation in thermoregulatory strategies (from active thermoregulation to thermoconformity) with respect to the costs and benefits of the thermoregulatory behaviour and the thermal quality of the environment. Although this framework has been widely employed in correlative field studies, experimental tests aiming to evaluate the model are scarce. We conducted laboratory experiments to see whether the common lizard Zootoca vivipara, an active and effective thermoregulator in the field, can alter its thermoregulatory behaviour in response to differences in perceived predation risk and food supply in a constant thermal environment. Predation risk and food supply were represented by chemical cues of a sympatric snake predator and the lizards’ food in the laboratory, respectively. We also compared males and postpartum females, which have different preferred or “target” body temperatures. Both sexes thermoregulated actively in all treatments. We detected sex-specific differences in the way lizards adjusted their accuracy of thermoregulation to the treatments: males were less accurate in the predation treatment, while no such effects were detected in females. Neither sex reacted to the food treatment. With regard to the two main types of thermoregulatory behaviour (activity and microhabitat selection), the treatments had no significant effects. However, postpartum females were more active than males in all treatments. Our results further stress that increasing physiological performance by active thermoregulation has high priority in lizard behaviour, but also shows that lizards can indeed shift their accuracy of thermoregulation in response to costs with possible immediate negative fitness effects (i.e. predation-caused mortality).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号