首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This article is concerned with a method for making inferences about various measures of vaccine efficacy. These measures describe reductions in susceptibility and in the potential to transmit infection. The method uses data on household outbreaks; it is based on a model that allows for transmission of infection both from within a household and from the outside. The use of household data is motivated by the hope that these are informative about vaccine-induced reduction of the potential to transmit infection, as household outbreaks contain some information about the possible source of infection. For illustration, the method is applied to observed data on household outbreaks of smallpox. These data are of the form needed and the number of households is of a size that can be managed in a vaccine trial. It is found that vaccine effects, such as the mean reduction in susceptibility and the mean reduction in the potential to infect others, per infectious contact, can be estimated with precision. However, a more specific parameter reflecting the reduction in infectivity for individuals partially responding to vaccination is not estimated well in the application. An evaluation of the method using artificial data shows that this parameter can be estimated with greater precision when we have outbreak data on a large number of small households.  相似文献   

2.
Summary We have recently described a method of building phylogenetic trees and have outlined an approach for proving whether a particular tree is optimal for the data used. In this paper we describe in detail the method of establishing lower bounds on the length of a minimal tree by partitioning the data set into subsets. All characters that could be involved in duplications in the data are paired with all other such characters. A matching algorithm is then used to obtain the pairing of characters that reveals the most duplications in the data. This matching may still not account for all nucleotide substitutions on the tree. The structure of the tree is then used to help select subsets of three or more. characters until the lower bound found by partitioning is equal to the length of the tree. The tree must then be a minimal tree since no tree can exist with a length less than that of the lower bound.The method is demonstrated using a set of 23 vertebrate cytochrome c sequences with the criterion of minimizing the total number of nucleotide substitutions. There are 131130 7045768798 9603440625 topologically distinct trees that can be constructed from this data set. The method described in this paper does identify 144 minimal tree variants. The method is general in the sense that it can be used for other data and other criteria of length. It need not however always be possible to prove a tree minimal but the method will give an upper and lower bound on the length of minimal trees.  相似文献   

3.
In functional data analysis for longitudinal data, the observation process is typically assumed to be noninformative, which is often violated in real applications. Thus, methods that fail to account for the dependence between observation times and longitudinal outcomes may result in biased estimation. For longitudinal data with informative observation times, we find that under a general class of shared random effect models, a commonly used functional data method may lead to inconsistent model estimation while another functional data method results in consistent and even rate-optimal estimation. Indeed, we show that the mean function can be estimated appropriately via penalized splines and that the covariance function can be estimated appropriately via penalized tensor-product splines, both with specific choices of parameters. For the proposed method, theoretical results are provided, and simulation studies and a real data analysis are conducted to demonstrate its performance.  相似文献   

4.
Sedimentation data acquired with the interference optical scanning system of the Optima XL-I analytical ultracentrifuge can exhibit time-invariant noise components, as well as small radial-invariant baseline offsets, both superimposed onto the radial fringe shift data resulting from the macromolecular solute distribution. A well-established method for the interpretation of such ultracentrifugation data is based on the analysis of time-differences of the measured fringe profiles, such as employed in the g(s*) method. We demonstrate how the technique of separation of linear and nonlinear parameters can be used in the modeling of interference data by unraveling the time-invariant and radial-invariant noise components. This allows the direct application of the recently developed approximate analytical and numerical solutions of the Lamm equation to the analysis of interference optical fringe profiles. The presented method is statistically advantageous since it does not require the differentiation of the data and the model functions. The method is demonstrated on experimental data and compared with the results of a g(s*) analysis. It is also demonstrated that the calculation of time-invariant noise components can be useful in the analysis of absorbance optical data. They can be extracted from data acquired during the approach to equilibrium, and can be used to increase the reliability of the results obtained from a sedimentation equilibrium analysis.  相似文献   

5.
Yang R  Yi N  Xu S 《Genetica》2006,128(1-3):133-143
The maximum likelihood method of QTL mapping assumes that the phenotypic values of a quantitative trait follow a normal distribution. If the assumption is violated, some forms of transformation should be taken to make the assumption approximately true. The Box–Cox transformation is a general transformation method which can be applied to many different types of data. The flexibility of the Box–Cox transformation is due to a variable, called transformation factor, appearing in the Box–Cox formula. We developed a maximum likelihood method that treats the transformation factor as an unknown parameter, which is estimated from the data simultaneously along with the QTL parameters. The method makes an objective choice of data transformation and thus can be applied to QTL analysis for many different types of data. Simulation studies show that (1) Box–Cox transformation can substantially increase the power of QTL detection; (2) Box–Cox transformation can replace some specialized transformation methods that are commonly used in QTL mapping; and (3) applying the Box–Cox transformation to data already normally distributed does not harm the result.  相似文献   

6.
1IntroductionElbo(ECG)offersalotofilllcorralltinfondionforthediagnosisOfheartdis-eases.Berz1,seahaormalEChcax[lrins>llleu"knownsituations,thcycanbeCSUghtinthelongti1Ylecontin~11xHlltoriDg.ndterrnoultonngsystemwhichcanrecord24-hoUrECGdataisoneofeffectiveme~toprovidethefun~.AlthOUghthehag6scaleICmorestohavebeeddevelopepbedeavailabletostorelOngtboeECGdsta,itisveqdifficultyandtioublesomethatalopnUmbeOfdstaisprasersandstoredortiallsillltted.InthedigitedECGdata,thereare…  相似文献   

7.
In model building and model evaluation, cross‐validation is a frequently used resampling method. Unfortunately, this method can be quite time consuming. In this article, we discuss an approximation method that is much faster and can be used in generalized linear models and Cox’ proportional hazards model with a ridge penalty term. Our approximation method is based on a Taylor expansion around the estimate of the full model. In this way, all cross‐validated estimates are approximated without refitting the model. The tuning parameter can now be chosen based on these approximations and can be optimized in less time. The method is most accurate when approximating leave‐one‐out cross‐validation results for large data sets which is originally the most computationally demanding situation. In order to demonstrate the method's performance, it will be applied to several microarray data sets. An R package penalized, which implements the method, is available on CRAN.  相似文献   

8.
Recent advances in high-throughput technologies have made it possible to generate both gene and protein sequence data at an unprecedented rate and scale thereby enabling entirely new "omics"-based approaches towards the analysis of complex biological processes. However, the amount and complexity of data that even a single experiment can produce seriously challenges researchers with limited bioinformatics expertise, who need to handle, analyze and interpret the data before it can be understood in a biological context. Thus, there is an unmet need for tools allowing non-bioinformatics users to interpret large data sets. We have recently developed a method, NNAlign, which is generally applicable to any biological problem where quantitative peptide data is available. This method efficiently identifies underlying sequence patterns by simultaneously aligning peptide sequences and identifying motifs associated with quantitative readouts. Here, we provide a web-based implementation of NNAlign allowing non-expert end-users to submit their data (optionally adjusting method parameters), and in return receive a trained method (including a visual representation of the identified motif) that subsequently can be used as prediction method and applied to unknown proteins/peptides. We have successfully applied this method to several different data sets including peptide microarray-derived sets containing more than 100,000 data points. NNAlign is available online at http://www.cbs.dtu.dk/services/NNAlign.  相似文献   

9.
We propose a method for improving the quality of signal from DNA microarrays by using several scans at varying scanner sen-sitivities. A Bayesian latent intensity model is introduced for the analysis of such data. The method improves the accuracy at which expressions can be measured in all ranges and extends the dynamic range of measured gene expression at the high end. Our method is generic and can be applied to data from any organism, for imaging with any scanner that allows varying the laser power, and for extraction with any image analysis software. Results from a self-self hybridization data set illustrate an improved precision in the estimation of the expression of genes compared to what can be achieved by applying standard methods and using only a single scan.  相似文献   

10.
Besides the problem of searching for effective methods for data analysis there are some additional problems with handling data of high uncertainty. Uncertainty problems often arise in an analysis of ecological data, e.g. in the cluster analysis of ecological data. Conventional clustering methods based on Boolean logic ignore the continuous nature of ecological variables and the uncertainty of ecological data. That can result in misclassification or misinterpretation of the data structure. Clusters with fuzzy boundaries reflect better the continuous character of ecological features. But the problem is, that the common clustering methods (like the fuzzy c-means method) are only designed for treating crisp data, that means they provide a fuzzy partition only for crisp data (e.g. exact measurement data). This paper presents the extension and implementation of the method of fuzzy clustering of fuzzy data proposed by Yang and Liu [Yang, M.-S. and Liu, H-H, 1999. Fuzzy clustering procedures for conical fuzzy vector data. Fuzzy Sets and Systems, 106, 189-200.]. The imprecise data can be defined as multidimensional fuzzy sets with not sharply formed boundaries (in the form of the so-called conical fuzzy vectors). They can then be used for the fuzzy clustering together with crisp data. That can be particularly useful when information is not available about the variances which describe the accuracy of the data and probabilistic approaches are impossible. The method proposed by Yang has been extended and implemented for the Fuzzy Clustering System EcoFucs developed at the University of Kiel. As an example, the paper presents the fuzzy cluster analysis of chemicals according to their ecotoxicological properties. The uncertainty and imprecision of ecotoxicological data are very high because of the use of various data sources, various investigation tests and the difficulty of comparing these data. The implemented method can be very helpful in searching for an adequate partition of ecological data into clusters with similar properties.  相似文献   

11.
Summary Determining the volumes of peaks in 2D NMR spectra can be prohibitively difficult in cases of overlapping, broad lines. Deconvolution and parameter estimation can be attempted on either the time-domain or the frequency-domain data. We present a method of estimating spectral parameters from frequency-domain data, using a combination of Lorentzian and Gaussian lineshapes for reference lines. This approach combines a previously published method of projecting the data on a linear space spanned by reference lines with a nonlinear least-squares fitting algorithm. Comparison of this method with other published methods of frequency-domain deconvolution shows that it is both more precise and more accurate when estimating 2D volumes.  相似文献   

12.
传统的作物种质数据组织方法,针对不同作物种类建立不同数据表,这种方法已不能有效适应种质数据综合分析的需要.本文提出了一种基于属性分离存储的种质数据组织方法,根据种质的每个属性分别建立数据表,各属性间没有从属关系.该方法可统一数据查询操作,优化查询过程,提高分析效率,具有灵活、可扩展的特点,可以方便地集成与种质分析相关的数据,适用于种质资源分布式数据库和相关信息系统的建立.  相似文献   

13.
This report presents a new approach to studying the metabolic and kinetic properties of anaerobic sludge from single batch experiments. The two main features of the method are that the methane production is measured on-line with a relatively cheap system, and that the methane production data can be plotted as rate vs time curves. The case studies of specific methanogenic activity, biodegradability and toxicity tests here presented show that very accurate kinetic data can be obtained. The method is specifically useful in experiments in which strong changes in methane production occur, and it is proposed as a powerful tool to study methanogenic systems. Furthermore, the method is simple and could be implemented by industry in the routine analysis of sludge.  相似文献   

14.
In this paper we propose efficient color segmentation method which is based on the Support Vector Machine classifier operating in a one-class mode. The method has been developed especially for the road signs recognition system, although it can be used in other applications. The main advantage of the proposed method comes from the fact that the segmentation of characteristic colors is performed not in the original but in the higher dimensional feature space. By this a better data encapsulation with a linear hypersphere can be usually achieved. Moreover, the classifier does not try to capture the whole distribution of the input data which is often difficult to achieve. Instead, the characteristic data samples, called support vectors, are selected which allow construction of the tightest hypersphere that encloses majority of the input data. Then classification of a test data simply consists in a measurement of its distance to a centre of the found hypersphere. The experimental results show high accuracy and speed of the proposed method.  相似文献   

15.
Tumors often harbor orders of magnitude more mutations than healthy tissues. The increased number of mutations may be due to an elevated mutation rate or frequent cell death and correspondingly rapid cell turnover, or a combination of the two. It is difficult to disentangle these two mechanisms based on widely available bulk sequencing data, where sequences from individual cells are intermixed and, thus, the cell lineage tree of the tumor cannot be resolved. Here we present a method that can simultaneously estimate the cell turnover rate and the rate of mutations from bulk sequencing data. Our method works by simulating tumor growth and finding the parameters with which the observed data can be reproduced with maximum likelihood. Applying this method to a real tumor sample, we find that both the mutation rate and the frequency of death may be high.  相似文献   

16.
Body composition measurement is of cardinal significance for medical and clinical applications. Currently, the dual-energy X-ray absorptiometry (DEXA) technique is widely applied for this measurement. In this study, we present a novel measurement method using the absorption and phase information obtained simultaneously from the X-ray grating-based interferometer (XGI). Rather than requiring two projection data sets with different X-ray energy spectra, with the proposed method, both the areal densities of the bone and the surrounding soft tissue can be acquired utilizing one projection data set. By using a human body phantom constructed to validate the proposed method, experimental results have shown that the compositions can be calculated with an improved accuracy comparing to the dual energy method, especially for the soft tissue measurement. Since the proposed method can be easily implemented on current XGI setup, it will greatly extend the applications of the XGI, and meanwhile has the potential to be an alternative to DEXA for human body composition measurement.  相似文献   

17.
The dynamic response of a complex biological or chemical system to a perturbation must often be described by an integral over an effectively continuous relaxation spectrum. Because of its well known instability to experimental error, the direct estimation of the spectrum is generally considered unfeasible. However, we show that good estimates can be obtained by constraining the spectrum to be the smoothest one that is consistent with the data. Also constraining the spectrum to be non-negative, if there is a priori knowledge of this, can further increase its accuracy. The method is completely automatic in that no initial estimates or assumptions about the functional form of the spectrum are necessary. Therefore models can be tested more rigorously and objectively since the functional form that they predict for the spectrum need not be assumed at the outset of the analysis as with parameter-fitting procedures. The method is illustrated on simulated data on the photodissociation of CO from heme proteins at low temperatures. The nonuniqueness of the solutions is discussed.  相似文献   

18.
19.
基于蛋白质网络功能模块的蛋白质功能预测   总被引:1,自引:0,他引:1  
在破译了基因序列的后基因组时代,随着系统生物学实验的快速发展,产生了大量的蛋白质相互作用数据,利用这些数据寻找功能模块及预测蛋白质功能在功能基因组研究中具有重要意义.打破了传统的基于蛋白质间相似度的聚类模式,直接从蛋白质功能团的角度出发,考虑功能团间的一阶和二阶相互作用,提出了模块化聚类方法(MCM),对实验数据进行聚类分析,来预测模块内未知蛋白质的功能.通过超几何分布P值法和增、删、改相互作用的方法对聚类结果进行预测能力分析和稳定性分析.结果表明,模块化聚类方法具有较高的预测准确度和覆盖率,有很好的容错性和稳定性.此外,模块化聚类分析得到了一些具有高预测准确度的未知蛋白质的预测结果,将会对生物实验有指导意义,其算法对其他具有相似结构的网络也具有普遍意义.  相似文献   

20.
Nielsen R 《Genetics》2001,159(1):401-411
This article describes a new Markov chain Monte Carlo (MCMC) method applicable to DNA sequence data, which treats mutations in the genealogy as missing data. The method facilitates inferences regarding the age and identity of specific mutations while taking the full complexities of the mutational process in DNA sequences into account. We demonstrate the utility of the method in three applications. First, we demonstrate how the method can be used to make inferences regarding population genetical parameters such as theta (the effective population size times the mutation rate). Second, we show how the method can be used to estimate the ages of mutations in finite sites models and for making inferences regarding the distribution and ages of nonsynonymous and synonymous mutations. The method is applied to two previously published data sets and we demonstrate that in one of the data sets the average age of nonsynonymous mutations is significantly lower than the average age of synonymous mutations, suggesting the presence of slightly deleterious mutations. Third, we demonstrate how the method in general can be used to evaluate the posterior distribution of a function of a mapping of mutations on a gene genealogy. This application is useful for evaluating the uncertainty associated with methods that rely on mapping mutations on a phylogeny or a gene genealogy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号