首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We derive and compare the operating characteristics of hierarchical and square array-based testing algorithms for case identification in the presence of testing error. The operating characteristics investigated include efficiency (i.e., expected number of tests per specimen) and error rates (i.e., sensitivity, specificity, positive and negative predictive values, per-family error rate, and per-comparison error rate). The methodology is illustrated by comparing different pooling algorithms for the detection of individuals recently infected with HIV in North Carolina and Malawi.  相似文献   

2.
A graphic approach to evaluate algorithms of secondary structure prediction   总被引:3,自引:0,他引:3  
Algorithms of secondary structure prediction have undergone the developments of nearly 30 years. However, the problem of how to appropriately evaluate and compare algorithms has not yet completely solved. A graphic method to evaluate algorithms of secondary structure prediction has been proposed here. Traditionally, the performance of an algorithm is evaluated by a number, i.e., accuracy of various definitions. Instead of a number, we use a graph to completely evaluate an algorithm, in which the mapping points are distributed in a three-dimensional space. Each point represents the predictive result of the secondary structure of a protein. Because the distribution of mapping points in the 3D space generally contains more information than a number or a set of numbers, it is expected that algorithms may be evaluated and compared by the proposed graphic method more objectively. Based on the point distribution, six evaluation parameters are proposed, which describe the overall performance of the algorithm evaluated. Furthermore, the graphic method is simple and intuitive. As an example of application, two advanced algorithms, i.e., the PHD and NNpredict methods, are evaluated and compared. It is shown that there is still much room for further improvement for both algorithms. It is pointed out that the accuracy for predicting either the alpha-helix or beta-strand in proteins with higher alpha-helix or beta-strand content, respectively, should be greatly improved for both algorithms.  相似文献   

3.
A phylogenetic comparative method is proposed for estimating historical effects on comparative data using the partitions that compose a cladogram, i.e., its monophyletic groups. Two basic matrices, Y and X, are defined in the context of an ordinary linear model. Y contains the comparative data measured over t taxa. X consists of an initial tree matrix that contains all the xj monophyletic groups (each coded separately as a binary indicator variable) of the phylogenetic tree available for those taxa. The method seeks to define the subset of groups, i.e., a reduced tree matrix, that best explains the patterns in Y. This definition is accomplished via regression or canonical ordination (depending on the dimensionality of Y) coupled with Monte Carlo permutations. It is argued here that unrestricted permutations (i.e., under an equiprobable model) are valid for testing this specific kind of groupwise hypothesis. Phylogeny is either partialled out or, more properly, incorporated into the analysis in the form of component variation. Direct extensions allow for testing ecomorphological data controlled by phylogeny in a variation partitioning approach. Currently available statistical techniques make this method applicable under most univariate/multivariate models and metrics; two-way phylogenetic effects can be estimated as well. The simplest case (univariate Y), tested with simulations, yielded acceptable type I error rates. Applications presented include examples from evolutionary ethology, ecology, and ecomorphology. Results showed that the new technique detected previously overlooked variation clearly associated with phylogeny and that many phylogenetic effects on comparative data may occur at particular groups rather than across the entire tree.  相似文献   

4.
Current approaches to RNA structure prediction range from physics-based methods, which rely on thousands of experimentally measured thermodynamic parameters, to machine-learning (ML) techniques. While the methods for parameter estimation are successfully shifting toward ML-based approaches, the model parameterizations so far remained fairly constant. We study the potential contribution of increasing the amount of information utilized by RNA folding prediction models to the improvement of their prediction quality. This is achieved by proposing novel models, which refine previous ones by examining more types of structural elements, and larger sequential contexts for these elements. Our proposed fine-grained models are made practical thanks to the availability of large training sets, advances in machine-learning, and recent accelerations to RNA folding algorithms. We show that the application of more detailed models indeed improves prediction quality, while the corresponding running time of the folding algorithm remains fast. An additional important outcome of this experiment is a new RNA folding prediction model (coupled with a freely available implementation), which results in a significantly higher prediction quality than that of previous models. This final model has about 70,000 free parameters, several orders of magnitude more than previous models. Being trained and tested over the same comprehensive data sets, our model achieves a score of 84% according to the F?-measure over correctly-predicted base-pairs (i.e., 16% error rate), compared to the previously best reported score of 70% (i.e., 30% error rate). That is, the new model yields an error reduction of about 50%. Trained models and source code are available at www.cs.bgu.ac.il/?negevcb/contextfold.  相似文献   

5.
6.
In statistical mechanics, the equilibrium properties of a physical system of particles can be calculated as the statistical average over accessible microstates of the system. In general, these calculations are computationally intractable since they involve summations over an exponentially large number of microstates. Clustering algorithms are one of the methods used to numerically approximate these sums. The most basic clustering algorithms first sub-divide the system into a set of smaller subsets (clusters). Then, interactions between particles within each cluster are treated exactly, while all interactions between different clusters are ignored. These smaller clusters have far fewer microstates, making the summation over these microstates, tractable. These algorithms have been previously used for biomolecular computations, but remain relatively unexplored in this context. Presented here, is a theoretical analysis of the error and computational complexity for the two most basic clustering algorithms that were previously applied in the context of biomolecular electrostatics. We derive a tight, computationally inexpensive, error bound for the equilibrium state of a particle computed via these clustering algorithms. For some practical applications, it is the root mean square error, which can be significantly lower than the error bound, that may be more important. We how that there is a strong empirical relationship between error bound and root mean square error, suggesting that the error bound could be used as a computationally inexpensive metric for predicting the accuracy of clustering algorithms for practical applications. An example of error analysis for such an application-computation of average charge of ionizable amino-acids in proteins-is given, demonstrating that the clustering algorithm can be accurate enough for practical purposes.  相似文献   

7.
8.
Ring networks are enjoying renewed interest as Storage Area Networks (SANs), i.e., networks for interconnecting storage devices (e.g., disk, disk arrays and tape drives) and storage data clients. This paper addresses the problem of fairness in ring networks with spatial reuse operating under dynamic traffic scenarios. To this end, in the first part of the paper the Max-Min fairness definition is extended to dynamic traffic scenarios and an algorithm for computing Max-Min fair rates in a dynamic environment is introduced. In the second part of the paper the extended Max-Min fairness definition is used as a measure to compare the performance in dynamic conditions of three fairness algorithms proposed for ring-based SANs. These algorithms are characterized by different fairness cycle sizes (number of links involved in each instance of the fairness algorithm), i.e., different complexity. The results show that the performance increases as the fairness cycle size decreases. In particular, the Global-cycle algorithm (implemented in the Serial Storage Architecture - SSA), whose cycle size is equal to the number N of links in the ring, exhibits the lowest performance, while the One-cycle algorithm, so called because of its cycle size equal to 1, has the best performance. The Variable-cycle algorithm, whose cycle size changes between 1 and N links, performs in between and provides the best tradeoff between performance and complexity. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

9.
With rapidly increasing availability of three-dimensional structures, one major challenge for the post-genome era is to infer the functions of biological molecules based on their structural similarity. While quantitative studies of structural similarity between the same type of biological molecules (e.g., protein vs. protein) have been carried out intensively, the comparable study of structural similarity between different types of biological molecules (e.g., protein vs. RNA) remains unexplored. Here we have developed a new bioinformatics approach to quantitatively study the structural similarity between two different types of biopolymers--proteins and RNA--based on the spatial distribution of conserved elements. We applied it to two previously proposed tRNA-protein mimicry pairs whose functional relatedness between two molecules has been recently determined experimentally. Our method detected the biologically meaningful signals, which are consistent with experimental evidence.  相似文献   

10.
In this paper we consider the setting where a group of n judges are to independently rank a series of k objects, but the intended complete rankings are not realized and we are faced with analyzing randomly incomplete ranking vectors. In this paper we propose a new testing procedure for dealing with such data realizations. We concentrate on the problem of testing for no differences in the objects being ranked (i.e., they are indistinguishable) against general alternatives, but our approach could easily be extended to restricted (e.g., ordered or umbrella) alternatives. Using an improvement of a preliminary screening approach previously proposed by the authors, we present an algorithm for computation of the relevant Friedman‐type statistic in the general alternatives setting and present the results of an extensive simulation study comparing the new procedure with the standard approach of imputing average within‐judge ranks to the unranked objects.  相似文献   

11.
Since 2010, variant strains of porcine epidemic diarrhea virus(PEDV) have caused disasters in the pork industry. The spike(S) protein, as the major immunity-eliciting antigen, has previously been used for serological testing and has been found to correlate significantly with the results of the serum neutralization(SN) test. However, further evaluation of this method is needed as new epidemic strains of PEDV emerge. Hence, the main objective of this study was to assess sow sera and determine the correlation between enzyme-linked immunosorbent assay(ELISA) results(involving a newly isolated GDS01 virus-based ELISA and ELISAs based on seven recombinant fragments comprising overlapping S1 and partial S2 sequences) and SN titers. Furthermore, we determined the reliability of the ELISAs based on receiver operating characteristics(ROC) curve analyses. For the most promising ELISA, i.e., the SP4 ELISA, the correlation coefficient(r) and the area under curve(AUC) were determined to be 0.6113 and 0.8538, respectively. In addition, we analyzed the homology of the SP4 sequences obtained from different strains(including vaccine strains) and found that various strains showed a high degree of homology in this region. Thus, we conclude that SP4 is a promising serological testing protein for use in the field.  相似文献   

12.
Many previous studies have attempted to assess ecological niche modeling performance using receiver operating characteristic (ROC) approaches, even though diverse problems with this metric have been pointed out in the literature. We explored different evaluation metrics based on independent testing data using the Darwin's Fox (Lycalopex fulvipes) as a detailed case in point. Six ecological niche models (ENMs; generalized linear models, boosted regression trees, Maxent, GARP, multivariable kernel density estimation, and NicheA) were explored and tested using six evaluation metrics (partial ROC, Akaike information criterion, omission rate, cumulative binomial probability), including two novel metrics to quantify model extrapolation versus interpolation (E‐space index I) and extent of extrapolation versus Jaccard similarity (E‐space index II). Different ENMs showed diverse and mixed performance, depending on the evaluation metric used. Because ENMs performed differently according to the evaluation metric employed, model selection should be based on the data available, assumptions necessary, and the particular research question. The typical ROC AUC evaluation approach should be discontinued when only presence data are available, and evaluations in environmental dimensions should be adopted as part of the toolkit of ENM researchers. Our results suggest that selecting Maxent ENM based solely on previous reports of its performance is a questionable practice. Instead, model comparisons, including diverse algorithms and parameterizations, should be the sine qua non for every study using ecological niche modeling. ENM evaluations should be developed using metrics that assess desired model characteristics instead of single measurement of fit between model and data. The metrics proposed herein that assess model performance in environmental space (i.e., E‐space indices I and II) may complement current methods for ENM evaluation.  相似文献   

13.
Kinematic interpolation is an important tool in biomechanics. The purpose of this work is to describe a method for interpolating three-dimensional kinematic data, minimizing error while maintaining ease of calculation. This method uses cubic quaternion and hermite interpolation to fill gaps between kinematic data points. Data sets with a small number of samples were extracted from a larger data set and used to validate the technique. Two additional types of interpolation were applied and then compared to the cubic quaternion interpolation. Displacement errors below 2% using the cubic quaternion method were achieved using 4% of the total samples, representing a decrease in error over the other algorithms.  相似文献   

14.
A group test gives a positive (negative) outcome if it contains at least u (at most l) positive items, and an arbitrary outcome if the number of positive items is between thresholds l and u. This problem introduced by Damaschke is called threshold group testing. It is a generalization of classical group testing. Chen and Fu extended this problem to the error-tolerant version and first proposed efficient nonadaptive algorithms. In this article, we extend threshold group testing to the k-inhibitors model in which a test has a positive outcome if it contains at least u positives and at most k-1 inhibitors. By using (d + k - l, u; 2e + 1]-disjunct matrix we provide nonadaptive algorithms for the threshold group testing model with k-inhibitors and at most e-erroneous outcomes. The decoding complexity is O(n(u+k) log n) for fixed parameters (d, u, l, k, e).  相似文献   

15.
Ambulatory measurement of 3D knee joint angle   总被引:1,自引:1,他引:0  
Three-dimensional measurement of joint motion is a promising tool for clinical evaluation and therapeutic treatment comparisons. Although many devices exist for joints kinematics assessment, there is a need for a system that could be used in routine practice. Such a system should be accurate, ambulatory, and easy to use. The combination of gyroscopes and accelerometers (i.e., inertial measurement unit) has proven to be suitable for unrestrained measurement of orientation during a short period of time (i.e., few minutes). However, due to their inability to detect horizontal reference, inertial-based systems generally fail to measure differential orientation, a prerequisite for computing the three-dimentional knee joint angle recommended by the Internal Society of Biomechanics (ISB). A simple method based on a leg movement is proposed here to align two inertial measurement units fixed on the thigh and shank segments. Based on the combination of the former alignment and a fusion algorithm, the three-dimensional knee joint angle is measured and compared with a magnetic motion capture system during walking. The proposed system is suitable to measure the absolute knee flexion/extension and abduction/adduction angles with mean (SD) offset errors of -1 degree (1 degree ) and 0 degrees (0.6 degrees ) and mean (SD) root mean square (RMS) errors of 1.5 degrees (0.4 degrees ) and 1.7 degrees (0.5 degrees ). The system is also suitable for the relative measurement of knee internal/external rotation (mean (SD) offset error of 3.4 degrees (2.7 degrees )) with a mean (SD) RMS error of 1.6 degrees (0.5 degrees ). The method described in this paper can be easily adapted in order to measure other joint angular displacements such as elbow or ankle.  相似文献   

16.
Tsuang DW  Millard SP  Ely B  Chi P  Wang K  Raskind WH  Kim S  Brkanac Z  Yu CE 《PloS one》2010,5(12):e14456

Background

The detection of copy number variants (CNVs) and the results of CNV-disease association studies rely on how CNVs are defined, and because array-based technologies can only infer CNVs, CNV-calling algorithms can produce vastly different findings. Several authors have noted the large-scale variability between CNV-detection methods, as well as the substantial false positive and false negative rates associated with those methods. In this study, we use variations of four common algorithms for CNV detection (PennCNV, QuantiSNP, HMMSeg, and cnvPartition) and two definitions of overlap (any overlap and an overlap of at least 40% of the smaller CNV) to illustrate the effects of varying algorithms and definitions of overlap on CNV discovery.

Methodology and Principal Findings

We used a 56 K Illumina genotyping array enriched for CNV regions to generate hybridization intensities and allele frequencies for 48 Caucasian schizophrenia cases and 48 age-, ethnicity-, and gender-matched control subjects. No algorithm found a difference in CNV burden between the two groups. However, the total number of CNVs called ranged from 102 to 3,765 across algorithms. The mean CNV size ranged from 46 kb to 787 kb, and the average number of CNVs per subject ranged from 1 to 39. The number of novel CNVs not previously reported in normal subjects ranged from 0 to 212.

Conclusions and Significance

Motivated by the availability of multiple publicly available genome-wide SNP arrays, investigators are conducting numerous analyses to identify putative additional CNVs in complex genetic disorders. However, the number of CNVs identified in array-based studies, and whether these CNVs are novel or valid, will depend on the algorithm(s) used. Thus, given the variety of methods used, there will be many false positives and false negatives. Both guidelines for the identification of CNVs inferred from high-density arrays and the establishment of a gold standard for validation of CNVs are needed.  相似文献   

17.
Early detection of fracture risk is important for initiating treatment and improving outcomes from both physiologic and pathologic causes of bone loss. While bone mineral density (a quantity measure) has traditionally been used for this purpose, alternative structural imaging parameters (quality measures) are proposed to better predict bone's true mechanical properties. To further elucidate this, trabecular bone from cadaveric human calcanei were used to evaluate the interrelationship of mechanical and structural parameters using mechanical testing, dual energy X-ray absorptiometry (DXA) scanning, and micro computed tomography (microCT) imaging. Directional specific structural properties were assessed in three-dimensional (3-D) and correlated to mechanical testing and DXA. The results demonstrated that microCT-derived indices of bone quality (i.e., volume fraction and structural model index) are better than DXA-derived bone mineral density for the prediction of the mechanical parameters of bone (i.e., elastic modulus, yield stress, and ultimate stress). Diagnostically, this implies that future work on the early prediction of fracture risk should focus as much on bone quality as on quantity. Furthermore, the results of this study show that a loss of bone primarily affects the connectedness and overall number of trabeculae. Ultimate stress, however, is better correlated with trabecular number than thickness. As such, primary prevention of osteoporosis may be more important than later countermeasures for bone loss.  相似文献   

18.
G C Wei  M A Tanner 《Biometrics》1991,47(4):1297-1309
The first part of the article reviews the Data Augmentation algorithm and presents two approximations to the Data Augmentation algorithm for the analysis of missing-data problems: the Poor Man's Data Augmentation algorithm and the Asymptotic Data Augmentation algorithm. These two algorithms are then implemented in the context of censored regression data to obtain semiparametric methodology. The performances of the censored regression algorithms are examined in a simulation study. It is found, up to the precision of the study, that the bias of both the Poor Man's and Asymptotic Data Augmentation estimators, as well as the Buckley-James estimator, does not appear to differ from zero. However, with regard to mean squared error, over a wide range of settings examined in this simulation study, the two Data Augmentation estimators have a smaller mean squared error than does the Buckley-James estimator. In addition, associated with the two Data Augmentation estimators is a natural device for estimating the standard error of the estimated regression parameters. It is shown how this device can be used to estimate the standard error of either Data Augmentation estimate of any parameter (e.g., the correlation coefficient) associated with the model. In the simulation study, the estimated standard error of the Asymptotic Data Augmentation estimate of the regression parameter is found to be congruent with the Monte Carlo standard deviation of the corresponding parameter estimate. The algorithms are illustrated using the updated Stanford heart transplant data set.  相似文献   

19.
A set of equations for determining chlorophyll a (Chl a) and accessory chlorophylls b, c 2 , c 1 + c 2 and the special case of Acaryochloris marina, which uses Chl d as its primary photosynthetic pigment and also has Chl a, have been developed for 90% acetone, methanol and ethanol solvents. These equations for different solvents give chlorophyll assays that are consistent with each other. No algorithms for Chl c compounds (c 2 , c 1 + c 2) in the presence of Chl a have previously been published for methanol or ethanol. The limits of detection (and inherent error, ± 95% confidence limit), for chlorophylls in all organisms tested, was generally less than 0.1 μg/ml. The Chl a and b algorithms for green algae and land plants have very small inherent errors (< 0.01 μg/ml). Chl a and d algorithms for Acaryochloris marina are consistent with each other, giving estimates of Chl d/a ratios which are consistent with previously published estimates using HPLC and a rarely used algorithm originally published for diethyl ether in 1955. The statistical error structure of chlorophyll algorithms is discussed. The relative error of measurements of chlorophylls increases hyperbolically in diluted chlorophyll extracts because the inherent errors of the chlorophyll algorithms are constants independent of the magnitude of absorbance readings. For safety reasons, efficient extraction of chlorophylls and the convenience of being able to use polystyrene cuvettes, the algorithms for ethanol are recommended for routine assays of chlorophylls. The methanol algorithms would be convenient for assays associated with HPLC work.  相似文献   

20.
Population sizing from still aerial pictures is of wide applicability in ecological and social sciences. The problem is long standing because current automatic detection and counting algorithms are known to fail in most cases, and exhaustive manual counting is tedious, slow, difficult to verify and unfeasible for large populations. An alternative is to multiply population density with some reference area but, unfortunately, sampling details, handling of edge effects, etc., are seldom described. For the first time we address the problem using principles of geometric sampling. These principles are old and solid, but largely unknown outside the areas of three dimensional microscopy and stereology. Here we adapt them to estimate the size of any population of individuals lying on an essentially planar area, e.g. people, animals, trees on a savanna, etc. The proposed design is unbiased irrespective of population size, pattern, perspective artifacts, etc. The implementation is very simple—it is based on the random superimposition of coarse quadrat grids. Also, an objective error assessment is often lacking. For the latter purpose the quadrat counts are often assumed to be independent. We demonstrate that this approach can perform very poorly, and we propose (and check via Monte Carlo resampling) a new theoretical error prediction formula. As far as efficiency, counting about 50 (100) individuals in 20 quadrats, can yield relative standard errors of about 8% (5%) in typical cases. This fact effectively breaks the barrier hitherto imposed by the current lack of automatic face detection algorithms, because semiautomatic sampling and manual counting becomes an attractive option.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号