期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

SuperFine: fast and accurate supertree estimation

Swenson MS Suri R Linder CR Warnow T 《Systematic biology》2012,61(2):214-227

Many research groups are estimating trees containing anywhere from a few thousands to hundreds of thousands of species, toward the eventual goal of the estimation of a Tree of Life, containing perhaps as many as several million leaves. These phylogenetic estimations present enormous computational challenges, and current computational methods are likely to fail to run even on data sets in the low end of this range. One approach to estimate a large species tree is to use phylogenetic estimation methods (such as maximum likelihood) on a supermatrix produced by concatenating multiple sequence alignments for a collection of markers; however, the most accurate of these phylogenetic estimation methods are extremely computationally intensive for data sets with more than a few thousand sequences. Supertree methods, which assemble phylogenetic trees from a collection of trees on subsets of the taxa, are important tools for phylogeny estimation where phylogenetic analyses based upon maximum likelihood (ML) are infeasible. In this paper, we introduce SuperFine, a meta-method that utilizes a novel two-step procedure in order to improve the accuracy and scalability of supertree methods. Our study, using both simulated and empirical data, shows that SuperFine-boosted supertree methods produce more accurate trees than standard supertree methods, and run quickly on very large data sets with thousands of sequences. Furthermore, SuperFine-boosted matrix representation with parsimony (MRP, the most well-known supertree method) approaches the accuracy of ML methods on supermatrix data sets under realistic conditions. 相似文献

2.

A model for accurate drift estimation in streams 总被引：1，自引：0，他引：1

Hazel Faulkner & Gordon H. Copp 《Freshwater Biology》2001,46(6):723-733

1. This paper explores the experimental difficulties involved with the use of drift nets in small streams, and outlines a method whereby the estimation of drift density (number of specimens m⁻³ of water) can be improved.
2. Changes in the filtering efficiency of the net caused by trapping of organic debris ('clogging') has the effect of reducing net entrance velocities, causing errors in the calculation of sampled water volume, and thus drift density. A model of the reductions in net entrance velocity based on empirical measurements of trapped debris is developed.
3. Cross-sectional velocity calculations suggest that errors can also be introduced into drift density calculations by positioning sampling nets only on the bed. A method to allow this effect is demonstrated.
4. As adjustments to the calculation of sampled volume are required when sampling in rivers that undergo marked changes in discharge during the sampling period, a method whereby these effects can be accommodated to improve drift density estimations is also outlined.
5. The results of this study imply that theoretical links between flow hydraulics and short-term drift behaviour are poorly understood. 相似文献

3.

A simple method for accurate estimation of apoptotic cells 总被引：6，自引：0，他引：6

Singh NP 《Experimental cell research》2000,256(1):328-337

A simple, sensitive, and reliable "DNA diffusion" assay for the quantification of apoptosis is described. Human lymphocytes and human lymphoblastoid cells, MOLT-4, were exposed to 0, 12.5, 25, 50, or 100 rad of X-rays. After 24 h of incubation, cells were mixed with agarose, microgels were made, and cells were lysed in high salt and detergents. DNA was precipitated in microgels by ethanol. Staining of DNA was done with an intense fluorescent dye, YOYO-1. Apoptotic cells show a halo of granular DNA with a hazy outer boundary. Necrotic cells, resulting from hyperthermia treatment, on the other hand, show an unusually large homogeneous nucleus with a clearly defined boundary. The number of cells with apoptotic and necrotic appearance can be scored and quantified by using a fluorescent microscope. Results were compared with other methods of apoptosis measurement: morphological estimations of apoptosis and DNA ladder pattern formation in regular agarose gel electrophoresis. Validation of the technique was done using some known inducers of apoptosis and necrosis (hyperthermia, hydrogen peroxide, mitoxantrone, novobiocin, and sodium ascorbate). 相似文献

4.

Rapid and accurate estimation of release conditions in the javelin throw 总被引：2，自引：0，他引：2

M Hubbard L W Alaways 《Journal of biomechanics》1989,22(6-7):583-595

We have developed a system to measure initial conditions in the javelin throw rapidly enough to be used by the thrower for feedback in performance improvement. The system consists of three subsystems whose main tasks are: (A) acquisition of automatically digitized high speed (200 Hz) video x, y position data for the first 0.1-0.2 s of the javelin flight after release (B) estimation of five javelin release conditions from the x, y position data and (C) graphical presentation to the thrower of these release conditions and a simulation of the subsequent flight together with optimal conditions and flight for the sam release velocity. The estimation scheme relies on a simulation model and is at least an order of magnitude more accurate than previously reported measurements of javelin release conditions. The system provides, for the first time ever in any throwing event, the ability to critique nearly instantly in a precise, quantitative manner the crucial factors in the throw which determine the range. This should be expected to much greater control and consistency of throwing variables by athletes who use system and could even lead to an evolution of new throwing techniques. 相似文献

5.

Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation

Lin Wan Kelian Sun Qi Ding Yuehua Cui Ming Li Yalu Wen Robert C. Elston Minping Qian Wenjiang J Fu 《Nucleic acids research》2009,37(17):e117

Affymetrix SNP arrays have been widely used for single-nucleotide polymorphism (SNP) genotype calling and DNA copy number variation inference. Although numerous methods have achieved high accuracy in these fields, most studies have paid little attention to the modeling of hybridization of probes to off-target allele sequences, which can affect the accuracy greatly. In this study, we address this issue and demonstrate that hybridization with mismatch nucleotides (HWMMN) occurs in all SNP probe-sets and has a critical effect on the estimation of allelic concentrations (ACs). We study sequence binding through binding free energy and then binding affinity, and develop a probe intensity composite representation (PICR) model. The PICR model allows the estimation of ACs at a given SNP through statistical regression. Furthermore, we demonstrate with cell-line data of known true copy numbers that the PICR model can achieve reasonable accuracy in copy number estimation at a single SNP locus, by using the ratio of the estimated AC of each sample to that of the reference sample, and can reveal subtle genotype structure of SNPs at abnormal loci. We also demonstrate with HapMap data that the PICR model yields accurate SNP genotype calls consistently across samples, laboratories and even across array platforms. 相似文献

6.

The number of replications needed for accurate estimation of the bootstrap P value in phylogenetic studies.

S B Hedges 《Molecular biology and evolution》1992,9(2):366-369

相似文献

7.

Towards accurate estimation of the proportion of true null hypotheses in multiple testing

Zhang SD 《PloS one》2011,6(4):e18874

BACKGROUND: Biomedical researchers are now often faced with situations where it is necessary to test a large number of hypotheses simultaneously, eg, in comparative gene expression studies using high-throughput microarray technology. To properly control false positive errors the FDR (false discovery rate) approach has become widely used in multiple testing. The accurate estimation of FDR requires the proportion of true null hypotheses being accurately estimated. To date many methods for estimating this quantity have been proposed. Typically when a new method is introduced, some simulations are carried out to show the improved accuracy of the new method. However, the simulations are often very limited to covering only a few points in the parameter space. RESULTS: Here I have carried out extensive in silico experiments to compare some commonly used methods for estimating the proportion of true null hypotheses. The coverage of these simulations is unprecedented thorough over the parameter space compared to typical simulation studies in the literature. Thus this work enables us to draw conclusions globally as to the performance of these different methods. It was found that a very simple method gives the most accurate estimation in a dominantly large area of the parameter space. Given its simplicity and its overall superior accuracy I recommend its use as the first choice for estimating the proportion of true null hypotheses in multiple testing. 相似文献

8.

Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences

Liu B Gibbons T Ghodsi M Treangen T Pop M 《BMC genomics》2011,12(Z2):S4

相似文献

9.

LSimpute: accurate estimation of missing values in microarray data with least squares methods

Bø TH Dysvik B Jonassen I 《Nucleic acids research》2004,32(3):e34

Microarray experiments generate data sets with information on the expression levels of thousands of genes in a set of biological samples. Unfortunately, such experiments often produce multiple missing expression values, normally due to various experimental problems. As many algorithms for gene expression analysis require a complete data matrix as input, the missing values have to be estimated in order to analyze the available data. Alternatively, genes and arrays can be removed until no missing values remain. However, for genes or arrays with only a small number of missing values, it is desirable to impute those values. For the subsequent analysis to be as informative as possible, it is essential that the estimates for the missing gene expression values are accurate. A small amount of badly estimated missing values in the data might be enough for clustering methods, such as hierachical clustering or K-means clustering, to produce misleading results. Thus, accurate methods for missing value estimation are needed. We present novel methods for estimation of missing values in microarray data sets that are based on the least squares principle, and that utilize correlations between both genes and arrays. For this set of methods, we use the common reference name LSimpute. We compare the estimation accuracy of our methods with the widely used KNNimpute on three complete data matrices from public data sets by randomly knocking out data (labeling as missing). From these tests, we conclude that our LSimpute methods produce estimates that consistently are more accurate than those obtained using KNNimpute. Additionally, we examine a more classic approach to missing value estimation based on expectation maximization (EM). We refer to our EM implementations as EMimpute, and the estimate errors using the EMimpute methods are compared with those our novel methods produce. The results indicate that on average, the estimates from our best performing LSimpute method are at least as accurate as those from the best EMimpute algorithm. 相似文献

10.

Some recommendations for an accurate estimation of Lanice conchilega density based on tube counts

Gert Van Hoey Magda Vincx Steven Degraer 《Helgoland Marine Research》2006,60(4):317-321

The tube building polychaete Lanice conchilega is a common and ecologically important species in intertidal and shallow subtidal sands. It builds a characteristic tube with ragged fringes and can retract rapidly into its tube to depths of more than 20 cm. Therefore, it is very difficult to sample L. conchilega individuals, especially with a Van Veen grab. Consequently, many studies have used tube counts as estimates of real densities. This study reports on some aspects to be considered when using tube counts as a density estimate of L. conchilega, based on intertidal and subtidal samples. Due to its accuracy and independence of sampling depth, the tube method is considered the prime method to estimate the density of L. conchilega. However, caution is needed when analyzing samples with fragile young individuals and samples from areas where temporary physical disturbance is likely to occur. 相似文献

11.

Isoform abundance inference provides a more accurate estimation of gene expression levels in RNA-seq

Wang X Wu Z Zhang X 《Journal of bioinformatics and computational biology》2010,8(Z1):177-192

相似文献

12.

ProViDE: A software tool for accurate estimation of viral diversity in metagenomic samples

Ghosh TS Mohammed MH Komanduri D Mande SS 《Bioinformation》2011,6(2):91-94

Given the absence of universal marker genes in the viral kingdom, researchers typically use BLAST (with stringent E-values) for taxonomic classification of viral metagenomic sequences. Since majority of metagenomic sequences originate from hitherto unknown viral groups, using stringent e-values results in most sequences remaining unclassified. Furthermore, using less stringent e-values results in a high number of incorrect taxonomic assignments. The SOrt-ITEMS algorithm provides an approach to address the above issues. Based on alignment parameters, SOrt-ITEMS follows an elaborate work-flow for assigning reads originating from hitherto unknown archaeal/bacterial genomes. In SOrt-ITEMS, alignment parameter thresholds were generated by observing patterns of sequence divergence within and across various taxonomic groups belonging to bacterial and archaeal kingdoms. However, many taxonomic groups within the viral kingdom lack a typical Linnean-like taxonomic hierarchy. In this paper, we present ProViDE (Program for Viral Diversity Estimation), an algorithm that uses a customized set of alignment parameter thresholds, specifically suited for viral metagenomic sequences. These thresholds capture the pattern of sequence divergence and the non-uniform taxonomic hierarchy observed within/across various taxonomic groups of the viral kingdom. Validation results indicate that the percentage of 'correct' assignments by ProViDE is around 1.7 to 3 times higher than that by the widely used similarity based method MEGAN. The misclassification rate of ProViDE is around 3 to 19% (as compared to 5 to 42% by MEGAN) indicating significantly better assignment accuracy. ProViDE software and a supplementary file (containing supplementary figures and tables referred to in this article) is available for download from http://metagenomics.atc.tcs.com/binning/ProViDE/ 相似文献

13.

The PRI: employing the unemployable

Robert Riggs 《Dialectical Anthropology》2010,34(4):579-583

This essay draws attention to the Prison Reentry Industry’s potential to create unprecedented employment opportunities for formerly incarcerated individuals. Situated at the intersection of money, programming, and state-sponsored surveillance, the Prison Reentry Industry (PRI) is notable for its implication in prolonging and deepening people’s entrenchment in the criminal justice/reentry matrix; however, the burgeoning of a “reentry industry” also ensures the growth of employment positions tailored perfectly to the experiences of formerly incarcerated individuals. The PRI’s production of significant employment opportunities for certain members of the formerly incarcerated population turns social science research on incarceration and employment on its head. It is by now almost conventional wisdom that a criminal history stands as a barrier to employment, but the PRI’s potential to create a substantial job market for formerly incarcerated people may engender an extraordinary outcome in which a criminal record represents for some, the factor leading to entrée into the ranks of the employed. 相似文献

14.

Modified Scholander apparatus for accurate estimation of carbon dioxide in small samples of expired air

LEVY LM BERNSTEIN LM DEVOR D 《Journal of applied physiology》1958,13(2):309-312

相似文献

15.

SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees

Liu K Warnow TJ Holder MT Nelesen SM Yu J Stamatakis AP Linder CR 《Systematic biology》2012,61(1):90-106

Highly accurate estimation of phylogenetic trees for large data sets is difficult, in part because multiple sequence alignments must be accurate for phylogeny estimation methods to be accurate. Coestimation of alignments and trees has been attempted but currently only SATé estimates reasonably accurate trees and alignments for large data sets in practical time frames (Liu K., Raghavan S., Nelesen S., Linder C.R., Warnow T. 2009b. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 324:1561-1564). Here, we present a modification to the original SATé algorithm that improves upon SATé (which we now call SATé-I) in terms of speed and of phylogenetic and alignment accuracy. SATé-II uses a different divide-and-conquer strategy than SATé-I and so produces smaller more closely related subsets than SATé-I; as a result, SATé-II produces more accurate alignments and trees, can analyze larger data sets, and runs more efficiently than SATé-I. Generally, SATé is a metamethod that takes an existing multiple sequence alignment method as an input parameter and boosts the quality of that alignment method. SATé-II-boosted alignment methods are significantly more accurate than their unboosted versions, and trees based upon these improved alignments are more accurate than trees based upon the original alignments. Because SATé-I used maximum likelihood (ML) methods that treat gaps as missing data to estimate trees and because we found a correlation between the quality of tree/alignment pairs and ML scores, we explored the degree to which SATé's performance depends on using ML with gaps treated as missing data to determine the best tree/alignment pair. We present two lines of evidence that using ML with gaps treated as missing data to optimize the alignment and tree produces very poor results. First, we show that the optimization problem where a set of unaligned DNA sequences is given and the output is the tree and alignment of those sequences that maximize likelihood under the Jukes-Cantor model is uninformative in the worst possible sense. For all inputs, all trees optimize the likelihood score. Second, we show that a greedy heuristic that uses GTR+Gamma ML to optimize the alignment and the tree can produce very poor alignments and trees. Therefore, the excellent performance of SATé-II and SATé-I is not because ML is used as an optimization criterion for choosing the best tree/alignment pair but rather due to the particular divide-and-conquer realignment techniques employed. 相似文献

16.

Parameter estimation of kinetic models from metabolic profiles: two-phase dynamic decoupling method

Jia G Stephanopoulos GN Gunawan R 《Bioinformatics (Oxford, England)》2011,27(14):1964-1970

MOTIVATION: Time-series measurements of metabolite concentration have become increasingly more common, providing data for building kinetic models of metabolic networks using ordinary differential equations (ODEs). In practice, however, such time-course data are usually incomplete and noisy, and the estimation of kinetic parameters from these data is challenging. Practical limitations due to data and computational aspects, such as solving stiff ODEs and finding global optimal solution to the estimation problem, give motivations to develop a new estimation procedure that can circumvent some of these constraints. RESULTS: In this work, an incremental and iterative parameter estimation method is proposed that combines and iterates between two estimation phases. One phase involves a decoupling method, in which a subset of model parameters that are associated with measured metabolites, are estimated using the minimization of slope errors. Another phase follows, in which the ODE model is solved one equation at a time and the remaining model parameters are obtained by minimizing concentration errors. The performance of this two-phase method was tested on a generic branched metabolic pathway and the glycolytic pathway of Lactococcus lactis. The results showed that the method is efficient in getting accurate parameter estimates, even when some information is missing. 相似文献

17.

The application of meteorological data to medical problems

LAWRENCE EN 《Proceedings of the Royal Society of Medicine》1958,51(4):261-262

相似文献

18.

Fast and accurate estimation of the population-scaled mutation rate, theta, from microsatellite genotype data

Roychoudhury A Stephens M 《Genetics》2007,176(2):1363-1366

We present a new approach for estimation of the population-scaled mutation rate, , from microsatellite genotype data, using the recently introduced "product of approximate conditionals" framework. Comparisons with other methods on simulated data demonstrate that this new approach is attractive in terms of both accuracy and speed of computation. Our simulation experiments also demonstrate that, despite the theoretical advantages of full-likelihood-based methods, methods based on certain summary statistics (specifically, the sample homozygosity) can perform very competitively in practice. 相似文献

19.

The accurate measurement of insulin molarity.

下载免费PDF全文

D M Harrison C J Garratt 《The Biochemical journal》1969,113(4):733-734

相似文献

20.

Estimation of hourly average solar radiation on tilted surface via ANNs

Gazela M Tambouratzis T 《International journal of neural systems》2002,12(1):1-13

An innovative approach is proposed for the estimation of hourly average solar radiation on tilted surface. The proposed approach, which is based on artificial neural networks and problem decomposition, demonstrates the following characteristics: Accuracy. Superior estimation is attained over both conventional approaches and theoretical models. Simplicity and efficiency. A small training set is employed and the training/test patterns involve few easily obtainable parameters. Generalization capability and robustness to noise. Apart from being of interest in meteorology, the accurate and efficient estimation of hourly average solar radiation on tilted surface is especially important in solar energy applications. 相似文献