首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
SUMMARY: AUGIST (accomodating uncertainty in genealogies while inferring species tress) is a new software package for inferring species trees while accommodating uncertainty in gene genealogies. It is written for the Mesquite software system and provides sampling procedures to incorporate uncertainty in gene tree reconstruction while providing confidence estimates for inferred species trees. AVAILABILITY: http://www.lycaenid.org/augist/  相似文献   

2.
This paper develops a novel methodology, the Best Tracer method (BTM), that substantially overcomes the principal limitations (intertracer inconsistencies, and poor precision of recovery) of estimating soil ingestion by specific soil‐based tracers in massbalance studies. The BTM incorporates a biological and statistical framework that improves precision of recovery of tracer estimates, markedly reducing input‐output misalignment error resulting from a lack of correspondence between food tracer input and fecal tracer output.

This method is then used to re‐estimate the soil ingestion distribution of previously published soil ingestion estimates from two children studies (Calabrese et al. 1989; Davis et al., 1990) and one adult study (Calabrese et al., 1990). Revised estimates of soil ingestion are provided for each study. In addition, the results from the two children's studies are combined to form a single estimate of the soil ingestion distribution. These collective findings result in more reliable quantitative estimates of soil ingestion than trace element specific estimates, as well as providing improved understanding of current published soil ingestion studies, and improved methods that will enhance the design and interpretation of future soil ingestion studies.

With respect to children, the data indicate that the Calabrese et al. (1989) study provides the most reliable estimates of soil ingestion based on the estimated precision of recovery. However, estimates for the combined data of the Calabrese et al. (1989) and Davis et al. (1990) studies include all available children's data from mass balance studies, and thus provide more robust estimates. The collective data suggest that the median child in these studies ingested 30–40 mg/day of soil, while the upper 95% estimate is approximately 200 mg/day. Current data are insufficient to distinguish the children's soil ingestion distribution from that of adults. The revised and improved estimates of soil ingestion for children and adults have important implications for contaminant exposure estimates used in site evaluation risk assessment procedures.  相似文献   


3.
SUMMARY: The program package CopyMap identifies copy number variation from oligo-hybridization and CGH data. Using a time-dependent hidden Markov model to combine evidence of copy number variants (CNVs) across multiple carriers, CopyMap is substantially more accurate than standard hidden Markov methods in identifying CNVs and calling CNV-carriers. Moreover, CopyMap provides more precise estimates of CNV-boundaries. AVAILABILITY: The C-source code and detailed documentation for the program CopyMap is available on the Internet at http://www.sph.umich.edu/csg/szoellner/  相似文献   

4.
MOTIVATION: Oligonucleotide expression arrays exhibit systematic and reproducible variation produced by the multiple distinct probes used to represent a gene. Recently, a gene expression index has been proposed that explicitly models probe effects, and provides improved fits of hybridization intensity for arrays containing perfect match (PM) and mismatch (MM) probe pairs. RESULTS: Here we use a combination of analytical arguments and empirical data to show directly that the estimates provided by model-based expression indexes are superior to those provided by commercial software. The improvement is greatest for genes in which probe effects vary substantially, and modeling the PM and MM intensities separately is superior to using the PM-MM differences. To empirically compare expression indexes, we designed a mixing experiment involving three groups of human fibroblast cells (serum starved, serum stimulated, and a 50:50 mixture of starved/stimulated), with six replicate HuGeneFL arrays in each group. Careful spiking of control genes provides evidence that 88-98% of the genes on the array are detectably transcribed, and that the model-based estimates can accurately detect the presence versus absence of a gene. The use of extensive replication from single RNA sources enables exploration of the technical variability of the array.  相似文献   

5.
SUMMARY: The survcomp package provides functions to assess and statistically compare the performance of survival/risk prediction models. It implements state-of-the-art statistics to (i) measure the performance of risk prediction models; (ii) combine these statistical estimates from multiple datasets using a meta-analytical framework; and (iii) statistically compare the performance of competitive models.  相似文献   

6.
Estimates of range‐wide abundance, harvest, and harvest rate are fundamental for sound inferences about the role of exploitation in the dynamics of free‐ranging wildlife populations, but reliability of existing survey methods for abundance estimation is rarely assessed using alternative approaches. North American mallard populations have been surveyed each spring since 1955 using internationally coordinated aerial surveys, but population size can also be estimated with Lincoln's method using banding and harvest data. We estimated late summer population size of adult and juvenile male and female mallards in western, midcontinent, and eastern North America using Lincoln's method of dividing (i) total estimated harvest, , by estimated harvest rate, , calculated as (ii) direct band recovery rate, , divided by the (iii) band reporting rate, . Our goal was to compare estimates based on Lincoln's method with traditional estimates based on aerial surveys. Lincoln estimates of adult males and females alive in the period June–September were 4.0 (range: 2.5–5.9), 1.8 (range: 0.6–3.0), and 1.8 (range: 1.3–2.7) times larger than respective aerial survey estimates for the western, midcontinent, and eastern mallard populations, and the two population estimates were only modestly correlated with each other (western: = 0.70, 1993–2011; midcontinent: = 0.54, 1961–2011; eastern: = 0.50, 1993–2011). Higher Lincoln estimates are predictable given that the geographic scope of inference from Lincoln estimates is the entire population range, whereas sampling frames for aerial surveys are incomplete. Although each estimation method has a number of important potential biases, our review suggests that underestimation of total population size by aerial surveys is the most likely explanation. In addition to providing measures of total abundance, Lincoln's method provides estimates of fecundity and population sex ratio and could be used in integrated population models to provide greater insights about population dynamics and management of North American mallards and most other harvested species.  相似文献   

7.
Assessing animal population growth curves is an essential feature of field studies in ecology and wildlife management. We used five models to assess population growth rates with a number of sets of population growth rate data. A 'generalized' logistic curve provides a better model than do four other popular models. Use of difference equations for fitting was checked by a comparison of that method and direct fitting of the analytical (integrated) solution for three of the models. Fits to field data indicate that estimates of the asymptote, K, from the 'generalized logistic' and the ordinary logistic agree well enough to support use of estimates of K from the ordinary logistic on data that cannot be satisfactorily fitted with the generalized logistic. Akaike's information criterion is widely used, often with a small sample version AICc. Our study of five models indicated a bias in the AICc criterion, so we recommend checking results with estimates of variance about regression for fitted models. Fitting growth curves provides a valuable supplement to, and check on computer models of populations.  相似文献   

8.
  1. Download : Download high-res image (292KB)
  2. Download : Download full-size image
Highlights► Design of isotopic labeling experiments determines the precision of flux estimates. ► New approaches were developed to select optimal isotopic tracers and measurements. ► Tandem mass spectrometry provides more informative data for 13C flux analysis. ► Parallel labeling experiments can improve the quality of flux estimates. ► New methods are needed for optimal design of parallel labeling experiments.  相似文献   

9.
MOTIVATION: Hidden Markov models (HMMs) calculate the probability that a sequence was generated by a given model. Log-odds scoring provides a context for evaluating this probability, by considering it in relation to a null hypothesis. We have found that using a reverse-sequence null model effectively removes biases owing to sequence length and composition and reduces the number of false positives in a database search. Any scoring system is an arbitrary measure of the quality of database matches. Significance estimates of scores are essential, because they eliminate model- and method-dependent scaling factors, and because they quantify the importance of each match. Accurate computation of the significance of reverse-sequence null model scores presents a problem, because the scores do not fit the extreme-value (Gumbel) distribution commonly used to estimate HMM scores' significance. RESULTS: To get a better estimate of the significance of reverse-sequence null model scores, we derive a theoretical distribution based on the assumption of a Gumbel distribution for raw HMM scores and compare estimates based on this and other distribution families. We derive estimation methods for the parameters of the distributions based on maximum likelihood and on moment matching (least-squares fit for Student's t-distribution). We evaluate the modeled distributions of scores, based on how well they fit the tail of the observed distribution for data not used in the fitting and on the effects of the improved E-values on our HMM-based fold-recognition methods. The theoretical distribution provides some improvement in fitting the tail and in providing fewer false positives in the fold-recognition test. An ad hoc distribution based on assuming a stretched exponential tail does an even better job. The use of Student's t to model the distribution fits well in the middle of the distribution, but provides too heavy a tail. The moment-matching methods fit the tails better than maximum-likelihood methods. AVAILABILITY: Information on obtaining the SAM program suite (free for academic use), as well as a server interface, is available at http://www.soe.ucsc.edu/research/compbio/sam.html and the open-source random sequence generator with varying compositional biases is available at http://www.soe.ucsc.edu/research/compbio/gen_sequence  相似文献   

10.
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This series in Advances in Physiology Education provides an opportunity to do just that: we will investigate basic concepts in statistics using the free software package R. Because this series uses R solely as a vehicle with which to explore basic concepts in statistics, I provide the requisite R commands. In this inaugural paper we explore the essential distinction between standard deviation and standard error: a standard deviation estimates the variability among sample observations whereas a standard error of the mean estimates the variability among theoretical sample means. If we fail to report the standard deviation, then we fail to fully report our data. Because it incorporates information about sample size, the standard error of the mean is a misguided estimate of variability among observations. Instead, the standard error of the mean provides an estimate of the uncertainty of the true value of the population mean.  相似文献   

11.
Recent statistical analyses suggest that sequencing of pooled samples provides a cost effective approach to determine genome-wide population genetic parameters. Here we introduce PoPoolation, a toolbox specifically designed for the population genetic analysis of sequence data from pooled individuals. PoPoolation calculates estimates of θ(Watterson), θ(π), and Tajima's D that account for the bias introduced by pooling and sequencing errors, as well as divergence between species. Results of genome-wide analyses can be graphically displayed in a sliding window plot. PoPoolation is written in Perl and R and it builds on commonly used data formats. Its source code can be downloaded from http://code.google.com/p/popoolation/. Furthermore, we evaluate the influence of mapping algorithms, sequencing errors, and read coverage on the accuracy of population genetic parameter estimates from pooled data.  相似文献   

12.
Doubts about isonymy   总被引:1,自引:0,他引:1  
The method of isonymy, developed by Crow and Mange for estimating inbreeding from surname frequencies, requires an assumption that has not been appreciated: It is necessary to assume that all males in some ancestral generation, the founding stock, had unique surnames. Because this assumption is seldom justified in real populations, the applicability of the isonymy method is extremely limited. Even worse, the estimates it provides refer to an unspecified founding stock, and this implies that these estimates are devoid of information.  相似文献   

13.
The sequences technique is frequently used for time domain assessment of the arterial-cardiac baroreceptor reflex sensitivity (BRS). The BRS is estimated by the slope between systolic blood pressure and RR interval values in baroreflex sequences (BSs) and an overall estimate is obtained by slope averaging. However, only 25% of all beats are in BSs with 60% of those located in 3-beat length segments. Also, in cases of BSs absence (usually associated with poor BRS function), the BRS cannot be quantified.Here, baroreflex events (BEs) are introduced and used with global/total slope estimators to improve BRS assessment. The performance of the novel method is evaluated using the EuroBaVar dataset. The events technique benefits from a higher number of beats: 50% of all beats are in BEs with more than 70% exceeding 3-beat length. It always provides a BRS estimate, even when BSs cannot be identified. When BSs are available, estimates from BEs and BSs are highly correlated. The estimates from BEs for the cases without BSs are lower than the estimates for the remaining cases, indicating poorer BRS function. The events technique also offers superior ability to discriminate lying from standing position in the EuroBaVar dataset (23/23 versus 18/23 for the sequences technique).  相似文献   

14.
Bias in the estimation of false discovery rate in microarray studies   总被引:4,自引:0,他引:4  
MOTIVATION: The false discovery rate (FDR) provides a key statistical assessment for microarray studies. Its value depends on the proportion pi(0) of non-differentially expressed (non-DE) genes. In most microarray studies, many genes have small effects not easily separable from non-DE genes. As a result, current methods often overestimate pi(0) and FDR, leading to unnecessary loss of power in the overall analysis. METHODS: For the common two-sample comparison we derive a natural mixture model of the test statistic and an explicit bias formula in the standard estimation of pi(0). We suggest an improved estimation of pi(0) based on the mixture model and describe a practical likelihood-based procedure for this purpose. RESULTS: The analysis shows that a large bias occurs when pi(0) is far from 1 and when the non-centrality parameters of the distribution of the test statistic are near zero. The theoretical result also explains substantial discrepancies between non-parametric and model-based estimates of pi(0). Simulation studies indicate mixture-model estimates are less biased than standard estimates. The method is applied to breast cancer and lymphoma data examples. AVAILABILITY: An R-package OCplus containing functions to compute pi(0) based on the mixture model, the resulting FDR and other operating characteristics of microarray data, is freely available at http://www.meb.ki.se/~yudpaw CONTACT: yudi.pawitan@meb.ki.se and alexander.ploner@meb.ki.se.  相似文献   

15.
Genetic and demographic estimates of dispersal are often thought to be inconsistent. In this study, we use the damselfly Coenagrion mercuriale (Odonata: Zygoptera) as a model to evaluate directly the relationship between estimates of dispersal rate measured during capture-mark-recapture fieldwork with those made from the spatial pattern of genetic markers in linear and two-dimensional habitats. We estimate the 'neighbourhood size' (Nb) - the product of the mean axial dispersal rate between parent and offspring and the population density - by a previously described technique, here called the regression method. Because C. mercuriale is less philopatric than species investigated previously by the regression method we evaluate a refined estimator that may be more applicable for relatively mobile species. Results from simulations and empirical data sets reveal that the new estimator performs better under most situations, except when dispersal is very localized relative to population density. Analysis of the C. mercuriale data extends previous results which demonstrated that demographic and genetic estimates of Nb by the regression method are equivalent to within a factor of two at local scales where genetic estimates are less affected by habitat heterogeneity, stochastic processes and/or differential selective regimes. The corollary is that with a little insight into a species' ecology the pattern of spatial genetic structure provides quantitative information on dispersal rates and/or population densities that has real value for conservation management.  相似文献   

16.
 Hot weather challenges livestock production but technology exists to offset the challenge if producers have made appropriate strategic decisions. Key issues include understanding the hazards of heat stress, being prepared to offer relief from the heat, recognizing when an animal is in danger, and taking appropriate action. This paper describes our efforts to develop biological response functions; assesses climatic probabilities and performs associated risk analyses; provides inputs for computer models used to make environmental management decisions; and evaluates threshold temperatures as estimates of critical temperature limits for swine, cattle and sheep. Received: 3 September 1998/Accepted: 5 October 1998  相似文献   

17.
Visually seriated radiographs of the proximal femur, proximal humerus, clavicle, and calcaneus from 130 individuals from the Hamann-Todd collection were examined as indicators of skeletal age at death. The clavicle demonstrated the most consistent relationship to age in both sexes. The same radiographs were also seriated by size-normalized optical density as a means of establishing relative radiolucency. In this context, visual seriation proved superior. The four sites studied showed strong divergence in response to age. Since each was sampling bone response from the same individual, it is concluded that bone loss is highly site specific. This demonstrates the individual character of specific skeletal sites. Visual inspection of clavicular radiographs, seriated on a populational basis, provides age estimates that are comparable to anatomical age indicators and provides independent estimates of skeletal age when included in the summary age method (1985: Am. J. Phys. Anthropol. 68:1–14).  相似文献   

18.
In order to understand the electricity use of Internet services, it is important to have accurate estimates for the average electricity intensity of transmitting data through the Internet (measured as kilowatt‐hours per gigabyte [kWh/GB]). This study identifies representative estimates for the average electricity intensity of fixed‐line Internet transmission networks over time and suggests criteria for making accurate estimates in the future. Differences in system boundary, assumptions used, and year to which the data apply significantly affect such estimates. Surprisingly, methodology used is not a major source of error, as has been suggested in the past. This article derives criteria to identify accurate estimates over time and provides a new estimate of 0.06 kWh/GB for 2015. By retroactively applying our criteria to existing studies, we were able to determine that the electricity intensity of data transmission (core and fixed‐line access networks) has decreased by half approximately every 2 years since 2000 (for developed countries), a rate of change comparable to that found in the efficiency of computing more generally.  相似文献   

19.
R ST, an analogue of F ST, provides a convenient approach for estimating levels of genetic differentiation from microsatellite data. This paper examines current approaches for calculating estimates of R ST and suggests a weighting scheme based on the transformation of allele sizes at loci across data sets. Combined within an analysis of variance framework this scheme yields an estimator of R ST analogous to the θ estimator of F ST. Software for the IBM-PC is described which carries out such calculations and assesses the significance of R ST or Nm estimates using bootstrap and permutation tests.  相似文献   

20.
We provide a novel method, DRISEE (duplicate read inferred sequencing error estimation), to assess sequencing quality (alternatively referred to as "noise" or "error") within and/or between sequencing samples. DRISEE provides positional error estimates that can be used to inform read trimming within a sample. It also provides global (whole sample) error estimates that can be used to identify samples with high or varying levels of sequencing error that may confound downstream analyses, particularly in the case of studies that utilize data from multiple sequencing samples. For shotgun metagenomic data, we believe that DRISEE provides estimates of sequencing error that are more accurate and less constrained by technical limitations than existing methods that rely on reference genomes or the use of scores (e.g. Phred). Here, DRISEE is applied to (non amplicon) data sets from both the 454 and Illumina platforms. The DRISEE error estimate is obtained by analyzing sets of artifactual duplicate reads (ADRs), a known by-product of both sequencing platforms. We present DRISEE as an open-source, platform-independent method to assess sequencing error in shotgun metagenomic data, and utilize it to discover previously uncharacterized error in de novo sequence data from the 454 and Illumina sequencing platforms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号