首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In statistical mechanics, the equilibrium properties of a physical system of particles can be calculated as the statistical average over accessible microstates of the system. In general, these calculations are computationally intractable since they involve summations over an exponentially large number of microstates. Clustering algorithms are one of the methods used to numerically approximate these sums. The most basic clustering algorithms first sub-divide the system into a set of smaller subsets (clusters). Then, interactions between particles within each cluster are treated exactly, while all interactions between different clusters are ignored. These smaller clusters have far fewer microstates, making the summation over these microstates, tractable. These algorithms have been previously used for biomolecular computations, but remain relatively unexplored in this context. Presented here, is a theoretical analysis of the error and computational complexity for the two most basic clustering algorithms that were previously applied in the context of biomolecular electrostatics. We derive a tight, computationally inexpensive, error bound for the equilibrium state of a particle computed via these clustering algorithms. For some practical applications, it is the root mean square error, which can be significantly lower than the error bound, that may be more important. We how that there is a strong empirical relationship between error bound and root mean square error, suggesting that the error bound could be used as a computationally inexpensive metric for predicting the accuracy of clustering algorithms for practical applications. An example of error analysis for such an application-computation of average charge of ionizable amino-acids in proteins-is given, demonstrating that the clustering algorithm can be accurate enough for practical purposes.  相似文献   

2.
Genetic similarities within and between human populations   总被引:2,自引:0,他引:2       下载免费PDF全文
The proportion of human genetic variation due to differences between populations is modest, and individuals from different populations can be genetically more similar than individuals from the same population. Yet sufficient genetic data can permit accurate classification of individuals into populations. Both findings can be obtained from the same data set, using the same number of polymorphic loci. This article explains why. Our analysis focuses on the frequency, omega, with which a pair of random individuals from two different populations is genetically more similar than a pair of individuals randomly selected from any single population. We compare omega to the error rates of several classification methods, using data sets that vary in number of loci, average allele frequency, populations sampled, and polymorphism ascertainment strategy. We demonstrate that classification methods achieve higher discriminatory power than omega because of their use of aggregate properties of populations. The number of loci analyzed is the most critical variable: with 100 polymorphisms, accurate classification is possible, but omega remains sizable, even when using populations as distinct as sub-Saharan Africans and Europeans. Phenotypes controlled by a dozen or fewer loci can therefore be expected to show substantial overlap between human populations. This provides empirical justification for caution when using population labels in biomedical settings, with broad implications for personalized medicine, pharmacogenetics, and the meaning of race.  相似文献   

3.
With increasing force, genetic divergence of mitochondrial DNA (mtDNA) is being argued as the primary tool for discovery of animal species. Two thresholds of single-gene divergence have been proposed: reciprocal monophyly, and 10 times greater genetic divergence between than within species (the "10x rule"). To explore quantitatively the utility of each approach, we couple neutral coalescent theory and the classical Bateson-Dobzhansky-Muller (BDM) model of speciation. The joint stochastic dynamics of these two processes demonstrate that both thresholds fail to "discover" many reproductively isolated lineages under a single incompatibility BDM model, especially when BDM loci have been subject to divergent selection. Only when populations have been isolated for > 4 million generations did these thresholds achieve error rates of < 10% under our model that incorporates variable population sizes. The high error rate evident in simulations is corroborated with six empirical data sets. These properties suggest that single-gene, high-throughput approaches to discovering new animal species will bias large-scale biodiversity surveys, particularly toward missing reproductively isolated lineages that have emerged by divergent selection or other mechanisms that accelerate reproductive isolation. Because single-gene thresholds for species discovery can result in substantial error at recent divergence times, they will misrepresent the correspondence between recently isolated populations and reproductively isolated lineages (= species).  相似文献   

4.
1.?Better understanding of the mechanisms affecting demographic variation in ungulate populations is needed to support sustainable management of harvested populations. While studies of moose Alces alces L. populations have previously explored temporal variation in demographic processes, managers responsible for populations that span large heterogeneous landscapes would benefit from an understanding of how demography varies across biogeographical gradients in climate and other population drivers. Evidence of thresholds in population response to manageable and un-manageable drivers could aid resource managers in identifying limits to the magnitude of sustainable change. 2.?Generalized additive models (GAMs) were used to evaluate the relative importance of population density, habitat abundance, summer and winter climatic conditions, primary production, and harvest intensity in explaining spatial variation in moose vital rates in Ontario, Canada. Tree regression was used to test for thresholds in the magnitudes of environmental predictor variables that significantly affected population vital rates. 3.?Moose population growth rate was negatively related to moose density and positively related to the abundance of mixed deciduous habitat abundant in forage. Calf recruitment was negatively related to a later start of the growing season and calf harvest. The ratio of bulls to cows was related to male harvest and hunter access, and thresholds were evident in predictor variables for all vital rate models. 4.?Findings indicate that the contributions of density-dependent and independent factors can vary depending on the scale of population process. The importance of density dependence and habitat supply to low-density ungulate populations was evident, and management strategies for ungulates may be improved by explicitly linking forest management and harvest. Findings emphasize the importance of considering summer climatic influences to ungulate populations, as recruitment in moose was more sensitive to the timing of vegetation green-up than winter severity. The efficacy of management decisions for harvested ungulates may require regional shifts in targets where populations span bioclimatic gradients. The use of GAMs in combination with recursive partitioning was demonstrated to be an informative analytical framework that captured nonlinear relationships common in natural processes and thresholds that are relevant to population management in diverse systems.  相似文献   

5.
JINLIANG WANG 《Molecular ecology》2009,18(10):2148-2164
Equations for the effective size ( Ne ) of a population were derived in terms of the frequencies of a pair of offspring taken at random from the population being sibs sharing the same one or two parents. Based on these equations, a novel method (called sibship assignment method) was proposed to infer Ne from the sibship frequencies estimated from a sibship assignment analysis, using the multilocus genotypes of a sample of offspring taken at random from a single cohort in a population. Comparative analyses of extensive simulated data and some empirical data clearly demonstrated that the sibship assignment method is much more accurate [measured by the root mean squared error, RMSE, of 1/(2 Ne )] than other methods such as the heterozygote excess method, the linkage disequilibrium method, and the temporal method. The RMSE of 1/(2 Ne ) from the sibship assignment method is typically a small fraction of that from other methods. The new method is also more general and flexible than other methods. It can be applied to populations with nonoverlapping generations of both diploid and haplodiploid species under random or nonrandom mating, using either codominant or dominant markers. It can also be applied to the estimation of Ne for a subpopulation with immigration. With some modification, it could be applied to monoecious diploid populations with self-fertilization, and to populations with overlapping generations.  相似文献   

6.
The emergence of novel respiratory pathogens can challenge the capacity of key health care resources, such as intensive care units, that are constrained to serve only specific geographical populations. An ability to predict the magnitude and timing of peak incidence at the scale of a single large population would help to accurately assess the value of interventions designed to reduce that peak. However, current disease-dynamic theory does not provide a clear understanding of the relationship between: epidemic trajectories at the scale of interest (e.g. city); population mobility; and higher resolution spatial effects (e.g. transmission within small neighbourhoods). Here, we used a spatially-explicit stochastic meta-population model of arbitrary spatial resolution to determine the effect of resolution on model-derived epidemic trajectories. We simulated an influenza-like pathogen spreading across theoretical and actual population densities and varied our assumptions about mobility using Latin-Hypercube sampling. Even though, by design, cumulative attack rates were the same for all resolutions and mobilities, peak incidences were different. Clear thresholds existed for all tested populations, such that models with resolutions lower than the threshold substantially overestimated population-wide peak incidence. The effect of resolution was most important in populations which were of lower density and lower mobility. With the expectation of accurate spatial incidence datasets in the near future, our objective was to provide a framework for how to use these data correctly in a spatial meta-population model. Our results suggest that there is a fundamental spatial resolution for any pathogen-population pair. If underlying interactions between pathogens and spatially heterogeneous populations are represented at this resolution or higher, accurate predictions of peak incidence for city-scale epidemics are feasible.  相似文献   

7.
A scaling rule of ecological theory, accepted but lacking experimental confirmation, is that the magnitude of fluctuations in population densities due to demographic stochasticity scales inversely with the square root of population numbers. This supposition is based on analyses of models exhibiting exponential growth or stable equilibria. Using two quantitative measures, we extend the scaling rule to situations in which population densities fluctuate due to nonlinear deterministic dynamics. These measures are applied to populations of the flour beetle Tribolium castaneum that display chaotic dynamics in both 20-g and 60-g habitats. Populations cultured in the larger habitat exhibit a clarification of the deterministic dynamics, which follows the inverse square root rule. Lattice effects, a deterministic phenomenon caused by the discrete nature of individuals, can cause deviations from the scaling rule when population numbers are small. The scaling rule is robust to the probability distribution used to model demographic variation among individuals.  相似文献   

8.
9.
Consider the problem of making an adjusted comparison of the medians of two populations on an interval type outcome variable. A common method of doing this is through the use of a linear model requiring the residuals to be normally distributed. We describe here two methods based on a linear model after Box–Cox transformation of the outcome variable. The methods require a reference population, which could be either of the populations under study or their aggregate. We compare the new procedures with the comparison of normal means procedure and other procedures proposed for this problem by simulation. It is found that the procedure based on comparison of the predicted values obtained from the observed covariates of the reference population has higher power for testing and smaller mean square error of estimation than the other methods, while maintaining reasonable control of the type I error rate. We illustrate the methods by analyzing the duration of the second stage of labor for women in two large observation studies (Collaborative Perinatal Project and Consortium on Safe Labor) separated by 50 years. We recommend the method based on comparison of the predicted values of the transformed outcomes, with careful attention to how close the resulting residual distribution is to normal.  相似文献   

10.
Motion capture systems are widely used to measure human kinematics. Nevertheless, users must consider system errors when evaluating their results. Most validation techniques for these systems are based on relative distance and displacement measurements. In contrast, our study aimed to analyse the absolute volume accuracy of optical motion capture systems by means of engineering surveying reference measurement of the marker coordinates (uncertainty: 0.75 mm). The method is exemplified on an 18 camera OptiTrack Flex13 motion capture system. The absolute accuracy was defined by the root mean square error (RMSE) between the coordinates measured by the camera system and by engineering surveying (micro-triangulation). The original RMSE of 1.82 mm due to scaling error was managed to be reduced to 0.77 mm while the correlation of errors to their distance from the origin reduced from 0.855 to 0.209. A simply feasible but less accurate absolute accuracy compensation method using tape measure on large distances was also tested, which resulted in similar scaling compensation compared to the surveying method or direct wand size compensation by a high precision 3D scanner. The presented validation methods can be less precise in some respects as compared to previous techniques, but they address an error type, which has not been and cannot be studied with the previous validation methods.  相似文献   

11.
Performance of neuronal population coding is investigated numerically, in neurons with Gaussian tuning functions of various widths and noise ratios. The present model is applicable to both direction coding and orientation coding. It is shown that the coding error exhibits peculiar dependence on the width of the tuning function and that the dependence under the influence of noise is different from that of the noise-free case. In the absence of noise, the coding error increases monotonically with the width of the tuning function. The increment obeys the power law (the exponent estimated is 0.501) when the width is less than the critical value. In this region of the width a scaling law is obtained, which shows that the root-mean-square error is proportional to the square root of the ratio of the width of the tuning function to the population size. When the width exceeds the critical value, the coding error increases more rapidly than the power law. The reason for this anomalous increase, not seen previously, is argued. Existence of noise changes the dependence of the coding error on the width of the tuning function. Unlike the noise-free case, the error under the influence of noise becomes minimum at an intermediate value of the width. The width that gives the minimum coding error is termed the optimum width in this article. The numerical results suggest that the optimum width is roughly proportional to the square root of the noise ratio but has only a weak dependence on the population size. It is further shown that the coding error for the optimum width increases sharply when the noise ratio exceeds about 0.5 and is inversely proportional to the square root of the population size.  相似文献   

12.
We examine the degree to which fitting simple dynamic models to time series of population counts can predict extinction probabilities. This is both an active branch of ecological theory and an important practical topic for resource managers. We introduce an approach that is complementary to recently developed techniques for estimating extinction risks (e.g., diffusion approximations) and, like them, requires only count data rather than the detailed ecological information available for traditional population viability analyses. Assuming process error, we use four different models of population growth to generate snapshots of population dynamics via time series of the lengths commonly available to ecologists. We then ask to what extent we can identify which of several broad classes of population dynamics is evident in the time series snapshot. Along the way, we introduce the idea of "variation thresholds," which are the maximum amount of process error that a population may withstand and still have a specified probability of surviving for a given length of time. We then show how these thresholds may be useful to both ecologists and resource managers, particularly when dealing with large numbers of poorly understood species, a common problem faced by those designing biodiversity reserves.  相似文献   

13.
Soil organic carbon is a key soil property related to soil fertility, aggregate stability and the exchange of CO2 with the atmosphere. Existing soil maps and inventories can rarely be used to monitor the state and evolution in soil organic carbon content due to their poor spatial resolution, lack of consistency and high updating costs. Visible and Near Infrared diffuse reflectance spectroscopy is an alternative method to provide cheap and high-density soil data. However, there are still some uncertainties on its capacity to produce reliable predictions for areas characterized by large soil diversity. Using a large-scale EU soil survey of about 20,000 samples and covering 23 countries, we assessed the performance of reflectance spectroscopy for the prediction of soil organic carbon content. The best calibrations achieved a root mean square error ranging from 4 to 15 g C kg−1 for mineral soils and a root mean square error of 50 g C kg−1 for organic soil materials. Model errors are shown to be related to the levels of soil organic carbon and variations in other soil properties such as sand and clay content. Although errors are ∼5 times larger than the reproducibility error of the laboratory method, reflectance spectroscopy provides unbiased predictions of the soil organic carbon content. Such estimates could be used for assessing the mean soil organic carbon content of large geographical entities or countries. This study is a first step towards providing uniform continental-scale spectroscopic estimations of soil organic carbon, meeting an increasing demand for information on the state of the soil that can be used in biogeochemical models and the monitoring of soil degradation.  相似文献   

14.
Massively parallel pyrosequencing of the small subunit (16S) ribosomal RNA gene has revealed that the extent of rare microbial populations in several environments, the 'rare biosphere', is orders of magnitude higher than previously thought. One important caveat with this method is that sequencing error could artificially inflate diversity estimates. Although the per-base error of 16S rDNA amplicon pyrosequencing has been shown to be as good as or lower than Sanger sequencing, no direct assessments of pyrosequencing errors on diversity estimates have been reported. Using only Escherichia coli MG1655 as a reference template, we find that 16S rDNA diversity is grossly overestimated unless relatively stringent read quality filtering and low clustering thresholds are applied. In particular, the common practice of removing reads with unresolved bases and anomalous read lengths is insufficient to ensure accurate estimates of microbial diversity. Furthermore, common and reproducible homopolymer length errors can result in relatively abundant spurious phylotypes further confounding data interpretation. We suggest that stringent quality-based trimming of 16S pyrotags and clustering thresholds no greater than 97% identity should be used to avoid overestimates of the rare biosphere.  相似文献   

15.
刘玮  辛美丽  吕芳  刘梦侠  丁刚  吴海一 《生态学报》2018,38(6):2031-2040
鼠尾藻是潮间带海藻床的主要构建者,但何种统计模型更适合鼠尾藻的数量分布研究目前尚不清楚。选取山东荣成内遮岛15个25m2区域进行了调查和数量统计,比较了算数平均模型、反距离权重模型及普通克里金模型的精度差异,分析了群体密度、丛生指数及盖度等因素对模型统计精度的影响。结果表明,反距离权重模型表现较为稳定、平均误差最低(平均绝对误差39.1株,均方根误差53.3株,偏差率13.0%),而算术平均模型的精度波动最大、平均误差最高(平均绝对误差53.8株,均方根误差65.3株,偏差率14.6%)。群体密度和盖度因素对模型精度无明显影响(P0.05),但丛生指数能显著影响3种模型的平均绝对误差和均方根误差(P0.05)。研究表明,3种模型精度差异并不明显,模型精度在一些指标上受丛生指数影响。总体来看,反距离权重模型和普通克里金模型稳定性较好,误差均值较小,且均能够反映鼠尾藻群体的空间分布,因而在鼠尾藻群体数量分布计算中具有一定优势。  相似文献   

16.

Because trees can positively influence local environments in urban ecosystems, it is important to measure their morphological characteristics, such as height and diameter at breast height (DBH). However, measuring these data for each individual tree is a time-consuming process that requires a great deal of manpower. In this study, we investigated the feasibility of using mobile LiDAR to estimate tree height and DBH along urban streets and in urban parks. We compared measurements from a mobile LiDAR unit with field measurements of tree height and DBH in urban parks and streets. The height-above-ground and Pratt circle fit methods were applied to calculate tree height and DBH, respectively. The LiDAR-estimated tree heights were highly accurate albeit slightly underestimated, with a root mean square error of 0.359 m for the street trees and 0.462 m for the park trees. On the other hand, the estimated DBHs were moderately accurate and overestimated, with a root mean square error of 3.77 cm for the street trees and 8.95 cm for the park trees. Densely planted trees in the park and obstacles in urban areas result in “shadows” (areas with no data), reducing accuracy. Irregular trunk shapes and scanned data that did not include full data point coverage of every trunk were the reasons for the errors. Despite these errors, this study highlights the potential of tree measurements obtained with mobile terrestrial LiDAR platforms to be scaled up from point-based locations to neighborhood-scale and city-scale inventories.

  相似文献   

17.
Inference of population structure under a Dirichlet process model   总被引:1,自引:0,他引:1       下载免费PDF全文
Huelsenbeck JP  Andolfatto P 《Genetics》2007,175(4):1787-1802
Inferring population structure from genetic data sampled from some number of individuals is a formidable statistical problem. One widely used approach considers the number of populations to be fixed and calculates the posterior probability of assigning individuals to each population. More recently, the assignment of individuals to populations and the number of populations have both been considered random variables that follow a Dirichlet process prior. We examined the statistical behavior of assignment of individuals to populations under a Dirichlet process prior. First, we examined a best-case scenario, in which all of the assumptions of the Dirichlet process prior were satisfied, by generating data under a Dirichlet process prior. Second, we examined the performance of the method when the genetic data were generated under a population genetics model with symmetric migration between populations. We examined the accuracy of population assignment using a distance on partitions. The method can be quite accurate with a moderate number of loci. As expected, inferences on the number of populations are more accurate when theta = 4N(e)u is large and when the migration rate (4N(e)m) is low. We also examined the sensitivity of inferences of population structure to choice of the parameter of the Dirichlet process model. Although inferences could be sensitive to the choice of the prior on the number of populations, this sensitivity occurred when the number of loci sampled was small; inferences are more robust to the prior on the number of populations when the number of sampled loci is large. Finally, we discuss several methods for summarizing the results of a Bayesian Markov chain Monte Carlo (MCMC) analysis of population structure. We develop the notion of the mean population partition, which is the partition of individuals to populations that minimizes the squared partition distance to the partitions sampled by the MCMC algorithm.  相似文献   

18.
Accurately predicting the populations with difficulties accessing drinking water because of drought and taking appropriate mitigation measures can minimize economic loss and personal injury. Taking the 2013 Guizhou extreme summer drought as an example, on the basis of collecting meteorological, basic geographic information, socioeconomic data, and disaster effect data of the study area, a rapid assessment model based on a backpropagation (BP) neural network was constructed. Six factors were chosen for the input of the network: the average monthly precipitation, Digital Elevation Model (DEM), river density, population density, road density, and gross domestic product (GDP). The population affected by drought was the model's output. Using samples from 50 drought-affected counties in Guizhou Province for network training, the model's parameters were optimized. Using the trained model, the populations in need were predicted using the other 74 drought-affected counties. The accuracy of the prediction model, represented by the coefficient of determination (R2) and the normalized root mean square error (N-RMSE), yielded 0.7736 for R2 and 0.0070 for N-RMSE. The method may provide an effective reference for rapid assessment of the population in need and disaster effect verification.  相似文献   

19.
Remediation of contaminated sites requires information on upper concentration limits of chemicals in environmental media that are protective of ecological receptors. These upper concentration limits can be considered ecological preliminary remediation goals (EcoPRGs). The motivation for developing EcoPRGs was to provide risk managers with a simple tool for evaluating remedial actions that would be protective of the environment. Hazard quotient calculations used to support ecological screening assessments were modified to derive soil EcoPRGs for terrestrial wildlife populations. The primary modification is a population area use factor that is the fraction of a terrestrial animal population potentially affected by the contaminated site. Wildlife assessment population boundaries are based on a receptor's dispersal distance; for mammals dispersal distance is strongly related to the linear dimension (square root) of home range. Assuming that wildlife are unlikely to disperse beyond some distance from their birth or natal site, dispersal distance can be thought of as the radius of the assessment population's boundaries. This general relationship is useful as a simple way to estimate assessment population areas for terrestrial animals and helps fill data gaps for wildlife without direct measurements of dispersal.  相似文献   

20.
Renal dysfunction induced by cadmium: biomarkers of critical effects   总被引:4,自引:0,他引:4  
Alfred Bernard 《Biometals》2004,17(5):519-523
Cadmium (Cd) is cumulative poison which can damage the kidneys after prolonged exposure in the industry or the environment. Renal damage induced by Cd affects primarily the cellular and functional integrity of the proximal tubules, the main site of the renal accumulation of the metal. This results in a variety of urinary abnormalities including an increased excretion of calcium, amino acids, enzymes and proteins. These effects have been documented by a large number of studies conducted during more than two decades in experimental animals and in populations environmentally or occupationally exposed to Cd. There is now a general agreement to say that the most sensitive and specific indicator of Cd-induced renal dysfunction is a decreased tubular reabsorption of low molecular weight proteins, leading to the so-called tubular proteinuria. beta2-microblobulin, retinol-binding protein and alpha1-microglobulin are the microproteins the most commonly used for screening renal damage in populations at risk. Tubular dysfunction develops in a dose-dependent manner according to the internal dose of Cd as assessed on the basis of Cd levels in kidney, urine or in blood. Depending on the sensitivity of the renal biomarker and the susceptibility of the exposed populations, the thresholds of urinary Cd vary from 2 to 10 microg/g creatinine. The thresholds associated with the development of the microproteinuria, the critical effect predictive of a decline of the renal function, is estimated around 10 microg/g creatinine for both occupationally and environmentally exposed populations. Much lower thresholds have been reported in some European studies conducted on the general population. These low thresholds, however, have been derived from associations whose causality remains uncertain and for urinary protein increases that might be reversible. Cd-induced microproteinuria is usually considered as irreversible except at the incipient stage of the intoxication where a partial or complete reversibility has been found in some studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号