首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 36 毫秒
1.
Bertail P  Tressou J 《Biometrics》2006,62(1):66-74
This article proposes statistical tools for quantitative evaluation of the risk due to the presence of some particular contaminants in food. We focus on the estimation of the probability of the exposure to exceed the so-called provisional tolerable weekly intake (PTWI), when both consumption data and contamination data are independently available. A Monte Carlo approximation of the plug-in estimator, which may be seen as an incomplete generalized U-statistic, is investigated. We obtain the asymptotic properties of this estimator and propose several confidence intervals, based on two estimators of the asymptotic variance: (i) a bootstrap type estimator and (ii) an approximate jackknife estimator relying on the Hoeffding decomposition of the original U-statistics. As an illustration, we present an evaluation of the exposure to Ochratoxin A in France.  相似文献   

2.
Over the past few decades, phycotoxins, secondary metabolites produced by toxic phytoplankton, have seen an increase in their frequency, concentrations, and geographic distribution. As shellfish accumulate phycotoxins making them unfit for human consumption, they are considered as an important food safety issue. Thus, a consumer exposure assessment on phycotoxins is necessary. Exposure assessment requires two types of information: contamination and consumption data. Shellfish contamination data on major toxins encountered by at-risk populations (Domoic Acid group, Okadaic Acid group, and Saxitoxin group) have been reviewed. Consumption data have been reviewed for both general and potential high-consumer populations. Then, we undertook acute and chronic exposure assessments, combining available French contamination data and our own consumption data. Studies including exposure assessment were then reviewed. Lastly, risk characterization was undertaken. It can be concluded that both acute and chronic exposure to phycotoxins via shellfish consumption is a matter of concern, mainly for high consumers identified in this review (specific populations and shellfish harvesters). However, the results for risk characterization must be improved. There is a need for (i) toxicological data to establish a Tolerable Daily Intake; (ii) an assessment of consumption and contamination data, undertaken at the same time, so as to assess exposure.  相似文献   

3.

Background

Models of Foot and Mouth Disease (FMD) transmission have assumed a homogeneous landscape across which Euclidean distance is a suitable measure of the spatial dependency of transmission. This paper investigated features of the landscape and their impact on transmission during the period of predominantly local spread which followed the implementation of the national movement ban during the 2001 UK FMD epidemic. In this study 113 farms diagnosed with FMD which had a known source of infection within 3 km (cases) were matched to 188 control farms which were either uninfected or infected at a later timepoint. Cases were matched to controls by Euclidean distance to the source of infection and farm size. Intervening geographical features and connectivity between the source of infection and case and controls were compared.

Results

Road distance between holdings, access to holdings, presence of forest, elevation change between holdings and the presence of intervening roads had no impact on the risk of local FMD transmission (p > 0.2). However the presence of linear features in the form of rivers and railways acted as barriers to FMD transmission (odds ratio = 0.507, 95% CIs = 0.297,0.887, p = 0.018).

Conclusion

This paper demonstrated that although FMD spread can generally be modelled using Euclidean distance and numbers of animals on susceptible holdings, the presence of rivers and railways has an additional protective effect reducing the probability of transmission between holdings.
  相似文献   

4.
We present an index for the dissimilarity/distance between geographical distributions based on reporting rates recorded on a regular lattice. Reporting rate data are common, for example, in bird atlas projects where observers fill in check lists of encountered species in a particular area. Our index is a variation of the Euclidean distance, with the contribution of each grid cell weighted by the number of checklists collected for the grid cell, and a scaling factor to ensure that the dissimilarity ranges between zero and one. Reporting rates were transformed to ordered percentile classes to make species with different mean reporting rates comparable. The index was developed for the comparison of distributions of The Atlas of Southern African Birds . We illustrate the dissimilarity index comparing distributions of whydahs and indigobirds (widowfinches), which are specialized brood parasites, to the distributions of their hosts: waxbills and other finches.  相似文献   

5.
Models of amino acid substitution were developed and compared using maximum likelihood. Two kinds of models are considered. "Empirical" models do not explicitly consider factors that shape protein evolution, but attempt to summarize the substitution pattern from large quantities of real data. "Mechanistic" models are formulated at the codon level and separate mutational biases at the nucleotide level from selective constraints at the amino acid level. They account for features of sequence evolution, such as transition-transversion bias and base or codon frequency biases, and make use of physicochemical distances between amino acids to specify nonsynonymous substitution rates. A general approach is presented that transforms a Markov model of codon substitution into a model of amino acid replacement. Protein sequences from the entire mitochondrial genomes of 20 mammalian species were analyzed using different models. The mechanistic models were found to fit the data better than empirical models derived from large databases. Both the mutational distance between amino acids (determined by the genetic code and mutational biases such as the transition-transversion bias) and the physicochemical distance are found to have strong effects on amino acid substitution rates. A significant proportion of amino acid substitutions appeared to have involved more than one codon position, indicating that nucleotide substitutions at neighboring sites may be correlated. Rates of amino acid substitution were found to be highly variable among sites.   相似文献   

6.
Plant scientists usually record several indicators in their abiotic factor experiments. The common statistical management involves univariate analyses. Such analyses generally create a split picture of the effects of experimental treatments since each indicator is addressed independently. The Euclidean distance combined with the information of the control treatment could have potential as an integrating indicator. The Euclidean distance has demonstrated its usefulness in many scientific fields but, as far as we know, it has not yet been employed for plant experimental analyses. To exemplify the use of the Euclidean distance in this field, we performed an experiment focused on the effects of mannitol on sugarcane micropropagation in temporary immersion bioreactors. Five mannitol concentrations were compared: 0, 50, 100, 150 and 200 mM. As dependent variables we recorded shoot multiplication rate, fresh weight, and levels of aldehydes, chlorophylls, carotenoids and phenolics. The statistical protocol which we then carried out integrated all dependent variables to easily identify the mannitol concentration that produced the most remarkable integral effect. Results provided by the Euclidean distance demonstrate a gradually increasing distance from the control in function of increasing mannitol concentrations. 200 mM mannitol caused the most significant alteration of sugarcane biochemistry and physiology under the experimental conditions described here. This treatment showed the longest statistically significant Euclidean distance to the control treatment (2.38). In contrast, 50 and 100 mM mannitol showed the lowest Euclidean distances (0.61 and 0.84, respectively) and thus poor integrated effects of mannitol. The analysis shown here indicates that the use of the Euclidean distance can contribute to establishing a more integrated evaluation of the contrasting mannitol treatments.  相似文献   

7.
Aims: A growing number of foodborne illnesses has been associated with the consumption of fresh produce. In this study, the probability of lettuce contamination with Escherichia coli O157:H7 from manure-amended soil and the effect of intervention strategies was determined. Methods and Results: Pathogen prevalence and densities were modelled probabilistically through the primary production chain of lettuce (manure, manure-amended soil and lettuce). The model estimated an average of 0·34 contaminated heads per hectare. A minimum manure storage time of 30 days and a minimum fertilization-to-planting interval of 60 days was most successful in reducing the risk. Some specific organic farming practices concerning manure and soil management were found to be risk reducing. Conclusions: Certain specific organic farming practices reduced the likelihood of contamination. This cannot be generalized to organic production as a whole. However, the conclusion is relevant for areas like the Netherlands where there is high use of manure in both organic and conventional vegetable production. Significance and Impact of the Study: Recent vegetable-associated disease outbreaks stress the importance of a safe vegetable production chain. The present study contributed to this by providing a first estimate of the likelihood of lettuce contamination with E. coli O157:H7 and the effectiveness of risk mitigation strategies.  相似文献   

8.
Most molecular analyses, including phylogenetic inference, are based on sequence alignments. We present an algorithm that estimates relatedness between biomolecules without the requirement of sequence alignment by using a protein frequency matrix that is reduced by singular value decomposition (SVD), in a latent semantic index information retrieval system. Two databases were used: one with 832 proteins from 13 mitochondrial gene families and another composed of 1000 sequences from nine types of proteins retrieved from GenBank. Firstly, 208 sequences from the first database and 200 from the second were randomly selected and compared using edit distance between each pair of sequences and respective cosines and Euclidean distances from SVD. Correlation between cosine and edit distance was -0.32 (P < 0.01) and between Euclidean distance and edit distance was +0.70 (P < 0.01). In order to check the ability of SVD in classifying sequences according to their categories, we used a sample of 202 sequences from the 13 gene families as queries (test set), and the other proteins (630) were used to generate the frequency matrix (training set). The classification algorithm applies a voting scheme based on the five most similar sequences with each query. With a 3-peptide frequency matrix, all 202 queries were correctly classified (accuracy = 100%). This algorithm is very attractive, because sequence alignments are neither generated nor required. In order to achieve results similar to those obtained with edit distance analysis, we recommend that Euclidean distance be used as a similarity measure for protein sequences in latent semantic indexing methods.  相似文献   

9.
Plant scientists usually record many indicators in their experiments. The common statistical management involves univariate analyses. Such analyses generally create a split picture of the effects of experimental treatments since each indicator is addressed independently. The Euclidean distance combined with the expert’s criteria seems to show potential as an integrating indicator. The Euclidean distance has been widely used in many scientific fields nevertheless, as far as we know, it has not been frequently employed in plant science experiments. To exemplify the use of the Euclidean distance in this field, we performed an experiment focused on the effects of gibberellic acid on protease excretion during pineapple micropropagation in temporary immersion bioreactors. Five gibberellic acid concentrations were compared: 0.0, 1.4, 2.8, 4.2 and 5.6 μM. Four dependent variables were recorded: increase in fresh shoot mass, protein concentration, proteolytic activity and specific activity in the culture media. The statistical protocol carried out integrated these four dependent variables to easily identify the best gibberellic acid treatment. Based on expert’s criteria the integral analysis provided by the Euclidean distance indicated that 4.2 μM gibberellic acid is the best treatment to produce proteases in the experimental conditions described here. This treatment showed the shortest (statistically significant) Euclidean distance to expert’s criteria (0.61).  相似文献   

10.

Background  

The definition of a distance measure plays a key role in the evaluation of different clustering solutions of gene expression profiles. In this empirical study we compare different clustering solutions when using the Mutual Information (MI) measure versus the use of the well known Euclidean distance and Pearson correlation coefficient.  相似文献   

11.
Allozyme data are widely used to infer the phylogenies of populations and closely-related species. Numerous parsimony, distance, and likelihood methods have been proposed for phylogenetic analysis of these data; the relative merits of these methods have been debated vigorously, but their accuracy has not been well explored. In this study, I compare the performance of 13 phylogenetic methods (six parsimony, six distance, and continuous maximum likelihood) by applying a congruence approach to eight allozyme data sets from the literature. Clades are identified that are supported by multiple data sets other than allozymes (e.g. morphology, DNA sequences), and the ability of different methods to recover these 'known' clades is compared. The results suggest that (1) distance and likelihood methods generally outperform parsimony methods, (2) methods that utilize frequency data tend to perform well, and (3) continuous maximum likelihood is among the most accurate methods, and appears to be robust to violations of its assumptions. These results are in agreement with those from recent simulation studies, and help provide a basis for empirical workers to choose among the many methods available for analysing allozyme characters.  相似文献   

12.
Ordination is a powerful method for analysing complex data setsbut has been largely ignored in sequence analysis. This papershows how to use principal coordinates analysis to find low–dimensionalrepresentations of distance matrices derived from aligned setsof sequences. The method takes a matrix of Euclidean distancesbetween all pairs of sequence and finds a coordinate space wherethe distances are exactly preserved The main problem is to finda measure of distance between aligned sequences that is Euclidean.The simplest distance function is the square root of the percentagedifference (as measured by identities) between two sequences,where one ignores any positions in the alignment where thereis a gap in any sequence. If one does not ignore positions witha gap, the distances cannot be guaranteed to be Euclidean butthe deleterious effects are trivial. Two examples of using themethod are shown. A set of 226 aligned globins were analysedand the resulting ordination very successfully represents theknown patterns of relationship between the sequences. In theother example, a set of 610 aligned 5S rRNA sequences were analysed.Sequence ordinations complement phylogenetic analyses. Theyshould not be viewed as a complete alternative.  相似文献   

13.
It is well known that ecological communities are spatially and temporally dynamic. Quantifying temporal variability in ecological communities is challenging, however, especially for time-series data sets of less than 40 measurement intervals. In this paper, we describe a method to quantify temporal variability in multispecies communities over time frames of 10–40 measurement intervals. Our approach is a community-level extension of autocorrelation analysis, but we use Euclidean distance to measure similarity of community samples at increasing time lags rather than the correlation coefficient. Regressing Euclidean distances versus increasing time lags yields a measure of the rate and nature of community change over time. We demonstrate the method with empirical data sets from shortgrass steppe, old-field succession and zooplankton dynamics in lakes, and we investigate properties of the analysis using simulation models. Results indicate that time-lag analysis provides a useful quantitative measurement of the rate and pattern of temporal dynamics in communities over time frames that are too short for more traditional autocorrelation approaches.  相似文献   

14.
15.
16.
In molecular biology, the issue of quantifying the similarity between two biological sequences is very important. Past research has shown that word-based search tools are computationally efficient and can find some new functional similarities or dissimilarities invisible to other algorithms like FASTA. Recently, under the independent model of base composition, Wu, Burke, and Davison (1997, Biometrics 53, 1431 1439) characterized a family of word-based dissimilarity measures that defined distance between two sequences by simultaneously comparing the frequencies of all subsequences of n adjacent letters (i.e., n-words) in the two sequences. Specifically, they introduced the use of Mahalanobis distance and standardized Euclidean distance into the study of DNA sequence dissimilarity. They showed that both distances had better sensitivity and selectivity than the commonly used Euclidean distance. The purpose of this article is to extend Mahalanobis and standardized Euclidean distances to Markov chain models of base composition. In addition, a new dissimilarity measure based on Kullback-Leibler discrepancy between frequencies of all n-words in the two sequences is introduced. Applications to real data demonstrate that Kullback-Leibler discrepancy gives a better performance than Euclidean distance. Moreover, under a Markov chain model of order kQ for base composition, where kQ is the estimated order based on the query sequence, standardized Euclidean distance performs very well. Under such a model, it performs as well as Mahalanobis distance and better than Kullback-Leibler discrepancy and Euclidean distance. Since standardized Euclidean distance is drastically faster to compute than Mahalanobis distance, in a usual workstation/PC computing environment, the use of standardized Euclidean distance under the Markov chain model of order kQ of base composition is generally recommended. However, if the user is very concerned with computational efficiency, then the use of Kullback-Leibler discrepancy, which can be computed as fast as Euclidean distance, is recommended. This can significantly enhance the current technology in comparing large datasets of DNA sequences.  相似文献   

17.
MOTIVATION: New application areas of survival analysis as for example based on micro-array expression data call for novel tools able to handle high-dimensional data. While classical (semi-) parametric techniques as based on likelihood or partial likelihood functions are omnipresent in clinical studies, they are often inadequate for modelling in case when there are less observations than features in the data. Support vector machines (svms) and extensions are in general found particularly useful for such cases, both conceptually (non-parametric approach), computationally (boiling down to a convex program which can be solved efficiently), theoretically (for its intrinsic relation with learning theory) as well as empirically. This article discusses such an extension of svms which is tuned towards survival data. A particularly useful feature is that this method can incorporate such additional structure as additive models, positivity constraints of the parameters or regression constraints. RESULTS: Besides discussion of the proposed methods, an empirical case study is conducted on both clinical as well as micro-array gene expression data in the context of cancer studies. Results are expressed based on the logrank statistic, concordance index and the hazard ratio. The reported performances indicate that the present method yields better models for high-dimensional data, while it gives results which are comparable to what classical techniques based on a proportional hazard model give for clinical data.  相似文献   

18.
Landscape genetics aims to investigate functional connectivity among wild populations by evaluating the impact of landscape features on gene flow. Genetic distances among populations or individuals are generally better explained by least-cost path (LCP) distances derived from resistance surfaces than by simple Euclidean distances. Resistance surfaces reflect the cost for an organism to move through particular landscape elements. However, determining the effects of landscape types on movements is challenging. Because of a general lack of empirical data on movements, resistance surfaces mostly rely on expert knowledge. Habitat-suitability models potentially provide a more objective method to estimate resistance surfaces than expert opinions, but they have rarely been applied in landscape genetics so far. We compared LCP distances based on expert knowledge with LCP distances derived from habitat-suitability models to evaluate their performance in landscape genetics. We related all LCP distances to genetic distances in linear mixed effect models on an empirical data set of wolves (Canis lupus) from Italy. All LCP distances showed highly significant (P ≤ 0.0001) standardized β coefficients and R 2 values, but LCPs from habitat-suitability models generally showed higher values than those resulting from expert knowledge. Moreover, all LCP distances better explained genetic distances than Euclidean distances, irrespective of the approaches used. Considering our results, we encourage researchers in landscape genetics to use resistance surfaces based on habitat suitability which performed better than expert-based LCPs in explaining patterns of gene flow and functional connectivity.  相似文献   

19.
We used data from three Common Goldeneye Bucephala clangula populations in Finland to study if the variation between females in egg morphology, as measured using a method developed by Eadie (1989), can be used to identify parasitized clutches. Eadie's method is based on z-score standardized measures of length, width, and weight of eggs. Using these measures, Euclidean distance for each pair of eggs within a clutch was calculated. Euclidean distance between the two most dissimilar eggs (maximum Euclidean distance, MED) was used as the criterion to identify parasitized clutches. Test clutches of 3 eggs that included one egg from each of three different females had a higher MED (2.80) than 3-egg clutches that included eggs from one female only (2.05), proving that there is statistically significant variation in egg morphology between females. Test clutches that included three eggs from each of three different females (9 eggs in all) had a mean MED of 4.51. The mean MED of naturally parasitized clutches (4.83) was higher than that of nonparasitized clutches (2.12). Further analyses suggested that MED>3.0 can be used as a conservative and reliable criterion to identify parasitized clutches. Our results confirm that Eadie's method is reliable enough to identify parasitized clutches in Common Goldeneyes.  相似文献   

20.
Consumption‐accounted greenhouse gas (GHG) emissions (GHGEs) vary considerably between households. Research originating from different traditions, including consumption research, urban planning, and environmental psychology, have studied different types of explanatory variables and provided different insights into this matter. This study integrates explanatory variables from different fields of research in the same empirical material, including socioeconomic variables (income, household size, sex, and age), motivational variables (proenvironmental attitudes and social norms), and physical variables (dwelling types and geographical distances). A survey was distributed to 2,500 Swedish households with a response rate of 40%. GHGEs were estimated for transport, residential energy, food, and other consumption, using data from both the survey and registers, such as odometer readings of cars and electricity consumption from utility providers. The results point toward the importance of explanatory variables that have to do with circumstances rather than motivations for proenvironmental behaviors. Net income was found to be the most important variable to explain GHGEs, followed by the physical variables, dwelling type, and the geographical distance index. The results also indicate that social norms around GHG‐intensive activities, for example, transport, may have a larger impact on a subject's emission level than proenvironmental attitudes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号