首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
By embedding the ultrametric distances between objects in a classification structure into a Euclidean space, two linear-algebraic procedures can be used to obtain a consensus among any combination of dendrograms, partitions, coverings, or more general arrangements of overlapping groups; the consensus has the property that the sum of the squared distances from the objects in each of the separate representations to their positions in the consensus is minimized in the Euclidean space. A consensus classification structure (usually a dendrogram) is obtained by reversing the embedding procedure. Admissibility criteria for a consensus are briefly considered.  相似文献   

2.
The coefficients of relationship and the Euclidean distances between 17 Venezuelan counties were estimated based on the frequency distribution of surnames obtained from the 1984 Venezuelan register of electors. In general, the coefficients of relationship between counties within the same state were higher than those between counties from different states. Euclidean distances exhibited the opposite relationship. Spearman's correlation coefficients for 136 pairs of counties were estimated between geographic and Euclidean distances (r = 0.41), geographic distance and coefficient of relationship (r = -0.44) and between Euclidean distance and coefficient of relationship (r = -0.48). The effect of partial geographic isolation on the frequency distribution of surnames is shown in the State of Falcón, where an isthmus separates two counties of the peninsula from two others on the mainland, and in Mérida, where the Venezuelan Andes separates three counties from the rest of the country. Our results suggest that Euclidean distances are less influenced by common surnames than the coefficients of relationship. They also indicate that in Venezuela a high proportion of the population has remained sedentary until recently, and this gives rise to the focal distribution of some surnames.  相似文献   

3.
A simple spectral editing procedure is described that generates separate subspectra for the methyl 13C-1H3 multiplet components of 1H-13C HSQC spectra. The editing procedure relies on co-addition of in-phase and antiphase spectra and yields 1H-coupled constant-time HSQC subspectra for the methyl region that have the simplicity of the regular decoupled CT-HSQC spectrum. Resulting spectra permit rapid and reliable measurement of 1H-13C J and dipolar couplings. The editing procedure is illustrated for a Ca2+-calmodulin sample in isotropic and liquid crystalline phases.  相似文献   

4.
Assessing the variation in individual frontal sinus outlines   总被引:2,自引:0,他引:2  
It is often suggested that the frontal sinus morphology of no two individuals is alike, and that the configuration of the frontal sinus is as unique to an individual as his or her fingerprints. However, no empirical, quantitative testing of the uniqueness of frontal sinus outlines has ever been performed. Such testing is necessary for frontal sinus identifications to be admissible in many courts. This study investigated frontal sinus outline variability using elliptic Fourier analysis (EFA), a geometric morphometric approach that fits a closed curve to an ordered set of data points, generating a set of coefficients that can be used to reproduce the outline. Two-dimensional representations of 808 frontal sinuses (as seen in posterior-anterior cranial radiographs) were digitized, and differences in their shapes were assessed quantitatively by comparing the Euclidean distances between EFA-generated outlines. Results show that Euclidean distances between outlines of different individuals are significantly larger than those between replicates of the same individual, and typicalities show that the probability of finding two different individuals with Euclidean distances less that that between a particular case's replicate is very small. Thus, there is a quantifiable and significant difference between the shapes of individuals' frontal sinus outlines.  相似文献   

5.
《Mathematical biosciences》1986,79(2):155-170
A new least squares estimation method for Bezier polynomial curves and surfaces is described and illustrated. The Cartesian coordinates of the vertices of polygons defining the curves separated well natural populations of Anodonta cygnea L. and human sagittal profiles of different sexes and age classes. Multivariate comparisons of the coordinates of Bezier polygon vertices and Euclidean distance measures showed that the polygon coordinates revealed shape differences better than distances between homologous points on the curves. Further, polygon coordinates separated groups for more than two variables as well as or better than the equivalent number distances. In addition, polygon coordinates permit construction of mean shapes and their variances. Possible applications in trend surface analyses and for illustration in computer-aided identification programs are suggested.  相似文献   

6.
In molecular biology, the issue of quantifying the similarity between two biological sequences is very important. Past research has shown that word-based search tools are computationally efficient and can find some new functional similarities or dissimilarities invisible to other algorithms like FASTA. Recently, under the independent model of base composition, Wu, Burke, and Davison (1997, Biometrics 53, 1431 1439) characterized a family of word-based dissimilarity measures that defined distance between two sequences by simultaneously comparing the frequencies of all subsequences of n adjacent letters (i.e., n-words) in the two sequences. Specifically, they introduced the use of Mahalanobis distance and standardized Euclidean distance into the study of DNA sequence dissimilarity. They showed that both distances had better sensitivity and selectivity than the commonly used Euclidean distance. The purpose of this article is to extend Mahalanobis and standardized Euclidean distances to Markov chain models of base composition. In addition, a new dissimilarity measure based on Kullback-Leibler discrepancy between frequencies of all n-words in the two sequences is introduced. Applications to real data demonstrate that Kullback-Leibler discrepancy gives a better performance than Euclidean distance. Moreover, under a Markov chain model of order kQ for base composition, where kQ is the estimated order based on the query sequence, standardized Euclidean distance performs very well. Under such a model, it performs as well as Mahalanobis distance and better than Kullback-Leibler discrepancy and Euclidean distance. Since standardized Euclidean distance is drastically faster to compute than Mahalanobis distance, in a usual workstation/PC computing environment, the use of standardized Euclidean distance under the Markov chain model of order kQ of base composition is generally recommended. However, if the user is very concerned with computational efficiency, then the use of Kullback-Leibler discrepancy, which can be computed as fast as Euclidean distance, is recommended. This can significantly enhance the current technology in comparing large datasets of DNA sequences.  相似文献   

7.
Plant scientists usually record several indicators in their abiotic factor experiments. The common statistical management involves univariate analyses. Such analyses generally create a split picture of the effects of experimental treatments since each indicator is addressed independently. The Euclidean distance combined with the information of the control treatment could have potential as an integrating indicator. The Euclidean distance has demonstrated its usefulness in many scientific fields but, as far as we know, it has not yet been employed for plant experimental analyses. To exemplify the use of the Euclidean distance in this field, we performed an experiment focused on the effects of mannitol on sugarcane micropropagation in temporary immersion bioreactors. Five mannitol concentrations were compared: 0, 50, 100, 150 and 200 mM. As dependent variables we recorded shoot multiplication rate, fresh weight, and levels of aldehydes, chlorophylls, carotenoids and phenolics. The statistical protocol which we then carried out integrated all dependent variables to easily identify the mannitol concentration that produced the most remarkable integral effect. Results provided by the Euclidean distance demonstrate a gradually increasing distance from the control in function of increasing mannitol concentrations. 200 mM mannitol caused the most significant alteration of sugarcane biochemistry and physiology under the experimental conditions described here. This treatment showed the longest statistically significant Euclidean distance to the control treatment (2.38). In contrast, 50 and 100 mM mannitol showed the lowest Euclidean distances (0.61 and 0.84, respectively) and thus poor integrated effects of mannitol. The analysis shown here indicates that the use of the Euclidean distance can contribute to establishing a more integrated evaluation of the contrasting mannitol treatments.  相似文献   

8.
Mitochondrial pre‐messenger RNAs in kinetoplastid protozoa are substrates of uridylate‐specific RNA editing. RNA editing converts non‐functional pre‐mRNAs into translatable molecules and can generate protein diversity by alternative editing. Although several editing complexes have been described, their structure and relationship is unknown. Here, we report the isolation of functionally active RNA editing complexes by a multistep purification procedure. We show that the endogenous isolates contain two subpopulations of ~20S and ~35–40S and present the three‐dimensional structures of both complexes by electron microscopy. The ~35–40S complexes consist of a platform density packed against a semispherical element. The ~20S complexes are composed of two subdomains connected by an interface. The two particles are structurally related, and we show that RNA binding is a main determinant for the interconversion of the two complexes. The ~20S editosomes contain an RNA‐binding site, which binds gRNA, pre‐mRNA and gRNA/pre‐mRNA hybrid molecules with nanomolar affinity. Variability analysis indicates that subsets of complexes lack or possess additional domains, suggesting binding sites for components. Together, a picture of the RNA editing machinery is provided.  相似文献   

9.
A technique is described for mathematically normalizing whole-cell protein profiles after sodium dodecyl sulphate-polyacrylamide gel electrophoresis to obtain standardized absolute migration distances using two internal Mr standards. A soft laser scanning densitometer was used to measure protein band migration distances in wet, silver-stained gels. The normalized values were superior to the unnormalized migration distances and common RF values in reducing the inter- and intragel variability of the protein band positions. A procedure is described for clustering normalized bacterial protein profiles using a sample data set obtained from the type strains of four Legionella species.  相似文献   

10.
RNA editing in flowering plant mitochondria alters numerous C nucleotides in a given mRNA molecule to U residues. To investigate whether neighbouring editing sites can influence each other we analyzed in vitro RNA editing of two sites spaced 30 nt apart. Deletion and competition experiments show that these two sites carry independent essential specificity determinants in the respective upstream 20-30 nucleotides. However, deletion of a an upstream sequence region promoting editing of the upstream site concomitantly decreases RNA editing of the second site 50-70 nucleotides downstream. This result suggests that supporting cis-/trans-interactions can be effective over larger distances and can affect more than one editing event.  相似文献   

11.
Using only data on sequence, a method of computing a low-resolution tertiary structure of a protein is described. The steps are: (a) Estimate the distances of individual residues from the centroid of the molecule, using data on hydrophobicity and additional geometrical constraints. (b) Using these distances, construct a two-valued matrix whose elements, the distances between residues, are greater or less thanR, the radius of the molecule. (c) Optimize to obtain a three-dimensional structure. This procedure requires modest computing facilities and is applicable to proteins with 164 residues and presumably more. It produces structures withr (correlation between inter-residue distances in the computed and native structures) between 0.5 and 0.7. Furthermore, correct inference of two or three long-range contacts suffices to yield structures withr values of 0.8–0.9. Because segments forming parallel or antiparallel folding structures intersect the radius vector at similar angles, from centroidal point distances it is possible to infer some of these long-range contacts by an elaboration of the procedure used to construct the input matrix. A criterion is also described which can be used to determine the quality of a proposed input matrix even when the native structure is not known.  相似文献   

12.
Two distribution-free permutation techniques are described for the analysis of ecological data. These methods are completely data dependent and provide analyses for the commonly-encountered completely-randomized and randomized-block designs in a multivariate framework. Euclidean distance forms the basis of both techniques, providing consistency with the observed distribution of data in many ecological studies.Abbreviations MRPP= Multiresponse permutation procedure - MRBP= Ibid, randomized block analog  相似文献   

13.
Ordination is a powerful method for analysing complex data setsbut has been largely ignored in sequence analysis. This papershows how to use principal coordinates analysis to find low–dimensionalrepresentations of distance matrices derived from aligned setsof sequences. The method takes a matrix of Euclidean distancesbetween all pairs of sequence and finds a coordinate space wherethe distances are exactly preserved The main problem is to finda measure of distance between aligned sequences that is Euclidean.The simplest distance function is the square root of the percentagedifference (as measured by identities) between two sequences,where one ignores any positions in the alignment where thereis a gap in any sequence. If one does not ignore positions witha gap, the distances cannot be guaranteed to be Euclidean butthe deleterious effects are trivial. Two examples of using themethod are shown. A set of 226 aligned globins were analysedand the resulting ordination very successfully represents theknown patterns of relationship between the sequences. In theother example, a set of 610 aligned 5S rRNA sequences were analysed.Sequence ordinations complement phylogenetic analyses. Theyshould not be viewed as a complete alternative.  相似文献   

14.
Landscape genetics aims to investigate functional connectivity among wild populations by evaluating the impact of landscape features on gene flow. Genetic distances among populations or individuals are generally better explained by least-cost path (LCP) distances derived from resistance surfaces than by simple Euclidean distances. Resistance surfaces reflect the cost for an organism to move through particular landscape elements. However, determining the effects of landscape types on movements is challenging. Because of a general lack of empirical data on movements, resistance surfaces mostly rely on expert knowledge. Habitat-suitability models potentially provide a more objective method to estimate resistance surfaces than expert opinions, but they have rarely been applied in landscape genetics so far. We compared LCP distances based on expert knowledge with LCP distances derived from habitat-suitability models to evaluate their performance in landscape genetics. We related all LCP distances to genetic distances in linear mixed effect models on an empirical data set of wolves (Canis lupus) from Italy. All LCP distances showed highly significant (P ≤ 0.0001) standardized β coefficients and R 2 values, but LCPs from habitat-suitability models generally showed higher values than those resulting from expert knowledge. Moreover, all LCP distances better explained genetic distances than Euclidean distances, irrespective of the approaches used. Considering our results, we encourage researchers in landscape genetics to use resistance surfaces based on habitat suitability which performed better than expert-based LCPs in explaining patterns of gene flow and functional connectivity.  相似文献   

15.
Habitat specialists living in metapopulations are sensitive to habitat fragmentation. In most studies, the effects of fragmentation on such species are analyzed based on Euclidean inter-patch distances. This approach, however, ignores the role of the landscape matrix. Recently, therefore, functional distances that account for the composition of the landscape surrounding the habitat patches have been used more frequently as indicators for patch occupancy. However, the performance of functional and non-functional connectivity measures in predicting patch occupancy of such species has never been compared in a multi-species approach.Here we evaluate the effect of habitat connectivity on the patch occupancy of 13 habitat specialists from three different insect orders (Auchenorrhyncha, Lepidoptera, Orthoptera) in fragmented calcareous grasslands. In order to calculate functional distances we used four different sets of resistance values and rankings. We then modelled species’ occurrence using both Euclidean and functional (based on least-cost modelling) inter-patch distances as predictors.We found that functional connectivity measures provided better results than the non-functional approach. However, a functional connectivity measure that was based on very coarse land-cover data performed even better than connectivity measures that were based on much more detailed land-use data.In order to take into account possible effects of the landscape matrix on patch occupancy by habitat specialists, future metapopulation studies should use functional rather than Euclidean distances whenever possible. For practical applications, we recommend a ‘simple approach’ which requires only coarse land-cover data and in our study performed better than all other functional connectivity measures, even more complex ones.  相似文献   

16.
In macroevolutionary studies, different approaches are commonly used to measure phylogenetic signal-the tendency of related taxa to resemble one another-including the K statistic and the Mantel test. The latter was recently criticized for lacking statistical power. Using new simulations, we show that the power of the Mantel test depends on the metrics used to define trait distances and phylogenetic distances between species. Increasing power is obtained by lowering variance and increasing negative skewness in interspecific distances, as obtained using Euclidean trait distances and the complement of Abouheif proximity as a phylogenetic distance. We show realistic situations involving "measurement error" due to intraspecific variability where the Mantel test is more powerful to detect a phylogenetic signal than a permutation test based on the K statistic. We highlight limitations of the K-statistic (univariate measure) and show that its application should take into account measurement errors using repeated measures per species to avoid estimation bias. Finally, we argue that phylogenetic distograms representing Euclidean trait distance as a function of the square root of patristic distance provide an insightful representation of the phylogenetic signal that can be used to assess both the impact of measurement error and the departure from a Brownian evolution model.  相似文献   

17.
Directional dispersal by wind and other dispersal agents may generate spatial patterns in passively dispersing metacommunities which cannot be detected by classical eigenvector methods based on Euclidean distances. We analysed zooplankton communities (Rotifera, Cladocera, Copepoda) in a cluster of soda pans distributed over a short spatial scale of 18 km and tested explicitly for directional signals in their spatial configuration. The study area is exposed to a prevailing northwestern wind direction. By applying asymmetric eigenvector maps (AEM), we were able to identify corresponding directionality in the spatial structure of communities. Furthermore, the match between community composition and environmental conditions exhibited a spatial pattern consistent with the prevailing wind corridor, with best match found downwind the dominant wind direction. We also found that classical eigenvector methods based on Euclidean distances underestimated the role of spatial processes in our data. Our study furthermore shows that dispersal limitation may constrain community assembly in highly mobile organisms even at spatial scales below 5 km.  相似文献   

18.
Clustering analysis of SAGE data using a Poisson approach   总被引:3,自引:1,他引:2       下载免费PDF全文
Serial analysis of gene expression (SAGE) data have been poorly exploited by clustering analysis owing to the lack of appropriate statistical methods that consider their specific properties. We modeled SAGE data by Poisson statistics and developed two Poisson-based distances. Their application to simulated and experimental mouse retina data show that the Poisson-based distances are more appropriate and reliable for analyzing SAGE data compared to other commonly used distances or similarity measures such as Pearson correlation or Euclidean distance.  相似文献   

19.
Higher plants encode hundreds of pentatricopeptide repeat proteins (PPRs) that are involved in several types of RNA processing reactions. Most PPR genes are predicted to be targeted to chloroplasts or mitochondria, and many are known to affect organellar gene expression. In some cases, RNA binding has been directly demonstrated, and the sequences of the cis-elements are known. In this work, we demonstrate that RNA cis-elements recognized by PPRs are constrained in chloroplast genome evolution. Cis-elements for two PPR genes and several RNA editing sites were analyzed for sequence changes by pairwise nucleotide substitution frequency, pairwise indel frequency, and maximum likelihood (ML) phylogenetic distances. All three of these analyses demonstrated that sequences within the cis-element are highly conserved compared with surrounding sequences. In addition, we have compared sequences around chloroplast editing sites and homologous sequences in species that lack an editing site due to the presence of a genomic T. Cis-elements for RNA editing sites are highly conserved in angiosperms; by contrast, comparable sequences around a genomically encoded T exhibit higher rates of nucleotide substitution, higher frequencies of indels, and greater ML distances. The loss in requirement for editing to create the ndhD start codon has resulted in the conversion of the PPR gene responsible for editing that site to a pseudogene. We show that organellar dependence on nuclear-encoded PPR proteins for gene expression has constrained the evolution of cis-elements that are required at the level of RNA processing. Thus, the expansion of the PPR gene family in plants has had a dramatic effect on the evolution of plant organelle genomes.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号