首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Pairwise curve synchronization for functional data   总被引:1,自引:0,他引:1  
Tang  Rong; Muller  Hans-Georg 《Biometrika》2008,95(4):875-889
Data collected by scientists are increasingly in the form oftrajectories or curves. Often these can be viewed as realizationsof a composite process driven by both amplitude and time variation.We consider the situation in which functional variation is dominatedby time variation, and develop a curve-synchronization methodthat uses every trajectory in the sample as a reference to obtainpairwise warping functions in the first step. These initialpairwise warping functions are then used to create improvedestimators of the underlying individual warping functions inthe second step. A truncated averaging process is used to obtainrobust estimation of individual warping functions. The methodcompares well with other available time-synchronization approachesand is illustrated with Berkeley growth data and gene expressiondata for multiple sclerosis.  相似文献   

2.
Functional data analysis techniques provide an alternative way of representing movement and movement variability as a function of time. In particular, the registration of functional data provides a local normalization of time functions. This normalization transforms a set of curves, records of repeated trials, yielding a new set of curves that only vary in terms of amplitude. Therefore, main events occur at the "same time" for all transformed curves and interesting features of individual recordings remain after averaging processes. This paper presents an application of the registration process to the analysis of the vertical forces exerted on the ground by both feet during the sit-to-stand movement. This movement is particularly interesting in functional evaluations related to balance control, lower extremity dysfunction or low-back pain.  相似文献   

3.
Methods for modeling sets of complex curves where the curves must be aligned in time (or in another continuous predictor) fall into the general class of functional data analysis and include self-modeling regression and time-warping procedures. Self-modeling regression (SEMOR), also known as a shape invariant model (SIM), assumes the curves have a common shape, modeled nonparametrically, and curve-specific differences in amplitude and timing, traditionally modeled by linear transformations. When curves contain multiple features that need to be aligned in time, SEMOR may be inadequate since a linear time transformation generally cannot align more than one feature. Time warping procedures focus on timing variability and on finding flexible time warps to align multiple data features. We draw on these methods to develop a SIM that models the time transformations as random, flexible, monotone functions. The model is motivated by speech movement data from the University of Wisconsin X-ray microbeam speech production project and is applied to these data to test the effect of different speaking conditions on the shape and relative timing of movement profiles.  相似文献   

4.
  1. When we collect the growth curves of many individuals, orderly variation in the curves is often observed rather than a completely random mixture of various curves. Small individuals may exhibit similar growth curves, but the curves differ from those of large individuals, whereby the curves gradually vary from small to large individuals. It has been recognized that after standardization with the asymptotes, if all the growth curves are the same (anamorphic growth curve set), the growth curve sets can be estimated using nonchronological data; otherwise, that is, if the growth curves are not identical after standardization with the asymptotes (polymorphic growth curve set), this estimation is not feasible. However, because a given set of growth curves determines the variation in the observed data, it may be possible to estimate polymorphic growth curve sets using nonchronological data.
  2. In this study, we developed an estimation method by deriving the likelihood function for polymorphic growth curve sets. The method involves simple maximum likelihood estimation. The weighted nonlinear regression and least‐squares method after the log‐transform of the anamorphic growth curve sets were included as special cases.
  3. The growth curve sets of the height of cypress (Chamaecyparis obtusa) and larch (Larix kaempferi) trees were estimated. With the model selection process using the AIC and likelihood ratio test, the growth curve set for cypress was found to be polymorphic, whereas that for larch was found to be anamorphic. Improved fitting using the polymorphic model for cypress is due to resolving underdispersion (less dispersion in real data than model prediction).
  4. The likelihood function for model estimation depends not only on the distribution type of asymptotes, but the definition of the growth curve set as well. Consideration of these factors may be necessary, even if environmental explanatory variables and random effects are introduced.
  相似文献   

5.
Given growing interest in functional data analysis (FDA) as a useful method for analyzing human movement data, it is critical to understand the effects of standard FDA procedures, including registration, on biomechanical analyses. Registration is used to reduce phase variability between curves while preserving the individual curve's shape and amplitude. The application of three methods available to assess registration could benefit those in the biomechanics community using FDA techniques: comparison of mean curves, comparison of average RMS values, and assessment of time-warping functions. Therefore, the present study has two purposes. First, the necessity of registration applied to cyclical data after time normalization is assessed. Second, we illustrate the three methods for evaluating registration effects. Masticatory jaw movements of 22 healthy adults (2 males, 21 females) were tracked while subjects chewed a gum-based pellet for 20 s. Motion data were captured at 60 Hz with two gen-locked video cameras. Individual chewing cycles were time normalized and then transformed into functional observations. Registration did not affect mean curves and warping functions were linear. Although registration decreased the RMS, indicating a decrease in inter-subject variability, the difference was not statistically significant. Together these results indicate that registration may not always be necessary for cyclical chewing data. An important contribution of this paper is the illustration of three methods for evaluating registration that are easy to apply and useful for judging whether the extra data manipulation is necessary.  相似文献   

6.
Functional traits and functional diversity measures are increasingly being used to examine land use effects on biodiversity and community assembly rules. Morphological traits are often used directly as functional traits. However, behavioral characteristics are more difficult to measure. Establishing methods to derive behavioral traits from morphological measurements is necessary to facilitate their inclusion in functional diversity analyses. We collected morphometric data from over 1,700 individuals of 12 species of dung beetle to establish whether morphological measurements can be used as predictors of behavioral traits. We also compared morphology among individuals collected from different land uses (primary forest, logged forest, and oil palm plantation) to identify whether intraspecific differences in morphology vary among land use types. We show that leg and eye measurements can be used to predict dung beetle nesting behavior and period of activity and we used this information to confirm the previously unresolved nesting behavior for Synapsis ritsemae. We found intraspecific differences in morphological traits across different land use types. Phenotypic plasticity was found for traits associated with dispersal (wing aspect ratio and wing loading) and reproductive capacity (abdomen size). The ability to predict behavioral functional traits from morphology is useful where the behavior of individuals cannot be directly observed, especially in tropical environments where the ecology of many species is poorly understood. In addition, we provide evidence that land use change can cause phenotypic plasticity in tropical dung beetle species. Our results reinforce recent calls for intraspecific variation in traits to receive more attention within community ecology.  相似文献   

7.
A novel hierarchical quantitative trait locus (QTL) mapping method using a polynomial growth function and a multiple-QTL model (with no dependence in time) in a multitrait framework is presented. The method considers a population-based sample where individuals have been phenotyped (over time) with respect to some dynamic trait and genotyped at a given set of loci. A specific feature of the proposed approach is that, instead of an average functional curve, each individual has its own functional curve. Moreover, each QTL can modify the dynamic characteristics of the trait value of an individual through its influence on one or more growth curve parameters. Apparent advantages of the approach include: (1) assumption of time-independent QTL and environmental effects, (2) alleviating the necessity for an autoregressive covariance structure for residuals and (3) the flexibility to use variable selection methods. As a by-product of the method, heritabilities and genetic correlations can also be estimated for individual growth curve parameters, which are considered as latent traits. For selecting trait-associated loci in the model, we use a modified version of the well-known Bayesian adaptive shrinkage technique. We illustrate our approach by analysing a sub sample of 500 individuals from the simulated QTLMAS 2009 data set, as well as simulation replicates and a real Scots pine (Pinus sylvestris) data set, using temporal measurements of height as dynamic trait of interest.  相似文献   

8.
We investigate the use of follow-up samples of individuals to estimate survival curves from studies that are subject to right censoring from two sources: (i) early termination of the study, namely, administrative censoring, or (ii) censoring due to lost data prior to administrative censoring, so-called dropout. We assume that, for the full cohort of individuals, administrative censoring times are independent of the subjects' inherent characteristics, including survival time. To address the loss to censoring due to dropout, which we allow to be possibly selective, we consider an intensive second phase of the study where a representative sample of the originally lost subjects is subsequently followed and their data recorded. As with double-sampling designs in survey methodology, the objective is to provide data on a representative subset of the dropouts. Despite assumed full response from the follow-up sample, we show that, in general in our setting, administrative censoring times are not independent of survival times within the two subgroups, nondropouts and sampled dropouts. As a result, the stratified Kaplan-Meier estimator is not appropriate for the cohort survival curve. Moreover, using the concept of potential outcomes, as opposed to observed outcomes, and thereby explicitly formulating the problem as a missing data problem, reveals and addresses these complications. We present an estimation method based on the likelihood of an easily observed subset of the data and study its properties analytically for large samples. We evaluate our method in a realistic situation by simulating data that match published margins on survival and dropout from an actual hip-replacement study. Limitations and extensions of our design and analytic method are discussed.  相似文献   

9.
An insect’s behavior is the expression of its integrated physiology in response to external and internal stimuli, turning insect behavior into a potential determinant of insecticide exposure. Behavioral traits may therefore influence insecticide efficacy against insects, compromising the validity of standard bioassays of insecticide activity, which are fundamentally based on lethality alone. By extension, insect ‘personality’ (i.e., an individual’s integrated set of behavioral tendencies that is inferred from multiple empirical measures) may also be an important determinant of insecticide exposure and activity. This has yet to be considered because the behavioral studies involving insects and insecticides focus on populations rather than on individuals. Even among studies of animal ‘personality’, the relative contributions of individual and population variation are usually neglected. Here, we assessed behavioral traits (within the categories: activity, boldness/shyness, and exploration/avoidance) of individuals from 15 populations of the maize weevil (Sitophilus zeamais), an important stored-grain pest with serious problems of insecticide resistance, and correlated the behavioral responses with the activity of the insecticide deltamethrin. This analysis was performed at both the population and individual levels. There was significant variation in weevil ‘personality’ among individuals and populations, but variation among individuals within populations accounted for most of the observed variation (92.57%). This result emphasizes the importance of individual variation in behavioral and ‘personality’ studies. When the behavioral traits assessed were correlated with median lethal time (LT50) at the population level and with the survival time under insecticide exposure, activity traits, particularly the distance walked, significantly increased survival time. Therefore, behavioral traits are important components of insecticide efficacy, and individual variation should be considered in such studies. This is so because population differences provided only crude approximation of the individual personality in a restrained experimental setting likely to restrict individual behavior favoring the transposition of the individual variation to the population.  相似文献   

10.
Experimental and corresponding modeling studies indicate that there is a 2- to 5-fold variation of intrinsic and synaptic parameters across animals while functional output is maintained. Here, we review experiments, using the heartbeat central pattern generator (CPG) in medicinal leeches, which explore the consequences of animal-to-animal variation in synaptic strength for coordinated motor output. We focus on a set of segmental heart motor neurons that all receive inhibitory synaptic input from the same four premotor interneurons. These four premotor inputs fire in a phase progression and the motor neurons also fire in a phase progression because of differences in synaptic strength profiles of the four inputs among segments. Our work tested the hypothesis that functional output is maintained in the face of animal-to-animal variation in the absolute strength of connections because relative strengths of the four inputs onto particular motor neurons is maintained across animals. Our experiments showed that relative strength is not strictly maintained across animals even as functional output is maintained, and animal-to-animal variations in strength of particular inputs do not correlate strongly with output phase. Further experiments measured the precise temporal pattern of the premotor inputs, the segmental synaptic strength profiles of their connections onto motor neurons, and the temporal pattern (phase progression) of those motor neurons all in the same animal for a series of 12 animals. The analysis of input and output in this sample of 12 individuals suggests that the number (four) of inputs to each motor neuron and the variability of the temporal pattern of input from the CPG across individuals weaken the influence of the strength of individual inputs. Moreover, the temporal pattern of the output varies as much across individuals as that of the input. Essentially, each animal arrives at a unique solution for how the network produces functional output.  相似文献   

11.
12.
Summary .  Time course microarray data consist of mRNA expression from a common set of genes collected at different time points. Such data are thought to reflect underlying biological processes developing over time. In this article, we propose a model that allows us to examine differential expression and gene network relationships using time course microarray data. We model each gene-expression profile as a random functional transformation of the scale, amplitude, and phase of a common curve. Inferences about the gene-specific amplitude parameters allow us to examine differential gene expression. Inferences about measures of functional similarity based on estimated time-transformation functions allow us to examine gene networks while accounting for features of the gene-expression profiles. We discuss applications to simulated data as well as to microarray data on prostate cancer progression.  相似文献   

13.
Time-depth recorders sample information about the three- dimensional behavior of diving animals over time and reduce this into just two dimensions, depth and time. Even so, interpretation of the data may still be difficult because of the volume of data and the detail that remains. Comparison of dive "shape" across individuals, geographical locations, or species presents problems because its analysis may involve subjective judgments or arbitrary distinctions. More constraints may be imposed if a telemetry system is used to transmit the data. Here we present two approaches for dive data compression and analysis. The first (applied before storage and transmission) selects the most important time-depth points in a profile where depth Vs. time trajectories change most significantly. The second (used to either preprocess or postprocess dive information) creates a dimensionless, depth, and duration independent index (TAD), which encapsulates the relevant information from dive profiles on where the diver centers its activity with respect to depth during a dive. Its use facilitates comparison across dives performed at different times or places, within or between individuals or species, irrespective of the duration and depth of their dives. Both can be used to reduce the amount of information sent or stored about dive behavior and can facilitate dive analysis.  相似文献   

14.
Plasma membrane is a complex structure, mainly composed by lipids and proteins, which plays a pivotal role in cell metabolism by regulating its selective permeability to ions and molecules. According to the “raft hypothesis”, lipids in the bilayer are not forming a structurally passive solvent, but are rather organized in specific domains, which present different structural and functional characteristics. The mechanical properties of the lipid part of plasma membrane have been recently characterized through Atomic Force Microscopy, by analyzing the features of force vs distance curves collected on supported lipid bilayers (SLBs). In case of lipid domains sizing from tens to hundreds of nanometers, which mimic in a good way the lateral organization of real membranes, a high lateral resolution and a large number of curves are often required for properly expressing the complexity of the system, with a consequent exponential growth of acquisition and processing time. In this paper we propose a method, based on a recently developed high speed Force Volume technique and on home‐built data processing software, for the mechanical characterization of nanostructured SLBs. With our software we have been able to process data set composed by tens of thousands of curves, collected with a spatial resolution ranging from 8 to 40 nm/pixel. Multiparametric maps and distribution histograms produced by our analysis allowed identifying a specific behavior for each lipid phase in the investigated model membranes, even in presence of nanosized features. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

15.
Current methods for detecting fluctuating selection require time series data on genotype frequencies. Here, we propose an alternative approach that makes use of DNA polymorphism data from a sample of individuals collected at a single point in time. Our method uses classical diffusion approximations to model temporal fluctuations in the selection coefficients to find the expected distribution of mutation frequencies in the population. Using the Poisson random-field setting we derive the site-frequency spectrum (SFS) for three different models of fluctuating selection. We find that the general effect of fluctuating selection is to produce a more "U"-shaped site-frequency spectrum with an excess of high-frequency derived mutations at the expense of middle-frequency variants. We present likelihood-ratio tests, comparing the fluctuating selection models to the neutral model using SFS data, and use Monte Carlo simulations to assess their power. We find that we have sufficient power to reject a neutral hypothesis using samples on the order of a few hundred SNPs and a sample size of approximately 20 and power to distinguish between selection that varies in time and constant selection for a sample of size 20. We also find that fluctuating selection increases the probability of fixation of selected sites even if, on average, there is no difference in selection among a pair of alleles segregating at the locus. Fluctuating selection will, therefore, lead to an increase in the ratio of divergence to polymorphism similar to that observed under positive directional selection.  相似文献   

16.
The identification of genetic variants responsible for behavioral variation is an enduring goal in biology, with wide-scale ramifications, ranging from medical research to evolutionary theory on personality syndromes. Here, we use for the first time a large-scale genetical genomics analysis in the brains of chickens to identify genes affecting anxiety as measured by an open field test. We combine quantitative trait locus (QTL) analysis in 572 individuals and expression QTL (eQTL) analysis in 129 individuals from an advanced intercross between domestic chickens and Red Junglefowl. We identify 10 putative quantitative trait genes affecting anxiety behavior. These genes were tested for an association in the mouse Heterogeneous Stock anxiety (open field) data set and human GWAS data sets for bipolar disorder, major depressive disorder, and schizophrenia. Although comparisons between species are complex, associations were observed for four of the candidate genes in mice and three of the candidate genes in humans. Using a multimodel approach we have therefore identified a number of putative quantitative trait genes affecting anxiety behavior, principally in chickens but also with some potentially translational effects as well. This study demonstrates that chickens are an excellent model organism for the genetic dissection of behavior.  相似文献   

17.
In this article, we imagine a breeding scenario with a population of individuals that have been genotyped but not phenotyped. We derived a computationally efficient statistic that uses this genetic information to measure the reliability of genomic estimated breeding values (GEBV) for a given set of individuals (test set) based on a training set of individuals. We used this reliability measure with a genetic algorithm scheme to find an optimized training set from a larger set of candidate individuals. This subset was phenotyped to create the training set that was used in a genomic selection model to estimate GEBV in the test set. Our results show that, compared to a random sample of the same size, the use of a set of individuals selected by our method improved accuracies. We implemented the proposed training selection methodology on four sets of data on Arabidopsis, wheat, rice and maize. This dynamic model building process that takes genotypes of the individuals in the test sample into account while selecting the training individuals improves the performance of genomic selection models.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0116-6) contains supplementary material, which is available to authorized users.  相似文献   

18.
Large-scale, multilocus genetic association studies require powerful and appropriate statistical-analysis tools that are designed to relate genotype and haplotype information to phenotypes of interest. Many analysis approaches consider relating allelic, haplotypic, or genotypic information to a trait through use of extensions of traditional analysis techniques, such as contingency-table analysis, regression methods, and analysis-of-variance techniques. In this work, we consider a complementary approach that involves the characterization and measurement of the similarity and dissimilarity of the allelic composition of a set of individuals' diploid genomes at multiple loci in the regions of interest. We describe a regression method that can be used to relate variation in the measure of genomic dissimilarity (or "distance") among a set of individuals to variation in their trait values. Weighting factors associated with functional or evolutionary conservation information of the loci can be used in the assessment of similarity. The proposed method is very flexible and is easily extended to complex multilocus-analysis settings involving covariates. In addition, the proposed method actually encompasses both single-locus and haplotype-phylogeny analysis methods, which are two of the most widely used approaches in genetic association analysis. We showcase the method with data described in the literature. Ultimately, our method is appropriate for high-dimensional genomic data and anticipates an era when cost-effective exhaustive DNA sequence data can be obtained for a large number of individuals, over and above genotype information focused on a few well-chosen loci.  相似文献   

19.
In terms of its soluble precursors, the coagulation proteome varies quantitatively among apparently healthy individuals. The significance of this variability remains obscure, in part because it is the backdrop against which the hemostatic consequences of more dramatic composition differences are studied. In this study we have defined the consequences of normal range variation of components of the coagulation proteome by using a mechanism-based computational approach that translates coagulation factor concentration data into a representation of an individual's thrombin generation potential. A novel graphical method is used to integrate standard measures that characterize thrombin generation in both empirical and computational models (e.g max rate, max level, total thrombin, time to 2 nM thrombin ("clot time")) to visualize how normal range variation in coagulation factors results in unique thrombin generation phenotypes. Unique ensembles of the 8 coagulation factors encompassing the limits of normal range variation were used as initial conditions for the computational modeling, each ensemble representing "an individual" in a theoretical healthy population. These "individuals" with unremarkable proteome composition was then compared to actual normal and "abnormal" individuals, i.e. factor ensembles measured in apparently healthy individuals, actual coagulopathic individuals or artificially constructed factor ensembles representing individuals with specific factor deficiencies. A sensitivity analysis was performed to rank either individual factors or all possible pairs of factors in terms of their contribution to the overall distribution of thrombin generation phenotypes. Key findings of these analyses include: normal range variation of coagulation factors yields thrombin generation phenotypes indistinguishable from individuals with some, but not all, coagulopathies examined; coordinate variation of certain pairs of factors within their normal ranges disproportionately results in extreme thrombin generation phenotypes, implying that measurement of a smaller set of factors may be sufficient to identify individuals with aberrant thrombin generation potential despite normal coagulation proteome composition.  相似文献   

20.
Diversity in biological communities frequently is compared using species accumulation curves, plotting observed species richness versus sample size. When species accumulation curves intersect, the ranking of communities by observed species richness depends on sample size, creating inconsistency in comparisons of diversity. We show that species accumulation curves for two communities are expected to intersect when the community with lower actual species richness has higher Simpson diversity (probability that two random individuals belong to different species). This may often occur when comparing communities that differ in habitat heterogeneity or disturbance, as we illustrate using data from neotropical butterflies. In contrast to observed species richness, estimated Simpson diversity always produces a consistent expected ranking among communities across sample sizes, with the statistical accuracy to confidently rank communities using small samples. Simpson diversity should therefore be particularly useful in rapid assessments to prioritize areas for conservation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号