首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Ecogeographic rules that describe quantitative relationships between morphologies and climate might help us predict how morphometrics of animals was shaped by local temperature or humidity. Although the ecogeographic rules had been widely tested in animals of Europe and North America, they had not been fully validated for species in regions that are less studied. Here, we investigate the morphometric variation of a widely distributed East Asian passerine, the vinous‐throated parrotbill (Sinosuthora webbiana), to test whether its morphological variation conforms to the prediction of Bergmann''s rule, Allen''s rules, and Gloger''s rule. We at first described the climatic niche of S. webbiana from occurrence records (n = 7838) and specimen records (n = 290). The results of analysis of covariance (ANCOVA) suggested that the plumage coloration of these parrotbills was darker in wetter/warmer environments following Gloger''s rule. However, their appendage size (culmen length, beak volume, tarsi length) was larger in colder environments, the opposite of the predictions of Allen''s rule. Similarly, their body size (wing length) was larger in warmer environments, the opposite of the predictions of Bergmann''s rule. Such disconformity to both Bergmann''s rule and Allen''s rule suggests that the evolution of morphological variations is likely governed by multiple selection forces rather than dominated by thermoregulation. Our results suggest that these ecogeographic rules should be validated prior to forecasting biological responses to climate change especially for species in less‐studied regions.  相似文献   

2.
3.
A key challenge in genetics is identifying the functional roles of genes in pathways. Numerous functional genomics techniques (e.g. machine learning) that predict protein function have been developed to address this question. These methods generally build from existing annotations of genes to pathways and thus are often unable to identify additional genes participating in processes that are not already well studied. Many of these processes are well studied in some organism, but not necessarily in an investigator''s organism of interest. Sequence-based search methods (e.g. BLAST) have been used to transfer such annotation information between organisms. We demonstrate that functional genomics can complement traditional sequence similarity to improve the transfer of gene annotations between organisms. Our method transfers annotations only when functionally appropriate as determined by genomic data and can be used with any prediction algorithm to combine transferred gene function knowledge with organism-specific high-throughput data to enable accurate function prediction.We show that diverse state-of-art machine learning algorithms leveraging functional knowledge transfer (FKT) dramatically improve their accuracy in predicting gene-pathway membership, particularly for processes with little experimental knowledge in an organism. We also show that our method compares favorably to annotation transfer by sequence similarity. Next, we deploy FKT with state-of-the-art SVM classifier to predict novel genes to 11,000 biological processes across six diverse organisms and expand the coverage of accurate function predictions to processes that are often ignored because of a dearth of annotated genes in an organism. Finally, we perform in vivo experimental investigation in Danio rerio and confirm the regulatory role of our top predicted novel gene, wnt5b, in leftward cell migration during heart development. FKT is immediately applicable to many bioinformatics techniques and will help biologists systematically integrate prior knowledge from diverse systems to direct targeted experiments in their organism of study.  相似文献   

4.
A fundamental challenge in Systems Biology is whether a cell‐scale metabolic model can predict patterns of genome evolution by realistically accounting for associated biochemical constraints. Here, we study the order in which genes are lost in an in silico evolutionary process, leading from the metabolic network of Eschericia coli to that of the endosymbiont Buchnera aphidicola. We examine how this order correlates with the order by which the genes were actually lost, as estimated from a phylogenetic reconstruction. By optimizing this correlation across the space of potential growth and biomass conditions, we compute an upper bound estimate on the model's prediction accuracy (R=0.54). The model's network‐based predictive ability outperforms predictions obtained using genomic features of individual genes, reflecting the effect of selection imposed by metabolic stoichiometric constraints. Thus, while the timing of gene loss might be expected to be a completely stochastic evolutionary process, remarkably, we find that metabolic considerations, on their own, make a marked 40% contribution to determining when such losses occur.  相似文献   

5.
Human bone marrow mesenchymal stem cells (hBMSCs) are widely used cell source for clinical bone regeneration. Achieving the greatest therapeutic effect is dependent on the osteogenic differentiation potential of the stem cells to be implanted. However, there are still no practical methods to characterize such potential non-invasively or previously. Monitoring cellular morphology is a practical and non-invasive approach for evaluating osteogenic potential. Unfortunately, such image-based approaches had been historically qualitative and requiring experienced interpretation. By combining the non-invasive attributes of microscopy with the latest technology allowing higher throughput and quantitative imaging metrics, we studied the applicability of morphometric features to quantitatively predict cellular osteogenic potential. We applied computational machine learning, combining cell morphology features with their corresponding biochemical osteogenic assay results, to develop prediction model of osteogenic differentiation. Using a dataset of 9,990 images automatically acquired by BioStation CT during osteogenic differentiation culture of hBMSCs, 666 morphometric features were extracted as parameters. Two commonly used osteogenic markers, alkaline phosphatase (ALP) activity and calcium deposition were measured experimentally, and used as the true biological differentiation status to validate the prediction accuracy. Using time-course morphological features throughout differentiation culture, the prediction results highly correlated with the experimentally defined differentiation marker values (R>0.89 for both marker predictions). The clinical applicability of our morphology-based prediction was further examined with two scenarios: one using only historical cell images and the other using both historical images together with the patient''s own cell images to predict a new patient''s cellular potential. The prediction accuracy was found to be greatly enhanced by incorporation of patients'' own cell features in the modeling, indicating the practical strategy for clinical usage. Consequently, our results provide strong evidence for the feasibility of using a quantitative time series of phase-contrast cellular morphology for non-invasive cell quality prediction in regenerative medicine.  相似文献   

6.
7.
Previous research has shown that young infants perceive others'' actions as structured by goals. One open question is whether the recruitment of this understanding when predicting others'' actions imposes a cognitive challenge for young infants. The current study explored infants'' ability to utilize their knowledge of others'' goals to rapidly predict future behavior in complex social environments and distinguish goal-directed actions from other kinds of movements. Fifteen-month-olds (N = 40) viewed videos of an actor engaged in either a goal-directed (grasping) or an ambiguous (brushing the back of her hand) action on a Tobii eye-tracker. At test, critical elements of the scene were changed and infants'' predictive fixations were examined to determine whether they relied on goal information to anticipate the actor''s future behavior. Results revealed that infants reliably generated goal-based visual predictions for the grasping action, but not for the back-of-hand behavior. Moreover, response latencies were longer for goal-based predictions than for location-based predictions, suggesting that goal-based predictions are cognitively taxing. Analyses of areas of interest indicated that heightened attention to the overall scene, as opposed to specific patterns of attention, was the critical indicator of successful judgments regarding an actor''s future goal-directed behavior. These findings shed light on the processes that support “smart” social behavior in infants, as it may be a challenge for young infants to use information about others'' intentions to inform rapid predictions.  相似文献   

8.
The interactions between consumers and prey, and their impact on biomass distribution among trophic levels, are central issues in both empirical and theoretical ecology. In a long-term experiment, where all organisms, including the top predator, were allowed to respond to environmental conditions by reproduction, we tested predictions from ''prey-dependent'' and ''ratio-dependent'' models. Prey-dependent models made correct predictions only in the presence of strong interactors in simple food chains, but failed to predict patterns in more complex situations. Processes such as omnivory, consumer excretion, and unsuitable prey-size windows (invulnerable prey) increased the complexity and created patterns resembling ratio-dependent consumption. However, whereas the prey-dependent patterns were created by the mechanisms predicted by the model, ratio-dependent patterns were not, suggesting that they may be ''right for the wrong reason''. We show here that despite the enormous complexity of ecosystems, it is possible to identify and disentangle mechanisms responsible for observed patterns in community structure, as well as in biomass development of organisms ranging in size from bacteria to fish.  相似文献   

9.
The potential of the computer program PASS (Prediction Activity Spectra for Substances) to predict rodent carcinogenicity for chemical compounds was studied. PASS predicts carcinogenicity of chemical compounds on the basis of their structural formula and of structure-activity relationship analysis of known carcinogens and non-carcinogens. The data on structures and experimental results of 2-year carcinogenicity assays for 412 chemicals from the NTP (National Toxicological Program) and 1190 chemicals from the CPDB (Carcinogenic Potency Database) were used in our study. The predictions take into consideration information about species and sex of animals. For evaluation of the predictive accuracy we used two procedures: leave-one-out cross-validation (LOO CV) and leave-20%-out cross-validation. In the last case we randomly divided the studied data set 20 times into two subsets. The data from the first subset, containing 80% of the compounds, were added to the PASS training set (which includes about 46,000 compounds with about 1500 biological activity types collected during the last 20 years to predict biological activity spectra), the second subset with 20% of the compounds was used as an evaluation set. The mean accuracy of prediction calculated by LOO CV is about 73% for NTP compounds in the 'equivocal' category of carcinogenic activity and 80% for NTP compounds in the 'evidence' category of carcinogenicity. The mean accuracy of prediction for the CPDB database is 89.9% calculated by LOO CV and 63.4% calculated by leave-20%-out cross-validation. Influence of incorporation of species and sex data on the accuracy of carcinogenicity prediction was also investigated. It was shown that the accuracy was increased only for data on male animals.  相似文献   

10.
It has been argued that spatially explicit population models (SEPMs) cannot provide reliable guidance for conservation biology because of the difficulty of obtaining direct estimates for their demographic and dispersal parameters and because of error propagation. We argue that appropriate model calibration procedures can access additional sources of information, compensating the lack of direct parameter estimates. Our objective is to show how model calibration using population-level data can facilitate the construction of SEPMs that produce reliable predictions for conservation even when direct parameter estimates are inadequate. We constructed a spatially explicit and individual-based population model for the dynamics of brown bears (Ursus arctos) after a reintroduction program in Austria. To calibrate the model we developed a procedure that compared the simulated population dynamics with distinct features of the known population dynamics (=patterns). This procedure detected model parameterizations that did not reproduce the known dynamics. Global sensitivity analysis of the uncalibrated model revealed high uncertainty in most model predictions due to large parameter uncertainties (coefficients of variation CV 0.8). However, the calibrated model yielded predictions with considerably reduced uncertainty (CV 0.2). A pattern or a combination of various patterns that embed information on the entire model dynamics can reduce the uncertainty in model predictions, and the application of different patterns with high information content yields the same model predictions. In contrast, a pattern that does not embed information on the entire population dynamics (e.g., bear observations taken from sub-areas of the study area) does not reduce uncertainty in model predictions. Because population-level data for defining (multiple) patterns are often available, our approach could be applied widely.  相似文献   

11.

Background:

Individual researchers are struggling to keep up with the accelerating emergence of high-throughput biological data, and to extract information that relates to their specific questions. Integration of accumulated evidence should permit researchers to form fewer - and more accurate - hypotheses for further study through experimentation.

Results:

Here a method previously used to predict Gene Ontology (GO) terms for Saccharomyces cerevisiae (Tian et al.: Combining guilt-by-association and guilt-by-profiling to predict Saccharomyces cerevisiae gene function. Genome Biol 2008, 9(Suppl 1):S7) is applied to predict GO terms and phenotypes for 21,603 Mus musculus genes, using a diverse collection of integrated data sources (including expression, interaction, and sequence-based data). This combined 'guilt-by-profiling' and 'guilt-by-association' approach optimizes the combination of two inference methodologies. Predictions at all levels of confidence are evaluated by examining genes not used in training, and top predictions are examined manually using available literature and knowledge base resources.

Conclusion:

We assigned a confidence score to each gene/term combination. The results provided high prediction performance, with nearly every GO term achieving greater than 40% precision at 1% recall. Among the 36 novel predictions for GO terms and 40 for phenotypes that were studied manually, >80% and >40%, respectively, were identified as accurate. We also illustrate that a combination of 'guilt-by-profiling' and 'guilt-by-association' outperforms either approach alone in their application to M. musculus.
  相似文献   

12.
Genetic screening is becoming possible on an unprecedented scale. However, its utility remains controversial. Although most variant genotypes cannot be easily interpreted, many individuals nevertheless attempt to interpret their genetic information. Initiatives such as the Personal Genome Project (PGP) and Illumina''s Understand Your Genome are sequencing thousands of adults, collecting phenotypic information and developing computational pipelines to identify the most important variant genotypes harbored by each individual. These pipelines consider database and allele frequency annotations and bioinformatics classifications. We propose that the next step will be to integrate these different sources of information to estimate the probability that a given individual has specific phenotypes of clinical interest. To this end, we have designed a Bayesian probabilistic model to predict the probability of dichotomous phenotypes. When applied to a cohort from PGP, predictions of Gilbert syndrome, Graves'' disease, non-Hodgkin lymphoma, and various blood groups were accurate, as individuals manifesting the phenotype in question exhibited the highest, or among the highest, predicted probabilities. Thirty-eight PGP phenotypes (26%) were predicted with area-under-the-ROC curve (AUC)>0.7, and 23 (15.8%) of these were statistically significant, based on permutation tests. Moreover, in a Critical Assessment of Genome Interpretation (CAGI) blinded prediction experiment, the models were used to match 77 PGP genomes to phenotypic profiles, generating the most accurate prediction of 16 submissions, according to an independent assessor. Although the models are currently insufficiently accurate for diagnostic utility, we expect their performance to improve with growth of publicly available genomics data and model refinement by domain experts.  相似文献   

13.
Accuracy of predicting protein secondary structure and solvent accessibility from sequence information has been improved significantly by using information contained in multiple sequence alignments as input to a neural 'network system. For the Asilomar meeting, predictions for 13 proteins were generated automatically using the publicly available prediction method PHD. The results confirm the estimate of 72% three-state prediction accuracy. The fairly accurate predictions of secondary structure segments made the tool useful as a starting point for modeling of higher dimensional aspects of protein structure. © 1995 Wiley-Liss, Inc.  相似文献   

14.

Background

Massively parallel sequencing studies have led to the identification of a large number of mutations present in a minority of cancers of a given site. Hence, methods to identify the likely pathogenic mutations that are worth exploring experimentally and clinically are required. We sought to compare the performance of 15 mutation effect prediction algorithms and their agreement. As a hypothesis-generating aim, we sought to define whether combinations of prediction algorithms would improve the functional effect predictions of specific mutations.

Results

Literature and database mining of single nucleotide variants (SNVs) affecting 15 cancer genes was performed to identify mutations supported by functional evidence or hereditary disease association to be classified either as non-neutral (n = 849) or neutral (n = 140) with respect to their impact on protein function. These SNVs were employed to test the performance of 15 mutation effect prediction algorithms. The accuracy of the prediction algorithms varies considerably. Although all algorithms perform consistently well in terms of positive predictive value, their negative predictive value varies substantially. Cancer-specific mutation effect predictors display no-to-almost perfect agreement in their predictions of these SNVs, whereas the non-cancer-specific predictors showed no-to-moderate agreement. Combinations of predictors modestly improve accuracy and significantly improve negative predictive values.

Conclusions

The information provided by mutation effect predictors is not equivalent. No algorithm is able to predict sufficiently accurately SNVs that should be taken forward for experimental or clinical testing. Combining algorithms aggregates orthogonal information and may result in improvements in the negative predictive value of mutation effect predictions.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-014-0484-1) contains supplementary material, which is available to authorized users.  相似文献   

15.
With respect to autosomal genes, a grandparent is equally related to male and female grandchildren. Because males are heterozygous for sex chromosomes, however, grandparents are asymmetrically related to male and female grandchildren via the sex chromosomes. For example, the Y chromosome from the paternal grandfather passes directly down to grandsons. This asymmetry leads to a prediction that genes on the sex chromosomes could drive differential grandparental care. Alternatively, the paternity uncertainty hypothesis for differential grandparent care brings about a different set of predictions. A grandfather, for example, has two degrees of uncertainty to his son's children but only one to his daughter's children. Thus, under high extra-pair paternity rates, paternity uncertainty predicts that a grandfather will favor his daughter's children over his son's children. A paternity uncertainty vs. a genetic relatedness hypothesis was tested using data from questionnaires asking adult grandchildren to rate the amount and quality of care of their various grandparents. We found no support for preferential care based on expected sex chromosome similarities. Instead, our data were in general accord with the predictions of the paternity uncertainty hypothesis of grandparental care. A model is presented to predict the rates of extra-pair paternity required in a population to have the effects of paternity uncertainty outweigh sex chromosome effects.  相似文献   

16.
Review: protein secondary structure prediction continues to rise   总被引:15,自引:0,他引:15  
Methods predicting protein secondary structure improved substantially in the 1990s through the use of evolutionary information taken from the divergence of proteins in the same structural family. Recently, the evolutionary information resulting from improved searches and larger databases has again boosted prediction accuracy by more than four percentage points to its current height of around 76% of all residues predicted correctly in one of the three states, helix, strand, and other. The past year also brought successful new concepts to the field. These new methods may be particularly interesting in light of the improvements achieved through simple combining of existing methods. Divergent evolutionary profiles contain enough information not only to substantially improve prediction accuracy, but also to correctly predict long stretches of identical residues observed in alternative secondary structure states depending on nonlocal conditions. An example is a method automatically identifying structural switches and thus finding a remarkable connection between predicted secondary structure and aspects of function. Secondary structure predictions are increasingly becoming the work horse for numerous methods aimed at predicting protein structure and function. Is the recent increase in accuracy significant enough to make predictions even more useful? Because the recent improvement yields a better prediction of segments, and in particular of beta strands, I believe the answer is affirmative. What is the limit of prediction accuracy? We shall see.  相似文献   

17.
18.
ABSTRACT: BACKGROUND: Biological databases contain large amounts of data concerning the functions and associationsof genes and proteins. Integration of data from several such databases into a single repositorycan aid the discovery of previously unknown connections spanning multiple types ofrelationships and databases. RESULTS: Biomine is a system that integrates cross-references from several biological databases into agraph model with multiple types of edges, such as protein interactions, gene-diseaseassociations and gene ontology annotations. Edges are weighted based on their type,reliability, and informativeness. We present Biomine and evaluate its performance in linkprediction, where the goal is to predict pairs of nodes that will be connected in the future,based on current data. In particular, we formulate protein interaction prediction and diseasegene prioritization tasks as instances of link prediction. The predictions are based on aproximity measure computed on the integrated graph. We consider and experiment withseveral such measures, and perform a parameter optimization procedure where different edgetypes are weighted to optimize link prediction accuracy. We also propose a novel method fordisease-gene prioritization, defined as finding a subset of candidate genes that cluster togetherin the graph. We experimentally evaluate Biomine by predicting future annotations in thesource databases and prioritizing lists of putative disease genes. CONCLUSIONS: The experimental results show that Biomine has strong potential for predicting links when aset of selected candidate links is available. The predictions obtained using the entire Biominedataset are shown to clearly outperform ones obtained using any single source of data alone,when different types of links are suitably weighted. In the gene prioritization task, anestablished reference set of disease-associated genes is useful, but the results show that underfavorable conditions, Biomine can also perform well when no such information is available.The Biomine system is a proof of concept. Its current version contains 1.1 million entities and8.1 million relations between them, with focus on human genetics. Some of its functionalitiesare available in a public query interface at http://biomine.cs.helsinki.fi, allowing searching forand visualizing connections between given biological entities.  相似文献   

19.
Cancer subtype classification and survival prediction both relate directly to patients'' specific treatment plans, making them fundamental medical issues. Although the two factors are interrelated learning problems, most studies tackle each separately. In this paper, expression levels of genes are used for both cancer subtype classification and survival prediction. We considered 350 diffuse large B-cell lymphoma (DLBCL) subjects, taken from four groups of patients (activated B-cell-like subtype dead, activated B-cell-like subtype alive, germinal center B-cell-like subtype dead, and germinal center B-cell-like subtype alive). As classification features, we used 11,271 gene expression levels of each subject. The features were first ranked by mRMR (Maximum Relevance Minimum Redundancy) principle and further selected by IFS (Incremental Feature Selection) procedure. Thirty-five gene signatures were selected after the IFS procedure, and the patients were divided into the above mentioned four groups. These four groups were combined in different ways for subtype prediction and survival prediction, specifically, the activated versus the germinal center and the alive versus the dead. Subtype prediction accuracy of the 35-gene signature was 98.6%. We calculated cumulative survival time of high-risk group and low-risk groups by the Kaplan-Meier method. The log-rank test p-value was 5.98e-08. Our methodology provides a way to study subtype classification and survival prediction simultaneously. Our results suggest that for some diseases, especially cancer, subtype classification may be used to predict survival, and, conversely, survival prediction features may shed light on subtype features.  相似文献   

20.
The current approach to using machine learning (ML) algorithms in healthcare is to either require clinician oversight for every use case or use their predictions without any human oversight. We explore a middle ground that lets ML algorithms abstain from making a prediction to simultaneously improve their reliability and reduce the burden placed on human experts. To this end, we present a general penalized loss minimization framework for training selective prediction-set (SPS) models, which choose to either output a prediction set or abstain. The resulting models abstain when the outcome is difficult to predict accurately, such as on subjects who are too different from the training data, and achieve higher accuracy on those they do give predictions for. We then introduce a model-agnostic, statistical inference procedure for the coverage rate of an SPS model that ensembles individual models trained using K-fold cross-validation. We find that SPS ensembles attain prediction-set coverage rates closer to the nominal level and have narrower confidence intervals for its marginal coverage rate. We apply our method to train neural networks that abstain more for out-of-sample images on the MNIST digit prediction task and achieve higher predictive accuracy for ICU patients compared to existing approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号