首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Gene-Gene dependency plays a very important role in system biology as it pertains to the crucial understanding of different biological mechanisms. Time-course microarray data provides a new platform useful to reveal the dynamic mechanism of gene-gene dependencies. Existing interaction measures are mostly based on association measures, such as Pearson or Spearman correlations. However, it is well known that such interaction measures can only capture linear or monotonic dependency relationships but not for nonlinear combinatorial dependency relationships. With the invocation of hidden Markov models, we propose a new measure of pairwise dependency based on transition probabilities. The new dynamic interaction measure checks whether or not the joint transition kernel of the bivariate state variables is the product of two marginal transition kernels. This new measure enables us not only to evaluate the strength, but also to infer the details of gene dependencies. It reveals nonlinear combinatorial dependency structure in two aspects: between two genes and across adjacent time points. We conduct a bootstrap-based test for presence/absence of the dependency between every pair of genes. Simulation studies and real biological data analysis demonstrate the application of the proposed method. The software package is available under request.  相似文献   

2.

Background

Knee osteoarthritis (OA) is the most common joint disease of adults worldwide. Since the treatments for advanced radiographic knee OA are limited, clinicians face a significant challenge of identifying patients who are at high risk of OA in a timely and appropriate way. Therefore, we developed a simple self-assessment scoring system and an improved artificial neural network (ANN) model for knee OA.

Methods

The Fifth Korea National Health and Nutrition Examination Surveys (KNHANES V-1) data were used to develop a scoring system and ANN for radiographic knee OA. A logistic regression analysis was used to determine the predictors of the scoring system. The ANN was constructed using 1777 participants and validated internally on 888 participants in the KNHANES V-1. The predictors of the scoring system were selected as the inputs of the ANN. External validation was performed using 4731 participants in the Osteoarthritis Initiative (OAI). Area under the curve (AUC) of the receiver operating characteristic was calculated to compare the prediction models.

Results

The scoring system and ANN were built using the independent predictors including sex, age, body mass index, educational status, hypertension, moderate physical activity, and knee pain. In the internal validation, both scoring system and ANN predicted radiographic knee OA (AUC 0.73 versus 0.81, p<0.001) and symptomatic knee OA (AUC 0.88 versus 0.94, p<0.001) with good discriminative ability. In the external validation, both scoring system and ANN showed lower discriminative ability in predicting radiographic knee OA (AUC 0.62 versus 0.67, p<0.001) and symptomatic knee OA (AUC 0.70 versus 0.76, p<0.001).

Conclusions

The self-assessment scoring system may be useful for identifying the adults at high risk for knee OA. The performance of the scoring system is improved significantly by the ANN. We provided an ANN calculator to simply predict the knee OA risk.  相似文献   

3.
This paper describes a method for growing a recurrent neural network of fuzzy threshold units for the classification of feature vectors. Fuzzy networks seem natural for performing classification, since classification is concerned with set membership and objects generally belonging to sets of various degrees. A fuzzy unit in the architecture proposed here determines the degree to which the input vector lies in the fuzzy set associated with the fuzzy unit. This is in contrast to perceptrons that determine the correlation between input vector and a weighting vector. The resulting membership value, in the case of the fuzzy unit, is compared with a threshold, which is interpreted as a membership value. Training of a fuzzy unit is based on an algorithm for linear inequalities similar to Ho-Kashyap recording. These fuzzy threshold units are fully connected in a recurrent network. The network grows as it is trained. The advantages of the network and its training method are: (1) Allowing the network to grow to the required size which is generally much smaller than the size of the network which would be obtained otherwise, implying better generalization, smaller storage requirements and fewer calculations during classification; (2) The training time is extremely short; (3) Recurrent networks such as this one are generally readily implemented in hardware; (4) Classification accuracy obtained on several standard data sets is better than that obtained by the majority of other standard methods; and (5) The use of fuzzy logic is very intuitive since class membership is generally fuzzy.  相似文献   

4.
With the continuous deepening of Artificial Neural Network(ANN)research,ANN model structure and function are improv-ing towards diversification and intelligence...  相似文献   

5.
Gene-gene interactions may play an important role in the genetics of a complex disease. Detection and characterization of gene-gene interactions is a challenging issue that has stimulated the development of various statistical methods to address it. In this study, we introduce a method to measure gene interactions using entropy-based statistics from a contingency table of trait and genotype combinations. We also developed an exploration procedure by using graphs. We propose a standardized relative information gain (RIG) measure to evaluate the interactions between single nucleotide polymorphism (SNP) combinations. To identify the k th order interactions, contingency tables of trait and genotype combinations of k SNPs are constructed, with which RIGs are calculated. The RIGs are standardized using the mean and standard deviation from the permuted datasets. SNP combinations yielding high standardized RIG are chosen for gene-gene interactions. Detection of high-order interactions and comparison of interaction strengths between different orders are made possible by using standardized RIG. We have applied the proposed standardized entropy-based method to two types of data sets from a simulation study and a real genetic association study. We have compared our method and the multifactor dimensionality reduction (MDR) method through power analysis of eight different genetic models with varying penetrance rates, number of SNPs, and sample sizes. Our method shows successful identification of genetic associations and gene-gene interactions both in simulation and real genetic data. Simulation results suggest that the proposed entropy-based method is better able to detect high-order interactions and is superior to the MDR method in most cases. The proposed method is well suited for detecting interactions without main effects as well as for models including main effects.  相似文献   

6.
7.
There are a large number of tomato cultivars with a wide range of morphological, chemical, nutritional and sensorial characteristics. Many factors are known to affect the nutrient content of tomato cultivars. A complete understanding of the effect of these factors would require an exhaustive experimental design, multidisciplinary scientific approach and a suitable statistical method. Some multivariate analytical techniques such as Principal Component Analysis (PCA) or Factor Analysis (FA) have been widely applied in order to search for patterns in the behaviour and reduce the dimensionality of a data set by a new set of uncorrelated latent variables. However, in some cases it is not useful to replace the original variables with these latent variables. In this study, Automatic Interaction Detection (AID) algorithm and Artificial Neural Network (ANN) models were applied as alternative to the PCA, AF and other multivariate analytical techniques in order to identify the relevant phytochemical constituents for characterization and authentication of tomatoes. To prove the feasibility of AID algorithm and ANN models to achieve the purpose of this study, both methods were applied on a data set with twenty five chemical parameters analysed on 167 tomato samples from Tenerife (Spain). Each tomato sample was defined by three factors: cultivar, agricultural practice and harvest date. General Linear Model linked to AID (GLM-AID) tree-structured was organized into 3 levels according to the number of factors. p-Coumaric acid was the compound the allowed to distinguish the tomato samples according to the day of harvest. More than one chemical parameter was necessary to distinguish among different agricultural practices and among the tomato cultivars. Several ANN models, with 25 and 10 input variables, for the prediction of cultivar, agricultural practice and harvest date, were developed. Finally, the models with 10 input variables were chosen with fit’s goodness between 44 and 100%. The lowest fits were for the cultivar classification, this low percentage suggests that other kind of chemical parameter should be used to identify tomato cultivars.  相似文献   

8.
In a natural setting, speech is often accompanied by gestures. As language, speech-accompanying iconic gestures to some extent convey semantic information. However, if comprehension of the information contained in both the auditory and visual modality depends on same or different brain-networks is quite unknown. In this fMRI study, we aimed at identifying the cortical areas engaged in supramodal processing of semantic information. BOLD changes were recorded in 18 healthy right-handed male subjects watching video clips showing an actor who either performed speech (S, acoustic) or gestures (G, visual) in more (+) or less (−) meaningful varieties. In the experimental conditions familiar speech or isolated iconic gestures were presented; during the visual control condition the volunteers watched meaningless gestures (G−), while during the acoustic control condition a foreign language was presented (S−). The conjunction of the visual and acoustic semantic processing revealed activations extending from the left inferior frontal gyrus to the precentral gyrus, and included bilateral posterior temporal regions. We conclude that proclaiming this frontotemporal network the brain''s core language system is to take too narrow a view. Our results rather indicate that these regions constitute a supramodal semantic processing network.  相似文献   

9.
Identifying diagnostic biomarkers based on genomic features for an accurate disease classification is a problem of great importance for both, basic medical research and clinical practice. In this paper, we introduce quantitative network measures as structural biomarkers and investigate their ability for classifying disease states inferred from gene expression data from prostate cancer. We demonstrate the utility of our approach by using eigenvalue and entropy-based graph invariants and compare the results with a conventional biomarker analysis of the underlying gene expression data.  相似文献   

10.
Protein expression and post-translational modification levels are tightly regulated in neoplastic cells to maintain cellular processes known as ‘cancer hallmarks’. The first Pan-Cancer initiative of The Cancer Genome Atlas (TCGA) Research Network has aggregated protein expression profiles for 3,467 patient samples from 11 tumor types using the antibody based reverse phase protein array (RPPA) technology. The resultant proteomic data can be utilized to computationally infer protein-protein interaction (PPI) networks and to study the commonalities and differences across tumor types. In this study, we compare the performance of 13 established network inference methods in their capacity to retrieve the curated Pathway Commons interactions from RPPA data. We observe that no single method has the best performance in all tumor types, but a group of six methods, including diverse techniques such as correlation, mutual information, and regression, consistently rank highly among the tested methods. We utilize the high performing methods to obtain a consensus network; and identify four robust and densely connected modules that reveal biological processes as well as suggest antibody–related technical biases. Mapping the consensus network interactions to Reactome gene lists confirms the pan-cancer importance of signal transduction pathways, innate and adaptive immune signaling, cell cycle, metabolism, and DNA repair; and also suggests several biological processes that may be specific to a subset of tumor types. Our results illustrate the utility of the RPPA platform as a tool to study proteomic networks in cancer.  相似文献   

11.
Relationships between environmental variables and diversity (Shannon‐Weaver index) of the fish communities in the Tagus estuary and adjacent coastal areas were analyzed. The focus was on the linearity or nonlinearity of these abiotic/biotic characteristics, with the aim to obtain an accurate short–medium term time‐scale diversity prediction from habitat variables alone. Multiple Linear Regressions (MLR) were used for the linear approach and Artificial Neural Networks (ANNs) for the nonlinear approach. MLR results in the external validation phase indicated a lack of model accuracy (R2 = 0.0710; %SEP = 47.5868; E = ?0.0217; ARV = 1.0217; N = 43). Results of the best of the Artificial Neural Networks used in this study (12‐15‐15‐1 architecture) in the external validation phase (ANN: R2 = 0.9736; %SEP = 7.8499; E = 0.9722; ARV = 0.0278; N = 43) were more accurate than those obtained with MLR. This indicates a clear nonlinear relationship between variables. In the best ANN model, nitrate concentration, depth, dissolved oxygen and temperature were the most important predictors of fish diversity in the Tagus estuary. The sensibility analysis indicated that the remaining variables (silicate, nitrite, transparency, salinity, slope, phosphate, water particulate organic matter, and chlorophyll a) played lesser roles in the model.  相似文献   

12.
Rahman ME  Islam R  Islam S  Mondal SI  Amin MR 《Genomics》2012,99(4):189-194
MicroRNA (miRNA) is a special class of short noncoding RNA that serves pivotal function of regulating gene expression. The computational prediction of new miRNA candidates involves various methods such as learning methods and methods using expression data. This article has proposed a reliable model - miRANN which is a supervised machine learning approach. MiRANN used known pre-miRNAs as positive set and a novel negative set from human CDS regions. The number of known miRNAs is now huge and diversified that could cover almost all characteristics of unknown miRNAs which increases the quality of the result (99.9% accuracy, 99.8% sensitivity, 100% specificity) and provides a more reliable prediction. MiRANN performs better than other state-of-the-art approaches and declares to be the most potential tool to predict novel miRNAs. We have also tested our result using a previous negative set. MiRANN, opens new ground using ANN for predicting pre-miRNAs with a promise of better performance.  相似文献   

13.
For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods.  相似文献   

14.
1IntroductionThe three-dimensional(3D)structure of a proteinis perhaps the most important of all its features,since itdetermines completely how the protein functions andinteracts with other molecules.Most biological mech-anisms at the protein level are based on shape-complementarity,so that proteins present particularconcavities and convexities that allow them to bind toeach other and formcomplexstructures,and tendon.Forthis reason,for instance,the drug design problem con-sists primarily in th…  相似文献   

15.

Background

Arginine vasopressin (AVP) plays a role in social behavior, through receptor AVPR1A. The promoter polymorphism AVPR1A RS3 has been associated with human social behaviors, and with acute response to stress. Here, the relationships between AVPR1A RS3, early-life stressors, and social interaction in adulthood were explored.

Methods

Adult individuals from a Swedish population-based cohort (n = 1871) were assessed for self-reported availability of social integration and social attachment and for experience of childhood adversities. Their DNA samples were genotyped for the microsatellite AVPR1A RS3.

Results

Among males, particularly those homozygous for the long alleles of AVPR1A RS3 were vulnerable to childhood adversity for their social attachment in adulthood. A similar vulnerability to childhood adversity among long allele carriers was found on adulthood social integration, but here both males and females were influenced.

Limitation

Data were self-reported and childhood adversity data were retrospective.

Conclusions

Early-life stress influenced the relationship between AVPR1A genetic variants and social interaction. For social attachment, AVPR1A was of importance in males only. The findings add to previous reports on higher acute vulnerability to stress in persons with long AVPR1A RS3 alleles and increased AVP levels.  相似文献   

16.
为探讨人工神经网络(ANN)在昆虫分类上的可行性,本文提出利用主成分分析和数学建模等方法相结合改进ANN,并以鳞翅目夜蛾科6种蛾类昆虫为样本进行了验证.首先利用Bugshape1.0特征提取软件获取6种蛾180个右前翅样本的13项数学形态特征数据,再运用主成分分析对蛾翅数学形态特征变量重新组合生成新的综合变量,然后结合主成分分析建立BP神经网络分类器.主成分分析结果表明,前5个主成分的累积贡献率为85.52%,已基本包含了全部特征变量具有的信息.在主成分分析的基础上,建立具有5个输入层节点,12个隐含层节点和1个输出层节点的三层BP神经网络分类器.每种蛾20个样本共120组特征数据对分类器进行训练和仿真,其余60组特征数据对分类器进行验证,仿真输出值与目标值的相关系数R=0.997,分类正确率达到了93.33%.较之未经过主成分分析而单独使用BP神经网络建立的分类器,基于主成分分析的BP神经网络分类器具有更优的性能和更准确的分类能力.研究结果表明本文提出的方法具有很好的分类和鉴别作用,为蛾种类的鉴别提供了一种可行的方法.  相似文献   

17.
The fidelity of the folding pathways being encoded in the amino acid sequence is met with challenge in instances where proteins with no sequence homology, performing different functions and no apparent evolutionary linkage, adopt a similar fold. The problem stated otherwise is that a limited fold space is available to a repertoire of diverse sequences. The key question is what factors lead to the formation of a fold from diverse sequences. Here, with the NAD(P)-binding Rossmann fold domains as a case study and using the concepts of network theory, we have unveiled the consensus structural features that drive the formation of this fold. We have proposed a graph theoretic formalism to capture the structural details in terms of the conserved atomic interactions in global milieu, and hence extract the essential topological features from diverse sequences. A unified mathematical representation of the different structures together with a judicious concoction of several network parameters enabled us to probe into the structural features driving the adoption of the NAD(P)-binding Rossmann fold. The atomic interactions at key positions seem to be better conserved in proteins, as compared to the residues participating in these interactions. We propose a “spatial motif” and several “fold specific hot spots” that form the signature structural blueprints of the NAD(P)-binding Rossmann fold domain. Excellent agreement of our data with previous experimental and theoretical studies validates the robustness and validity of the approach. Additionally, comparison of our results with statistical coupling analysis (SCA) provides further support. The methodology proposed here is general and can be applied to similar problems of interest.  相似文献   

18.
Biomarker analysis and evaluation in oncology is the product of a number of processes (including managerial, technical and interpretation steps) which need to be monitored and controlled to prevent and correct errors and guarantee a satisfactory level of quality. Several biomarkers have recently moved to clinical validation studies and successively to clinical practice without any definition of standard procedures and/or quality control (QC) schemes necessary to guarantee the reproducibility of the laboratory information. In Italy several national scientific societies and single researchers have activated -- often on a pilot level -- specific external quality assessment protocols, thereby potentially jeopardizing the clinical reality even further. In view of the seriousness of the problem, in 1998 the Italian Ministry of Health sponsored a National Survey Project to coordinate and standardize the procedures and to develop QC programs for the analysis of cancer biomarkers of potential clinical relevance. Twelve QC programs focused on biomarkers and concerning morphological, immunohistochemical, biochemical, molecular, and immunoenzymatic assays were coordinated and implemented. Specifically, external QC programs for the analytical phase of immunohistochemical p53, Bcl-2, c-erb-2/neu/HER2, and microvessel density determination, of morphological evaluation of tumor differentiation grade, and of molecular p53 analysis were activated for the first time within the project. Several hundreds of Italian laboratories took part in these QC programs, the results of which are available on the web site of the Network (www.cqlaboncologico.it). Financial support from the Italian Government and the National Research Council (CNR) will guarantee the pursuit of activities that will be extended to new biomarkers, to preanalytical phases of the assays, and to revision of the criteria of clinical usefulness for evaluating the cost/benefit ratio.  相似文献   

19.
It is well accepted that the brain''s computation relies on spatiotemporal activity of neural networks. In particular, there is growing evidence of the importance of continuously and precisely timed spiking activity. Therefore, it is important to characterize memory states in terms of spike-timing patterns that give both reliable memory of firing activities and precise memory of firing timings. The relationship between memory states and spike-timing patterns has been studied empirically with large-scale recording of neuron population in recent years. Here, by using a recurrent neural network model with dynamics at two time scales, we construct a dynamical memory network model which embeds both fast neural and synaptic variation and slow learning dynamics. A state vector is proposed to describe memory states in terms of spike-timing patterns of neural population, and a distance measure of state vector is defined to study several important phenomena of memory dynamics: partial memory recall, learning efficiency, learning with correlated stimuli. We show that the distance measure can capture the timing difference of memory states. In addition, we examine the influence of network topology on learning ability, and show that local connections can increase the network''s ability to embed more memory states. Together theses results suggest that the proposed system based on spike-timing patterns gives a productive model for the study of detailed learning and memory dynamics.  相似文献   

20.
Genes, environment, and the interaction between them are each known to play an important role in the risk for developing complex diseases such as metabolic syndrome. For environmental factors, most studies focused on the measurements observed at the individual level, and therefore can only consider the gene-environment interaction at the same individual scale. Indeed the group-level (called contextual) environmental variables, such as community factors and the degree of local area development, may modify the genetic effect as well. To examine such cross-level interaction between genes and contextual factors, a flexible statistical model quantifying the variability of the genetic effects across different categories of the contextual variable is in need. With a Bayesian generalized linear mixed-effects model with an unconditional likelihood, we investigate whether the individual genetic effect is modified by the group-level residential environment factor in a matched case-control metabolic syndrome study. Such cross-level interaction is evaluated by examining the heterogeneity in allelic effects under various contextual categories, based on posterior samples from Markov chain Monte Carlo methods. The Bayesian analysis indicates that the effect of rs1801282 on metabolic syndrome development is modified by the contextual environmental factor. That is, even among individuals with the same genetic component of PPARG_Pro12Ala, living in a residential area with low availability of exercise facilities may result in higher risk. The modification of the group-level environment factors on the individual genetic attributes can be essential, and this Bayesian model is able to provide a quantitative assessment for such cross-level interaction. The Bayesian inference based on the full likelihood is flexible with any phenotype, and easy to implement computationally. This model has a wide applicability and may help unravel the complexity in development of complex diseases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号