首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A relationship has been proposed to exist between individual outcomes (live or stillbirth) of twins in the same set. Here, we analyze this association between live births and stillbirths among individuals in different twin pairs. When national birth registers are analyzed, individuals in opposite-sex twin sets can be identified and the correlation between individual outcomes estimated. However, full information about the individuals in same-sex twin sets is not, as a rule, available, and consequently, correlation coefficients cannot be estimated, but upper and lower limits of the correlation coefficients can be obtained. The methods introduced here were applied to data from Sweden (1869-1967), the Aland Islands (Finland) (1750-1949), the Kingdom of Saxony (1881-1900), and England and Wales (1940-2003). Comparisons between the correlation coefficients among opposite-sex twins and the lower bound (minimum) of correlation coefficients among same-sex twins indicate that in all populations studied a stronger association exists between twins in same-sex rather than opposite-sex twin sets or pairs. For opposite-sex twin sets no general association between the correlation coefficient and the stillbirth rate was identified.  相似文献   

2.
T Huang  M Jiang  X Kong  YD Cai 《PloS one》2012,7(8):e43441
Integrating high-throughput data obtained from different molecular levels is essential for understanding the mechanisms of complex diseases such as cancer. In this study, we integrated the methylation, microRNA and mRNA data from lung cancer tissues and normal lung tissues using functional gene sets. For each Gene Ontology (GO) term, three sets were defined: the methylation set, the microRNA set and the mRNA set. The discriminating ability of each gene set was represented by the Matthews correlation coefficient (MCC), as evaluated by leave-one-out cross-validation (LOOCV). Next, the MCCs in the methylation sets, the microRNA sets and the mRNA sets were ranked. By comparing the MCC ranks of methylation, microRNA and mRNA for each GO term, we classified the GO sets into six groups and identified the dysfunctional methylation, microRNA and mRNA gene sets in lung cancer. Our results provide a systematic view of the functional alterations during tumorigenesis that may help to elucidate the mechanisms of lung cancer and lead to improved treatments for patients.  相似文献   

3.
Two genes are said to be coexpressed if their expression levels have a similar spatial or temporal pattern. Ever since the profiling of gene microarrays has been in progress, computational modeling of coexpression has acquired a major focus. As a result, several similarity/distance measures have evolved over time to quantify coexpression similarity/dissimilarity between gene pairs. Of these, correlation coefficient has been established to be a suitable quantifier of pairwise coexpression. In general, correlation coefficient is good for symbolizing linear dependence, but not for nonlinear dependence. In spite of this drawback, it outperforms many other existing measures in modeling the dependency in biological data. In this paper, for the first time, we point out a significant weakness of the existing similarity/distance measures, including the standard correlation coefficient, in modeling pairwise coexpression of genes. A novel measure, called BioSim, which assumes values between -1 and +1 corresponding to negative and positive dependency and 0 for independency, is introduced. The computation of BioSim is based on the aggregation of stepwise relative angular deviation of the expression vectors considered. The proposed measure is analytically suitable for modeling coexpression as it accounts for the features of expression similarity, expression deviation and also the relative dependence. It is demonstrated how the proposed measure is better able to capture the degree of coexpression between a pair of genes as compared to several other existing ones. The efficacy of the measure is statistically analyzed by integrating it with several module-finding algorithms based on coexpression values and then applying it on synthetic and biological data. The annotation results of the coexpressed genes as obtained from gene ontology establish the significance of the introduced measure. By further extending the BioSim measure, it has been shown that one can effectively identify the variability in the expression patterns over multiple phenotypes. We have also extended BioSim to figure out pairwise differential expression pattern and coexpression dynamics. The significance of these studies is shown based on the analysis over several real-life data sets. The computation of the measure by focusing on stepwise time points also makes it effective to identify partially coexpressed genes. On the whole, we put forward a complete framework for coexpression analysis based on the BioSim measure.  相似文献   

4.
Mouse models are often used to study human genes because it is believed that the expression and function are similar for the majority of orthologous genes between the two species. However, recent comparisons of microarray data from thousands of orthologous human and mouse genes suggested rapid evolution of gene expression profiles under minimal or no selective constraint. These findings appear to contradict non-array-based observations from many individual genes and imply the uselessness of mouse models for studying human genes. Because absolute levels of gene expression are not comparable between species when the data are generated by species-specific microarrays, use of relative mRNA abundance among tissues (RA) is preferred to that of absolute expression signals. We thus reanalyze human and mouse genome-wide gene expression data generated by oligonucleotide microarrays. We show that the mean correlation coefficient among expression profiles detected by different probe sets of the same gene is only 0.38 for humans and 0.28 for mice, indicating that current measures of expression divergence are flawed because the large estimation error (discrepancy in expression signal detected by different probe sets of the same gene) is mistakenly included in the between-species divergence. When this error is subtracted, 84% of human-mouse orthologous gene pairs show significantly lower expression divergence than that of random gene pairs. In contrast to a previous finding, but consistent with the common sense, expression profiles of orthologous tissues between species are more similar to each other than to those of nonorthologous tissues. Furthermore, the evolutionary rate of expression divergence and that of coding sequence divergence are found to be weakly, but significantly positively correlated, when RA and the Euclidean distance are used to measure expression-profile divergence. These results highlight the importance of proper consideration of various estimation errors in comparing the microarray data between species.  相似文献   

5.
MicroRNAs negatively regulate the accumulation of mRNAs therefore when they are expressed in the same cells their expression profiles show an inverse correlation. We previously described one positively correlated miRNA/target pair, but it is not known how widespread this phenomenon is. Here, we investigated the correlation between the expression profiles of differentially expressed miRNAs and their targets during tomato fruit development using deep sequencing, Northern blot and RT-qPCR. We found an equal number of positively and negatively correlated miRNA/target pairs indicating that positive correlation is more frequent than previously thought. We also found that the correlation between microRNA and target expression profiles can vary between mRNAs belonging to the same gene family and even for the same target mRNA at different developmental stages. Since microRNAs always negatively regulate their targets, the high number of positively correlated microRNA/target pairs suggests that mutual exclusion could be as widespread as temporal regulation. The change of correlation during development suggests that the type of regulatory circuit directed by a microRNA can change over time and can be different for individual gene family members. Our results also highlight potential problems for expression profiling-based microRNA target identification/validation.  相似文献   

6.
Evaluating the importance of higher-order correlations of neural spike counts has been notoriously hard. A large number of samples are typically required in order to estimate higher-order correlations and resulting information theoretic quantities. In typical electrophysiology data sets with many experimental conditions, however, the number of samples in each condition is rather small. Here we describe a method that allows to quantify evidence for higher-order correlations in exactly these cases. We construct a family of reference distributions: maximum entropy distributions, which are constrained only by marginals and by linear correlations as quantified by the Pearson correlation coefficient. We devise a Monte Carlo goodness-of-fit test, which tests--for a given divergence measure of interest--whether the experimental data lead to the rejection of the null hypothesis that it was generated by one of the reference distributions. Applying our test to artificial data shows that the effects of higher-order correlations on these divergence measures can be detected even when the number of samples is small. Subsequently, we apply our method to spike count data which were recorded with multielectrode arrays from the primary visual cortex of anesthetized cat during an adaptation experiment. Using mutual information as a divergence measure we find that there are spike count bin sizes at which the maximum entropy hypothesis can be rejected for a substantial number of neuronal pairs. These results demonstrate that higher-order correlations can matter when estimating information theoretic quantities in V1. They also show that our test is able to detect their presence in typical in-vivo data sets, where the number of samples is too small to estimate higher-order correlations directly.  相似文献   

7.
8.
Analysis of genetic interaction networks often involves identifying genes with similar profiles, which is typically indicative of a common function. While several profile similarity measures have been applied in this context, they have never been systematically benchmarked. We compared a diverse set of correlation measures, including measures commonly used by the genetic interaction community as well as several other candidate measures, by assessing their utility in extracting functional information from genetic interaction data. We find that the dot product, one of the simplest vector operations, outperforms most other measures over a large range of gene pairs. More generally, linear similarity measures such as the dot product, Pearson correlation or cosine similarity perform better than set overlap measures such as Jaccard coefficient. Similarity measures that involve L2-normalization of the profiles tend to perform better for the top-most similar pairs but perform less favorably when a larger set of gene pairs is considered or when the genetic interaction data is thresholded. Such measures are also less robust to the presence of noise and batch effects in the genetic interaction data. Overall, the dot product measure performs consistently among the best measures under a variety of different conditions and genetic interaction datasets.  相似文献   

9.
Comparing the gene-expression profiles of sick and healthy individuals can help in understanding disease. Such differential expression analysis is a well-established way to find gene sets whose expression is altered in the disease. Recent approaches to gene-expression analysis go a step further and seek differential co-expression patterns, wherein the level of co-expression of a set of genes differs markedly between disease and control samples. Such patterns can arise from a disease-related change in the regulatory mechanism governing that set of genes, and pinpoint dysfunctional regulatory networks.Here we present DICER, a new method for detecting differentially co-expressed gene sets using a novel probabilistic score for differential correlation. DICER goes beyond standard differential co-expression and detects pairs of modules showing differential co-expression. The expression profiles of genes within each module of the pair are correlated across all samples. The correlation between the two modules, however, differs markedly between the disease and normal samples.We show that DICER outperforms the state of the art in terms of significance and interpretability of the detected gene sets. Moreover, the gene sets discovered by DICER manifest regulation by disease-specific microRNA families. In a case study on Alzheimer''s disease, DICER dissected biological processes and protein complexes into functional subunits that are differentially co-expressed, thereby revealing inner structures in disease regulatory networks.  相似文献   

10.
We address possible limitations of publicly available data sets of yeast gene expression. We study the predictability of known regulators via time-series analysis, and show that less than 20% of known regulatory pairs exhibit strong correlations in the Cho/Spellman data sets. By analyzing known regulatory relationships, we designed an edge detection function which identified candidate regulations with greater fidelity than standard correlation methods. We develop general methods for integrated analysis of coarse time-series data sets. These include 1) methods for automated period detection in a predominately cycling data set and 2) phase detection between phase-shifted cyclic data sets. We show how to properly correct for the problem of comparing correlation coefficients between pairs of sequences of different lengths and small alphabets. Finally, we note that the correlation coefficient of sequences over alphabets of size two can exhibit very counterintuitive behavior when compared with the Hamming distance.  相似文献   

11.
12.
调控通路内基因表达的相关性分析   总被引:1,自引:1,他引:0  
李传星  李霞  郭政  宫滨生  屠康 《遗传》2004,26(6):929-933
本研究从基因表达调控通路的角度分析了基因功能与基因表达之间的关系,利用7套酿酒酵母基因芯片表达谱数据和通路数据库(KEGG和CYGD)所提供的信息,应用我们研制的Genehub软件分析研究了同一基因表达调控通路内的基因在mRNA表达水平上的相关性,共涉及16条通路,495个基因。通过Pearson相关系数和Spearman相关系数两种相似性测度的分析,我们发现有94%(15条)的基因表达调控通路内的基因在大于等于4套的表达谱数据中是共表达的,以上结果从基因表达调控通路的角度,证实了基因功能与基因表达之间存在着一定的相关性。  相似文献   

13.
14.
Among the several linkage disequilibrium measures known to capture different features of the non-independence between alleles at different loci, the most commonly used for diallelic loci is the r(2) measure. In the present study, we tackled the problem of the bias of r(2) estimate, which results from the sample structure and/or the relatedness between genotyped individuals. We derived two novel linkage disequilibrium measures for diallelic loci that are both extensions of the usual r(2) measure. The first one, r(S)(2), uses the population structure matrix, which consists of information about the origins of each individual and the admixture proportions of each individual genome. The second one, r(V)(2), includes the kinship matrix into the calculation. These two corrections can be applied together in order to correct for both biases and are defined either on phased or unphased genotypes.We proved that these novel measures are linked to the power of association tests under the mixed linear model including structure and kinship corrections. We validated them on simulated data and applied them to real data sets collected on Vitis vinifera plants. Our results clearly showed the usefulness of the two corrected r(2) measures, which actually captured 'true' linkage disequilibrium unlike the usual r(2) measure.  相似文献   

15.
珙桐群落种间关系的研究   总被引:16,自引:0,他引:16  
朱利君  苏智先  胡进耀  苏瑞军  周良   《广西植物》2006,26(1):32-37,4
用方差比率法(VR)和在2×2联列表基础上应用Jaccard指数、Dice指数和Pearson相关系数、Spearman秩相关系数对卧龙自然保护区三江珙桐群落种间关系进行了分析,结果表明珙桐群落中主要种群在总体上表现出正关联,群落中种间联结的显著度较低,仅有7对联结显著,珙桐与总状山矾呈显著负联结,与稠李呈极显著负联结。大多数种对间关联程度较低,存在极大排斥性的种对较少,暗示目前珙桐群落处于较成熟阶段。研究种间关系时,种间关联性测定指标结合相关系数分析得出的结果较好,最后对珙桐的保护提出了一些建议。  相似文献   

16.
17.
MicroRNAs are small noncoding RNAs that regulate genes post-transciptionally by binding and degrading target eukaryotic mRNAs. We use a quantitative model to study gene regulation by inhibitory microRNAs and compare it to gene regulation by prokaryotic small non-coding RNAs (sRNAs). Our model uses a combination of analytic techniques as well as computational simulations to calculate the mean-expression and noise profiles of genes regulated by both microRNAs and sRNAs. We find that despite very different molecular machinery and modes of action (catalytic vs stoichiometric), the mean expression levels and noise profiles of microRNA-regulated genes are almost identical to genes regulated by prokaryotic sRNAs. This behavior is extremely robust and persists across a wide range of biologically relevant parameters. We extend our model to study crosstalk between multiple mRNAs that are regulated by a single microRNA and show that noise is a sensitive measure of microRNA-mediated interaction between mRNAs. We conclude by discussing possible experimental strategies for uncovering the microRNA-mRNA interactions and testing the competing endogenous RNA (ceRNA) hypothesis.  相似文献   

18.
MicroRNAs have been known to regulate almost all physiological and pathological processes by suppressing their target genes. In humans, more than 1000 microRNAs have been identified, each of which targets dozens or even hundreds of genes. Facing this huge repertoire of microRNA targeting, it is important to identify which microRNAs are active, i.e., down-regulating their targets, in specific physiological or pathological conditions. Predicting active microRNAs is different from predicting microRNA targets because the authentic target genes of a microRNA are often not directly and solely regulated by that microRNA, leading to inconsistent expression changes between the microRNA and its true targets. Several computational programs have been proposed to predict the activity of a microRNA from the expressions of its target genes. These programs performed well when being applied on the expression data obtained from distinct tissue types or from experiments that transfect a microRNA into cells (i.e., non-physiological). But the performance of microRNA activity prediction is not clear on the expression data from the same tissue type in two physiological conditions, e.g., liver tissues from cancer patients and healthy people. In this work, we evaluate the performance of two microRNA activity prediction programs using seven expression data sets, all of which compare samples in two physiological conditions, as well as propose a new approach that predicts microRNA activity with an accuracy of over 80%. Unlike current methods, which predict active microRNAs by comparing two groups of samples, e.g., tumor versus normal, our new approach compares each diseased sample with all the samples in the control group. In other words, it can predict the microRNA activity of a person. In this work, this new application is named to predict “personalized microRNA activity”.  相似文献   

19.
The genetic analysis of quantitative traits in humans is changing as a result of the availability of whole-genome SNP data. Heritability analysis can make use of actual genetic sharing between pairs of individuals estimated from the genotype data, rather than the expected genetic sharing implied by their family relationship. This could provide more accurate heritability estimates and help to overcome the equal environment assumption. Quantitative trait locus (QTL) linkage mapping can make use of local genetic sharing inferred from very dense local genotype data from pedigree members or individuals not previously known to be related. This approach may be particularly suited for detecting loci that contain rare variants with major effect on the phenotype. Finally, whole-genome SNP data can be used to measure the genetic similarity between individuals to provide matched sets for association studies, in order to avoid spurious association from population stratification.  相似文献   

20.
Ostlund G  Sonnhammer EL 《Gene》2012,497(2):228-236
mRNA expression is widely used as a proxy for protein expression. However, their true relation is not known and two genes with the same mRNA levels might have different abundances of respective proteins. A related question is whether the coexpression of mRNA for gene pairs is reflected by the corresponding protein pairs. We examined the mRNA-protein correlation for both expression and coexpression. This analysis yielded insights into the relationship between mRNA and protein abundance, and allowed us to identify subsets of greater mRNA-protein coherence. The correlation between mRNA and protein was low for both expression and coexpression, 0.12 and 0.06 respectively. However, applying the best-performing quality measure, high-quality subsets reached a Spearman correlation of 0.31 for expression, 0.34 for coexpression and 0.49 for coexpression when restricted to functionally coupled genes. Our methodology can thus identify subsets for which the mRNA levels are expected to be the strongest correlated with protein levels.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号