首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Position Weight Matrices (PWMs) are probabilistic representations of signals in sequences. They are widely used to model approximate patterns in DNA or in protein sequences. The usage of PWMs needs as a prerequisite to knowing the statistical significance of a word according to its score. This is done by defining the P-value of a score, which is the probability that the background model can achieve a score larger than or equal to the observed value. This gives rise to the following problem: Given a P-value, find the corresponding score threshold. Existing methods rely on dynamic programming or probability generating functions. For many examples of PWMs, they fail to give accurate results in a reasonable amount of time.  相似文献   

2.
3.
4.
5.
6.

Background

In a number of diseases, certain genes are reported to be strongly methylated and thus can serve as diagnostic markers in many cases. Scientific literature in digital form is an important source of information about methylated genes implicated in particular diseases. The large volume of the electronic text makes it difficult and impractical to search for this information manually.

Methodology

We developed a novel text mining methodology based on a new concept of position weight matrices (PWMs) for text representation and feature generation. We applied PWMs in conjunction with the document-term matrix to extract with high accuracy associations between methylated genes and diseases from free text. The performance results are based on large manually-classified data. Additionally, we developed a web-tool, DEMGD, which automates extraction of these associations from free text. DEMGD presents the extracted associations in summary tables and full reports in addition to evidence tagging of text with respect to genes, diseases and methylation words. The methodology we developed in this study can be applied to similar association extraction problems from free text.

Conclusion

The new methodology developed in this study allows for efficient identification of associations between concepts. Our method applied to methylated genes in different diseases is implemented as a Web-tool, DEMGD, which is freely available at http://www.cbrc.kaust.edu.sa/demgd/. The data is available for online browsing and download.  相似文献   

7.
Objective: To examine the association between relative body weight and health status and the potential modifying effects of socioeconomic position and working conditions on this association. Research Methods and Procedures: The data were derived from three identical cross‐sectional surveys conducted in 2000, 2001, and 2002. Respondents to postal surveys were middle‐aged employees of the City of Helsinki (7148 women and 1799 men, response rate 67%). BMI was based on self‐reported weight and height. Health status was measured by the Short‐Form 36 subscales and component summaries. Results: Body weight was inversely associated with physical health, but in mental health, differences between BMI categories were small and inconsistent. In women, physical health deteriorated monotonically with increasing BMI, whereas in men, poor physical health was found among the obese only. Socioeconomic position did not modify the association between BMI and health. In women, the association between body weight and physical health became stronger with decreasing job control and increasing physical work load, whereas in men, a similar modifying effect was found for high job demands. Discussion: Body weight was associated with physical health only. Lower levels of relative weight in women than in men may be associated with poor physical health. High body weight combined with adverse working conditions may impose a double burden on physical health.  相似文献   

8.
Most tissue cells grown in sparse cultures on linearly elastic substrates typically display a small, round phenotype on soft substrates and become increasingly spread as the modulus of the substrate increases until their spread area reaches a maximum value. As cell density increases, individual cells retain the same stiffness-dependent differences unless they are very close or in molecular contact. On nonlinear strain-stiffening fibrin gels, the same cell types become maximally spread even when the low strain elastic modulus would predict a round morphology, and cells are influenced by the presence of neighbors hundreds of microns away. Time lapse microscopy reveals that fibroblasts and human mesenchymal stem cells on fibrin deform the substrate by several microns up to five cell lengths away from their plasma membrane through a force limited mechanism. Atomic force microscopy and rheology confirm that these strains locally and globally stiffen the gel, depending on cell density, and this effect leads to long distance cell-cell communication and alignment. Thus cells are acutely responsive to the nonlinear elasticity of their substrates and can manipulate this rheological property to induce patterning.  相似文献   

9.

Background

Atmospheric pollution is a major public health concern. It can affect placental function and restricts fetal growth. However, scientific knowledge remains too limited to make inferences regarding causal associations between maternal exposure to air pollution and adverse effects on pregnancy. This study evaluated the association between low birth weight (LBW) and maternal exposure during pregnancy to traffic related air pollutants (TRAP) in São Paulo, Brazil.

Methods and findings

Analysis included 5,772 cases of term-LBW (<2,500 g) and 5,814 controls matched by sex and month of birth selected from the birth registration system. Mothers’ addresses were geocoded to estimate exposure according to 3 indicators: distance from home to heavy traffic roads, distance-weighted traffic density (DWTD) and levels of particulate matter ≤10 µg/m3 estimated through land use regression (LUR-PM10). Final models were evaluated using multiple logistic regression adjusting for birth, maternal and pregnancy characteristics. We found decreased odds in the risk of LBW associated with DWTD and LUR-PM10 in the highest quartiles of exposure with a significant linear trend of decrease in risk. The analysis with distance from heavy traffic roads was less consistent. It was also observed that mothers with higher education and neighborhood-level income were potentially more exposed to TRAP.

Conclusions

This study found an unexpected decreased risk of LBW associated with traffic related air pollution. Mothers with advantaged socioeconomic position (SEP) although residing in areas of higher vehicular traffic might not in fact be more expose to air pollution. It can also be that the protection against LBW arising from a better SEP is stronger than the effect of exposure to air pollution, and this exposure may not be sufficient to increase the risk of LBW for these mothers.  相似文献   

10.
Shrinkage Estimators for Covariance Matrices   总被引:1,自引:0,他引:1  
Estimation of covariance matrices in small samples has been studied by many authors. Standard estimators, like the unstructured maximum likelihood estimator (ML) or restricted maximum likelihood (REML) estimator, can be very unstable with the smallest estimated eigenvalues being too small and the largest too big. A standard approach to more stably estimating the matrix in small samples is to compute the ML or REML estimator under some simple structure that involves estimation of fewer parameters, such as compound symmetry or independence. However, these estimators will not be consistent unless the hypothesized structure is correct. If interest focuses on estimation of regression coefficients with correlated (or longitudinal) data, a sandwich estimator of the covariance matrix may be used to provide standard errors for the estimated coefficients that are robust in the sense that they remain consistent under misspecification of the covariance structure. With large matrices, however, the inefficiency of the sandwich estimator becomes worrisome. We consider here two general shrinkage approaches to estimating the covariance matrix and regression coefficients. The first involves shrinking the eigenvalues of the unstructured ML or REML estimator. The second involves shrinking an unstructured estimator toward a structured estimator. For both cases, the data determine the amount of shrinkage. These estimators are consistent and give consistent and asymptotically efficient estimates for regression coefficients. Simulations show the improved operating characteristics of the shrinkage estimators of the covariance matrix and the regression coefficients in finite samples. The final estimator chosen includes a combination of both shrinkage approaches, i.e., shrinking the eigenvalues and then shrinking toward structure. We illustrate our approach on a sleep EEG study that requires estimation of a 24 x 24 covariance matrix and for which inferences on mean parameters critically depend on the covariance estimator chosen. We recommend making inference using a particular shrinkage estimator that provides a reasonable compromise between structured and unstructured estimators.  相似文献   

11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
基于已知的酵母转录因子结合位点数据资料,构建转录因子结合位点碱基关联二联体位置权重矩阵,整合碱基关联二联体位置权重矩阵和碱基保守性参量M2i,提出一种新的预测转录因子结合位点的方法(PWMSA).利用self-consistency和cross-validation两种方法对此算法进行检验,均获得了较高的预测成功率,结果表明9种转录因子结合位点的总体预测成功率超过81%,明显高于单碱基位置权重矩阵,同时与已有预测转录因子结合位点的软件进行比较,核苷酸水平上的关联系数和结合位点水平上的关联系数分别达到0.42和0.52,优于现有预测方法.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号