首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
Z曲线在基因组中应用的研究与展望   总被引:1,自引:0,他引:1  
张爽  黄萍  藏露  欧阳玉梅  马志强 《生物信息学》2009,7(3):212-214,226
随着后基因组时代的到来,大规模分子数据的处理和分析的需求变得越来越重要,生物数据的处理方法也层出不穷。这里介绍了一种新的表示DNA序列的模型-Z曲线,并详细的介绍了它的数学原理、和它在识别基因、基因起始位点以及识别Isochore结构上的应用,最后展望了Z曲线在识别TFBS上的应用。  相似文献   

2.
Z曲线的理论研究   总被引:4,自引:0,他引:4  
DNA序列与正四面体(RT)中的映象点(D格点)具有对应关系.一系列D格点形成晶格(D晶格).D晶格证明是面心立方晶格,并揭示了D格点的空间分布规律.在此基础上得到Z曲线的一些特性,如Z曲线束具有S4群的对称性。Z曲线均在最大正四面体的内切球内等等.  相似文献   

3.
RNA编辑是重要的转录后修饰过程,目前已有多种算法用于识别RNA编辑,本文主要研究小鼠中测序深度对RNA编辑识别算法的影响,从而为RNA编辑的研究给出建议的方法. 本文使用STAR比对软件将小鼠的RNA-seq数据进行序列比对,然后使用GATK识别SNV,并用Separate Method、GIREMI、RNAEditor 3种方法识别出RNA编辑位点. 最后对3种方法识别RNA编辑位点的共同部分、识别效率、识别稳定性、识别与测序深度的关系进行分析. 结果发现3种方法识别的编辑位点数目差异大,共有位点较少,随着测序深度的增加,识别的RNA编辑位点数也在增加. 结果表明RNA编辑识别算法在小鼠中的识别性能与测序深度呈正相关.  相似文献   

4.
目的应用NASBA方法制备SIV/SHIV RNA定量测定标准品。方法应用NASBA方法直接扩增SIVmac251病毒gag基因上1476~1685之间的片段,扩增的RNA产物(RS-NASBA)纯化后10倍系列稀释,测定定量曲线、标准曲线,测定该标准品的稳定性和重复性。结果应用Qiagen公司QuantiTect SYBR GREEN RT-PCRKit,该标准品可精确定量到2.033×10 copies/μL。结论外标准品RS.NASBA纯度高,稳定性好,可用于定量测定SIV/SHIV RNA拷贝数。  相似文献   

5.
使用转录组测序(RNA-Seq)数据识别黑猩猩RNA编辑位点,探索了RNA编辑的识别机制以及潜在的功能影响.基于黑猩猩RNA-Seq数据与基因组序列的比对信息发现RNA-DNA错配位点,并构建编辑位点候选集.从中滤除基因组或转录组测序质量低的位点,其他的过滤条件包括3′端测不准、覆盖度、SNP位点以及估算的编辑水平.构建二项分布统计模型和Bonferroni多重检验滤除候选集中的随机错误,得到RNA编辑位点.选取落在已知基因上的编辑位点进行功能分析,并用Two Sample Logo软件分析编辑位点上下游序列的特征.识别出黑猩猩12种碱基替换型RNA编辑位点8 334个,其中有41个编辑位点改变原有的氨基酸,另有3个编辑位点落在microRNA(miRNA)潜在靶基因的种子结合区.统计学分析表明,分别有640和872个RNA编辑位点存在组织和性别差异.上下游碱基频率分析表明,多种类型的编辑位点紧邻碱基具有显著偏好.结果显示, RNA编辑在黑猩猩体内大量存在,且潜在具有重要的生物学功能,为进一步深入研究灵长类RNA编辑的机制奠定了基础.  相似文献   

6.
随着Z曲线的提出,用几何学的方法研究DNA序列得到了发展。利用z曲线的思想对核酸序列的可视化分析平台的构建进行了研究,提出了基于包围盒的多精度抽稀算法,实现了从DNA序列到三维空间点和曲线的直接转换。  相似文献   

7.
计算RNA组学:非编码RNA结构识别与功能预测   总被引:2,自引:0,他引:2       下载免费PDF全文
真核生物基因组中包含大量非编码RNA基因,计算RNA组学采用信息科学等多学科方法解析ncRNA的结构与功能.本文就ncRNA数据存储与管理、ncRNA基因识别与鉴定、ncRNA靶标识别与功能预测等问题,对目前计算RNA组学的主要研究方法和内容进行了评述.  相似文献   

8.
为实现高通量识别新的药物-长链非编码RNA(Long non-coding RNA, lncRNA)关联,本文提出了一种基于图卷积网络模型来识别潜在药物-lncRNA关联的方法DLGCN(Drug-LncRNA graph convolution network)。首先,基于药物的结构信息和lncRNA的序列信息分别构建了药物-药物和lncRNA-lncRNA相似性网络,并整合实验证实的药物-lncRNA关联构建了药物-lncRNA异质性网络。然后,将注意力机制和图卷积运算应用于该网络中,学习药物和lncRNA的低维特征,基于整合的低维特征预测新的药物-lncRNA关联。通过效能评估,DLGCN的受试者工作特性曲线下面积(Area under receiver operating characteristic, AUROC)达到0.843 1,优于经典的机器学习方法和常见的深度学习方法。此外,DLGCN预测到姜黄素能够调控lncRNA MALAT1的表达,已被最近的研究证实。DLGCN能够有效预测药物-lncRNA关联,为肿瘤治疗新靶点的识别和抗癌药物的筛选提供了重要参考。  相似文献   

9.
环状RNA是新发现的一类具有重要生物学功能的RNA。现有的环状RNA识别工具依赖高通量测序数据,因数据本身和识别方式的弊端而普遍存在准确性不足、不同方法间重复性低以及假阳性率/假阴性率高等缺点。为了解决该问题,我们搭建模型来实现不依赖于测序数据而根据序列的内在特征的环状RNA从头预测。本文选取了包括剪接位点上下游内含子的长度、A-to-I密度和Alu重复序列等100个与RNA成环相关的序列特征,建立了机器学习模型,并识别了人类基因组中的环状RNA,比较了两种机器学习方法随机森林法(RF)和支持向量机(SVM)的分类效果。结果表明,所选序列特征能有效地鉴别RNA能否成环,同时,不同序列特征对模型的分类预测能力的贡献也不同。相比于SVM方法,RF分类的效果更好。  相似文献   

10.
RNA 5-甲基胞嘧啶(m5C)修饰在许多生物过程中发挥重要的作用,对m5C位点的准确识别有助于更好地理解其生物学功能,所以识别m5C甲基化位点十分必要。尽管已发展了多种识别m5C甲基化位点的机器学习方法,但预测能力仍有待提高。本文基于双向长短时记忆网络和注意力机制,提出了一种预测RNA m5C甲基化位点的深度学习算法。用该方法在人、小鼠、酿酒酵母和拟南芥共4种生物的RNA m5C数据集上进行实验,m5C位点预测AUC值分别达到92.5%、99.7%、93.6%和86.5%。与现有预测方法相比,该方法具有较好的预测性能,并且具有更优的泛化能力,为RNA m5C甲基化位点预测提供了一种新方法。  相似文献   

11.
We present a simple theory of the dynamics of force generation by RecA during homologous strand exchange and a continuous, deterministic mathematical model of the proposed process. Calculations show that force generation is possible in this model for certain reasonable values of the parameters. We predict the shape of the force-velocity curve for the Holliday junction, which exhibits a distinctive kink at large retarding force, and suggest experiments which should distinguish between the proposed model and other models in the literature.  相似文献   

12.
The receiver operating characteristic (ROC) curve is a popular tool to evaluate and compare the accuracy of diagnostic tests to distinguish the diseased group from the nondiseased group when test results from tests are continuous or ordinal. A complicated data setting occurs when multiple tests are measured on abnormal and normal locations from the same subject and the measurements are clustered within the subject. Although least squares regression methods can be used for the estimation of ROC curve from correlated data, how to develop the least squares methods to estimate the ROC curve from the clustered data has not been studied. Also, the statistical properties of the least squares methods under the clustering setting are unknown. In this article, we develop the least squares ROC methods to allow the baseline and link functions to differ, and more importantly, to accommodate clustered data with discrete covariates. The methods can generate smooth ROC curves that satisfy the inherent continuous property of the true underlying curve. The least squares methods are shown to be more efficient than the existing nonparametric ROC methods under appropriate model assumptions in simulation studies. We apply the methods to a real example in the detection of glaucomatous deterioration. We also derive the asymptotic properties of the proposed methods.  相似文献   

13.
Continuous biomarkers are common for disease screening and diagnosis. To reach a dichotomous clinical decision, a threshold would be imposed to distinguish subjects with disease from nondiseased individuals. Among various performance metrics, specificity at a controlled sensitivity level (or vice versa) is often desirable because it directly targets the clinical utility of the intended clinical test. Meanwhile, covariates, such as age, race, as well as sample collection conditions, could impact the biomarker distribution and may also confound the association between biomarker and disease status. Therefore, covariate adjustment is important in such biomarker evaluation. Most existing covariate adjustment methods do not specifically target the desired sensitivity/specificity level, but rather do so for the entire biomarker distribution. As such, they might be more prone to model misspecification. In this paper, we suggest a parsimonious quantile regression model for the diseased population, only locally at the controlled sensitivity level, and assess specificity with covariate-specific control of the sensitivity. Variance estimates are obtained from a sample-based approach and bootstrap. Furthermore, our proposed local model extends readily to a global one for covariate adjustment for the receiver operating characteristic (ROC) curve over the sensitivity continuum. We demonstrate computational efficiency of this proposed method and restore the inherent monotonicity in the estimated covariate-adjusted ROC curve. The asymptotic properties of the proposed estimators are established. Simulation studies show favorable performance of the proposal. Finally, we illustrate our method in biomarker evaluation for aggressive prostate cancer.  相似文献   

14.
In medical research, diagnostic tests with continuous values are widely employed to attempt to distinguish between diseased and non-diseased subjects. The diagnostic accuracy of a test (or a biomarker) can be assessed by using the receiver operating characteristic (ROC) curve of the test. To summarize the ROC curve and primarily to determine an “optimal” threshold for test results to use in practice, several approaches may be considered, such as those based on the Youden index, on the so-called close-to-(0,1) point, on the concordance probability and on the symmetry point. In this paper, we focus on the symmetry point-based approach, that simultaneously controls the probabilities of the two types of correct classifications (healthy as healthy and diseased as diseased), and show how to get joint nonparametric confidence regions for the corresponding optimal cutpoint and the associated sensitivity (= specificity) value. Extensive simulation experiments are conducted to evaluate the finite sample performances of the proposed method. Real datasets are also used to illustrate its application.  相似文献   

15.
Apolipoprotein CIII (apoCIII), a major constituent of triglyceride-rich lipoprotein, has been proposed as a key contributor to hypertriglyceridemia on the basis of its inhibitory effects on lipoprotein lipase. Many immunochemical methods have been developed for human apoCIII quantification, including ELISA. However, a sensitive and quantitative assay for nonhuman primates is not commercially available. We developed a sensitive, quantitative, and highly specific sandwich ELISA to measure apoCIII in both nonhuman primate and human serum. Our assay generates a linear calibration curve from 0.01 μg/ml to 10 μg/ml using an apoCIII standard that was purified from cynomolgus monkey serum. It is highly reproducible (intra- and interplate CV < 5% and < 8%, respectively), sensitive enough to distinguish 10% difference of apoCIII present in serum, and has no interference from purified human apolipoprotein AI, AII, B, CI, CII, or E. The same assay can also be used to measure human apoCIII with a linear calibration curve from 0.005 μg/ml to 1 μg/ml using purified human apoCIII as the standard. This fast and highly sensitive ELISA could be a useful tool to investigate the role of apoCIII in lipoprotein transport and cardiovascular disease.  相似文献   

16.
A simple parametric model is proposed for data from a point-process version of a reaction time experiment. It is used to statistically check for the presence and nature of nonlinear inhibition in the eye-brain-hand system, as well as to study the nature of the reaction time delay distribution. The model tells us that, in principle, the second-order intensity estimate can be used to determine whether the experimental subject is systematically observing the first or the second of two flashes transmitted in short succession. Nonparametric estimates of second-order intensity functions are used in conjunction with this model. In particular, the model allows for the computation of good bandwidths for intensity curve estimation. A parametric bootstrap can also be implemented. Our methods are illustrated with 12 runs of data from a real reaction time experiment. It is found that nonlinear inhibition is present in the eye-brain-hand system. However, there are insufficient data to distinguish between log-normality and normality in the reaction time distribution, due partly to confounding with the particular kind of nonlinear inhibition present in the system.  相似文献   

17.
The Weber-Fechner fraction was measured in experiments where a subject was presented with a pair of lines, the length of one (reference) line being constant and the length of the other (test) one varying. The subject had to say whether the longer line was above or below the shorter one. The subject was trained to distinguish the positions of lines in areas limited in the number of reference lines located at different degrees of eccentricity. The fraction curve for each area proved to be of the same shape as the curve obtained for the entire range and reflecting the Weber-Fechner law. The fraction was originally maximum and only slightly decreased afterwards. On the basis of these data, a neural construction is suggested that serves for describing the visual space and explains the shape of the curve reflecting the Weber-Fechner law.  相似文献   

18.
We tested the power of a segregation analysis method (first proposed by Elandt-Johnson) to distinguish between single-locus and two-locus models, with and without environmentally caused reduced penetrance. We also looked at the effect of ascertainment probability on the analysis and at the proband-conditioned ascertainment correction proposed by Cannings and Thompson. We found that: (1) the segregation analysis has sufficient power to distinguish between the fully-penetrant double-recessive (RR) model and the fully-penetrant single-locus dominant and recessive models; (2) the method can also distinguish fairly well between the dominant-recessive (DR) and RR models, even when one does not take into account the population prevalence; (3) the method has much less power to distinguish between the fully-penetrant RR model and the single-locus models with reduced penetrance; (4) when environmental penetrance is taken account of in the analysis, the power of the method to distinguish between the one- and two-locus models improved substantially; (5) the estimates of ascertainment probability, pi, were robust, regardless of the model under which the data were generated; and (6) the Cannings-Thompson approach to ascertainment correction worked well only when the pi used to generate the data was less than .1.  相似文献   

19.
亚高寒草甸不同生境植物群落物种多度分布格局的拟合   总被引:1,自引:0,他引:1  
刘梦雪  刘佳佳  杜晓光  郑小刚 《生态学报》2010,30(24):6935-6942
物种多度分布是群落生态学研究的核心内容。通过对青藏高原东部亚高寒草甸3种不同生境草本植物群落的抽样调查,结合16个物种多度分布模型的两种曲线拟合优度检验得出如下结果:多种不同模型可以拟合同一生境的物种多度分布。相比于其他可拟合模型,几何级数模型在3种生境中两种拟合优度检验方法下的平均拟合效果是最好的,拟合优度值均在最优拟合优度值10左右波动。次优模型鉴于不同生境不同的检验方法表现不一。除了几何级数模型外,Sugihara分数模型在最小二乘法的拟合方法下,也可以拟合3种生境的物种多度分布。研究结果表明,仅用拟合优度检验区分产生不同物种分布格局的模型和机制是不可靠的,需要做进一步的检验性实验研究。  相似文献   

20.
In this paper, we propose a successive learning method in hetero-associative memories, such as Bidirectional Associative Memories and Multidirectional Associative Memories, using chaotic neural networks. It can distinguish unknown data from the stored known data and can learn the unknown data successively. The proposed model makes use of the difference in the response to the input data in order to distinguish unknown data from the stored known data. When input data is regarded as unknown data, it is memorized. Furthermore, the proposed model can estimate and learn correct data from noisy unknown data or incomplete unknown data by considering the temporal summation of the continuous data input. In addition, similarity to the physiological facts in the olfactory bulb of a rabbit found by Freeman are observed in the behavior of the proposed model. A series of computer simulations shows the effectiveness of the proposed model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号