共查询到18条相似文献,搜索用时 62 毫秒
1.
生物小分子microRNA可以对基因表达进行正向或负向调控,研究microRNA与基因之间的关系对于机体稳态的维持和疾病治疗都有着重要意义。利用深度学习方法对microRNA和基因靶向关系进行预测,提出了TransformerMGI模型。在特征工程阶段,针对生物序列潜在信息难以准确地提取这一问题,TransformerMGI模型分别采用了基于图卷积神经网络的GP-GCN方法和DNA2Vec模型对microRNA和基因数据的潜在信息进行提取,得到了二者的表征嵌入矩阵,在模型方面,TransformerMGI模型引入了幂归一化来改进经典的深度学习模型。利用microRNA和基因数据经过特征提取后得到两个表征矩阵,这两个矩阵分别被放入TransformerMGI模型中,通过TransformerMGI模型内部的Attention机制对二者自身和相互的特征信息进行了聚合和关联运算,最终预测出microRNA调控基因的概率。采用ROC曲线下面积和准确召回率曲线作为模型性能评价指标,将TransformerMGI与其他现有模型进行了比较评估。实验结果表明,TransformerMGI模型的AUC和AUPRC评分均可达0.91以上,优于现有的其他模型。TransformerMGI模型能在不考虑生物学原理和基因组背景的前提下,仅依赖microRNA和基因的碱基序列信息,实现microRNA靶向基因的预测,从而为后续的microRNA靶向基因预测研究提供了可借鉴的深度学习方法。 相似文献
2.
药物从研发到临床应用需要耗费较长的时间,研发期间的投入成本可高达十几亿元。而随着医药研发与人工智能的结合以及生物信息学的飞速发展,药物活性相关数据急剧增加,传统的实验手段进行药物活性预测已经难以满足药物研发的需求。借助算法来辅助药物研发,解决药物研发中的各种问题能够大大推动药物研发进程。传统机器学习方法尤其是随机森林、支持向量机和人工神经网络在药物活性方面能够达到较高的预测精度。深度学习由于具有多层神经网络,模型可以接收高维的输入变量且不需要人工限定数据输入特征,可以拟合较为复杂的函数模型,应用于药物研发可以进一步提高各个环节的效率。在药物活性预测中应用较为广泛的深度学习模型主要是深度神经网络(deep neural networks,DNN)、循环神经网络(recurrent neural networks,RNN)和自编码器(auto encoder,AE),而生成对抗网络(generative adversarial networks,GAN)由于其生成数据的能力常常被用来和其他模型结合进行数据增强。近年来深度学习在药物分子活性预测方面的研究和应用综述表明,深度学习模型的准确度和效率均高于传统实验方法和传统机器学习方法。因此,深度学习模型有望成为药物研发领域未来十年最重要的辅助计算模型。 相似文献
3.
药物研发是非常重要但也十分耗费人力物力的过程。利用计算机辅助预测药物与蛋白质亲和力的方法可以极大地加快药物研发过程。药物靶标亲和力预测的关键在于对药物和蛋白质进行准确详细地信息表征。提出一种基于深度学习与多层次信息融合的药物靶标亲和力的预测模型,试图通过综合药物与蛋白质的多层次信息,来获得更好的预测表现。首先将药物表述成分子图和扩展连接指纹两种形式,分别利用图卷积神经网络模块和全连接层进行学习;其次将蛋白质序列和蛋白质K-mer特征分别输入卷积神经网络模块和全连接层来学习蛋白质潜在特征;随后将4个通道学习到的特征进行融合,再利用全连接层进行预测。在两个基准药物靶标亲和力数据集上验证了所提方法的有效性,并与其他已有模型作对比研究。结果说明提出的模型相比基准模型能得到更好的预测性能,表明提出的综合药物与蛋白质多层次信息的药物靶标亲和力预测策略是有效的。 相似文献
4.
目的 长链非编码RNA在遗传、代谢和基因表达调控等方面发挥着重要作用。然而,传统的实验方法解析RNA的三级结构耗时长、费用高且操作要求高。此外,通过计算方法来预测RNA的三级结构在近十年来无突破性进展。因此,需要提出新的预测算法来准确的预测RNA的三级结构。所以,本文发展可以用于提高RNA三级结构预测准确性的碱基关联图预测方法。方法 为了利用RNA理化特征信息,本文应用多层全卷积神经网络和循环神经网络的深度学习算法来预测RNA碱基间的接触概率,并通过注意力机制处理RNA序列中碱基间相互依赖的特征。结果 通过多层神经网络与注意力机制结合,本文方法能够有效得到RNA特征值中局部和全局的信息,提高了模型的鲁棒性和泛化能力。检验计算表明,所提出模型对序列长度L的4种标准(L/10、L/5、L/2、L)碱基关联图的预测准确率分别达到0.84、0.82、0.82和0.75。结论 基于注意力机制的深度学习预测算法能够提高RNA碱基关联图预测的准确率,从而帮助RNA三级结构的预测。 相似文献
5.
近年来,越来越多的生物学实验研究表明,microRNA (miRNA)在人类复杂疾病的发展中发挥着重要作用。因此,预测miRNA与疾病之间的关联有助于疾病的准确诊断和有效治疗。由于传统的生物学实验是一种昂贵且耗时的方式,于是许多基于生物学数据的计算模型被提出来预测miRNA与疾病的关联。本研究提出了一种端到端的深度学习模型来预测miRNA-疾病关联关系,称为MDAGAC。首先,通过整合疾病语义相似性,miRNA功能相似性和高斯相互作用谱核相似性,构建miRNA和疾病的相似性图。然后,通过图自编码器和协同训练来改善标签传播的效果。该模型分别在miRNA图和疾病图上建立了两个图自编码器,并对这两个图自编码器进行了协同训练。miRNA图和疾病图上的图自编码器能够通过初始关联矩阵重构得分矩阵,这相当于在图上传播标签。miRNA-疾病关联的预测概率可以从得分矩阵得到。基于五折交叉验证的实验结果表明,MDAGAC方法可靠有效,优于现有的几种预测miRNA-疾病关联的方法。 相似文献
6.
7.
N6,2′-O-二甲基腺苷(m6Am)是一种常见的RNA分子的可逆修饰。部分研究已经说明m6Am对mRNA的影响,但现阶段对m6Am的生物学功能探索仍不够。所以我们提出了m6AmTwins,一种新的端到端双胞胎网络,将Transformer(自动编码器)和双向门控循环单元(Bi-GRU)有机结合,简单利用RNA序列得到RNA的检测性。相比于现有的算法,本文亮点在于利用对比学习,构建新的损失函数来训练m6AmTwins模型,提高了模型的泛化能力。基于Twins网络和简单编码方案,在两组正负比为1∶10的非平衡数据集下,其独立测试集上均取得了较好的结果,马修斯相关系数(MCC)分别得到0.53和0.545。同时,为增强m6AmTwins模型的鲁棒性(robustness),本文在训练集上还进行了10折交叉验证,其MCC结果分别为0... 相似文献
8.
浮游植物作为水生态系统中最重要的生物组成部分之一,对水环境敏感,在水环境监测中得到了广泛的关注。然而水生环境复杂多样,准确高效地识别浮游植物是监测工作中的一大挑战。当前浮游植物识别方法可分为经典形态学分类、分子标记和人工智能图像识别三类。前两种方法已被广泛采用,但费时费力,不利于监测机构的大规模应用和推广。同样,利用图像进行自动化分类难以在高准确率与高效率上达到平衡。深度学习技术的发展为此提供了新思路。本文提出一种新的深度卷积神经网络RAN-11。该网络以残差注意力网络Attention-56和Attention-92为基础,凭借通道对齐融合主干上的底层特征与顶层特征,通过调整注意力模块和残差快个数以精简结构,并引入了Leaky ReLU激活函数代替ReLU。以太湖11个优势属共计1036张图像为数据来源进行对比验证。除星杆藻外,RAN-11对单一优势属的的查准率都在90%以上,并且有5个优势属达到100%的查准率。RAN-11的识别准确率为95.67%,推理速率为41.5帧/s,不仅比Attention-92(95.19%的准确率,23.6帧/s)更准确,而且比Attention-56(94.71%的准确率,41.2帧/s)更快,真正兼顾了准确率与效率。研究结果表明:(1)RAN-11在查准率、准确率和推理速率上优于原始残差注意力网络,更优于以词包模型为代表的传统图像识别方法;(2)融合多尺度特征、精简网络结构和优化激活函数是提高卷积神经网络性能的有力手段。建立在经典分类基础之上,本文提出新的残差注意力网络来提升浮游植物鉴定技术,并构建出浮游植物自动化识别系统,识别准确率高、易于推广,对于实现水体中浮游植物的自动化监测具有重要意义。 相似文献
9.
目的 长非编码RNA(lncRNAs)参与多种重要的生物学过程并与各种人类疾病密切相关,因此,lncRNA-疾病关联预测研究有助于疾病的诊断、治疗和在分子水平理解人类疾病的发生发展机制。目前,大多数lncRNA-疾病关联预测方法倾向于浅层整合lncRNA和疾病的相关信息,忽略网络拓扑结构中的深层嵌入特征;另外通过随机选取lncRNA-疾病非关联对构建负样本训练集合,影响预测方法的鲁棒性。方法 本文提出一种基于网络嵌入的NELDA方法,预测潜在的lncRNA-疾病关联关系。NELDA首先利用lncRNA 表达谱、疾病本体论和已知的lncRNA-疾病关联关系,构建lncRNA相似性网络、疾病相似性网络和lncRNA-疾病关联网络。然后,通过设计4个深度自编码器分别从lncRNA/疾病的相似性网络、lncRNA-疾病关联网络学习lncRNA和疾病的低维网络嵌入特征。串联lncRNA和疾病的相似性网络嵌入特征及lncRNA和疾病的关联网络嵌入特征,分别输入两个支持向量机分类器预测lncRNA-疾病关联。最后,采用加权融合策略融合两个支持向量机分类器的预测结果,给出lncRNA-疾病关联关系的最终预测结果。另外,根据已知的lncRNA-疾病关联对和疾病语义相似性,设计一种负样本选取策略构建可信度相对较高的lncRNA-疾病非关联对样本集,用以改善分类器的鲁棒性,该策略通过设计一种打分函数为每对lncRNA-疾病进行打分,选取得分较低的lncRNA-疾病对作为lncRNA-疾病非关联对样本(即负样本)。结果 十折交叉验证实验结果表明:NELDA能够有效预测lncRNA-疾病关联关系,其AUC达到0.982 7,比现有LDASR和 LDNFSGB方法分别提高了0.062 7和0.020 7。另外,负样本选取策略与决策级加权融合策略能够有效改善NELDA预测性能。胃癌和乳腺癌案例研究中,29/40(72.5%)预测的与胃癌和乳腺癌关联lncRNAs,在近期文献和公共数据库中能够发现相关的支撑证据。结论 这些实验结果表明,NELDA是一种有效的lncRNA-疾病关联关系预测方法,具有挖掘潜在lncRNA-疾病关联关系的能力。 相似文献
10.
RNA 5-甲基胞嘧啶(m5C)修饰在许多生物过程中发挥重要的作用,对m5C位点的准确识别有助于更好地理解其生物学功能,所以识别m5C甲基化位点十分必要。尽管已发展了多种识别m5C甲基化位点的机器学习方法,但预测能力仍有待提高。本文基于双向长短时记忆网络和注意力机制,提出了一种预测RNA m5C甲基化位点的深度学习算法。用该方法在人、小鼠、酿酒酵母和拟南芥共4种生物的RNA m5C数据集上进行实验,m5C位点预测AUC值分别达到92.5%、99.7%、93.6%和86.5%。与现有预测方法相比,该方法具有较好的预测性能,并且具有更优的泛化能力,为RNA m5C甲基化位点预测提供了一种新方法。 相似文献
11.
Air pollution is a serious threat to both the ecological environment and the physical health of individuals. Therefore, accurate air quality prediction is urgent and necessary for pollution mitigation and residents’ travel. However, few existing models are established based on the dynamic spatiotemporal correlation of air pollutants to predict air quality. In this paper, a novel deep learning model combining the dynamic graph convolutional network and the multi-channel temporal convolutional network (DGC-MTCN) is proposed for air quality prediction. To efficiently represent the time-varying spatial dependencies, a new spatiotemporal dynamic correlation calculation method based on gray relation analysis is proposed to construct dynamic adjacency matrices. Then, the spatiotemporal features are sufficiently extracted by the graph convolutional network and the multi-channel temporal convolutional network. Two real-world air quality datasets collected from Beijing and Fushun are applied to verify the performance of our proposed model. The experimental results show that compared with other baselines, the DGC-MTCN model has excellent prediction accuracy. Especially for the prediction of multi-step and different stations, our model performs better temporal stability and generalization ability. 相似文献
12.
PurposeThis study aims to investigate the feasibility of using convolutional neural networks to predict an accurate and high resolution dose distribution from an approximated and low resolution input dose.MethodsSixty-six patients were treated for prostate cancer with VMAT. We created the treatment plans using the Acuros XB algorithm with 2 mm grid size, followed by the dose calculated using the anisotropic analytical algorithm with 5 mm grid with the same plan parameters. U-net model was used to predict 2 mm grid dose from 5 mm grid dose. We investigated the two models differing for the training data used as input, one used just the low resolution dose (D model) and the other combined the low resolution dose with CT data (DC model). Dice similarity coefficient (DSC) was calculated to ascertain how well the shape of the dose-volume is matched. We conducted gamma analysis for the following: DVH from the two models and the reference DVH for all prostate structures.ResultsThe DSC values in the DC model were significantly higher than those in the D model (p < 0.01). For the CTV, PTV, and bladder, the gamma passing rates in the DC model were significantly higher than those in the D model (p < 0.002–0.02). The mean doses in the CTV and PTV for the DC model were significantly better matched to those in the reference dose (p < 0.0001).ConclusionsThe proposed U-net model with dose and CT image used as input predicted more accurate dose. 相似文献
13.
在遗传学中,终止子是位于poly(A)位点下游、长度在数百碱基以内、包含多个回文序列、具有终止转录功能的DNA结构域,其主要作用是使转录终止。在原核生物基因组中有两类转录终止子,即Rho-dependent因子和Rho-independent因子。在本项研究中,提出了一种新的预测模型(TermCNN)来快速准确地识别细菌转录终止子。该模型将具有代表性的6-mer特征子集(2 537个特征)和电子—离子相互作用伪电位(EIIP)作为输入向量,利用卷积神经网络(CNN)构建预测模型。五折交叉验证和独立测试的结果表明该模型优于最新的预测模型iTerm-PseKNC。值得注意的是,该模型在跨物种试验中具有明显的优势。它可以高度精确地预测大肠杆菌(E. coli)和枯草芽孢杆菌(B. subtilis)的转录终止子。 相似文献
14.
15.
颅骨性别鉴定在法医学和颅骨面貌复原等领域具有重要研究意义和应用价值,针对传统颅骨性别鉴定需要专家参与且主观性强、计算机辅助方法需要人工标定特征点等问题,本文提出了结合改进卷积神经网络和最小二乘法的颅骨性别鉴定方法。首先,获取三维颅骨模型多角度颅骨图像,利用改进的卷积神经网络计算每个样本的每张图像属于男性和女性的概率;其次,基于概率均值采用最小二乘法计算每张图像对性别鉴定的权重;最后,利用上述步骤得到的最优参数构造决策函数,通过决策值完成颅骨性别鉴定。本文方法抛弃了繁琐的手动测量,对完整颅骨的性别鉴定正确率高达94.4%,对不完整颅骨的性别鉴定正确率高达87.5%,能够获得较好的颅骨性别鉴定性能。 相似文献
16.
Summary A clustering method is presented that groups sample plots (stands or other units) together, based on their proximity in a
multidimensional test space in which the axes represent the attributes (species) of the individuals (sample plots, etc.).
The resulting dendrogram is used to make subjective judgements on the type and distinctiveness of the groupings.
The method is demonstrated on the vegetation data from 43 sample plots in mixed white spruce stands in Saskatchewan. The results
were related to the levels of 31 habitat variables. Significant differences among the groups were shown to exist for basal
area, height growth, and root penetration of white spruce and for field capacity and pH of the soil.
When possible the nomenclature of Fernald (1950) is followed for the Pteridophyta and Spermatophyta; elsewhere Rydberg's (1954)
nomenclature is followed. The nomenclature of Grout (1928–1940) is used for the Musci, with the exception ofCalliergonella schreberi, which is replaced byPleurozium schreberi (Willd) Mitt. 相似文献
17.
Sparse-view computed tomography (CT) is a recent approach to reducing the radiation dose in patients and speeding up the data acquisition. Consequently, sparse-view CT has been of particular interest among researchers within the CT community. Advanced reconstruction algorithms for sparse-view CT, such as iterative algorithms with total-variation (TV), have been studied along with the problem of increasing computational burden and the blurring of artifacts in the reconstructed images. Studies on deep-learning-based approaches applying U-NET have recently achieved remarkable outcomes in various domains including low-dose CT. In this study, we propose a new method for sparse-view CT reconstruction based on a multi-level wavelet convolutional neural network (MWCNN). First, a filtered backprojection (FBP) was used to reconstruct a sparsely sampled sinogram from 60, 120, and 180 projections. Subsequently, the sparse-view data obtained from FBP were fed to a deep-learning network, i.e., the MWCNN. Our network architecture combines a wavelet transform and modified U-NET without pooling. By replacing the pooling function with the wavelet transform, the receptive field is enlarged to improve the performance. We qualitatively and quantitatively evaluated the interpolation, iterative TV method, and standard U-NET in terms of a reduction in the streaking artifacts and a preservation of the anatomical structures. When compared with other methods, the proposed method showed the highest performance based on various evaluation parameters such as the structural similarity, root mean square error, and resolution. These results indicate that the MWCNN possesses a powerful potential for achieving a sparse-view CT reconstruction. 相似文献
18.
Yiwei Wang Ting Huang Xiao Sun Yudong Wang 《Journal of cellular biochemistry》2019,120(11):18845-18853
Endometrial cancer is one of the most common gynecological malignant tumors. The roles of competing endogenous RNAs (ceRNAs) in this disease, however, remain unclear. In this study, we constructed a ceRNA network to reveal the core ceRNAs in endometrial cancer. Differentially expressed genes were summarized from The Cancer Genome Atlas database, whereupon 140 genes were identified for building the network. Further correlation, survival, and enrichment analyses suggested that these genes may help towards elucidating the molecular mechanisms of endometrial cancer. After validation of the findings with the GSE17025 data set, LINC00958, microRNA-761, and DOLPP1 were highlighted as the critical genes in the ceRNA network. Our work suggests that LINC00958 may regulate DOLPP1 by “sponging” miR-761 in endometrial cancer. 相似文献