首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
目的 基于位点特异性打分矩阵(position-specific scoring matrices,PSSM)的预测模型已经取得了良好的效果,基于PSSM的各种优化方法也在不断发展,但准确率相对较低,为了进一步提高预测准确率,本文基于卷积神经网络(convolutional neural networks,CNN)算法做了进一步研究。方法 采用PSSM将启动子序列处理成数值矩阵,通过CNN算法进行分类。大肠杆菌K-12(Escherichia coli K-12,E.coli K-12,下文简称大肠杆菌)的Sigma38、Sigma54和Sigma70 3种启动子序列被作为正集,编码(Coding)区和非编码(Non-coding)区的序列为负集。结果 在预测大肠杆菌启动子的二分类中,准确率达到99%,启动子预测的成功率接近100%;在对Sigma38、Sigma54、Sigma70 3种启动子的三分类中,预测准确率为98%,并且针对每一种序列的预测准确率均可以达到98%以上。最后,本文以Sigma38、Sigma54、Sigma70 3种启动子分别和Coding区或者Non-coding区序列做四分类,预测得到的准确性为0.98,对3种Sigma启动子均衡样本的十交叉检验预测精度均可以达到0.95以上,海明距离为0.016,Kappa系数为0.97。结论 相较于支持向量机(support vector machine,SVM)等其他分类算法,CNN分类算法更具优势,并且基于CNN的分类优势,编码方式亦可以得到简化。  相似文献   

2.
Analytical bounding functions for diffusion problems with Michaelis-Menten kinetics were recently presented by Anderson and Arthurs, 1985 (Bull. math. Biol. 47, 145–153). Their methods, successful to some extent for a small range of parameters, has the disadvantage of providing a weak upper bound. The optimal approach for the use of one-line bounding kinetics is presented. The use of two-line bounding kinetics is also shown, in order to give, sufficient accuracy in those cases where the one-line approach does not provide satisfactory results. The bounding functions provide excellent upper and lower bounds on the true solution for the entire range of kinetic and transport parameters.  相似文献   

3.
We have recently shown that an energy penalty for the incorporation of residual tensorial constraints into molecular structure calculations can be formulated without the explicit knowledge of the Saupe orientation tensor (Moltke and Grzesiek, J. Biomol. NMR, 1999, 15, 77–82). Here we report the implementation of such an algorithm into the program X-PLOR. The new algorithm is easy to use and has good convergence properties. The algorithm is used for the structure refinement of the HIV-1 Nef protein using 252 dipolar coupling restraints. The approach is compared to the conventional penalty function with explicit knowledge of the orientation tensor's amplitude and rhombicity. No significant differences are found with respect to speed, Ramachandran core quality or coordinate precision.  相似文献   

4.
Abstract

The tetraribonucleoside triphosphate 15 and the cyclic tetraribonucleotide 16 have been prepared by a recently reported triester approach in solution, involving H-phosphonate coupling.  相似文献   

5.
自动对焦是实现线虫自动化筛选的一个重要步骤.在光学显微镜系统中,通过采集同一个视野下不同焦面的图像,再通过清晰度评价函数对这些图像进行运算,得到的最大值被认为是最佳对焦位置.在本研究中,对16种常用的自动对焦算法以及最近提出的一些算法进行了评估,通过评估找出最适合线虫脂滴图像的自动对焦算法,从而搭建一套线虫脂滴自动化筛选系统.同时就对焦精度、运算时间、抗噪声能力、对焦曲线等特征进行了分析评价,结果表明,大多数算法对线虫脂滴图像都有较好的表现,特别是绝对Tenengrad算法在对焦精度上有最好的表现,我们将优选该算法应用到线虫脂滴自动化筛选系统中.  相似文献   

6.
目的 长链非编码RNA在遗传、代谢和基因表达调控等方面发挥着重要作用。然而,传统的实验方法解析RNA的三级结构耗时长、费用高且操作要求高。此外,通过计算方法来预测RNA的三级结构在近十年来无突破性进展。因此,需要提出新的预测算法来准确的预测RNA的三级结构。所以,本文发展可以用于提高RNA三级结构预测准确性的碱基关联图预测方法。方法 为了利用RNA理化特征信息,本文应用多层全卷积神经网络和循环神经网络的深度学习算法来预测RNA碱基间的接触概率,并通过注意力机制处理RNA序列中碱基间相互依赖的特征。结果 通过多层神经网络与注意力机制结合,本文方法能够有效得到RNA特征值中局部和全局的信息,提高了模型的鲁棒性和泛化能力。检验计算表明,所提出模型对序列长度L的4种标准(L/10、L/5、L/2、L)碱基关联图的预测准确率分别达到0.84、0.82、0.82和0.75。结论 基于注意力机制的深度学习预测算法能够提高RNA碱基关联图预测的准确率,从而帮助RNA三级结构的预测。  相似文献   

7.
The insertion-deletion model developed by Thorne, Kishino and Felsenstein (1991, J. Mol. Evol., 33, 114–124; the TKF91 model) provides a statistical framework of two sequences. The statistical alignment of a set of sequences related by a star tree is a generalization of this model. The known algorithm computes the probability of a set of such sequences in O(l 2k ) time, where l is the geometric mean of the sequence lengths and k is the number of sequences. An improved algorithm is presented whose running time is only O(22k l k).  相似文献   

8.
Abstract

Algorithms of secondary structure prediction have undergone the developments of nearly 30 years. However, the problem of how to appropriately evaluate and compare algorithms has not yet completely solved. A graphic method to evaluate algorithms of secondary structure prediction has been proposed here. Traditionally, the performance of an algorithm is evaluated by a number, i.e., accuracy of various definitions. Instead of a number, we use a graph to completely evaluate an algorithm, in which the mapping points are distributed in a three-dimensional space. Each point represents the predictive result of the secondary structure of a protein. Because the distribution of mapping points in the 3D space generally contains more information than a number or a set of numbers, it is expected that algorithms may be evaluated and compared by the proposed graphic method more objectively. Based on the point distribution, six evaluation parameters are proposed, which describe the overall performance of the algorithm evaluated. Furthermore, the graphic method is simple and intuitive. As an example of application, two advanced algorithms, i.e., the PHD and NNpredict methods, are evaluated and compared. It is shown that there is still much room for further improvement for both algorithms. It is pointed out that the accuracy for predicting either the α-helix or β-strand in proteins with higher α-helix or β-strand content, respectively, should be greatly improved for both algorithms.  相似文献   

9.
The accuracy of global Smith-Waterman alignments and Pareto-optimal alignments depending on the degree of sequence similarity (percent of coincidence, %id, and the number of removed fragments NGap) has been examined. An algorithm for constructing a set of three to six alignments has been developed of which the best alignment on the average exceeds in accuracy the best alignment that can be constructed using the Smith-Waterman algorithm. For weakly homologous sequences (%id 15, NGap 20), the increase in accuracy is on the average about 8%, with the average accuracy of the global Smith-Waterman alignments being about 38% (the accuracy was estimated on model test sets).  相似文献   

10.
Biswas  Bipasa  Lai  Yinglei 《BMC genomics》2019,20(2):35-47
Background

The next generation sequencing technology allows us to obtain a large amount of short DNA sequence (DNA-seq) reads at a genome-wide level. DNA-seq data have been increasingly collected during the recent years. Count-type data analysis is a widely used approach for DNA-seq data. However, the related data pre-processing is based on the moving window method, in which a window size need to be defined in order to obtain count-type data. Furthermore, useful information can be reduced after data pre-processing for count-type data.

Results

In this study, we propose to analyze DNA-seq data based on the related distance-type measure. Distances are measured in base pairs (bps) between two adjacent alignments of short reads mapped to a reference genome. Our experimental data based simulation study confirms the advantages of distance-type measure approach in both detection power and detection accuracy. Furthermore, we propose artificial censoring for the distance data so that distances larger than a given value are considered potential outliers. Our purpose is to simplify the pre-processing of DNA-seq data. Statistically, we consider a mixture of right censored geometric distributions to model the distance data. Additionally, to reduce the GC-content bias, we extend the mixture model to a mixture of generalized linear models (GLMs). The estimation of model can be achieved by the Newton-Raphson algorithm as well as the Expectation-Maximization (E-M) algorithm. We have conducted simulations to evaluate the performance of our approach. Based on the rank based inverse normal transformation of distance data, we can obtain the related z-values for a follow-up analysis. For an illustration, an application to the DNA-seq data from a pair of normal and tumor cell lines is presented with a change-point analysis of z-values to detect DNA copy number alterations.

Conclusion

Our distance-type measure approach is novel. It does not require either a fixed or a sliding window procedure for generating count-type data. Its advantages have been demonstrated by our simulation studies and its practical usefulness has been illustrated by an experimental data application.

  相似文献   

11.
目的:为解决肿瘤亚型识别过程中易出现的维数灾难和过拟合问题,提出了一种改进的粒子群BP神经网络集成算法。方法:算法采用欧式距离和互信息来初步过滤冗余基因,之后用Relief算法进一步处理,得到候选特征基因集合。采用BP神经网络作为基分类器,将特征基因提取与分类器训练相结合,改进的粒子群对其权值和阈值进行全局搜索优化。结果:当隐含层神经元个数为5时,候选特征基因个数为110时,QPSO/BP算法全局优化和搜索,此时的分类准确率最高。结论:该算法不但提高了肿瘤分型识别的准确率,而且降低了学习的复杂度。  相似文献   

12.
A computer algorithm is presented which equiprobably generates any member of the set of all directed trees withk labeled terminal nodes and unlabeled interior nodes. The algorithm requires roughlyk 2 /2 storage locations. The one-time initialization requiresO(k 2 ) time, while generating each tree requiresO(k) time. Contribution No. 477 in Ecology and Evolution, State University of New York at Stony Brook. This research was supported by Grant No. DEB8003508 from the National Science Foundation to Robert R. Sokal.  相似文献   

13.
Abstract

A third-order algorithm for stochastic dynamics (SD) simulations is proposed, identical to the powerful molecular dynamics leap-frog algorithm in the limit of infinitely small friction coefficient γ. It belongs to the class of SD algorithms, in which the integration time step Δt is not limited by the condition Δt ≤ γ?1, but only by the properties of the systematic force. It is shown how constraints, such as bond length or bond angle constraints, can be incorporated in the computational scheme. It is argued that the third-order Verlet-type SD algorithm proposed earlier may be simplified without loosing its third-order accuracy. The leap-frog SD algorithm is proven to be equivalent to the verlet-type SD algorithm. Both these SD algorithms are slightly more economical on computer storage than the Beeman-type SD algorithm.  相似文献   

14.
Summary The peptide sequential assignment algorithm presented here was implemented as a macro within the CONnectivity TRacing ASsignment Tools (CONTRAST) computer software package. The algorithm provides a semi- or fully automated global means of sequentially assigning the NMR backbone resonances of proteins. The program's performance is demonstrated here by its analysis of realistic computer-generated data for IIIGlc, a 168-residue signal-transducing protein of Escherichia coli [Pelton et al. (1991) Biochemistry, 30, 10043–10057]. Missing experimental data (19 resonances) were generated so that a complete assignment set could be tested. The algorithm produces sequential assignments from appropriate peak lists of nD NMR data. It quantifies the ambiguity of each assignment and provides ranked alternatives. A best first approach, in which high-scoring local assignments are made before and in preference to lower scoring assignments, is shown to be superior (in terms of the current set of CONTRAST scoring routines) to approaches such as simulated annealing that seek to maximize the combined scores of the individual assignments. The robustness of the algorithm was tested by evaluating the effects of imposed frequency imprecision (scatter), added false signals (noise), missing peaks (incomplete data), and variation in userdefined tolerances on the performance of the algorithm.  相似文献   

15.
目的 对肺通气过程进行床旁实时连续图像监控,是机械通气患者和临床医生的迫切需求。肺部电阻抗成像(EIT)可反映呼吸引起的胸腔电特性变化分布,在肺通气监测方面具有天然的优势。本文目的在于建立基于径向基函数神经网络(RBFNN)的肺部加权频差电阻抗成像(wfd-EIT)方法,实现对肺通气的高空间分辨率成像。方法 利用肺部wfd-EIT成像方法实时描绘胸腔电导率分布状况,再通过RBFNN将目标区域可视化并精准识别其边界信息。首先通过数值分析模拟,在各个激励频率利用COMSOL与MATLAB软件建立2 028个仿真样本,分为训练样本集和测试样本集,验证所提出成像方法的可行性和有效性。其次,为了验证仿真结果,建立肺部物理模型,选用具有低电导特性的生物组织模拟肺部通气区域,对其进行成像实验,并采用图像相关系数(ICC)和肺区域比(LRR)定量数据衡量成像方法的准确性。结果 wfd-EIT方法可以在任意时刻进行图像重建,并能够准确反映出目标区域的电特性分布;利用基于RBFNN的算法能够增强目标区域的成像精度,ICC可达0.94以上,更好地凸显其边界轮廓信息。结论 通过wfd-EIT成像方法,利用多频阻抗谱同步测量实现目标区域的快速可视化,并结合RBFNN网络逼近任意非线性函数的优点,实现对目标区域电特性变化的精准识别,为下一步进行临床肺通气的EIT图像监测奠定了理论和技术基础。  相似文献   

16.
Abstract

The algorithm “Curves”, that we have recently presented in this journal (J. Biolmol. Str. Dynam. 6, 63–91 (1988)), is updated to take into account the conventions developed at the Cambridge meeting on DNA curvature (September 1988) and extended to the calculation of local parameters. In addition, the principles which govern the choices made in establishing the Curves algorithm are compared with the approaches adopted by other authors.  相似文献   

17.
PurposeTo perform a detailed evaluation of dose calculation accuracy and clinical feasibility of Mobius3D. Of particular importance, multileaf collimator (MLC) modeling accuracy in the Mobius3D dose calculation algorithm was investigated.MethodsMobius3D was fully commissioned by following the vendor-suggested procedures, including dosimetric leaf gap (DLG) optimization. The DLG optimization determined an optimal DLG correction factor which minimized the average difference between calculated and measured doses for 13 patient volumetric-modulated arc therapy (VMAT) plans. Two sets of step-and-shoot plans were created to examine MLC and off-axis open fields modeling accuracy of the Mobius3D dose calculation algorithm: MLC test set and off-axis open field test set. The test plans were delivered to MapCHECK for the MLC tests and an ionization chamber for the off-axis open field test, and these measured doses were compared to Mobius3D-calculated doses.ResultsThe mean difference between the calculated and measured doses across the 13 VMAT plans was 0.6% with an optimal DLG correction factor of 1.0. The mean percentage of pixels passing gamma from a 3%/1 mm gamma analysis for the MLC test set was 43.5% across the MLC tests. For the off-axis open field tests, the Mobius3D-calculated dose for 1.5 cm square field was −4.6% lower than the chamber-measured dose.ConclusionsIt was demonstrated that Mobius3D has dose calculation uncertainties for small fields and MLC tongue-and-groove design is not adequately taken into consideration in Mobius3D. Careful consideration of DLG correction factor, which affects the resulting dose distributions, is required when commissioning Mobius3D for patient-specific QA.  相似文献   

18.
PurposeWhole-body bone scintigraphy is the most widely used method for detecting bone metastases in advanced cancer. However, its interpretation depends on the experience of the radiologist. Some automatic interpretation systems have been developed in order to improve diagnostic accuracy. These systems are pixel-based and do not use spatial or textural information of groups of pixels, which could be very important for classifying images with better accuracy. This paper presents a fast method of object-oriented classification that facilitates easier interpretation of bone scintigraphy images.MethodsNine whole-body images from patients suspected with bone metastases were analyzed in this preliminary study. First, an edge-based segmentation algorithm together with the full lambda-schedule algorithm were used to identify the object in the bone scintigraphy and the textural and spatial attributes of these objects were calculated. Then, a set of objects (224 objects, ~ 46% of the total objects) were selected as training data based on visual examination of the image, and were assigned to various levels of radionuclide accumulation before performing the data classification using both k-nearest-neighbor and support vector machine classifiers. The performance of the proposed method was evaluated using as metric the statistical parameters calculated from error matrix.ResultsThe results revealed that the proposed object-oriented classification approach using either k-nearest-neighbor or support vector machine as classification methods performed well in detecting bone metastasis in terms of overall accuracy (86.62 ± 2.163% and 86.81 ± 2.137% respectively) and kappa coefficient (0.6395 ± 0.0143 and 0.6481 ± 0.0218 respectively).ConclusionsIn conclusion, the described method provided encouraging results in mapping bone metastases in whole-body bone scintigraphy.  相似文献   

19.
Abstract

Computational studies of carbohydrates that do not consider explicit solvent molecules suffer from the strong tendency of the carbohydrate pendant hydroxyl groups to form intramolecular hydrogen bonds that are unlikely to be present in protic media. In this paper a novel approach towards molecular modelling of carbohydrates is described. The average effect of intra- and intermolecular hydrogen bonding is introduced into the potential energy function by adding a new (extended) atom type representing a carbohydrate hydroxyl group to the CHARMm force field; we coin the name CHEAT (Carbohydrate Hydroxyls represented by Extended AToms) for the resulting force field. As a training set for the parametrisation of CHEAT we used ethylene glycol, 10 cyclohexanols, 5 inositols, and 12 glycopyranoses for which in total 64 conformational energy differences were estimated using a set of steric interaction energies between hydroxyl and/or methyl groups on six-membered ring compounds as derived by Angyal (Angew. Chem., 8, 172-182, (1969)). The root-mean-square deviation between the estimated energy differences and the corresponding values obtained by our CHEAT approach amounts to 0.37 kcal/mol (n = 64). The CHEAT approach, which is claimed to calculate aqueous state compatible energetical and conformational properties of carbohydrates, is computationally very efficient and facilitates the calculation of nanosecond range MD trajectories as well as systematic conformational searches of oligosaccharides.  相似文献   

20.
目的 研究构建基于共祖(identity-by-descent,IBD)片段算法预测远亲缘关系分析流程并评估预测准确性。方法 采用高密度单核苷酸多态性(single nucleotide polymorphism,SNP)芯片对253份家系样本进行检测,研究基于IBD片段算法的分析流程进行两两个体间亲缘关系预测,评估预测准确性。随机减少SNP位点,评估位点数对算法预测准确性的影响。结果 IBD片段算法预测1~7级亲缘关系平均置信区间准确率为94.72%,预测可信度为99.77%,6级及以上亲缘关系预测时出现假阴性。随着SNP数量减少,预测准确性会出现一定程度的下降。结论 IBD片段算法可用于7级以内亲缘关系的预测,该算法在群体遗传学、法医遗传学等领域有重要应用价值。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号