首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. In the alignment of a family of protein sequences, global MSA algorithms perform better than local ones in many cases, while local ones perform better than global ones when some sequences have long insertions or deletions (indels) relative to others. Many recent leading MSA algorithms have incorporated pairwise alignment information obtained from a mixture of sources into their scoring system to improve accuracy of alignment containing long indels.  相似文献   

2.
A novel closed loop control framework is proposed to inhibit epileptiform wave in a neural mass model by external electric field, where the unscented Kalman filter method is used to reconstruct dynamics and estimate unmeasurable parameters of the model. Specifically speaking, the iterative learning control algorithm is introduced into the framework to optimize the control signal. In the proposed method, the control effect can be significantly improved based on the observation of the past attempts. Accordingly, the proposed method can effectively suppress the epileptiform wave as well as showing robustness to noises and uncertainties. Lastly, the simulation is carried out to illustrate the feasibility of the proposed method. Besides, this work shows potential value to design model-based feedback controllers for epilepsy treatment.  相似文献   

3.
The potential of five different groups of materials--carbons, glass and ceramics, polymers, hydrogels and collagen--as biomaterials in artificial implant applications is examined. In addition to the physical and/or structural properties of these materials, the blood and tissue responses to implants made of these biomaterials for various applications are presented. Emphasis is placed on materials related to the intended application; as catheter tips and biosensors for glucose to be used in conjunction with an implantable insulin delivery system as a complete artificial pancreas.  相似文献   

4.
MOTIVATION: We describe a novel method for detecting the domain structure of a protein from sequence information alone. The method is based on analyzing multiple sequence alignments that are derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence and are combined into a single predictor using a neural network. The output is further smoothed and post-processed using a probabilistic model to predict the most likely transition positions between domains. RESULTS: The method was assessed using the domain definitions in SCOP and CATH for proteins of known structure and was compared with several other existing methods. Our method performs well both in terms of accuracy and sensitivity. It improves significantly over the best methods available, even some of the semi-manual ones, while being fully automatic. Our method can also be used to suggest and verify domain partitions based on structural data. A few examples of predicted domain definitions and alternative partitions, as suggested by our method, are also discussed. AVAILABILITY: An online domain-prediction server is available at http://biozon.org/tools/domains/  相似文献   

5.
Localization of seizure sources prior to neurosurgery is crucial. In this paper, a new method is proposed to localize the seizure sources from multi-channel electroencephalogram (EEG) signals. Blind source separation based on second order blind identification (SOBI) is primarily applied to estimate the brain source signals in each window of the EEG signals. A new clustering method based on rival penalized competitive learning (RPCL) is then developed to cluster the rows of the estimated unmixing matrices in all the windows. The algorithm also includes pre and post-processing stages. By multiplying each cluster center to the EEG signals, the brain signal sources are approximated. According to a complexity value measure, the main seizure source signal is separated from the others. This signal is projected back to the electrodes’ space and is subjected to the dipole source localization using a single dipole model. The simulation results verify the accuracy of the system. In addition, correct localization of the seizure source is consistent with the clinical tests derived using the simultaneous intracranial recordings.  相似文献   

6.
Fast sequence clustering using a suffix array algorithm   总被引:1,自引:0,他引:1  
MOTIVATION: Efficient clustering is important for handling the large amount of available EST sequences. Most contemporary methods are based on some kind of all-against-all comparison, resulting in a quadratic time complexity. A different approach is needed to keep up with the rapid growth of EST data. RESULTS: A new, fast EST clustering algorithm is presented. Sub-quadratic time complexity is achieved by using an algorithm based on suffix arrays. A prototype implementation has been developed and run on a benchmark data set. The produced clusterings are validated by comparing them to clusterings produced by other methods, and the results are quite promising. AVAILABILITY: The source code for the prototype implementation is available under a GPL license from http://www.ii.uib.no/~ketil/bio/.  相似文献   

7.
 The coherence function measures the amount of correlation between two signals x and y as a function of the frequency, independently of their causal relationships. Therefore, the coherence function is not useful in deciding whether an open-loop relationship between x and y is set (x acts on y, but the reverse relationship is prevented) or x and y interact in a closed loop (x affects y, and vice versa). This study proposes a method based on a bivariate autoregressive model to derive the strength of the causal coupling on both arms of a closed loop. The method exploits the definition of causal coherence. After the closed-loop identification of the model coefficients, the causal coherence is calculated by switching off separately the feedback or the feedforward path, thus opening the closed loop and fixing causality. The method was tested in simulations and applied to evaluate the degree of the causal coupling between two variables known to interact in a closed loop mainly at a low frequency (LF, around 0.1 Hz) and at a high frequency (HF, at the respiratory rate): the heart period (RR interval) and systolic arterial pressure (SAP). In dogs at control, the RR interval and the SAP are highly correlated at HF. This coupling occurs in the causal direction from the RR interval to the SAP (the mechanical path), while the coupling on the reverse causal direction (the baroreflex path) is not significant, thus pointing out the importance of the direct effects of respiration on the RR interval. Total baroreceptive denervation, by opening the closed loop at the level of the influences of SAP on RR interval, does not change these results. In elderly healthy men at rest, the RR interval and SAP are highly correlated at the LF and the HF. At the HF, a significant coupling in both causal directions is found, even though closed-loop interactions are detected in few cases. At the LF, the link on the baroreflex pathway is negligible with respect to that on the reverse mechanical one. In heart transplant recipients, in which SAP variations do not cause RR interval changes as a result of the cardiac denervation, the method correctly detects a significant coupling only on the pathway from the RR interval to the SAP. Received: 28 June 2001 / Accepted in revised form: 23 October 2001  相似文献   

8.
ABSTRACT

Actigraphy is widely used in sleep studies but lacks a universal unsupervised algorithm for sleep/wake identification. An unsupervised algorithm is useful in large-scale population studies and in cases where polysomnography (PSG) is unavailable, as it does not require sleep outcome labels to train the model but utilizes information solely contained in actigraphy to learn sleep and wake characteristics and separate the two states. In this study, we proposed a machine learning unsupervised algorithm based on the Hidden Markov Model (HMM) for sleep/wake identification. The proposed algorithm is also an individualized approach that takes into account individual variabilities and analyzes each individual actigraphy profile separately to infer sleep and wake states. We used Actiwatch and PSG data from 43 individuals in the Multi-Ethnic Study of Atherosclerosis study to evaluate the method performance. Epoch-by-epoch comparisons and sleep variable comparisons were made between our algorithm, the unsupervised algorithm embedded in the Actiwatch software (AS), and the pre-trained supervised UCSD algorithm. Using PSG as the reference, the accuracy was 85.7% for HMM, 84.7% for AS, and 85.0% for UCSD. The sensitivity was 99.3%, 99.7%, and 98.9% for HMM, AS, and UCSD, respectively, and the specificity was 36.4%, 30.0%, and 31.7%, respectively. The Kappa statistic was 0.446 for HMM, 0.399 for AS, and 0.311 for UCSD, suggesting fair to moderate agreement between PSG and actigraphy. The Bland–Altman plots further show that the total sleep time, sleep latency, and sleep efficiency estimates by HMM were closer to PSG with narrower 95% limits of agreement than AS and UCSD. All three methods tend to overestimate sleep and underestimate wake compared to PSG. Our HMM approach is also able to differentiate relatively active and sedentary individuals by quantifying variabilities in activity counts: individuals with higher estimated activity variabilities tend to show more frequent sedentary behaviors. Our unsupervised data-driven HMM algorithm achieved better performance than the commonly used Actiwatch software algorithm and the pre-trained UCSD algorithm. HMM can help expand the application of actigraphy in cases where PSG is hard to acquire and supervised methods cannot be trained. In addition, the estimated HMM parameters can characterize individual activity patterns and sedentary tendencies that can be further utilized in downstream analysis.  相似文献   

9.
Loops are the most variable regions of protein structure and are, in general, the least accurately predicted. Their prediction has been approached in two ways, ab initio and database search. In recent years, it has been thought that ab initio methods are more powerful. In light of the continued rapid expansion in the number of known protein structures, we have re‐evaluated FREAD, a database search method and demonstrate that the power of database search methods may have been underestimated. We found that sequence similarity as quantified by environment specific substitution scores can be used to significantly improve prediction. In fact, FREAD performs appreciably better for an identifiable subset of loops (two thirds of shorter loops and half of the longer loops tested) than the ab initio methods of MODELLER, PLOP, and RAPPER. Within this subset, FREAD's predictive ability is length independent, in general, producing results within 2Å RMSD, compared to an average of over 10Å for loop length 20 for any of the other tested methods. We also benchmarked the prediction protocols on a set of 212 loops from the model structures in CASP 7 and 8. An extended version of FREAD is able to make predictions for 127 of these, it gives the best prediction of the methods tested in 61 of these cases. In examining FREAD's ability to predict in the model environment, we found that whole structure quality did not affect the quality of loop predictions. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

10.
Chained learning architectures in a simple closed-loop behavioural context   总被引:1,自引:0,他引:1  

Objective

Living creatures can learn or improve their behaviour by temporally correlating sensor cues where near-senses (e.g., touch, taste) follow after far-senses (vision, smell). Such type of learning is related to classical and/or operant conditioning. Algorithmically all these approaches are very simple and consist of single learning unit. The current study is trying to solve this problem focusing on chained learning architectures in a simple closed-loop behavioural context.

Methods

We applied temporal sequence learning (Porr B and Wörgötter F 2006) in a closed-loop behavioural system where a driving robot learns to follow a line. Here for the first time we introduced two types of chained learning architectures named linear chain and honeycomb chain. We analyzed such architectures in an open and closed-loop context and compared them to the simple learning unit.

Conclusions

By implementing two types of simple chained learning architectures we have demonstrated that stable behaviour can also be obtained in such architectures. Results also suggest that chained architectures can be employed and better behavioural performance can be obtained compared to simple architectures in cases where we have sparse inputs in time and learning normally fails because of weak correlations.  相似文献   

11.

Background  

Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation) score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program.  相似文献   

12.
A generalized linear model with Gamma errors is used to estimatethe coordinates of a restriction map when the site order isknown. This can be conveniently programmed in a wide range ofstatistical packages (e.g. Genstat 5, Minitab, SAS), and givesmaximum likelihood estimates with their associated optimal properties.Regression diagnostics allow the checking of assumptions andhelp to identify mis–specified, influential or discordantfragment lengths. A specific diagnostic for identifying fragmentlengths causing reversal of restriction site order is derived.Exact ‘fragment’ lengths from DNA sequencing canbe conveniently included in an approximate manner by givingthem a larger weight than observed restriction fragment lengths.Two examples and the Genstat 5 codes used in their analysisare presented.  相似文献   

13.
The publishers wish to apologise for typesetting errors thatappeared in two equations, on pages 541 and 547, of the abovepaper. The correct versions are presented below.   相似文献   

14.
蛋白质能量最小化是蛋白质折叠的重要内容。用于蛋白质折叠的新的杂合进化算法结合了交叉和柯西变异。基于toy模型的蛋白质能量最小化算例表明,这个新的杂合进化算法是有效的。  相似文献   

15.
MOTIVATION: Preliminary results on the data produced using the Affymetrix large-scale genotyping platforms show that it is necessary to construct improved genotype calling algorithms. There is evidence that some of the existing algorithms lead to an increased error rate in heterozygous genotypes, and a disproportionately large rate of heterozygotes with missing genotypes. Non-random errors and missing data can lead to an increase in the number of false discoveries in genetic association studies. Therefore, the factors that need to be evaluated in assessing the performance of an algorithm are the missing data (call) and error rates, but also the heterozygous proportions in missing data and errors. RESULTS: We introduce a novel genotype calling algorithm (GEL) for the Affymetrix GeneChip arrays. The algorithm uses likelihood calculations that are based on distributions inferred from the observed data. A key ingredient in accurate genotype calling is weighting the information that comes from each probe quartet according to the quality/reliability of the data in the quartet, and prior information on the performance of the quartet. AVAILABILITY: The GEL software is implemented in R and is available by request from the corresponding author at nicolae@galton.uchicago.edu.  相似文献   

16.
插入序列共同区元件:细菌中新出现的一种基因捕获系统   总被引:1,自引:0,他引:1  
摘要:插入序列共同区(Insertion sequence common region,ISCR)元件是一类在结构和功能上与IS91家族相似的特殊插入序列,特点是缺少了末端反向重复序列(Inverted repeats, IRs),在插入位点不产生直接重复序列,并通过滚环式(Rolling circle, RC)进行转座。ISCR元件作为一种新的基因捕获系统,它可以移动邻近的任何DNA序列,为耐药基因在不同种属细菌间水平传播提供了高效的媒介。世界各地多种革兰氏阴性病原菌中已发现有19种ISCR元件,大部分ISCR元件同时携带了多种耐药基因,提示ISCR有可能会造成细菌多重耐药性的快速传播。本文就ISCR结构特征、类型、移动方式、起源及进化的研究进展进行了综述。  相似文献   

17.
18.
Protein libraries are essential to the field of protein engineering. Increasingly, probabilistic protein design is being used to synthesize combinatorial protein libraries, which allow the protein engineer to explore a vast space of amino acid sequences, while at the same time placing restrictions on the amino acid distributions. To this end, if site-specific amino acid probabilities are input as the target, then the codon nucleotide distributions that match this target distribution can be used to generate a partially randomized gene library. However, it turns out to be a highly nontrivial computational task to find the codon nucleotide distributions that exactly matches a given target distribution of amino acids. We first showed that for any given target distribution an exact solution may not exist at all. Formulated as a constrained optimization problem, we then developed a genetic algorithm-based approach to find codon nucleotide distributions that match as closely as possible to the target amino acid distribution. As compared with the previous gradient descent method on various objective functions, the new method consistently gave more optimized distributions as measured by the relative entropy between the calculated and the target distributions. To simulate the actual lab solutions, new objective functions were designed to allow for two separate sets of codons in seeking a better match to the target amino acid distribution.  相似文献   

19.
Leong MK  Chen HB  Shih YH 《PloS one》2012,7(3):e33829

Background

P-glycoprotein (P-gp) is an ATP-dependent membrane transporter that plays a pivotal role in eliminating xenobiotics by active extrusion of xenobiotics from the cell. Multidrug resistance (MDR) is highly associated with the over-expression of P-gp by cells, resulting in increased efflux of chemotherapeutical agents and reduction of intracellular drug accumulation. It is of clinical importance to develop a P-gp inhibition predictive model in the process of drug discovery and development.

Methodology/Principal Findings

An in silico model was derived to predict the inhibition of P-gp using the newly invented pharmacophore ensemble/support vector machine (PhE/SVM) scheme based on the data compiled from the literature. The predictions by the PhE/SVM model were found to be in good agreement with the observed values for those structurally diverse molecules in the training set (n = 31, r 2 = 0.89, q 2 = 0.86, RMSE = 0.40, s = 0.28), the test set (n = 88, r 2 = 0.87, RMSE = 0.39, s = 0.25) and the outlier set (n = 11, r 2 = 0.96, RMSE = 0.10, s = 0.05). The generated PhE/SVM model also showed high accuracy when subjected to those validation criteria generally adopted to gauge the predictivity of a theoretical model.

Conclusions/Significance

This accurate, fast and robust PhE/SVM model that can take into account the promiscuous nature of P-gp can be applied to predict the P-gp inhibition of structurally diverse compounds that otherwise cannot be done by any other methods in a high-throughput fashion to facilitate drug discovery and development by designing drug candidates with better metabolism profile.  相似文献   

20.
We have studied the adsorption of argon at 87 K in slit pores of finite length with a smooth graphitic potential, open at both ends or closed at one end. Simulations were carried out using conventional GCMC (grand canonical Monte Carlo) or kMC (kinetic Monte Carlo) in the canonical ensemble with extremely long Markov chain, of at least 2 × 108 configurations; selected simulations with much longer Markov chains do not show any change in the results. When the pore width is in the micropore range (0.65 nm), type I isotherms are obtained for both pore models and for both simulation methods. However, wider pores (1, 2 and 3 nm in width) all exhibit hysteresis loops in the GCMC simulations, while in the canonical ensemble simulations, the isotherms pass through a sigmoid van der Waals type loop in the transition region. This loop locates the true equilibrium transition. For the pores with one closed end, this transition is close to, or coincides with, the adsorption branch of the GCMC hysteresis loop, but for the open-ended pores, it is more closely associated with the desorption branch. In a separate study of adsorption hysteresis in an infinitely long slit pore, using both simulation techniques, the van der Waals loop follows the adsorption branch of the GCMC isotherm to the transition, then reverts to a long vertical section that falls midway between the two hysteresis branches and finally moves to the desorption transition close to the evaporation pressure. An examination of molecular distributions inside the pores reveals two coexisting phases in the canonical simulations, whereas in the grand canonical simulations, the molecules are uniformly distributed along the length of the pores.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号