首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Multiple sequence alignment (MSA) is a useful tool in bioinformatics. Although many MSA algorithms have been developed, there is still room for improvement in accuracy and speed. In the alignment of a family of protein sequences, global MSA algorithms perform better than local ones in many cases, while local ones perform better than global ones when some sequences have long insertions or deletions (indels) relative to others. Many recent leading MSA algorithms have incorporated pairwise alignment information obtained from a mixture of sources into their scoring system to improve accuracy of alignment containing long indels.  相似文献   

2.
A novel closed loop control framework is proposed to inhibit epileptiform wave in a neural mass model by external electric field, where the unscented Kalman filter method is used to reconstruct dynamics and estimate unmeasurable parameters of the model. Specifically speaking, the iterative learning control algorithm is introduced into the framework to optimize the control signal. In the proposed method, the control effect can be significantly improved based on the observation of the past attempts. Accordingly, the proposed method can effectively suppress the epileptiform wave as well as showing robustness to noises and uncertainties. Lastly, the simulation is carried out to illustrate the feasibility of the proposed method. Besides, this work shows potential value to design model-based feedback controllers for epilepsy treatment.  相似文献   

3.
The potential of five different groups of materials--carbons, glass and ceramics, polymers, hydrogels and collagen--as biomaterials in artificial implant applications is examined. In addition to the physical and/or structural properties of these materials, the blood and tissue responses to implants made of these biomaterials for various applications are presented. Emphasis is placed on materials related to the intended application; as catheter tips and biosensors for glucose to be used in conjunction with an implantable insulin delivery system as a complete artificial pancreas.  相似文献   

4.
MOTIVATION: We describe a novel method for detecting the domain structure of a protein from sequence information alone. The method is based on analyzing multiple sequence alignments that are derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence and are combined into a single predictor using a neural network. The output is further smoothed and post-processed using a probabilistic model to predict the most likely transition positions between domains. RESULTS: The method was assessed using the domain definitions in SCOP and CATH for proteins of known structure and was compared with several other existing methods. Our method performs well both in terms of accuracy and sensitivity. It improves significantly over the best methods available, even some of the semi-manual ones, while being fully automatic. Our method can also be used to suggest and verify domain partitions based on structural data. A few examples of predicted domain definitions and alternative partitions, as suggested by our method, are also discussed. AVAILABILITY: An online domain-prediction server is available at http://biozon.org/tools/domains/  相似文献   

5.
Fast sequence clustering using a suffix array algorithm   总被引:1,自引:0,他引:1  
MOTIVATION: Efficient clustering is important for handling the large amount of available EST sequences. Most contemporary methods are based on some kind of all-against-all comparison, resulting in a quadratic time complexity. A different approach is needed to keep up with the rapid growth of EST data. RESULTS: A new, fast EST clustering algorithm is presented. Sub-quadratic time complexity is achieved by using an algorithm based on suffix arrays. A prototype implementation has been developed and run on a benchmark data set. The produced clusterings are validated by comparing them to clusterings produced by other methods, and the results are quite promising. AVAILABILITY: The source code for the prototype implementation is available under a GPL license from http://www.ii.uib.no/~ketil/bio/.  相似文献   

6.
ABSTRACT

Actigraphy is widely used in sleep studies but lacks a universal unsupervised algorithm for sleep/wake identification. An unsupervised algorithm is useful in large-scale population studies and in cases where polysomnography (PSG) is unavailable, as it does not require sleep outcome labels to train the model but utilizes information solely contained in actigraphy to learn sleep and wake characteristics and separate the two states. In this study, we proposed a machine learning unsupervised algorithm based on the Hidden Markov Model (HMM) for sleep/wake identification. The proposed algorithm is also an individualized approach that takes into account individual variabilities and analyzes each individual actigraphy profile separately to infer sleep and wake states. We used Actiwatch and PSG data from 43 individuals in the Multi-Ethnic Study of Atherosclerosis study to evaluate the method performance. Epoch-by-epoch comparisons and sleep variable comparisons were made between our algorithm, the unsupervised algorithm embedded in the Actiwatch software (AS), and the pre-trained supervised UCSD algorithm. Using PSG as the reference, the accuracy was 85.7% for HMM, 84.7% for AS, and 85.0% for UCSD. The sensitivity was 99.3%, 99.7%, and 98.9% for HMM, AS, and UCSD, respectively, and the specificity was 36.4%, 30.0%, and 31.7%, respectively. The Kappa statistic was 0.446 for HMM, 0.399 for AS, and 0.311 for UCSD, suggesting fair to moderate agreement between PSG and actigraphy. The Bland–Altman plots further show that the total sleep time, sleep latency, and sleep efficiency estimates by HMM were closer to PSG with narrower 95% limits of agreement than AS and UCSD. All three methods tend to overestimate sleep and underestimate wake compared to PSG. Our HMM approach is also able to differentiate relatively active and sedentary individuals by quantifying variabilities in activity counts: individuals with higher estimated activity variabilities tend to show more frequent sedentary behaviors. Our unsupervised data-driven HMM algorithm achieved better performance than the commonly used Actiwatch software algorithm and the pre-trained UCSD algorithm. HMM can help expand the application of actigraphy in cases where PSG is hard to acquire and supervised methods cannot be trained. In addition, the estimated HMM parameters can characterize individual activity patterns and sedentary tendencies that can be further utilized in downstream analysis.  相似文献   

7.
 The coherence function measures the amount of correlation between two signals x and y as a function of the frequency, independently of their causal relationships. Therefore, the coherence function is not useful in deciding whether an open-loop relationship between x and y is set (x acts on y, but the reverse relationship is prevented) or x and y interact in a closed loop (x affects y, and vice versa). This study proposes a method based on a bivariate autoregressive model to derive the strength of the causal coupling on both arms of a closed loop. The method exploits the definition of causal coherence. After the closed-loop identification of the model coefficients, the causal coherence is calculated by switching off separately the feedback or the feedforward path, thus opening the closed loop and fixing causality. The method was tested in simulations and applied to evaluate the degree of the causal coupling between two variables known to interact in a closed loop mainly at a low frequency (LF, around 0.1 Hz) and at a high frequency (HF, at the respiratory rate): the heart period (RR interval) and systolic arterial pressure (SAP). In dogs at control, the RR interval and the SAP are highly correlated at HF. This coupling occurs in the causal direction from the RR interval to the SAP (the mechanical path), while the coupling on the reverse causal direction (the baroreflex path) is not significant, thus pointing out the importance of the direct effects of respiration on the RR interval. Total baroreceptive denervation, by opening the closed loop at the level of the influences of SAP on RR interval, does not change these results. In elderly healthy men at rest, the RR interval and SAP are highly correlated at the LF and the HF. At the HF, a significant coupling in both causal directions is found, even though closed-loop interactions are detected in few cases. At the LF, the link on the baroreflex pathway is negligible with respect to that on the reverse mechanical one. In heart transplant recipients, in which SAP variations do not cause RR interval changes as a result of the cardiac denervation, the method correctly detects a significant coupling only on the pathway from the RR interval to the SAP. Received: 28 June 2001 / Accepted in revised form: 23 October 2001  相似文献   

8.
Chained learning architectures in a simple closed-loop behavioural context   总被引:1,自引:0,他引:1  

Objective

Living creatures can learn or improve their behaviour by temporally correlating sensor cues where near-senses (e.g., touch, taste) follow after far-senses (vision, smell). Such type of learning is related to classical and/or operant conditioning. Algorithmically all these approaches are very simple and consist of single learning unit. The current study is trying to solve this problem focusing on chained learning architectures in a simple closed-loop behavioural context.

Methods

We applied temporal sequence learning (Porr B and Wörgötter F 2006) in a closed-loop behavioural system where a driving robot learns to follow a line. Here for the first time we introduced two types of chained learning architectures named linear chain and honeycomb chain. We analyzed such architectures in an open and closed-loop context and compared them to the simple learning unit.

Conclusions

By implementing two types of simple chained learning architectures we have demonstrated that stable behaviour can also be obtained in such architectures. Results also suggest that chained architectures can be employed and better behavioural performance can be obtained compared to simple architectures in cases where we have sparse inputs in time and learning normally fails because of weak correlations.  相似文献   

9.

Background  

Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation) score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program.  相似文献   

10.
A generalized linear model with Gamma errors is used to estimatethe coordinates of a restriction map when the site order isknown. This can be conveniently programmed in a wide range ofstatistical packages (e.g. Genstat 5, Minitab, SAS), and givesmaximum likelihood estimates with their associated optimal properties.Regression diagnostics allow the checking of assumptions andhelp to identify mis–specified, influential or discordantfragment lengths. A specific diagnostic for identifying fragmentlengths causing reversal of restriction site order is derived.Exact ‘fragment’ lengths from DNA sequencing canbe conveniently included in an approximate manner by givingthem a larger weight than observed restriction fragment lengths.Two examples and the Genstat 5 codes used in their analysisare presented.  相似文献   

11.
MOTIVATION: Preliminary results on the data produced using the Affymetrix large-scale genotyping platforms show that it is necessary to construct improved genotype calling algorithms. There is evidence that some of the existing algorithms lead to an increased error rate in heterozygous genotypes, and a disproportionately large rate of heterozygotes with missing genotypes. Non-random errors and missing data can lead to an increase in the number of false discoveries in genetic association studies. Therefore, the factors that need to be evaluated in assessing the performance of an algorithm are the missing data (call) and error rates, but also the heterozygous proportions in missing data and errors. RESULTS: We introduce a novel genotype calling algorithm (GEL) for the Affymetrix GeneChip arrays. The algorithm uses likelihood calculations that are based on distributions inferred from the observed data. A key ingredient in accurate genotype calling is weighting the information that comes from each probe quartet according to the quality/reliability of the data in the quartet, and prior information on the performance of the quartet. AVAILABILITY: The GEL software is implemented in R and is available by request from the corresponding author at nicolae@galton.uchicago.edu.  相似文献   

12.
Protein libraries are essential to the field of protein engineering. Increasingly, probabilistic protein design is being used to synthesize combinatorial protein libraries, which allow the protein engineer to explore a vast space of amino acid sequences, while at the same time placing restrictions on the amino acid distributions. To this end, if site-specific amino acid probabilities are input as the target, then the codon nucleotide distributions that match this target distribution can be used to generate a partially randomized gene library. However, it turns out to be a highly nontrivial computational task to find the codon nucleotide distributions that exactly matches a given target distribution of amino acids. We first showed that for any given target distribution an exact solution may not exist at all. Formulated as a constrained optimization problem, we then developed a genetic algorithm-based approach to find codon nucleotide distributions that match as closely as possible to the target amino acid distribution. As compared with the previous gradient descent method on various objective functions, the new method consistently gave more optimized distributions as measured by the relative entropy between the calculated and the target distributions. To simulate the actual lab solutions, new objective functions were designed to allow for two separate sets of codons in seeking a better match to the target amino acid distribution.  相似文献   

13.
Leong MK  Chen HB  Shih YH 《PloS one》2012,7(3):e33829

Background

P-glycoprotein (P-gp) is an ATP-dependent membrane transporter that plays a pivotal role in eliminating xenobiotics by active extrusion of xenobiotics from the cell. Multidrug resistance (MDR) is highly associated with the over-expression of P-gp by cells, resulting in increased efflux of chemotherapeutical agents and reduction of intracellular drug accumulation. It is of clinical importance to develop a P-gp inhibition predictive model in the process of drug discovery and development.

Methodology/Principal Findings

An in silico model was derived to predict the inhibition of P-gp using the newly invented pharmacophore ensemble/support vector machine (PhE/SVM) scheme based on the data compiled from the literature. The predictions by the PhE/SVM model were found to be in good agreement with the observed values for those structurally diverse molecules in the training set (n = 31, r 2 = 0.89, q 2 = 0.86, RMSE = 0.40, s = 0.28), the test set (n = 88, r 2 = 0.87, RMSE = 0.39, s = 0.25) and the outlier set (n = 11, r 2 = 0.96, RMSE = 0.10, s = 0.05). The generated PhE/SVM model also showed high accuracy when subjected to those validation criteria generally adopted to gauge the predictivity of a theoretical model.

Conclusions/Significance

This accurate, fast and robust PhE/SVM model that can take into account the promiscuous nature of P-gp can be applied to predict the P-gp inhibition of structurally diverse compounds that otherwise cannot be done by any other methods in a high-throughput fashion to facilitate drug discovery and development by designing drug candidates with better metabolism profile.  相似文献   

14.
15.
We have studied the adsorption of argon at 87 K in slit pores of finite length with a smooth graphitic potential, open at both ends or closed at one end. Simulations were carried out using conventional GCMC (grand canonical Monte Carlo) or kMC (kinetic Monte Carlo) in the canonical ensemble with extremely long Markov chain, of at least 2 × 108 configurations; selected simulations with much longer Markov chains do not show any change in the results. When the pore width is in the micropore range (0.65 nm), type I isotherms are obtained for both pore models and for both simulation methods. However, wider pores (1, 2 and 3 nm in width) all exhibit hysteresis loops in the GCMC simulations, while in the canonical ensemble simulations, the isotherms pass through a sigmoid van der Waals type loop in the transition region. This loop locates the true equilibrium transition. For the pores with one closed end, this transition is close to, or coincides with, the adsorption branch of the GCMC hysteresis loop, but for the open-ended pores, it is more closely associated with the desorption branch. In a separate study of adsorption hysteresis in an infinitely long slit pore, using both simulation techniques, the van der Waals loop follows the adsorption branch of the GCMC isotherm to the transition, then reverts to a long vertical section that falls midway between the two hysteresis branches and finally moves to the desorption transition close to the evaporation pressure. An examination of molecular distributions inside the pores reveals two coexisting phases in the canonical simulations, whereas in the grand canonical simulations, the molecules are uniformly distributed along the length of the pores.  相似文献   

16.
Han LY  Cai CZ  Ji ZL  Cao ZW  Cui J  Chen YZ 《Nucleic acids research》2004,32(21):6437-6444
The function of a protein that has no sequence homolog of known function is difficult to assign on the basis of sequence similarity. The same problem may arise for homologous proteins of different functions if one is newly discovered and the other is the only known protein of similar sequence. It is desirable to explore methods that are not based on sequence similarity. One approach is to assign functional family of a protein to provide useful hint about its function. Several groups have employed a statistical learning method, support vector machines (SVMs), for predicting protein functional family directly from sequence irrespective of sequence similarity. These studies showed that SVM prediction accuracy is at a level useful for functional family assignment. But its capability for assignment of distantly related proteins and homologous proteins of different functions has not been critically and adequately assessed. Here SVM is tested for functional family assignment of two groups of enzymes. One consists of 50 enzymes that have no homolog of known function from PSI-BLAST search of protein databases. The other contains eight pairs of homologous enzymes of different families. SVM correctly assigns 72% of the enzymes in the first group and 62% of the enzyme pairs in the second group, suggesting that it is potentially useful for facilitating functional study of novel proteins. A web version of our software, SVMProt, is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.  相似文献   

17.
The AKRs (aldo-keto reductases) are a superfamily of enzymes which mainly rely on NADPH to reversibly reduce various carbonyl-containing compounds to the corresponding alcohols. A small number have been found with dual NADPH/NADH specificity, usually preferring NADPH, but none are exclusive for NADH. Crystal structures of the dual-specificity enzyme xylose reductase (AKR2B5) indicate that NAD+ is bound via a key interaction with a glutamate that is able to change conformations to accommodate the 2'-phosphate of NADP+. Sequence comparisons suggest that analogous glutamate or aspartate residues may function in other AKRs to allow NADH utilization. Based on this, nine putative enzymes with potential NADH specificity were identified and seven genes were successfully expressed and purified from Drosophila melanogaster, Escherichia coli, Schizosaccharomyces pombe, Sulfolobus solfataricus, Sinorhizobium meliloti and Thermotoga maritima. Each was assayed for co-substrate dependence with conventional AKR substrates. Three were exclusive for NADPH (AKR2E3, AKR3F2 and AKR3F3), two were dual-specific (AKR3C2 and AKR3F1) and one was specific for NADH (AKR11B2), the first such activity in an AKR. Fluorescence measurements of the seventh protein indicated that it bound both NADPH and NADH but had no activity. Mutation of the aspartate into an alanine residue or a more mobile glutamate in the NADH-specific E. coli protein converted it into an enzyme with dual specificity. These results show that the presence of this carboxylate is an indication of NADH dependence. This should allow improved prediction of co-substrate specificity and provide a basis for engineering enzymes with altered co-substrate utilization for this class of enzymes.  相似文献   

18.
Background aimsMultiple cell-therapy products require density separation as a part of manufacturing. The traditional method for Ficoll separation, layering cell suspensions over Ficoll in tubes, followed by centrifugation and collection of cells from the interface, is too cumbersome and poses too high a risk of contamination for clinical-scale use. Recently, a system for clinical-scale Ficoll gradient applications has been introduced (Sepax?) but this system has limited availability and is costly.MethodsFor preparations of mononuclear cells (MNC) for dendritic cell (DC) production, we developed a Ficoll separation protocol that employs the Haemonetics? Cell Saver5? surgical blood salvage and wash instrument. This system uses standard blood bags and tubing, has single-use components, and is effectively closed. We analyzed 37 recent separation processes using this instrument and protocol. We measured depletion of red blood cells (RBC) and polymorphonuclear leukocytes (PMN), and recovery of CD14+ monocytes and MNC.ResultsStarting cell counts were 14.6 ± 8.0 (×109). Total cell recovery was 49.2 ± 15.2%, RBC depletion was 88.4 ± 2.8%, PMN depletion was 86.9 ± 6.1%, MNC recovery was 63.6 ± 5.0% and CD14+ monocyte recovery was 75.3 ± 9.9%.ConclusionsThe Cell Saver5? is relatively inexpensive to purchase and use. The instrument and its disposables are licensed by the United States Food and Drug Administration (FDA) for intra-operative blood salvage, and we have obtained approval for investigational use. Our method with this instrument has proven to be simple and efficient for clinical-scale Ficoll separations.  相似文献   

19.
20.

Background  

Recently, extensive studies have been carried out on arrhythmia classification algorithms using artificial intelligence pattern recognition methods such as neural network. To improve practicality, many studies have focused on learning speed and the accuracy of neural networks. However, algorithms based on neural networks still have some problems concerning practical application, such as slow learning speeds and unstable performance caused by local minima.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号