首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Sliding-window averaging of amino acid properties is a standard method for predicting protein secondary structure. For example, transmembrane segments are predicted to occur near the peaks in a hydropathy plot of a membrane protein. Such a scheme (linear convolutional recognizer, LCR) assigns a number (weight) to each type of monomer, and then convolutes some window function with the sequence of weights. The window has commonly been rectangular, and the weights derived from singlet amino acid frequencies in proteins of known secondary structure or from physical properties of amino acids. The accuracy of the windows and weights have remained unknown. We use linear optimization theory to develop a general method for approximating the optimal window and weights for a LCR. The method assumes that one knows the sequences of one or more chains and the locations of their "features", regions having the secondary structure of interest. We present formulae for quantifying the accuracy of predictors. We show why the optimal LCR is more accurate than methods based on the differences between singlet monomer frequencies inside and outside features. The advantage of an optimal LCR is that its weights inherently include correlations between nearby monomer positions. The optimal predictor is not perfect though. We argue that its inaccuracy is an intrinsic limitation of linear predictors based on monomer weights. As a practical example, we study predictors for transbilayer segments of membrane proteins. We estimate the optimal weights and windows for the two bacterial photosynthetic reaction centers whose three-dimensional structures are known. The resultant LCR, which is more accurate than previous ones, is still inexact. We apply it to bacteriorhodopsin and halorhodopsin. Several non-linear generalizations are examined as possible improvements to the LCR method: non-linear combinations of linear predictors and windowed Fourier transforms of the weight sequences. The former do not significantly increase the accuracy, while the latter reveal a weak negative correlation between the segments and periodic variations of the weights.  相似文献   

2.
A simple method for estimating the transition/transversion ratio was developed. This method can be applied to not only two sequences but also more than two sequences. The statistical properties of the method and some other methods were examined by numerical computation and computer simulation. The results obtained showed that, in terms of bias and variance, the new method gives a better estimate of the transition/transversion ratio than do the other examined methods. The new method was applied to human and chimpanzee mitochondrial control region sequences. Received: 22 September 1997 / Accepted: 1 November 1997  相似文献   

3.
Green lacewings stop flying in response to ultrasound. The behavioural response begins with folding of the wings, which starts about 40 msec following stimulation. About 66 msec later potentials from the indirect flight muscles cease. Insects resume their stationary flight after a certain period of time, which is dependent on the stimulus duration. Consistent responses occur only during the insects' night. Stimuli eliciting the cessation of flight have the following parameters: frequencies of from 15 to 140 kHz, intensities above 55 dB, single pulses of from 1 to 100 msec in duration, and pulse sequences having repetition rates up to 70 or 80 pulses/sec. Pulse sequences from 0·1 to 1 sec produce response durations that last longer than the stimulus, whereas pulse sequences longer than 1 sec, elicit responses that do not last as long as the stimulus. The duration of the response remains nearly constant when single ultrasonic pulses are given. This flight cessation behaviour provides a mechanism whereby green lacewings can avoid predation by bats. Responses seen in green lacewings are compared with similar responses in noctuid moths.  相似文献   

4.
Slowly varying activity in the striatum, the main Basal Ganglia input structure, is important for the learning and execution of movement sequences. Striatal medium spiny neurons (MSNs) form cell assemblies whose population firing rates vary coherently on slow behaviourally relevant timescales. It has been shown that such activity emerges in a model of a local MSN network but only at realistic connectivities of and only when MSN generated inhibitory post-synaptic potentials (IPSPs) are realistically sized. Here we suggest a reason for this. We investigate how MSN network generated population activity interacts with temporally varying cortical driving activity, as would occur in a behavioural task. We find that at unrealistically high connectivity a stable winners-take-all type regime is found where network activity separates into fixed stimulus dependent regularly firing and quiescent components. In this regime only a small number of population firing rate components interact with cortical stimulus variations. Around connectivity a transition to a more dynamically active regime occurs where all cells constantly switch between activity and quiescence. In this low connectivity regime, MSN population components wander randomly and here too are independent of variations in cortical driving. Only in the transition regime do weak changes in cortical driving interact with many population components so that sequential cell assemblies are reproducibly activated for many hundreds of milliseconds after stimulus onset and peri-stimulus time histograms display strong stimulus and temporal specificity. We show that, remarkably, this activity is maximized at striatally realistic connectivities and IPSP sizes. Thus, we suggest the local MSN network has optimal characteristics – it is neither too stable to respond in a dynamically complex temporally extended way to cortical variations, nor is it too unstable to respond in a consistent repeatable way. Rather, it is optimized to generate stimulus dependent activity patterns for long periods after variations in cortical excitation.  相似文献   

5.
The design of slice selective pulses for magnetic resonance imaging can be cast as an optimal control problem. The Fourier synthesis method is an existing approach to solve these optimal control problems. In this method the gradient field as well as the excitation field are switched rapidly and their amplitudes are calculated based on a Fourier series expansion. Here, we provide a novel insight into the Fourier synthesis method via representing the Bloch equation in spherical coordinates. Based on the spherical Bloch equation, we propose an alternative sequence of pulses that can be used for slice selection which is more time efficient compared to the original method. Simulation results demonstrate that while the performance of both methods is approximately the same, the required time for the proposed sequence of pulses is half of the original sequence of pulses. Furthermore, the slice selectivity of both sequences of pulses changes with radio frequency field inhomogeneities in a similar way. We also introduce a measure, referred to as gradient complexity, to compare the performance of both sequences of pulses. This measure indicates that for a desired level of uniformity in the excited slice, the gradient complexity for the proposed sequence of pulses is less than the original sequence.  相似文献   

6.
MOTIVATION: Current genomic sequence assemblers assume that the input data is derived from a single, homogeneous source. However, recent whole-genome shotgun sequencing projects have violated this assumption, resulting in input fragments covering the same region of the genome whose sequences differ due to polymorphic variation in the population. While single-nucleotide polymorphisms (SNPs) do not pose a significant problem to state-of-the-art assembly methods, these methods do not handle insertion/deletion (indel) polymorphisms of more than a few bases. RESULTS: This paper describes an efficient method for detecting sequence discrepencies due to polymorphism that avoids resorting to global use of more costly, less stringent affine sequence alignments. Instead, the algorithm uses graph-based methods to determine the small set of fragments involved in each polymorphism and performs more sophisticated alignments only among fragments in that set. Results from the incorporation of this method into the Celera Assembler are reported for the D. melanogaster, H. sapiens, and M. musculus genomes.  相似文献   

7.
Slippage is an important sequencing problem that can occur in EST projects. However, very few studies have addressed this. We propose three new methods to detect slippage artifacts: arithmetic mean method, geometric mean method, and echo coverage method. Each method is simple and has two different strategies for processing sequences: suffix and subsequence. Using the 291,689 EST sequences produced in the SUCEST project, we performed comparative tests between our proposed methods and the SUCEST method. The subsequence strategy is better than the suffix strategy, because it is not anchored at the end of the sequence, so it is more flexible to find slippage at the beginning of the EST. In a comparison with the SUCEST method, the advantage of our methods is that they do not discard the majority of the sequences marked as slippage, but instead only remove the slipped artifact from the sequence. Based on our tests the echo coverage method with subsequence strategy shows the best compromise between slippage detection and ease of calibration.  相似文献   

8.
9.
MOTIVATION: We present a method for modeling protein families by means of probabilistic suffix trees (PSTs). The method is based on identifying significant patterns in a set of related protein sequences. The patterns can be of arbitrary length, and the input sequences do not need to be aligned, nor is delineation of domain boundaries required. The method is automatic, and can be applied, without assuming any preliminary biological information, with surprising success. Basic biological considerations such as amino acid background probabilities, and amino acids substitution probabilities can be incorporated to improve performance. RESULTS: The PST can serve as a predictive tool for protein sequence classification, and for detecting conserved patterns (possibly functionally or structurally important) within protein sequences. The method was tested on the Pfam database of protein families with more than satisfactory performance. Exhaustive evaluations show that the PST model detects much more related sequences than pairwise methods such as Gapped-BLAST, and is almost as sensitive as a hidden Markov model that is trained from a multiple alignment of the input sequences, while being much faster.  相似文献   

10.
With the exponential growth of genomic sequences, there is an increasing demand to accurately identify protein coding regions (exons) from genomic sequences. Despite many progresses being made in the identification of protein coding regions by computational methods during the last two decades, the performances and efficiencies of the prediction methods still need to be improved. In addition, it is indispensable to develop different prediction methods since combining different methods may greatly improve the prediction accuracy. A new method to predict protein coding regions is developed in this paper based on the fact that most of exon sequences have a 3-base periodicity, while intron sequences do not have this unique feature. The method computes the 3-base periodicity and the background noise of the stepwise DNA segments of the target DNA sequences using nucleotide distributions in the three codon positions of the DNA sequences. Exon and intron sequences can be identified from trends of the ratio of the 3-base periodicity to the background noise in the DNA sequences. Case studies on genes from different organisms show that this method is an effective approach for exon prediction.  相似文献   

11.
Distance-based methods for phylogeny reconstruction are the fastest and easiest to use, and their popularity is accordingly high. They are also the only known methods that can cope with huge datasets of thousands of sequences. These methods rely on evolutionary distance estimation and are sensitive to errors in such estimations. In this study, a novel Bayesian method for estimation of evolutionary distances is developed. The proposed method enables the use of a sophisticated evolutionary model that better accounts for among-site rate variation (ASRV), thereby improving the accuracy of distance estimation. Rate variations are estimated within a Bayesian framework by extracting information from the entire dataset of sequences, unlike standard methods that can only use one pair of sequences at a time. We compare the accuracy of a cascade of distance estimation methods, starting from commonly used methods and moving towards the more sophisticated novel method. Simulation studies show significant improvements in the accuracy of distance estimation by the novel method over the commonly used ones. We demonstrate the effect of the improved accuracy on tree reconstruction using both real and simulated protein sequence alignments. An implementation of this method is available as part of the SEMPHY package.  相似文献   

12.
13.
Based on statistical analyses of song sequences, Bengalese finch (Lonchura striata var. domestica) songs do not show unvarying motif repetition as has been found in zebra finches (Taeniopygia guttata). Instead, there are variations of partially stereotyped sequences of song syllables. Although these stereotyped sequences consist of multiple syllables, in most cases these syllables occur together. To examine whether such structures really exist as a vocal production unit, we subjected singing birds to a light flash and determined when the stimulus stopped the songs. When light interruptions were presented within the statistically stereotyped sequences, the subsequent syllables tended to be produced, whereas interruptions presented during the statistically variable sequences tended to cause instantaneous song termination. This suggests that the associations among the song syllables that compose the statistically stereotyped sequences are more order dependent than those for the statistically variable sequences, and the tolerances of syllable pairs to visual interruptions are consistent with the statistical song structures. Additionally, following interruptions, several types of song sequence variations were observed that had not been previously reported. These phenomena might be caused by various effects of the visual stimulus on the hierarchical motor control program.  相似文献   

14.
15.
The process of inferring phylogenetic trees from molecular sequences almost always starts with a multiple alignment of these sequences but can also be based on methods that do not involve multiple sequence alignment. Very little is known about the accuracy with which such alignment-free methods recover the correct phylogeny or about the potential for increasing their accuracy. We conducted a large-scale comparison of ten alignment-free methods, among them one new approach that does not calculate distances and a faster variant of our pattern-based approach; all distance-based alignment-free methods are freely available from http://www.bioinformatics.org.au (as Python package decaf+py). We show that most methods exhibit a higher overall reconstruction accuracy in the presence of high among-site rate variation. Under all conditions that we considered, variants of the pattern-based approach were significantly better than the other alignment-free methods. The new pattern-based variant achieved a speed-up of an order of magnitude in the distance calculation step, accompanied by a small loss of tree reconstruction accuracy. A method of Bayesian inference from k-mers did not improve on classical alignment-free (and distance-based) methods but may still offer other advantages due to its Bayesian nature. We found the optimal word length k of word-based methods to be stable across various data sets, and we provide parameter ranges for two different alphabets. The influence of these alphabets was analyzed to reveal a trade-off in reconstruction accuracy between long and short branches. We have mapped the phylogenetic accuracy for many alignment-free methods, among them several recently introduced ones, and increased our understanding of their behavior in response to biologically important parameters. In all experiments, the pattern-based approach emerged as superior, at the expense of higher resource consumption. Nonetheless, no alignment-free method that we examined recovers the correct phylogeny as accurately as does an approach based on maximum-likelihood distance estimates of multiply aligned sequences.  相似文献   

16.
Conserved segments in DNA or protein sequences are strong candidates for functional elements and thus appropriate methods for computing them need to be developed and compared. We describe five methods and computer programs for finding highly conserved blocks within previously computed multiple alignments, primarily for DNA sequences. Two of the methods are already in common use; these are based on good column agreement and high information content. Three additional methods find blocks with minimal evolutionary change, blocks that differ in at most k positions per row from a known center sequence and blocks that differ in at most k positions per row from a center sequence that is unknown a priori. The center sequence in the latter two methods is a way to model potential binding sites for known or unknown proteins in DNA sequences. The efficacy of each method was evaluated by analysis of three extensively analyzed regulatory regions in mammalian beta-globin gene clusters and the control region of bacterial arabinose operons. Although all five methods have quite different theoretical underpinnings, they produce rather similar results on these data sets when their parameters are adjusted to best approximate the experimental data. The optimal parameters for the method based on information content varied little for different regulatory regions of the beta-globin gene cluster and hence may be extrapolated to many other regulatory regions. The programs based on maximum allowed mismatches per row have simple parameters whose values can be chosen a priori and thus they may be more useful than the other methods when calibration against known functional sites is not available.  相似文献   

17.
本文用传统的转筒式运动条纹刺激及电视运动条纹图形刺激两种方法进行了OKN实验,对所引起的OKN反应进行了定量比较,结果证明两者的刺激效果是相似的;用电视运动图象刺激方法,分别在中心视场和周边视场进行刺激实验,阐明了OKN主要是由作用于视网膜中央区域的运动图象刺激所引起的;并对OKN的动态反应进行了实验分析,在正弦速度刺激下,OKN增益主要取决于刺激运动的加速度,而不是单纯取决于刺激运动的速度或频率,并在脉冲速度刺激的OKN实验中,用动态反应时间阐明了这一结论.  相似文献   

18.
Given an amino acid sequence, we discuss how to find efficiently an optimal set of disjoint regions (substrings, domains, modules, etc.), each of which can be matched to some element of a predefined inventory containing, for example, consensus sequences, protosequences, or protein family profiles. A two-stage approach to sequence decomposition, consisting of the detection of all acceptable matches followed by the construction of an optimal subset of compatible matches, leads to computational difficulties. When the problem is reformulated in terms of network comparisons, it can be solved in time quadratic in the length of the sequence and linear with the number of templates in the inventory, by a single pass of a dynamic programming algorithm. This method has the advantage that the criterion for acceptable matches can be relaxed without materially affecting computing time. Except under special conditions it is more efficient than previous segmentation methods based on dynamic programming.  相似文献   

19.
Understanding the most appropriate workflow for biochemical human leukocyte antigen (HLA)‐associated peptide enrichment prior to ligand sequencing is essential to achieve optimal sensitivity in immunopeptidomics experiments. The use of different detergents for HLA solubilization as well as complementary workflows to separate HLA‐bound peptides from HLA protein complex components after their immunoprecipitation including HPLC, C18 cartridge, and 5 kDa filter are described. It is observed that all solubilization approaches tested led to similar peptide ligand identification rates; however, a higher number of peptides are identified in samples lysed with CHAPS compared with other methods. The HPLC method is superior in terms of HLA‐I peptide recovery compared with 5 kDa filter and C18 cartridge peptide purification methods. Most importantly, it is observed that both the choice of detergent and peptide purification strategy creates a significant bias for the identified peptide sequences, and that allele‐specific peptide repertoires are affected depending on the workflow of choice. The results highlight the importance of employing a suitable strategy for HLA peptide enrichment and that the obtained peptide repertoires do not necessarily reflect the true distributions of peptide sequences in the sample.  相似文献   

20.
Reaction time (RT) and error rate that depend on stimulus duration were measured in a luminance-discrimination reaction time task. Two patches of light with different luminance were presented to participants for ‘short’ (150 ms) or ‘long’ (1 s) period on each trial. When the stimulus duration was ‘short’, the participants responded more rapidly with poorer discrimination performance than they did in the longer duration. The results suggested that different sensory responses in the visual cortices were responsible for the dependence of response speed and accuracy on the stimulus duration during the luminance-discrimination reaction time task. It was shown that the simple winner-take-all-type neural network model receiving transient and sustained stimulus information from the primary visual cortex successfully reproduced RT distributions for correct responses and error rates. Moreover, temporal spike sequences obtained from the model network closely resembled to the neural activity in the monkey prefrontal or parietal area during other visual decision tasks such as motion discrimination and oddball detection tasks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号