首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein-protein interactions govern almost all biological processes and the underlying functions of proteins. The interaction sites of protein depend on the 3D structure which in turn depends on the amino acid sequence. Hence, prediction of protein function from its primary sequence is an important and challenging task in bioinformatics. Identification of the amino acids (hot spots) that leads to the characteristic frequency signifying a particular biological function is really a tedious job in proteomic signal processing. In this paper, we have proposed a new promising technique for identification of hot spots in proteins using an efficient time-frequency filtering approach known as the S-transform filtering. The S-transform is a powerful linear time-frequency representation and is especially useful for the filtering in the time-frequency domain. The potential of the new technique is analyzed in identifying hot spots in proteins and the result obtained is compared with the existing methods. The results demonstrate that the proposed method is superior to its counterparts and is consistent with results based on biological methods for identification of the hot spots. The proposed method also reveals some new hot spots which need further investigation and validation by the biological community.  相似文献   

2.
An important topic in genomic sequence analysis is the identification of protein coding regions. In this context, several coding DNA model-independent methods, based on the occurrence of specific patterns of nucleotides at coding regions, have been proposed. Nonetheless, these methods have not been completely suitable due to their dependence on an empirically pre-defined window length required for a local analysis of a DNA region. We introduce a method, based on a modified Gabor-wavelet transform (MGWT), for the identification of protein coding regions. This novel transform is tuned to analyze periodic signal components and presents the advantage of being independent of the window length. We compared the performance of the MGWT with other methods using eukaryote datasets. The results show that the MGWT outperforms all assessed model-independent methods with respect to identification accuracy. These results indicate that the source of at least part of the identification errors produced by the previous methods is the fixed working scale. The new method not only avoids this source of errors, but also makes available a tool for detailed exploration of the nucleotide occurrence.  相似文献   

3.
An Evaluation of Measures of Synonymous Codon Usage Bias   总被引:14,自引:0,他引:14  
Synonymous codons are not generally used at equal frequencies, and this trend is observed for most genes and organisms. Several methods have been proposed and used to estimate the degree of the nonrandom use of the different synonymous codons. The estimates obtained by these methods, however, show different levels of both precision and dispersion when coding regions of a finite number of codons are under analysis. Here, we present a study, based on computer simulation, of how the different methods proposed to evaluate the nonrandom use of synonymous codons are affected by the length of the coding region analyzed. The results show that some of these methods are heavily influenced by the number of codons and that the comparison of codon usage bias between coding regions of different lengths shows a methodological bias under different conditions of nonrandom use of synonymous codons. The study of the dispersion of the estimates obtained by the different methods gives, on the other hand, an indication of the methods to be applied to compare values of codon usage bias among coding regions of equivalent length. Received: 10 September 1997 / Accepted: 23 March 1998  相似文献   

4.
BackgroundThe success of collapsing methods which investigate the combined effect of rare variants on complex traits has so far been limited. The manner in which variants within a gene are selected prior to analysis has a crucial impact on this success, which has resulted in analyses conventionally filtering variants according to their consequence. This study investigates whether an alternative approach to filtering, using annotations from recently developed bioinformatics tools, can aid these types of analyses in comparison to conventional approaches.ConclusionIncorporating variant annotations from non-coding bioinformatics tools should prove to be a valuable asset for rare variant analyses in the future. Filtering by variant consequence is only possible in coding regions of the genome, whereas utilising non-coding bioinformatics annotations provides an opportunity to discover unknown causal variants in non-coding regions as well. This should allow studies to uncover a greater number of causal variants for complex traits and help elucidate their functional role in disease.  相似文献   

5.
Time-frequency filtering of MEG signals with matching pursuit.   总被引:4,自引:0,他引:4  
Time-frequency signal analysis based on various decomposition techniques is widely used in biomedical applications. Matching Pursuit is a new adaptive approach for time-frequency decomposition of such biomedical signals. Its advantage is that it creates a concise signal approximation with the help of a small set of Gabor atoms chosen iteratively from a large and redundant set. In this paper, the usage of Matching Pursuit for time-frequency filtering of biomagnetic signals is proposed. The technique was validated on artificial signals and its performance was tested for varying signal-to-noise ratios using both simulated and real MEG somatic evoked magnetic field data.  相似文献   

6.
Computer-aided protein-coding gene prediction in uncharacterized genomic DNA sequences is one of the most important issues of biological signal processing.A modified filter method based on a statistically optimal null filter(SONF) theory is proposed for recognizing protein-coding regions.The square deviation gain(SDG) between the input and output of the model is used to identify the coding regions.The effective SDG amplification model with Class I and Class II enhancement is designed to suppress the non-coding regions.Also,an evaluation algorithm has been used to compare the modified model with most gene prediction methods currently available in terms of sensitivity,specificity and precision.The performance for identification of protein-coding regions has been evaluated at the nucleotide level using benchmark datasets and 91.4%,96%,93.7% were obtained for sensitivity,specificity and precision,respectively.These results suggest that the proposed model is potentially useful in gene finding field,which can help recognize protein-coding regions with higher precision and speed than present algorithms.  相似文献   

7.
microRNAs (miRNA) are a class of non-protein coding functional RNAs that are thought to regulate expression of target genes by direct interaction with mRNAs. miRNAs have been identified through both experimental and computational methods in a variety of eukaryotic organisms. Though these approaches have been partially successful, there is a need to develop more tools for detection of these RNAs as they are also thought to be present in abundance in many genomes. In this report we describe a tool and a web server, named CID-miRNA, for identification of miRNA precursors in a given DNA sequence, utilising secondary structure-based filtering systems and an algorithm based on stochastic context free grammar trained on human miRNAs. CID-miRNA analyses a given sequence using a web interface, for presence of putative miRNA precursors and the generated output lists all the potential regions that can form miRNA-like structures. It can also scan large genomic sequences for the presence of potential miRNA precursors in its stand-alone form. The web server can be accessed at http://mirna.jnu.ac.in/cidmirna/.  相似文献   

8.
The identification of gene coding regions of DNA sequences through digital signal processing techniques based on the so-called 3-base periodicity has been an emerging problem in bioinformatics. The signal to noise ratio (SNR) of a DNA sequence is computed after mapping the DNA symbolic sequence into numerical sequences. Typical mapping schemes include the Voss, Z-curve and tetrahedron representations and the like, which have been used to construct gene coding region detecting algorithms. In this paper, an extended definition of SNR is proposed, which has less computational cost and wider applicability than its original ones. Furthermore, we analyze the SNRs of different mapping schemes and derive the general relationship between Voss based SNR and that of its general affine transformations. We conclude that the SNRs of Z-curve and tetrahedron map are also linearly proportional to that of Voss map. Not only is our conclusion instructional for the design of other affine transformations, but it is also of much significance in understanding the role of the symbolic-to-numerical mapping in the detection of gene coding regions.  相似文献   

9.
A wealth of novel lipid loci have been identified through a variety of approaches focused on common and low-frequency variation and collaborative metaanalyses in multiethnic populations. Despite progress in identification of loci, the task of determining causal variants remains challenging. This work will undoubtedly be enhanced by improved understanding of regulatory DNA at a genomewide level as well as new methodologies for interrogating the relationships between noncoding SNPs and regulatory regions. Equally challenging is the identification of causal genes at novel loci. Some progress has been made for a handful of genes and comprehensive testing of candidate genes using multiple model systems is underway. Additional insights will be gleaned from focusing on low frequency and rare coding variation at candidate loci in large populations. This article is part of a Special Issue entitled: From Genome to Function.  相似文献   

10.
Block-matching techniques have been widely used in the task of estimating displacement in medical images, and they represent the best approach in scenes with deformable structures such as tissues, fluids, and gels. In this article, a new iterative block-matching technique—based on successive deformation, search, fitting, filtering, and interpolation stages—is proposed to measure elastic displacements in two-dimensional polyacrylamide gel electrophoresis (2D–PAGE) images. The proposed technique uses different deformation models in the task of correlating proteins in real 2D electrophoresis gel images, obtaining an accuracy of 96.6% and improving the results obtained with other techniques. This technique represents a general solution, being easy to adapt to different 2D deformable cases and providing an experimental reference for block-matching algorithms.  相似文献   

11.
一种基于特征筛选的原核生物启动子判别分析方法   总被引:3,自引:3,他引:0  
启动子识别是研究基因转录调控的重要环节,但目前方法的识别正确率偏低。在深入分析原核启动子特征的基础上,提出了一种基于特征筛选的原核启动子判别分析方法,首先在启动子序列的组成特征、信号特征和结构特征中选取备选特征,为每个特征建立适当的描述模型,并对主要的保守模式采用复合模式模型;再通过模型计算对备选特征进行逐步筛选,优化特征集,将序列表示为组合特征向量;最终利用二次判别分析实现识别。对大肠杆菌和枯草杆菌实际启动子数据进行的刀切法测试验证了方法的有效性和通用性。对于大肠杆菌非编码区(70启动子,识别的平均正确率达到了85.8%,优于其它几种典型识别方法;对于大肠杆菌编码区内部)70启动子和其它几种原核启动子,平均正确率也都超过了80%。方法框架还具有良好的可扩展性,能够方便地容纳新特征,使识别性能不断提高。  相似文献   

12.
随着生物恐怖与生物战威胁的增加,微生物法医学的概念应运而生.微生物法医学的主要任务就是通过微生物学、免疫学、分子生物学和分析化学等各种技术手段,为生物恐怖袭击或自然发生的暴发性疾病追踪微生物的来源,推测微生物间的亲缘关系或为传播途径提供科学证据.近年来,微生物法医学在生物恐怖病原体的法医学鉴定、国家计算机网络的建立及多种鉴定方法的建立和质量控制方面取得较大进展,本文对此进行综述.  相似文献   

13.
Disordered regions of proteins are highly abundant in various biological processes, involving regulation and signaling and also in relation with cancer, cardiovascular, autoimmune diseases and neurodegenerative disorders. Hence, recognizing disordered regions in proteins is a critical task. In this paper, we presented a new feature encoding technique built from physicochemical properties of residues selected as per the chaotic structure of related protein sequence. Our feature vector has been tested with various classification algorithms on an up-to-date data set and also compared to other methods. The proposed method shows better classification performance than many methods in terms of accuracy, sensitivity and specificity. Our results suggest that the new method that links the residues and their physicochemical properties using Lyapunov exponents is highly effective in recognition of disordered regions.  相似文献   

14.
Accurate identification of cell nuclei and their tracking using three dimensional (3D) microscopic images is a demanding task in many biological studies. Manual identification of nuclei centroids from images is an error-prone task, sometimes impossible to accomplish due to low contrast and the presence of noise. Nonetheless, only a few methods are available for 3D bioimaging applications, which sharply contrast with 2D analysis, where many methods already exist. In addition, most methods essentially adopt segmentation for which a reliable solution is still unknown, especially for 3D bio-images having juxtaposed cells. In this work, we propose a new method that can directly extract nuclei centroids from fluorescence microscopy images. This method involves three steps: (i) Pre-processing, (ii) Local enhancement, and (iii) Centroid extraction. The first step includes two variations: first variation (Variant-1) uses the whole 3D pre-processed image, whereas the second one (Variant-2) modifies the preprocessed image to the candidate regions or the candidate hybrid image for further processing. At the second step, a multiscale cube filtering is employed in order to locally enhance the pre-processed image. Centroid extraction in the third step consists of three stages. In Stage-1, we compute a local characteristic ratio at every voxel and extract local maxima regions as candidate centroids using a ratio threshold. Stage-2 processing removes spurious centroids from Stage-1 results by analyzing shapes of intensity profiles from the enhanced image. An iterative procedure based on the nearest neighborhood principle is then proposed to combine if there are fragmented nuclei. Both qualitative and quantitative analyses on a set of 100 images of 3D mouse embryo are performed. Investigations reveal a promising achievement of the technique presented in terms of average sensitivity and precision (i.e., 88.04% and 91.30% for Variant-1; 86.19% and 95.00% for Variant-2), when compared with an existing method (86.06% and 90.11%), originally developed for analyzing C. elegans images.  相似文献   

15.
Summary Hybridization probes produced from DNA sequences have proven to be a powerful tool in the rapid and sensitive analysis of natural microbial communities. By using function-specific probes, such as those identifying genes coding for photosynthesis, the potential a microbial community has for performing a given function may be rapidly determined. Gene probes have also been used in the identification and isolation of a specific catabolic genotype in less than one-fourth the time required for the conventional culture enrichment technique. Species-specific probes constructed from portions of genes coding for ribosomal RNA have been used for the rapid identification and enumeration of bacterial species in environmental samples. The use of reassociation kinetics as a measure of community diversity and complexity is also discussed. The successful application of this technique to community analysis may reduce the time required from 1 year, for conventional analysis, to 2 weeks.  相似文献   

16.
The recently introduced wavelet transform is a member of the class of time-frequency representations which include the Gabor short-time Fourier transform and Wigner-Ville distribution. Such techniques are of significance because of their ability to display the spectral content of a signal as time elapses. The value of the wavelet transform as a signal analysis tool has been demonstrated by its successful application to the study of turbulence and processing of speech and music. Since, in common with these subjects, both the time and frequency content of physiological signals are often of interest (the ECG being an obvious example), the wavelet transform represents a particularly relevant means of analysis. Following a brief introduction to the wavelet transform and its implementation, this paper describes a preliminary investigation into its application to the study of both ECG and heart rate variability data. In addition, the wavelet transform can be used to perform multiresolution signal decomposition. Since this process can be considered as a sub-band coding technique, it offers the opportunity for data compression, which can be implemented using efficient pyramidal algorithms. Results of the compression and reconstruction of ECG data are given which suggest that the wavelet transform is well suited to this task.  相似文献   

17.
The genes for the classical transplantation antigens are unique in that they belong to a multigene family of which each member is represented by a large number of alleles. Since all of these genes are highly related in sequence, it has been difficult to study the expression of individual members of this complex gene family. Based upon our initial suggestion that the 3' noncoding regions of these genes may be useful in identifying mRNA molecules transcribed from different loci, we have compared a large number of sequences from different inbred mouse strains and have been able to assign each of these sequences without ambiguity into distinct allelic series. Such accurate assignment has afforded the opportunity to compare the coding regions of these highly homologous genes and has led to the identification of sequences which are apparently unique to specific genes in the family. Synthetic oligonucleotides corresponding to each of the locus-specific unique regions have been used successfully to type a panel of cDNA sequences, as well as to quantitate the relative amounts of mRNA transcribed from distinct loci. The availability of these specific coding probes will allow the analysis of individual genes and their specific expression without interference from other highly homologous sequences in this multigene family.  相似文献   

18.
Identification of coding regions in DNA sequences remains challenging. Various methods have been proposed, but these are limited by species-dependence and the need for adequate training sets. The elements in DNA coding regions are known to be distributed in a quasi-random way, while those in non-coding regions have typical similar structures. For short sequences, these statistical characteristics cannot be extracted correctly and cannot even be detected. This paper introduces a new way to solve the problem: balanced estimation of diffusion entropy (BEDE).  相似文献   

19.
20.
Testing candidate plant barcode regions in the Myristicaceae   总被引:2,自引:0,他引:2  
The concept and practice of DNA barcoding have been designed as a system to facilitate species identification and recognition. The primary challenge for barcoding plants has been to identify a suitable region on which to focus the effort. The slow relative nucleotide substitution rates of plant mitochondria and the technical issues with the use of nuclear regions have focused attention on several proposed regions in the plastid genome. One of the challenges for barcoding is to discriminate closely related or recently evolved species. The Myristicaceae, or nutmeg family, is an older group within the angiosperms that contains some recently evolved species providing a challenging test for barcoding plants. The goal of this study is to determine the relative utility of six coding (Universal Plastid Amplicon - UPA, rpoB, rpoc1, accD, rbcL, matK) and one noncoding (trnH-psbA) chloroplast loci for barcoding in the genus Compsoneura using both single region and multiregion approaches. Five of the regions we tested were predominantly invariant across species (UPA, rpoB, rpoC1, accD, rbcL). Two of the regions (matK and trnH-psbA) had significant variation and show promise for barcoding in nutmegs. We demonstrate that a two-gene approach utilizing a moderately variable region (matK) and a more variable region (trnH-psbA) provides resolution among all the Compsonuera species we sampled including the recently evolved C. sprucei and C. mexicana. Our classification analyses based on nonmetric multidimensional scaling ordination, suggest that the use of two regions results in a decreased range of intraspecific variation relative to the distribution of interspecific divergence with 95% of the samples correctly identified in a sequence identification analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号