首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
With the rapid increase of DNA databases of human and other eukaryotic model organisms, a large great number of genes need to be distinguished from the DNA databases. Exact recognition of translation initiation sites (TISs) of eukaryotic genes is very important to understand the translation initiation process, predict the detailed structure of eukaryotic genes, and annotate uncharacterized sequences. The problem has not been solved satisfactorily, especially for recognizing TISs of the eukaryotic genes with shorter first exons. It is an important task for extracting new features and finding new powerful algorithms for recognizing TISs of eukaryotic genes. In this paper, the important characteristics of shorter flanking fragments around TISs are extracted and an expectation-maximization (EM) algorithm based on incomplete data is used to recognize TISs of eukaryotic genes. The accuracy is up to 87.8% over a six-fold cross-validation test. The result shows that the identification variables are effectively extracted and the EM algorithm is a powerful tool to predict the TISs of eukaryotic genes. The algorithm also can be applied to other classification or clustering tasks in bioinformatics.  相似文献   

2.
3.
4.
CAT vectors for analysis of eukaryotic promoters and enhancers   总被引:36,自引:0,他引:36  
E Prost  D D Moore 《Gene》1986,45(1):107-111
We have constructed two sets of plasmids for analysis of factors affecting mammalian gene expression. The pOCAT series contains a bacterial chloramphenicol-resistance expression unit (cat) and no eukaryotic promoter. The pUTKAT series contains the same cat unit under the control of the thymidine-kinase promoter of Herpes simplex virus. These plasmids are designed for testing effects of inserted regulatory elements on cat expression after transient transfection of mammalian cells in culture. We demonstrate here that the pOCAT series is useful for studying activities of inserted eukaryotic promoters, and the pUTKAT series is useful for studying activities of inserted eukaryotic enhancers.  相似文献   

5.
Molecular genetic analysis of FNR-dependent promoters   总被引:21,自引:17,他引:21  
  相似文献   

6.
原核生物操纵子结构的准确注释对基因功能和基因调控网络的研究具有重要意义,通过生物信息学方法计算预测是当前基因组操纵子结构注释的最主要来源.当前的预测算法大都需要实验确认的操纵子作为训练集,但实验确认的操纵子数据的缺乏一直成为发展算法的瓶颈.基于对操纵子结构的认识,从基因间距离、转录翻译相关的调控信号以及COG功能注释等特征出发,建立了描述操纵子复杂结构的概率模型,并提出了不依赖于特定物种操纵子数据作为训练集的迭代自学习算法.通过对实验验证的操纵子数据集的测试比较,结果表明算法对于预测操纵子结构非常有效.在不依赖于任何已知操纵子信息的情况下,算法在总体预测水平上超过了目前最好的操纵子预测方法,而且这种自学习的预测算法要优于依赖特定物种进行训练的算法.这些特点使得该算法能够适用于新测序的物种,有别于当前常用的操纵子预测方法.对细菌和古细菌的基因组进行大规模比较分析,进一步提高了对基因组操纵子结构的普遍特征和物种特异性的认识.  相似文献   

7.
Transcriptional repression of eukaryotic promoters   总被引:108,自引:0,他引:108  
M Levine  J L Manley 《Cell》1989,59(3):405-408
  相似文献   

8.
9.
Protein docking using a genetic algorithm   总被引:2,自引:0,他引:2  
A genetic algorithm (GA) for protein-protein docking is described, in which the proteins are represented by dot surfaces calculated using the Connolly program. The GA is used to move the surface of one protein relative to the other to locate the area of greatest surface complementarity between the two. Surface dots are deemed complementary if their normals are opposed, their Connolly shape type is complementary, and their hydrogen bonding or hydrophobic potential is fulfilled. Overlap of the protein interiors is penalized. The GA is tested on 34 large protein-protein complexes where one or both proteins has been crystallized separately. Parameters are established for which 30 of the complexes have at least one near-native solution ranked in the top 100. We have also successfully reassembled a 1,400-residue heptamer based on the top-ranking GA solution obtained when docking two bound subunits.  相似文献   

10.
Metagenomics is an emerging field in which the power of genomic analysis is applied to an entire microbial community, bypassing the need to isolate and culture individual microbial species. Assembling of metagenomic DNA fragments is very much like the overlap-layout-consensus procedure for assembling isolated genomes, but is augmented by an additional binning step to differentiate scaffolds, contigs and unassembled reads into various taxonomic groups. In this paper, we employed n-mer oligonucleotide frequencies as the features and developed a hierarchical classifier (PCAHIER) for binning short (≤ 1,000 bps) metagenomic fragments. The principal component analysis was used to reduce the high dimensionality of the feature space. The hierarchical classifier consists of four layers of local classifiers that are implemented based on the linear discriminant analysis. These local classifiers are responsible for binning prokaryotic DNA fragments into superkingdoms, of the same superkingdom into phyla, of the same phylum into genera, and of the same genus into species, respectively. We evaluated the performance of the PCAHIER by using our own simulated data sets as well as the widely used simHC synthetic metagenome data set from the IMG/M system. The effectiveness of the PCAHIER was demonstrated through comparisons against a non-hierarchical classifier, and two existing binning algorithms (TETRA and Phylopythia).  相似文献   

11.
Photosynthesis response to carbon dioxide concentration can provide data on a number of important parameters related to leaf physiology. The genetic algorithm (GA), which is a robust stochastic evolutionary computational algorithm inspired by both natural selection and natural genetics, is proposed to simultaneously estimate the parameters [including maximum carboxylation rate allowed by ribulose 1·5-bisphosphate carboxylase/oxygenase (Rubisco) carboxylation rate ( V cmax), potential light-saturated electron transport rate ( J max), triose-phosphate utilization (TPU), leaf dark respiration in the light ( R d) and mesophyll conductance ( g m)] of the photosynthesis models presented by Farquhar, von Caemmerer and Berry, and Ethier and Livingston. The results show that by properly constraining the parameter bounds the GA-based estimate methods can effectively and efficiently obtain globally (or, at least near globally) optimal solutions, which are as good as or better than those obtained by non-linear curve fitting methods used in previous studies. More complicated problems such as taking the g m variation response to CO2 into account can be easily formulated and solved by using GA. The influence of the crossover probability ( P c), mutation probability ( P m), population size and generation on the performance of GA was also investigated.  相似文献   

12.
We present a simple method to detect pathogenicity islands and anomalous gene clusters in bacterial genomes. The method uses iterative discriminant analysis to define genomic regions that deviate most from the rest of the genome in three compositional criteria: G+C content, dinucleotide frequency and codon usage. Using this method, we identify many virulence-related gene islands, e.g. encoding protein secretion systems, adhesins, toxins, and other anomalous gene clusters, such as prophages. The program and the whole dataset, including the catalogs of genes in the detected anomalous segments, are publicly available at http://compbio.sibsnet.org/projects/pai-ida/. This program can be used in searching for virulence-related factors in newly sequenced bacterial genomes.  相似文献   

13.
A new method is proposed for finding a low dimensional subspace of high dimensional microarray data. We developed a new criterion for constructing the weight matrix by using local neighborhood information to discover the intrinsic discriminant structure in the data. Also this approach applies regularized least square technique to extract relevant features. We assess the performance of the proposed methodology by applying it to four publicly available tumor datasets. In a low dimensional subspace, the proposed method classified these tumors accurately and reliably. Also, through a comparison study, we verify the reliability of the dimensionality reduction and discrimination results.  相似文献   

14.
Tetracycline-reversible silencing of eukaryotic promoters.   总被引:12,自引:1,他引:11       下载免费PDF全文
  相似文献   

15.
16.
Protein structure alignment using a genetic algorithm   总被引:3,自引:0,他引:3  
Szustakowski JD  Weng Z 《Proteins》2000,38(4):428-440
We have developed a novel, fully automatic method for aligning the three-dimensional structures of two proteins. The basic approach is to first align the proteins' secondary structure elements and then extend the alignment to include any equivalent residues found in loops or turns. The initial secondary structure element alignment is determined by a genetic algorithm. After refinement of the secondary structure element alignment, the protein backbones are superposed and a search is performed to identify any additional equivalent residues in a convergent process. Alignments are evaluated using intramolecular distance matrices. Alignments can be performed with or without sequential connectivity constraints. We have applied the method to proteins from several well-studied families: globins, immunoglobulins, serine proteases, dihydrofolate reductases, and DNA methyltransferases. Agreement with manually curated alignments is excellent. A web-based server and additional supporting information are available at http://engpub1.bu.edu/-josephs.  相似文献   

17.
MOTIVATION: A rapid growth in the number of genes with known sequences calls for developing automated tools for their classification and analysis. It became clear that nucleosome packaging of eukaryotic DNA is very important for gene functioning. Automated computer tools for characterization of nucleosome packaging density could be useful for studying of gene regulation and genome annotation. RESULTS: A program for constructing nucleosome formation potential profiles of eukaryotic DNA sequences was developed. Nucleosome packaging density was analyzed for different functional types of human promoters. It was found that in promoters of tissue-specific genes, the nucleosome formation potential was essentially higher than in genes expressed in many tissues, or housekeeping genes. Hence, capability of nucleosome positioning in the promoter region may serve as a factor regulating gene expression. AVAILABILITY: The program for nucleosome sites recognition is included into the GeneExpress system; section 'DNA Nucleosomal Organization', http://wwwmgs.bionet.nsc.ru/mgs/programs/recon/.  相似文献   

18.
Computer assisted surgical interventions and research in joint kinematics rely heavily on the accurate registration of three-dimensional bone surface models reconstructed from various imaging technologies. Anomalous results were seen in a kinematic study of carpal bones using a principal axes alignment approach for the registration. The study was repeated using an iterative closest point algorithm, which is more accurate, but also more demanding to apply. The principal axes method showed errors between 0.35 mm and 0.49 mm for the scaphoid, and between 0.40 mm and 1.22 mm for the pisiform. The iterative closest point method produced errors of less than 0.4 mm. These results show that while the principal axes method approached the accuracy of the iterative closest point algorithm in asymmetrical bones, there were more pronounced errors in bones with some symmetry. Principal axes registration for carpal bones should be avoided.  相似文献   

19.
20.
A genetic algorithm has been devised and applied to the problems of molecular similarity, pharmacophore elucidation, and determination of molecular conformation. The algorithm is based on a binary representation of molecular position and conformation. Using the genetic operators, crossover, mutation, and selection near optimum conformations and orientations of molecules may be determined which best-fit defined constraints. The constraints may be any useful function for example, intermolecular or intramolecular distances, electrostatic potential on a surface, or volume overlap. Problems with up to 30 degrees of freedom have been tackled successfully.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号