首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
4.
DNA regulatory sequences control gene expression by forming DNA-protein complex with specific DNA binding protein. A major task of studies of gene regulation is to identify DNA regulatory sequences in genome-wide. Especially with the rapid pace of genome project, the function of DNA regulatory sequences becomes one of the focuses in functional genome era. Several approaches for screening and characterizing DNA regulatory sequences emerged one by one, from initial low-throughput methods to high-throughput strategies. Even though at present bioinformatics tools facilitate the process of screening regulatory fragments, the most reliable results will come from experimental test. This article highlights some experimental methods for the identification of regulatory sequences. A brief review of the history and procedures for selection methods are provided. Tendency as well as limitation and extension of these methods are also presented.  相似文献   

5.
Let A denote an alphabet consisting of n types of letters. Given a sequence S of length L with v(i) letters of type i on A, to describe the compositional properties and combinatorial structure of S, we propose a new complexity function of S, called the reciprocal complexity of S, as C(S) = (i=1) product operator (n) (L/nv(i))(vi) Based on this complexity measure, an efficient algorithm is developed for classifying and analyzing simple segments of protein and nucleotide sequence databases associated with scoring schemes. The running time of the algorithm is nearly proportional to the sequence length. The program DSR corresponding to the algorithm was written in C++, associated with two parameters (window length and cutoff value) and a scoring matrix. Some examples regarding protein sequences illustrate how the method can be used to find regions. The first application of DSR is the masking of simple sequences for searching databases. Queries masked by DSR returned a manageable set of hits below the E-value cutoff score, which contained all true positive homologues. The second application is to study simple regions detected by the DSR program corresponding to known structural features of proteins. An extensive computational analysis has been made of protein sequences with known, physicochemically defined nonglobular segments. For the SWISS-PROT amino acid sequence database (Release 40.2 of 02-Nov-2001), we determine that the best parameters and the best BLOSUM matrix are, respectively, for automatic segmentation of amino acid sequences into nonglobular and globular regions by the DSR program: Window length k = 35, cutoff value b = 0.46, and the BLOSUM 62.5 matrix. The average "agreement accuracy (sensitivity)" of DSR segmentation for the SWISS-PROT database is 97.3%.  相似文献   

6.
7.
8.
Discovering and detecting transposable elements in genome sequences   总被引:2,自引:0,他引:2  
The contribution of transposable elements (TEs) to genome structure and evolution as well as their impact on genome sequencing, assembly, annotation and alignment has generated increasing interest in developing new methods for their computational analysis. Here we review the diversity of innovative approaches to identify and annotate TEs in the post-genomic era, covering both the discovery of new TE families and the detection of individual TE copies in genome sequences. These approaches span a broad spectrum in computational biology including de novo, homology-based, structure-based and comparative genomic methods. We conclude that the integration and visualization of multiple approaches and the development of new conceptual representations for TE annotation will further advance the computational analysis of this dynamic component of the genome.  相似文献   

9.
10.
11.
MOTIVATION: In model organisms such as yeast, large databases of protein-protein and protein-DNA interactions have become an extremely important resource for the study of protein function, evolution, and gene regulatory dynamics. In this paper we demonstrate that by integrating these interactions with widely-available mRNA expression data, it is possible to generate concrete hypotheses for the underlying mechanisms governing the observed changes in gene expression. To perform this integration systematically and at large scale, we introduce an approach for screening a molecular interaction network to identify active subnetworks, i.e., connected regions of the network that show significant changes in expression over particular subsets of conditions. The method we present here combines a rigorous statistical measure for scoring subnetworks with a search algorithm for identifying subnetworks with high score. RESULTS: We evaluated our procedure on a small network of 332 genes and 362 interactions and a large network of 4160 genes containing all 7462 protein-protein and protein-DNA interactions in the yeast public databases. In the case of the small network, we identified five significant subnetworks that covered 41 out of 77 (53%) of all significant changes in expression. Both network analyses returned several top-scoring subnetworks with good correspondence to known regulatory mechanisms in the literature. These results demonstrate how large-scale genomic approaches may be used to uncover signalling and regulatory pathways in a systematic, integrative fashion.  相似文献   

12.
Discovering simple DNA sequences by the algorithmic significance method   总被引:6,自引:1,他引:5  
A new method, ‘algorithmic significance’, is proposedas a tool for discovery of patterns in DNA sequences. The mainidea is that patterns can be discovered by finding ways to encodethe observed data concisely. In this sense, the method can beviewed as a formal version of the Occam's Razor principle. Inthis paper the method is applied to discover significantly simpleDNA sequences. We define DNA sequences to be simple if theycontain repeated occurrences of certain ‘words’and thus can be encoded in a small number of bits. Such definitionincludes minisatellites and microsatellites. A standard dynamicprogramming algorithm for data compression is applied to computethe minimal encoding lengths of sequences in linear time. Anelectronic mail server for identification of simple sequencesbased on the proposed method has been installed at the Internetaddress pythia@anl.gov.  相似文献   

13.
We have characterized at the nucleotide level a 4.8-kilobase pair segment of the third chromosome of Droophila melanogaster, which contains a cluster of three chorion genes, s 18-1, s 15-1 and s 19-1. These genes are tandemly oriented and share the same basic organization: a small and a large exon separated by a short intron in the signal peptide region. In the coding region, limited similarities at the DNA and protein level suggest a common but distant evolutionary origin. The flanking sequences were searched for elements that might be involved in controlling the tissue-specific and temporally regulated expression and the selective amplification of the chorion genes. A good candidate for a cis-regulatory element is the hexamer, TCACGT, which is found in all three genes in a highly significant position, 23 to 27 nucleotides upstream of the TATA-box, accompanied by additional, less exact similarities. Palindromes and short inverted repeats that are found in the vicinity of their complement are non-uniformly distributed: they are most concentrated in the 3 flanking part of all three genes, in and near regions of unusually high A and T content. The highest number of dyad symmetries, remiiscent of sequences that function as viral replication origins, is found associated with the T- and A-rich regions between genes s18-1 and s15-1.  相似文献   

14.
Mitchison NA 《Genome biology》2001,2(1):comment2001.1-comment20016
The extensive polymorphism revealed in non-coding gene-regulatory sequences, particularly in the immune system, suggests that this type of genetic variation is functionally and evolutionarily far more important than has been suspected, and provides a lead to new therapeutic strategies.  相似文献   

15.
N A Mitchison 《Genome biology》2000,2(1):comment200
The extensive polymorphism revealed in non-coding gene-regulatory sequences, particularly in the immune system, suggests that this type of genetic variation is functionally and evolutionarily far more important than has been suspected, and provides a lead to new therapeutic strategies.  相似文献   

16.
17.
Molecular Biology Reports - The 35S and 5S ribosomal DNA (rDNA) organized in thousands of copies in genomes, have been widely used in numerous comparative cytogenetic studies. Nevertheless, several...  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号