共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
4.
Assaying the regulatory potential of mammalian conserved non-coding sequences in human cells
下载免费PDF全文
![点击此处可从《Genome biology》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Catia Attanasio Alexandre Reymond Richard Humbert Robert Lyle Michael S Kuehn Shane Neph Peter J Sabo Jeff Goldy Molly Weaver Andrew Haydock Kristin Lee Michael Dorschner Emmanouil T Dermitzakis Stylianos E Antonarakis John A Stamatoyannopoulos 《Genome biology》2008,9(12):R168-12
5.
Primary structure of thousands of genes is being determined in many laboratories worldwide. While it is relatively easy to analyse the coding region(s) of genes, it is usually hard to understand what is located in non-coding regions. A non-coding region may contain very valuable information about the mode of functioning of a given gene, e. g. promoters, enhancers, silencers etc. The regulatory function of these sequences is determined by their interaction with certain sequence-specific proteins, i. e. the presence of a certain DNA sequence in a non-coding region of a gene may suggest that the gene is regulated by a specific protein factor. This minireview summarizes recent data on most known eukaryotic sequence-specific DNA-binding protein factors, including their origin, DNA consensus, and their role in expression of corresponding genes. 相似文献
6.
《Cell cycle (Georgetown, Tex.)》2013,12(19):2216-2219
Cancer is a disease involving multi-step dynamic changes in the genome. However, studies on cancer genome so far have focused most heavily on protein-coding genes, and our knowledge on alterations of the functional non-coding sequences in cancer is largely absent. MicroRNAs (miRNA) are ~22 nt non-coding RNAs, which regulate gene expression in a sequence-specific manner via translational inhibition or mRNA degradation. Mounting evidence is showing that miRNAs may play important roles in tumor development, and a better understanding of their alteration in cancer genome and oncogenic property should contribute to the diagnosis and treatment of cancer. 相似文献
7.
8.
Chuanhua Xing Donald L. Bitzer Winser E. Alexander Mladen A. Vouk Anne-Marie Stomp 《Nucleic acids research》2009,37(2):591-601
We introduce a new approach in this article to distinguish protein-coding sequences from non-coding sequences utilizing a period-3, free energy signal that arises from the interactions of the 3′-terminal nucleotides of the 18S rRNA with mRNA. We extracted the special features of the amplitude and the phase of the period-3 signal in protein-coding regions, which is not found in non-coding regions, and used them to distinguish protein-coding sequences from non-coding sequences. We tested on all the experimental genes from Saccharomyces cerevisiae and Schizosaccharomyces pombe. The identification was consistent with the corresponding information from GenBank, and produced better performance compared to existing methods that use a period-3 signal. The primary tests on some fly, mouse and human genes suggests that our method is applicable to higher eukaryotic genes. The tests on pseudogenes indicated that most pseudogenes have no period-3 signal. Some exploration of the 3′-tail of 18S rRNA and pattern analysis of protein-coding sequences supported further our assumption that the 3′-tail of 18S rRNA has a role of synchronization throughout translation elongation process. This, in turn, can be utilized for the identification of protein-coding sequences. 相似文献
9.
Identifying non-coding RNA regions on the genome using computational methods is currently receiving a lot of attention. In
general, it is essentially more difficult than the problem of detecting protein-coding genes because non-coding RNA regions
have only weak statistical signals. On the other hand, most functional RNA families have conserved sequences and secondary
structures which are characteristic of their molecular function in a cell. These are known as sequence motifs and consensus
structures, respectively. In this paper, we propose an improved method which extends a pairwise structural alignment method
for RNA sequences to handle position specific scoring matrices and hence to incorporate motifs into structural alignment of
RNA sequences. To model sequence motifs, we employ position specific scoring matrices (PSSMs). Experimental results show that
PSSMs enable us to find individual RNA families efficiently, especially if we have biological knowledge such as sequence motifs.
K. Sato and K. Morita contributed equally to this work. 相似文献
10.
11.
12.
MOTIVATION: Several pattern discovery methods have been proposed to detect over-represented motifs in upstream sequences of co-regulated genes, and are for example used to predict cis-acting elements from clusters of co-expressed genes. The clusters to be analyzed are often noisy, containing a mixture of co-regulated and non-co-regulated genes. We propose a method to discriminate co-regulated from non-co-regulated genes on the basis of counts of pattern occurrences in their non-coding sequences. METHODS: String-based pattern discovery is combined with discriminant analysis to classify genes on the basis of putative regulatory motifs. RESULTS: The approach is evaluated by comparing the significance of patterns detected in annotated regulons (positive control), random gene selections (negative control) and high-throughput regulons (noisy data) from the yeast Saccharomyces cerevisiae. The classification is evaluated on the annotated regulons, and the robustness and rejection power is assessed with mixtures of co-regulated and random genes. 相似文献
13.
14.
Identification of a conserved sequence in the non-coding regions of many human genes. 总被引:4,自引:3,他引:1
下载免费PDF全文
![点击此处可从《Nucleic acids research》网站下载免费的PDF全文](/ch/ext_images/free.gif)
We have analyzed a sequence of approximately 70 base pairs (bp) that shows a high degree of similarity to sequences present in the non-coding regions of a number of human and other mammalian genes. The sequence was discovered in a fragment of human genomic DNA adjacent to an integrated hepatitis B virus genome in cells derived from human hepatocellular carcinoma tissue. When one of the viral flanking sequences was compared to nucleotide sequences in GenBank, more than thirty human genes were identified that contained a similar sequence in their non-coding regions. The sequence element was usually found once or twice in a gene, either in an intron or in the 5' or 3' flanking regions. It did not share any similarities with known short interspersed nucleotide elements (SINEs) or presently known gene regulatory elements. This element was highly conserved at the same position within the corresponding human and mouse genes for myoglobin and N-myc, indicating evolutionary conservation and possible functional importance. Preliminary DNase I footprinting data suggested that the element or its adjacent sequences may bind nuclear factors to generate specific DNase I hypersensitive sites. The size, structure, and evolutionary conservation of this sequence indicates that it is distinct from other types of short interspersed repetitive elements. It is possible that the element may have a cis-acting functional role in the genome. 相似文献
15.
Computational identification of protein coding potential of conserved sequence tags through cross-species evolutionary analysis 总被引:4,自引:0,他引:4
下载免费PDF全文
![点击此处可从《Nucleic acids research》网站下载免费的PDF全文](/ch/ext_images/free.gif)
The identification of conserved sequence tags (CSTs) through comparative genome analysis may reveal important regulatory elements involved in shaping the spatio-temporal expression of genetic information. It is well known that the most significant fraction of CSTs observed in human–mouse comparisons correspond to protein coding exons, due to their strong evolutionary constraints. As we still do not know the complete gene inventory of the human and mouse genomes it is of the utmost importance to establish if detected conserved sequences are genes or not. We propose here a simple algorithm that, based on the observation of the specific evolutionary dynamics of coding sequences, efficiently discriminates between coding and non-coding CSTs. The application of this method may help the validation of predicted genes, the prediction of alternative splicing patterns in known and unknown genes and the definition of a dictionary of non-coding regulatory elements. 相似文献
16.
17.
An analysis of the diversity of the aspartyl proteases of Plasmodium falciparum, known as plasmepsins (PMs), was completed in view of their possible role as drug targets. DNA sequence polymorphisms were identified in nine pm genes including their non-coding (introns and 5' flanking) sequences. All genes contained at least one single nucleotide polymorphism (SNP). Extensive microsatellite diversity was observed predominantly in non-coding sequences. All but one non-synonymous polymorphism (a conservative substitution) were mapped to the surface of the predicted protein, contradicting a possible role in enzymatic activity. The distribution of SNPs was found to be non-random among pm genes, with pm6 and pm10 having significantly higher SNP densities, suggesting they were under selection. For pm6 the majority of the SNPs were in introns and some of these may contribute to splice site variation. SNPs were found at a high density in both the coding and non-coding sequences of pm10. Recombination was important in generating additional diversity at this locus. Although direct selection for pm10 mutations could not be ruled out, the presence of balancing selection and a high density of SNPs in non-coding sequence led us to propose that another gene under selection may be influencing the diversity in the region. By sequencing short DNA tags in a 200 kb region flanking pm10 we show that a cluster of antigen genes, known to be under diversifying selection, may contribute to the observed diversity. We discuss the importance of diversity and local selection effects when choosing drug targets for intervention strategies. 相似文献
18.
Sophie Marion de Procé Daniel L. Halligan Peter D. Keightley Brian Charlesworth 《Journal of molecular evolution》2009,69(6):601-611
Contrary to the classical view, a large amount of non-coding DNA seems to be selectively constrained in Drosophila and other
species. Here, using Drosophila miranda BAC sequences and the Drosophila pseudoobscura genome sequence, we aligned coding and non-coding sequences between D. pseudoobscura and D. miranda, and investigated their patterns of evolution. We found two patterns that have previously been observed in comparisons between
Drosophila melanogaster and its relatives. First, there is a negative correlation between intron divergence and intron length, suggesting that longer
non-coding sequences may contain more regulatory elements than shorter sequences. Our other main finding is a negative correlation
between the rate of non-synonymous substitutions (d
N) and codon usage bias (F
op), showing that fast-evolving genes have a lower codon usage bias, consistent with strong positive selection interfering with
weak selection for codon usage. 相似文献
19.
20.
A new statistical approach using functions based on the circular code classifies correctly more than 93% of bases in protein (coding) genes and non-coding genes of human sequences. Based on this statistical study, a research software called 'Analysis of Coding Genes' (ACG) has been developed for identifying protein genes in the genomes and for determining their frame. Furthermore, the software ACG also allows an evaluation of the length of protein genes, their position in the genome, their relative position between themselves, and the prediction of internal frames in protein genes. 相似文献