首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we use a statistical estimator developed in astrophysics to study the distribution and organization of features of the human genome. Using the human reference sequence we quantify the global distribution of CpG islands (CGI) in each chromosome and demonstrate that the organization of the CGI across a chromosome is non-random, exhibits surprisingly long range correlations (10 Mb) and varies significantly among chromosomes. These correlations of CGI summarize functional properties of the genome that are not captured when considering variation in any particular separate (and local) feature. The demonstration of the proposed methods to quantify the organization of CGI in the human genome forms the basis of future studies. The most illuminating of these will assess the potential impact on phenotypic variation of inter-individual variation in the organization of the functional features of the genome within and among chromosomes, and among individuals for particular chromosomes.  相似文献   

2.
CpG islands as gene markers in the human genome.   总被引:65,自引:0,他引:65  
F Larsen  G Gundersen  R Lopez  H Prydz 《Genomics》1992,13(4):1095-1107
  相似文献   

3.
4.
5.
Chuang LY  Huang HC  Lin MC  Yang CH 《PloS one》2011,6(6):e21036

Background

Regions with abundant GC nucleotides, a high CpG number, and a length greater than 200 bp in a genome are often referred to as CpG islands. These islands are usually located in the 5′ end of genes. Recently, several algorithms for the prediction of CpG islands have been proposed.

Methodology/Principal Findings

We propose here a new method called CPSORL to predict CpG islands, which consists of a complement particle swarm optimization algorithm combined with reinforcement learning to predict CpG islands more reliably. Several CpG island prediction tools equipped with the sliding window technique have been developed previously. However, the quality of the results seems to rely too much on the choices that are made for the window sizes, and thus these methods leave room for improvement.

Conclusions/Significance

Experimental results indicate that CPSORL provides results of a higher sensitivity and a higher correlation coefficient in all selected experimental contigs than the other methods it was compared to (CpGIS, CpGcluster, CpGProd and CpGPlot). A higher number of CpG islands were identified in chromosomes 21 and 22 of the human genome than with the other methods from the literature. CPSORL also achieved the highest coverage rate (3.4%). CPSORL is an application for identifying promoter and TSS regions associated with CpG islands in entire human genomic. When compared to CpGcluster, the islands predicted by CPSORL covered a larger region in the TSS (12.2%) and promoter (26.1%) region. If Alu sequences are considered, the islands predicted by CPSORL (Alu) covered a larger TSS (40.5%) and promoter (67.8%) region than CpGIS. Furthermore, CPSORL was used to verify that the average methylation density was 5.33% for CpG islands in the entire human genome.  相似文献   

6.
The human genome is revisited using exon and intron distribution profiles. The 26,564 annotated genes in the human genome (build October, 2003) contain 233,785 exons and 207,344 introns. On average, there are 8.8 exons and 7.8 introns per gene. About 80% of the exons on each chromosome are < 200 bp in length. < 0.01% of the introns are < 20 bp in length and < 10% of introns are more than 11,000 bp in length. These results suggest constraints on the splicing machinery to splice out very long or very short introns and provide insight to optimal intron length selection. Interestingly, the total length in introns and intergenic DNA on each chromosome is significantly correlated to the determined chromosome size with a coefficient of correlation r = 0.95 and r = 0.97, respectively. These results suggest their implication in genome design.  相似文献   

7.
In eukaryotes, neighboring genes can be packaged together in specific chromatin structures that ensure their coordinated expression. Examples of such multi-gene chromatin domains are well-documented, but a global view of the chromatin organization of eukaryotic genomes is lacking. To systematically identify multi-gene chromatin domains, we constructed a compendium of genome-scale binding maps for a broad panel of chromatin-associated proteins in Drosophila melanogaster. Next, we computationally analyzed this compendium for evidence of multi-gene chromatin domains using a novel statistical segmentation algorithm. We find that at least 50% of all fly genes are organized into chromatin domains, which often consist of dozens of genes. The domains are characterized by various known and novel combinations of chromatin proteins. The genes in many of the domains are coregulated during development and tend to have similar biological functions. Furthermore, during evolution fewer chromosomal rearrangements occur inside chromatin domains than outside domains. Our results indicate that a substantial portion of the Drosophila genome is packaged into functionally coherent, multi-gene chromatin domains. This has broad mechanistic implications for gene regulation and genome evolution.  相似文献   

8.
9.
10.
Analysis of DNA sequences of the human chromosomes 21 and 22 performed using a specially designed MegaGene software allowed us to obtain the following results. Purine and pyrimidine nucleotide residues are unevenly distributed along both chromosomes, displaying maxima and minima (Y waves phi) with a period of about 3 Mbp. Distribution of G + C along both chromosomes has no distinct maxima and minima, however, chromosome 21 contains considerably less G + C than chromosome 22. Both exons and Alu repeats are unevenly distributed along chromosome 21: they are scarce in its left part and abundant in the right part, while MIR elements are quite monotonously spread along this chromosome. The Alu repeats show a wave-like distribution pattern similar for both repeat orientations. The number of the Alu repeats of opposite orientations was equal for both studied chromosomes, and this may be considered a new property of the human genome. The positive correlation between the exon and Alu distribution patterns along the chromosome, the concurrent distribution of Alu repeats in both orientations along the chromosome, and the equal copy numbers for Alu in direct and inverted orientations within an individual chromosome point to their important role in the human genome, and do not fit the notion that Alu repeats belong to parasitic (junk) DNA.  相似文献   

11.
The exon-intron organization of genes of all human chromosomes was studied in relation to the density of gene distribution on DNA strands and on the number of introns in genes. The lengths of exons, introns and genes have been found to vary correlatively, and this correlation depends on the density of genes in human chromosomes. It has been established that genes with the exon-intron organization have similar tendencies of variation of the lengths of exons and introns with an increase in the number of introns.  相似文献   

12.
13.
14.
Sequence organization of the human genome   总被引:1,自引:0,他引:1  
The organization of three sequence classes—single copy, repetitive, and inverted repeated sequences—within the human genome has been studied by renaturation techniques, hydroxylapatite binding methods, and DNA hyperchromism. Repetitive sequence classes are distributed throughout 80% or more of the genome. Slightly more than half of the genome consists of short single copy sequences, with a length of about 2 kb interspersed with repetitive sequences. The average length of the repetitive sequences is also small and approximates the length of these sequences found in other organisms. The sequence organization of the human genome therefore resembles the sequence organization found in Xenopus and sea urchin. The inverted repeats are essentially randomly positioned with respect to both sequence class and sequence arrangement, so that all three sequence classes are found to be mutually interspersed in a portion of the genome.  相似文献   

15.
屠鞠传礼  王建军 《生物信息学》2010,8(3):254-257,262
为了研究CpG岛产生和消失机制以及位于基因启动子区域外的CpG岛保守性等问题,我们通过序列比对和进化保守性分析等方法,分析在人类和小鼠中保守的基因上的CpG岛。结果显示已有保守序列的突变以及序列插入删除是CpG岛产生和消失的主要原因,进一步分析发现52%的在小鼠基因组上保守序列完全缺失的CpG岛位于两个转座子之间,提示转座子所介导的序列插入是CpG岛形成和消失的重要原因。人类基因组上在启动子区域外的CpG岛中约有79%为新产生的CpG岛,显著高于启动子区域内新产生的CpG岛比例(41%)。GO分析表明与这些CpG岛相关的部分基因与神经系统发育显著相关,提示新产生的CpG岛参与神经发育过程。  相似文献   

16.
CpG islands in vertebrate genomes   总被引:120,自引:0,他引:120  
  相似文献   

17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号