首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MOTIVATION: GenBank data are at present lacking alpha satellite higher-order repeat (HOR) annotation. Furthermore, exact HOR consensus lengths have not been reported so far. Given the fast growth of sequence databases in the centromeric region, it is of increasing interest to have efficient tools for computational identification and analysis of HORs from known sequences. RESULTS: We develop a graphical user interface method, ColorHOR, for fast computational identification of HORs in a given genomic sequence, without requiring a priori information on the composition of the genomic sequence. ColorHOR is based on an extension of the key-string algorithm and provides a color representation of the order and orientation of HORs. For the key string, we use a robust 6 bp string from a consensus alpha satellite and its representative nature is tested. ColorHOR algorithm provides a direct visual identification of HORs (direct and/or reverse complement). In more detail, we first illustrate the ColorHOR results for human chromosome 1. Using ColorHOR we determine for the first time the HOR annotation of the GenBank sequence of the whole human genome. In addition to some HORs, corresponding to those determined previously biochemically, we find new HORs in chromosomes 4, 8, 9, 10, 11 and 19. For the first time, we determine exact consensus lengths of HORs in 10 chromosomes. We propose that the HOR assignment obtained by using ColorHOR be included into the GenBank database.  相似文献   

2.
Tandemly arrayed non-coding sequences or satellite DNAs (satDNAs) are rapidly evolving segments of eukaryotic genomes, including the centromere, and may raise a genetic barrier that leads to speciation. However, determinants and mechanisms of satDNA sequence dynamics are only partially understood. Sequence analyses of a library of five satDNAs common to the root-knot nematodes Meloidogyne chitwoodi and M. fallax together with a satDNA, which is specific for M. chitwoodi only revealed low sequence identity (32–64%) among them. However, despite sequence differences, two conserved motifs were recovered. One of them turned out to be highly similar to the CENP-B box of human alpha satDNA, identical in 10–12 out of 17 nucleotides. In addition, organization of nematode satDNAs was comparable to that found in alpha satDNA of human and primates, characterized by monomers concurrently arranged in simple and higher-order repeat (HOR) arrays. In contrast to alpha satDNA, phylogenetic clustering of nematode satDNA monomers extracted either from simple or from HOR array indicated frequent shuffling between these two organizational forms. Comparison of homogeneous simple arrays and complex HORs composed of different satDNAs, enabled, for the first time, the identification of conserved motifs as obligatory components of monomer junctions. This observation highlights the role of short motifs in rearrangements, even among highly divergent sequences. Two mechanisms are proposed to be involved in this process, i.e., putative transposition-related cut-and-paste insertions and/or illegitimate recombination. Possibility for involvement of the nematode CENP-B box-like sequence in the transposition-related mechanism and together with previously established similarity of the human CENP-B protein and pogo-like transposases implicate a novel role of the CENP-B box and related sequence motifs in addition to the known function in centromere protein binding.  相似文献   

3.
A new key-string segmentation algorithm for identification of alpha satellite DNAs and higher-order repeat (HOR) units was introduced and exemplified. Starting with an initial key string, we determine the dominant key string and HOR. Our key-string algorithm was used to scan the recent GenBank data for human alpha satellite DNA sequence AC017075.8 (193 277 bp) from the centromeric region of chromosome 7. The sequence was computationally segmented into one HOR domain (super-repeat domain) and two non-HOR domains. Dominant key-string GTTTCT provided segmentation in terms of alpha monomers. The HOR is tandemly repeated in 54 copies in the super-repeat (HOR) domain. Five insertions and three deletions in the HOR structure associated with a dominant key string were identified. Concensus HOR was constructed. Divergence of individual HOR copies from concensus amounts to 0.7% on the average, while divergence between 16 monomer variants within each HOR is on the average 20%. In the front and back domain, 199 monomer variants were identified that are not organized in HOR and diverge by 20-40%.  相似文献   

4.
Much attention has been devoted to identifying genomic patterns underlying the evolution of the human brain and its emergent advanced cognitive capabilities, which lie at the heart of differences distinguishing humans from chimpanzees, our closest living relatives. Here, we identify two particular intragene repeat structures of noncoding human DNA, spanning as much as a hundred kilobases, that are present in human genome but are absent from the chimpanzee genome and other nonhuman primates. Using our novel computational method Global Repeat Map, we examine tandem repeat structure in human and chimpanzee chromosome 1. In human chromosome 1, we find three higher order repeats (HORs), two of them novel, not reported previously, whereas in chimpanzee chromosome 1, we find only one HOR, a 2mer alphoid HOR instead of human alphoid 11mer HOR. In human chromosome 1, we identify an HOR based on 39-bp primary repeat unit, with secondary, tertiary, and quartic repeat units, fully embedded in human hornerin gene, related to regenerating and psoriatric skin. Such an HOR is not found in chimpanzee chromosome 1. We find a remarkable human 3mer HOR organization based on the ~1.6-kb primary repeat unit, fully embedded within the neuroblastoma breakpoint family genes, which is related to the function of the human brain. Such HORs are not present in chimpanzees. In general, we find that human-chimpanzee differences are much larger for tandem repeats, in particularly for HORs, than for gene sequences. This may be of great significance in light of recent studies that are beginning to reveal the large-scale regulatory architecture of the human genome, in particular the role of noncoding sequences. We hypothesize about the possible importance of human accelerated HOR patterns as components in the gene expression multilayered regulatory network.  相似文献   

5.
Efficient construction of BAC-based human artificial chromosomes (HACs) requires optimization of each key functional unit as well as development of techniques for the rapid and reliable manipulation of high-molecular weight BAC vectors. Here, we have created synthetic chromosome 17-derived alpha-satellite arrays, based on the 16-monomer repeat length typical of natural D17Z1 arrays, in which the consensus CENP-B box elements are either completely absent (0/16 monomers) or increased in density (16/16 monomers) compared to D17Z1 alpha-satellite (5/16 monomers). Using these vectors, we show that the presence of CENP-B box elements is a requirement for efficient de novo centromere formation and that increasing the density of CENP-B box elements may enhance the efficiency of de novo centromere formation. Furthermore, we have developed a novel, high-throughput methodology that permits the rapid conversion of any genomic BAC target into a HAC vector by transposon-mediated modification with synthetic alpha-satellite arrays and other key functional units. Taken together, these approaches offer the potential to significantly advance the utility of BAC-based HACs for functional annotation of the genome and for applications in gene transfer.  相似文献   

6.
We have investigated the organization and complexity of alpha satellite DNA on chromosomes 10 and 12 by restriction endonuclease mapping, in situ hybridization (ISH), and DNA-sequencing methods. Alpha satellite DNA on both chromosomes displays a basic dimeric organization, revealed as a 6- and an 8-mer higher-order repeat (HOR) unit on chromosome 10 and as an 8-mer HOR on chromosome 12. While these HORs show complete chromosome specificity under high-stringency ISH conditions, they recognize an identical set of chromosomes under lower stringencies. At the nucleotide sequence level, both chromosome 10 HORs are 50% identical to the HOR on chromosome 12 and to all other alpha satellite DNA sequences from the in situ cross-hybridizing chromosomes, with the exception of chromosome 6. An 80% identity between chromosome 6- and chromosome 10-derived alphoid sequences was observed. These data suggest that the alphoid DNA on chromosomes 6 and 10 may represent a distinct subclass of the dimeric subfamily. These sequences are proposed to be present, along with the more typical dimeric alpha satellite sequences, on a number of different human chromosomes.  相似文献   

7.
Understanding the folding of centromere DNA in the maximally condensed methaphase chromosome remains a basic challenge in cell biology. We propose here a set of structural models with a graphical presentation of alphoid higher order repeat (HOR) distribution in the centromere folding, based on the assumption of encryption key for microtubule-centromere interaction which arises from chromosome-specific crystal-like structure of HORs. Specific HOR leads to a characteristic geometrical pattern which may be responsible for individual microtubule to recognize a specific structure of centromere in each chromosome.  相似文献   

8.
The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes).  相似文献   

9.
The human centromere protein B (CENP-B), a centromeric heterochromatin component, forms a homodimer that specifically binds to a distinct DNA sequence (the CENP-B box), which appears within every other alpha-satellite repeat. Previously, we determined the structure of the human CENP-B DNA-binding domain, CENP-B-(1-129), complexed with the CENP-B box DNA. In the present study, we determined the crystal structure of its dimerization domain (CENP-B-(540-599)), another functional domain of CENP-B, at 1.65-A resolution. CENP-B-(540-599) contains two alpha-helices, which are folded into an antiparallel configuration. The CENP-B-(540-599) dimer formed a symmetrical, antiparallel, four-helix bundle structure with a large hydrophobic patch in which 23 residues of one monomer form van der Waals contacts with the other monomer. In the CENP-B-(540-599) dimer, the N-terminal ends of CENP-B-(540-599) are oriented on opposite sides of the dimer. This CENP-B dimer configuration may be suitable for capturing two distant CENP-B boxes during centromeric heterochromatin formation.  相似文献   

10.
Human centromeres are mainly composed of alpha satellite DNA hierarchically organized as higher-order repeats (HORs). Alpha satellite dynamics is shown by sequence homogenization in centromeric arrays and by its transfer to other centromeric locations, for example, during the maturation of new centromeres. We identified during prenatal aneuploidy diagnosis by fluorescent in situ hybridization a de novo insertion of alpha satellite DNA from the centromere of chromosome 18 (D18Z1) into cytoband 15q26. Although bound by CENP-B, this locus did not acquire centromeric functionality as demonstrated by the lack of constriction and the absence of CENP-A binding. The insertion was associated with a 2.8-kbp deletion and likely occurred in the paternal germline. The site was enriched in long terminal repeats and located ∼10 Mbp from the location where a centromere was ancestrally seeded and became inactive in the common ancestor of humans and apes 20–25 million years ago. Long-read mapping to the T2T-CHM13 human genome assembly revealed that the insertion derives from a specific region of chromosome 18 centromeric 12-mer HOR array in which the monomer size follows a regular pattern. The rearrangement did not directly disrupt any gene or predicted regulatory element and did not alter the methylation status of the surrounding region, consistent with the absence of phenotypic consequences in the carrier. This case demonstrates a likely rare but new class of structural variation that we name “alpha satellite insertion.” It also expands our knowledge on alphoid DNA dynamics and conveys the possibility that alphoid arrays can relocate near vestigial centromeric sites.  相似文献   

11.
Comparison of human and chimpanzee genomes has received much attention, because of paramount role for understanding evolutionary step distinguishing us from our closest living relative. In order to contribute to insight into Y chromosome evolutionary history, we study and compare tandems, higher order repeats (HORs), and regularly dispersed repeats in human and chimpanzee Y chromosome contigs, using robust Global Repeat Map algorithm. We find a new type of long-range acceleration, human-accelerated HOR regions. In peripheral domains of 35mer human alphoid HORs, we find riddled features with ten additional repeat monomers. In chimpanzee, we identify 30mer alphoid HOR. We construct alphoid HOR schemes showing significant human–chimpanzee difference, revealing rapid evolution after human–chimpanzee separation. We identify and analyze over 20 large repeat units, most of them reported here for the first time as: chimpanzee and human ~1.6 kb 3mer secondary repeat unit (SRU) and ~23.5 kb tertiary repeat unit (~0.55 kb primary repeat unit, PRU); human 10848, 15775, 20309, 60910, and 72140 bp PRUs; human 3mer SRU (~2.4 kb PRU); 715mer and 1123mer SRUs (5mer PRU); chimpanzee 5096, 10762, 10853, 60523 bp PRUs; and chimpanzee 64624 bp SRU (10853 bp PRU). We show that substantial human–chimpanzee differences are concentrated in large repeat structures, at the level of as much as ~70% divergence, sizably exceeding previous numerical estimates for some selected noncoding sequences. Smeared over the whole sequenced assembly (25 Mb) this gives ~14% human–chimpanzee divergence. This is significantly higher estimate of divergence between human and chimpanzee than previous estimates.  相似文献   

12.
Tandemly repeated DNA families appear to undergo concerted evolution, such that repeat units within a species have a higher degree of sequence similarity than repeat units from even closely related species. While intraspecies homogenization of repeat units can be explained satisfactorily by repeated rounds of genetic exchange processes such as unequal crossing over and/or gene conversion, the parameters controlling these processes remain largely unknown. Alpha satellite DNA is a noncoding tandemly repeated DNA family found at the centromeres of all human and primate chromosomes. We have used sequence analysis to investigate the molecular basis of 13 variant alpha satellite repeat units, allowing comparison of multiple independent recombination events in closely related DNA sequences. The distribution of these events within the 171-bp monomer is nonrandom and clusters in a distinct 20- to 25-bp region, suggesting possible effects of primary sequence and/or chromatin structure. The position of these recombination events may be associated with the location within the higher-order repeat unit of the binding site for the centromere-specific protein CENP-B. These studies have implications for the molecular nature of genetic recombination, mechanisms of concerted evolution, and higher-order structure of centromeric heterochromatin.  相似文献   

13.
The human centromere proteins A (CENP-A) and B (CENP-B) are the fundamental centromere components of chromosomes. CENP-A is the centromere-specific histone H3 variant, and CENP-B specifically binds a 17-base pair sequence (the CENP-B box), which appears within every other alpha-satellite DNA repeat. In the present study, we demonstrated centromere-specific nucleosome formation in vitro with recombinant proteins, including histones H2A, H2B, H4, CENP-A, and the DNA-binding domain of CENP-B. The CENP-A nucleosome wraps 147 base pairs of the alpha-satellite sequence within its nucleosome core particle, like the canonical H3 nucleosome. Surprisingly, CENP-B binds to nucleosomal DNA when the CENP-B box is wrapped within the nucleosome core particle and induces translational positioning of the nucleosome without affecting its rotational setting. This CENP-B-induced translational positioning only occurs when the CENP-B box sequence is settled in the proper rotational setting with respect to the histone octamer surface. Therefore, CENP-B may be a determinant for translational positioning of the centromere-specific nucleosomes through its binding to the nucleosomal CENP-B box.  相似文献   

14.
15.
Minor satellite DNA, found at Mus musculus centromeres, is not present in the genome of the Asian mouse Mus caroli. This repetitive sequence family is speculated to have a role in centromere function by providing an array of binding sites for the centromere-associated protein CENP-B. The apparent absence of CENP-B binding sites in the M. caroli genome poses a major challenge to this hypothesis. Here we describe two abundant satellite DNA sequences present at M. caroli centromeres. These satellites are organized as tandem repeat arrays, over 1 Mb in size, of either 60- or 79-bp monomers. All autosomes carry both satellites and small amounts of a sequence related to the M. musculus major satellite. The Y chromosome contains small amounts of both major satellite and the 60-bp satellite, whereas the X chromosome carries only major satellite sequences. M. caroli chromosomes segregate in M. caroli x M. musculus interspecific hybrid cell lines, indicating that the two sets of chromosomes can interact with the same mitotic spindle. Using a polyclonal CENP-B antiserum, we demonstrate that M. caroli centromeres can bind murine CENP-B in such an interspecific cell line, despite the absence of canonical 17-bp CENP-B binding sites in the M. caroli genome. Sequence analysis of the 79-bp M. caroli satellite reveals a 17-bp motif that contains all nine bases previously shown to be necessary for in vitro binding of CENP-B. This M. caroli motif binds CENP-B from HeLa cell nuclear extract in vitro, as indicated by gel mobility shift analysis. We therefore suggest that this motif also causes CENP-B to associate with M. caroli centromeres in vivo. Despite the sequence differences, M. caroli presents a third, novel mammalian centromeric sequence producing an array of binding sites for CENP-B.  相似文献   

16.
A species-specific satellite DNA (Lb-MspISAT) was isolated from the North African rodent Lemniscomys barbarus. This DNA is highly homogeneous in the sequence of different repeats and shows no internal repetitions. Filter and in situ hybridizations demonstrated that it is tandemly repeated at the centromeres of all chromosomes of the complement. A 19-bp CENP-B-like motif was found in Lb-MspISAT which conserves 12 of the 17-bp of the human CENP-B box, but only 5 of the 9-bp of the canonical sequence that is necessary to bind the CENP-B protein. Compared with the human CENP-B box, nucleotide substitutions and insertions increase the palindromic structure of this motif. The possibilities that it may be involved in centromeric function or in homogenization of the Lb-MspISAT sequence are discussed.  相似文献   

17.
CENP-B, a highly conserved centromere-associated protein, binds to -satellite DNA, the centromeric satellite of primate chromosomes, at a 17-bp sequence, the CENP-B box. By fluorescence in situ hybridization (FISH) with an oligomer specific for the CENP-B box sequence, we have demonstrated the abundance of CENP-B boxes on all chromosomes (except the Y) of humans, chimpanzee, pygmy chimpanzee, gorilla, and orangutan. This sequence motif was not detected in the genomes of other primates, including gibbons, Old and New World monkeys, and prosimians. Our results indicate that the CENP-B box containing subtype of -satellite DNA may have emerged recently in the evolution of the large-bodied hominoids, after divergence of the phylogenetic lines leading to gibbons and apes; the box is thus on the order of 15–25 million years of age. The rapid process of dispersal and fixation of the CENP-B box sequence throughout the human and great ape genomes is thought to be a consequence of concerted evolution of -satellite subsets on both homologous and nonhomologous chromosomes.Correspondence to: T. Haaf  相似文献   

18.
This paper presents the first report on the structure of a 14-kb centromere sequence in a cereal genome that includes 1.9-kb direct repeats. The cereal centromeric sequence (CCS1) conserved in some Gramineae species contains a 17-bp motif similar to the CENP-B box, which serves as the binding site for the centromere-specific protein CENP-B in human. To isolate centromeric units from rice (Oryza sativa L.), we performed PCR using the CENP-B box-like sequences (CBLS) as primers. A 264-bp clone was amplified by this method, and called RCS1516. It appeared to be a novel member of the CCS1 family, sharing about 60% identity with the CCS1 sequences of other cereals. Then, a 14-kb genomic clone, λRCB11, carrying the RCS1516 sequence was isolated and sequenced. It was found to contain three copies of a 1.9-kb direct repeat, RCE1, separated by 5.1- and 1.7-kb. A 300-bp sequence at the 3′ end of RCE1 is highly conserved in all three copies (>90%) and is almost identical to the RCS1516 sequence including the CBLS motif. The copy number of RCE1 was estimated to range from 102 to 103 in the haploid genome of rice. Cloned RCE1 units were used for fluorescent in situ hybridization (FISH) analysis, and signals were observed on almost every primary constriction of rice chromosomes. Thus it was concluded that RCE1 is a significant component of the rice centromere. The λRCB11 clone contained at least four A/T-rich regions, which are candidate for matrix attachment regions (MARs), in the sequences between the RCE1 repeats. Other elements that are homologous to the short centromeric repetitive sequences pSau3A9 and pRG5, detected in both sorghum and rice, were also found in the clone. Received: 9 June 1998 / Accepted: 16 September 1998  相似文献   

19.
In eukaryotes, CpG methylation is an epigenetic DNA modification that is important for heterochromatin formation. Centromere protein B (CENP-B) specifically binds to the centromeric 17 base-pair CENP-B box DNA, which contains two CpG dinucleotides. In this study, we tested complex formation by the DNA-binding domain of CENP-B with methylated and unmethylated CENP-B box DNAs, and found that CENP-B preferentially binds to the unmethylated CENP-B box DNA. Competition analyses revealed that the affinity of CENP-B for the CENP-B box DNA is reduced nearly to the level of nonspecific DNA binding by CpG methylation.  相似文献   

20.
Here, a new satellite-DNA family is isolated and characterized from wedge sole, Dicologoglossa cuneata Moreau, 1881 (Pleuronectiformes), a fish having a small genome. This satellite-DNA family of sequences was isolated by conventional cloning after digestion of genomic DNA with the DraI restriction enzyme. Repeat units are 171 bp in length with a high AT content (63%). Several runs of consecutive adenines and thymines were found, and concomitantly computer analyses revealed that these regions are prone to acquire stable sequence-directed curvature. Especially remarkable is that the DraI sequences are composed almost entirely of the repetition of up to fourteen 9-bp motifs (T/C)GTC(A/C)AAAA similar to other vertebrate centromeric satellite-DNA sequences. In fact, we demonstrate the origin of this satellite through duplication of this motif plus the addition of a stretch of cytosines. The centromeric location and the presence in this satellite-DNA sequence of not only different vertebrate motifs (CENP-B box, pJalpha) but also others such as the CDEIII motif of Saccharomyces cerevisiae reveal a possible role in centromere function. All these characteristics provide important information on the origin, function, and the evolution of the centromeric satellite DNAs in wedge sole.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号