首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A search for new members of the mammalian interspersed repeat (MIR) family has been done over the coding regions of human genome from GenBank-116. Only 254 nucleotide sequences contained MIRs in coding regions, of which 45 MIR copies were unknown before, including 17 that occurred in translated gene regions. The program developed by the authors has been demonstrated to surpass the CENSOR program in the search power. The evolution of the MIR copies located in translated regions of human genome is discussed.  相似文献   

2.
The mammalian interspersed repetitive (MIR) element was amplified in mammals 130 million years ago. The MIR element is at least 260 bp in length and is found in approximately 105 copies in the mammalian genome. We analyzed copies of the MIR element in the DNA of various mammals to determine its relationship to the structure and function of genes, in an attempt to identify specific uses of the MIR element within the mammalian genome. We found that alternative splicing within the acetylcholine receptor gene in humans takes place within the MIR element and results in the incorporation of part of the MIR element into the coding sequence of this gene. Furthermore, the polyadenylation signal (AATAAA) at the 3' end of four different mammalian genes is derived from the MIR element. These uses of the MIR element suggest that other regulatory sequences found within the mammalian genome originated from ancient transposable elements, many of which may no longer be recognizable.  相似文献   

3.
Two types (MIR and Alu) of short interspersed repeated DNA sequences (SINEs) were used for analysis of genetic relationships among higher primates, and for detection of polymorphism in human genomic DNA. The DNA regions located between the neighboring copies of these SINEs were amplified in polymerase chain reaction with primers complementary to the MIR and Alu consensus sequences (inter-SINE PCR). Comparison of the sets of amplified DNA fragments for different species or individuals provides evaluation of the relationships among them. Using inter-MIR PCR technique, the relationships among the higher primates of the infraorder Catarrhini reported elsewhere were confirmed, pointing to the efficiency of the method for phylogenetic studies. No human DNA polymorphism was revealed with the help of inter-MIR PCR. This polymorphism was detected by means of inter-Alu PCR, which is probably associated with the continuing amplification of Alu elements in human genome.  相似文献   

4.
Two types (MIR and Alu) of short interspersed repeated DNA sequences (SINEs) were used for analysis of genetic relationships among higher primates, and for detection of polymorphism in human genomic DNA. The DNA regions located between the neighboring copies of these SINEs were amplified in polymerase chain reaction with primers complementary to the MIR and Alu consensus sequences (inter-SINE PCR). Comparison of the sets of amplified DNA fragments for different species or individuals provides evaluation of the relationships among them. Using inter-MIR PCR technique, the relationships among the higher primates of the infraorder Catarrhini reported elsewhere were confirmed, pointing to the efficiency of the method for phylogenetic studies. No human DNA polymorphism was revealed with the help of inter-MIR PCR. This polymorphism was detected by means of inter-Alu PCR, which is probably associated with the continuing amplification of Alu elements in human genome.  相似文献   

5.
The location of mammalian interspersed repeats (MIRs) and their density have been determined in the complete nucleotide sequence of human chromosome 22. The approach developed by us has allowed detection of 9675 MIRs at a statistically significant level, which by 15% exceeds the MIR number revealed by all previous approaches. It has been demonstrated that a considerable amount of MIRs missed by the algorithms applied earlier occurs in known DNA sequences of the human genome. The study of the MIR density revealed substantial irregularity of their distribution along the chromosome. The data on the MIRs thus found and the computer program searching for diverged sequences are available by E-mail: katrin2@mail.ru or katrin22@mtu-net.ru.  相似文献   

6.
The location of mammalian interspersed repeats (MIRs) and their density have been determined in the complete nucleotide sequence of human chromosome 22. The approach developed by us has allowed detection of 9675 MIRs at a statistically significant level, which by 15% exceeds the MIR number revealed by all previous approaches. It has been demonstrated that a considerable amount of MIRs missed by the algorithms applied earlier occurs in known DNA sequences of the human genome. The study of the MIR density revealed substantial irregularity of their distribution along the chromosome. The data on the MIRs thus found and the computer program searching for diverged sequences are available by E-mail: katrin2@mail.ru or katrin22@mtu-net.ru.  相似文献   

7.
Ubiquitin coding sequences were isolated from a human genomic library and two cDNA libraries. One human ubiquitin gene consists of 2055 nucleotides and codes for a polyprotein consisting of 685 amino acid residues. The polyprotein contains nine direct repeats of the ubiquitin amino acid sequence and the last ubiquitin sequence is extended with an additional valyl residue at the C-terminal end. No spacer sequences separate the ubiquitin repeats and the coding regions are not interrupted by intervening sequences. This particular gene is transcribed since cDNAs corresponding to the genomic sequence have been isolated. At least two more types of ubiquitin genes are encoded in the human genome, one coding for an ubiquitin monomer while another presumably codes for three or four direct repeats of the ubiquitin sequence. Human DNA contains many copies of the ubiquitin sequence. Ubiquitin is therefore encoded in the human genome as a multigene family.  相似文献   

8.
9.
We analysed the distribution of transposable elements (TEs) in 100 aligned pairs of orthologous intergenic regions from the mouse and human genomes. Within these regions, conserved segments of high similarity between the two species alternate with segments of low similarity. Identifiable TEs comprise 40-60% of segments of low similarity. Within such segments, a particular copy of a TE found in one species has no orthologue in the other. Overall, TEs comprise only approximately 20 % of conserved segments. However, TEs from two families, MIR and L2, are rather common within conserved segments. Statistical analysis of the distributions of TEs suggests that a majority of the MIR and L2 elements present in murine intergenic regions have human orthologues. These elements must have been present in the common ancestor of human and mouse and have remained under substantial negative selection that prevented their divergence beyond recognition. If so, recruitment of MIR- and L2-derived sequences to perform a function that increases host fitness is rather common, with at least two such events per host gene. The central part of the MIR consensus sequence is over-represented in conserved segments given its background frequency in the genome, suggesting that it is under the strongest selective constraint.  相似文献   

10.
The nuclear ribosomal locus coding for the large subunit is represented in tandem arrays in the plant genome. These consecutive gene blocks, consisting of several regions, are widely applied in plant phylogenetics. The regions coding for the subunits of the rRNA have the lowest rate of evolution. Also the spacer regions like the internal transcribed spacers (ITS) and external transcribed spacers (ETS) are widely utilized in phylogenetics. The fact, that these regions are present in many copies in the plant genome is an advantage for laboratory practice but might be problem for phylogenetic analysis. Beside routine usage, the rDNA regions provide the great potential to study complex evolutionary mechanisms, such as reticulate events or array duplications. The understanding of these processes is based on the observation that the multiple copies of rDNA regions are homogenized through concerted evolution. This phenomenon results to paralogous copies, which can be misleading when incorporated in phylogenetic analyses. The fact that non-functional copies or pseudogenes can coexist with ortholougues in a single individual certainly makes also the analysis difficult. This article summarizes the information about the structure and utility of the phylogenetically informative spacer regions of the rDNA, namely internal- and external transcribed spacer regions as well as the intergenic spacer (IGS).  相似文献   

11.
Kasahara M 《Immunogenetics》1999,50(3-4):134-145
 It has recently become apparent that the human genome contains at least three regions that are paralogous to the major histocompatibility complex (MHC). The number of gene families with copies in the MHC and these paralogous regions is increasing steadily as genome analysis progresses. This review presents the updated listing of the human gene families that constitute the MHC paralogous group. When genes with multiple copies within the MHC, such as class I and class II genes, are counted as single entities, nearly one-third of the genes residing in the HLA complex have paralogous copies in at least one of the three paralogous regions. The review also discusses the long-term genome dynamics of the MHC, taking into account the rapidly accumulating information on the genomic organizations of the MHCs in various model organisms.  相似文献   

12.
We propose a prediction method for human full-length cDNA by comparing sequence data between human genome shotgun sequence and mouse full-length cDNA. The human genome which is homologous to the mouse full-length cDNA is selected by a homology search program, and the predicted exons are connected at the exon-intron junction which gives the best homology score to the mouse full-length cDNA. The accuracy of the predicted human full-length coding region is 83.3%, and the false positive rate is 16.7%. Five human full-length proteins out of 20 proteins are correctly predicted.  相似文献   

13.
Long non‐coding RNA MIR503 host gene (MIR503HG) is located on chromosome Xq26.3, and has been found to be deregulated in many types of human malignancy and function as tumour suppressor or promoter based on cancer types. The role of MIR503HG in breast cancer was still unknown. In our study, we found MIR503HG expression was significantly decreased in triple‐negative breast cancer tissues and cell lines. Furthermore, we observed low MIR503HG expression was correlated with late clinical stage, lymph node metastasis and distant metastasis. In the survival analysis, we observed that triple‐negative breast cancer patients with low MIR503HG expression had a statistically significant worse prognosis compared with those with high MIR503HG expression, and low MIR503HG expression was a poor independent prognostic factor for overall survival in triple‐negative breast cancer patients. The study in vitro suggested MIR503HG inhibits cell migration and invasion via miR‐103/OLFM4 axis in triple negative breast cancer. In conclusion, MIR503HG functions as a tumour suppressive long non‐coding RNA in triple negative breast cancer.  相似文献   

14.
Some algorithms are described for the search of regions in a nucleic acid sequence that, when translated into amino acids, are homologous to a given amino acid pattern. All algorithms are modifications of the dynamic programming method for sequence comparison such that the translation of codons is taken into account. One of the algorithms has been implemented as a FORTRAN 77 program. The program operates on files that follow the format of the EMBL Nucleotide Sequence Data Library.  相似文献   

15.
16.
Poly-A+ mRNA from Xenopus laevis oocytes, partially enriched for r-protein coding capacity has been used as starting material for preparing a cDNA bank in plasmid pBR322. The clones containing sequences specific for r-proteins have been selected by translation of the complementary mRNAs. Clones for six different r-proteins have been identified and utilized as probes for studying their genomic organization. Two gene copies per haploid genome were found for r-proteins L1, L14, S19, and four-five for protein S1, S8 and L32. Moreover a population polymorphism has been observed for the genomic regions containing sequences for r-protein S1, S8 and L14.  相似文献   

17.
Grover D  Kannan K  Brahmachari SK  Mukerji M 《Genetica》2005,124(2-3):273-289
Elucidation of complete nucleotide sequence of the human has revealed that coding sequences that store the information needed to synthesize functional proteins, occupy only 2% of the genomic region. The remaining 98%, barring few regulatory sequences, has been referred to as non-functional or junk DNA and consists of many kinds of repeat elements. In fact, human genome is the most repeat rich genome sequenced so far, in which more than half of the region is occupied by such sequences. Determination of significance of these repeats in the human genome has become the focus of many studies all over the world, especially after genome sequencing did not reveal any significant difference in coding regions between lower eukaryotes and human. In this article, we have focused on Alu repeats that are primate specific elements with many interesting biological properties. Moreover, these are the repeats with highest copy number in the human genome. We have highlighted different facets of their interaction with the genome and changing paradigms regarding their role in genome organization.  相似文献   

18.
We sequenced across all of the gene boundaries in the mitochondrial genome of the cattle tick, Boophilus microplus, to determine the arrangement of its genes. The mtDNA of B. microplus has a coding region, composed of tRNA(Glu) and 60 bp of the 3' end of ND1, that is repeated five times. Boophilus microplus is the first coelomate animal known to have more than two copies of a coding sequence. The mitochondrial genome of B. microplus has other unusual features, including (1) reduced T arms in tRNAs, (2) an AT bias in codon use, (3) two control regions that have evolved in concert, (4) three gene rearrangements, and (5) a stem-loop between tRNA(Gln) and tRNA(Phe). The short T arms and small control regions (CRs) of B. microplus and other ticks suggest strong selection for small genomes. Imprecise termination of replication beyond its origin, which can account for the evolution of tandem repeats of coding regions in other mitochondrial genomes, cannot explain the evolution of the fivefold repeated sequence in the mitochondrial genome of B. microplus. Instead, slipped-strand mispairing or recombination are the most plausible explanations for the evolution of these tandem repeats.  相似文献   

19.
Repetitive elements may comprise over two-thirds of the human genome   总被引:1,自引:0,他引:1  
Transposable elements (TEs) are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo "clouds"). We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%-69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM), to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (~25 bp). Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed "element-specific" P-clouds (ESPs) to identify novel Alu and MIR SINE elements, and using it we identified ~100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed.  相似文献   

20.
Antibodies elicited against chromosomal protein HMG-17, purified from calf, were used to screen a human lambda gt11 cDNA expression library and isolate the full length cDNA coding for this protein. Sequence analysis reveals that the nucleotide distribution along this cDNA is highly asymmetric. The amino acid sequence, deduced from the reading frame, reveals that the human HMG-17 is, respectively, 96 and 92% homologous with the calf and chicken protein. The amino acid substitution are conservative suggesting evolutionary constraints on the conformation of the protein. The human genome contains 35-50 HMG-17 gene copies which, as revealed by Southern analysis, are distributed at several loci. Northern analysis of total RNA isolated from 3 human cell lines, indicates that each cell contains a single-size mRNA coding for this protein. Nucleotide sequences which cross-hybridize, under stringent conditions, with the human HMG-17 cDNA are present in the genome of rodents and absent from the genomes of sea urchin, Drosophila, and yeast. The availability of a probe for the HMG-17 gene may help elucidate the cellular role of this protein which may confer specific conformations to transcribable regions in the genome.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号