共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Micro-and minisatellites constitute an essential part of DNA with low sequence complexity and perform a number of important functions. The TandemSWAN program was used to search the human genome for tandem repeats with a length of a repeated unit to 70 bp, including repeats with a large number of nucleotide substitutions. It was shown that, for a significant fraction of the program-found minisatellites with a repeat unit length less than 25 bp, a shorter repeated motif can be discerned in this sequence, which is often similar to the sequence of microsatellites occurring widely in the human genome. A model of hierarchical origin of minisatellites in the human genome was proposed. 相似文献
3.
We have examined conserved protein motifs in the non-coding, intergenic regions ("pseudomotif patterns") and surveyed their occurrence in the fly, worm, yeast and human genomes (chromosomes 21 and 22 only). To identify these patterns, we masked out annotated genes, pseudogenes and repeat regions from the raw genomic sequence and then compared the remaining sequence, in six-frame translation, against 1319 patterns from the PROSITE database. For each pseudomotif pattern, the absolute number of occurrences is not very informative unless compared against a statistical expectation; consequently, we calculated the expected occurrence of each pattern using a Poisson model and verified this with simulations. Using a p-value cut-off of 0.01, we found 67 pseudomotif patterns over-represented in fly intergenic regions, 34 in worm, 21 in human and six in yeast. These include the zinc finger, leucine zipper, nucleotide-binding motif and EGF domain. Many of the over-represented patterns were common to two or more organisms, but there were a few that were unique to specific ones. Furthermore, we found more over-represented patterns in the fly than in the worm, although the fly has fewer pseudogenes. This puzzling observation can be explained by a higher deletion rate in the fly genome. We also surveyed under-represented patterns, finding 23 in the fly, 12 in the worm, 18 in human and two in yeast. If intergenic sequences were truly random, we would expect an equal number of over and under-represented patterns. The fact that for each organism the number of over-represented patterns is greater than the number of under-represented ones implies that a fraction of the intergenic regions consist of ancient protein fragments that, due to accumulated disablements, have become unrecognizable by conventional techniques for gene and pseudogene identification. Moreover, we find that in aggregate the over-represented pseudomotif patterns occupy a substantial fraction of the intergenic regions. Further information is available at http://pseudogene.org 相似文献
4.
5.
The usefulness of the M-statistic in odontomorphometric distance analyses was evaluated against a battery of more traditional metrics, which included Mahalanobis' D2, Penrose's shape metric, the Manhattan distance and Delta. Odontometric data used for the analyses were derived from 202 Paraguayan Lengua Indians and 125 contemporary caucasoids. Efron's Bootstrap procedure was used to evaluate the statistical accuracy of the different metrics, when each was applied to the same populations. Additionally, metric stability in the face of reduced sample size, statistical bias resulting from over- and underestimation, and the effects of standardization, were investigated. Our results indicated that Penrose's shape metric rather that the recently introduced M-statistic was the most reliable metric evaluated. Penrose's shape remained the most reliable when sample size was artificially reduced and when raw data were used. Interestingly, Mahalanobis' generalized distance emerged as the least reliable statistics, especially when used on small sample sizes. 相似文献
6.
7.
Unraveling the evolutionary forces responsible for variations of neutral substitution patterns among taxa or along genomes is a major issue in the identification of functional sequence features. Mammalian genomes show large-scale regional variations of GC-content (the isochores), but the substitution processes at the origin of this structure are poorly understood. We have analyzed the pattern of neutral substitutions in 14.3 Mb of primate noncoding regions. We show that the GC-content toward which sequences are evolving is strongly correlated (r(2) = 0.61, P = 2 10(-16)) with the rate of crossovers (notably in females). This demonstrates that recombination drives the evolution of base composition in human (probably via the process of biased gene conversion). The present substitution patterns are very different from what they had been in the past, resulting in a major modification of the isochore structure of our genome. This non-equilibrium situation suggests that changes of recombination rates occur relatively frequently during evolution, possibly as a consequence of karyotype rearrangements. These results have important implications for understanding the spatial and temporal variations of substitution processes in a broad range of sexual organisms, and for detecting the hallmarks of natural selection in DNA sequences. 相似文献
8.
E. V. Pankratova 《Molecular Biology》2008,42(3):371-380
Many genes are known to have several promoters. The contribution of alternative promoters to the structural and functional diversity of protein isoforms in eukaryotic cells is considered, including their role in synthesis of identical proteins from different mRNAs, generation of protein isoforms with different and even opposite functions, expression of housekeeping genes, and the variation of the recognition domains of adhesion proteins and receptors. In some cases, alternative promoters allow one gene to produce mRNAs with different open reading frames and, consequently, proteins with no amino acid sequence homology. 相似文献
9.
While genome-era technologies focused on complete genome sequencing in various organisms, post-genome technologies aim at the understanding of the mechanisms of genetic information processing and elucidation of within-species variation. Single nucleotide polymorphisms (SNPs) are the most common source of genome variation in the human population. Nonsynonymous SNPs that occur in coding gene regions and result in amino acid substitutions are of particular interest. It is thought that such SNPs are responsible for phenotypic variation, quantitative traits, and the etiology of common diseases. PolyPhen is a computational tool for the prediction of putatively functional nonsynonymous SNPs by combining information of various types. The application areas of PolyPhen and similar methods include the genetics of complex diseases and congenital defects, the identification of functional mutations in model organisms, and evolutionary genetics. 相似文献
10.
11.
One theory formalised in 1970 proposes that the complexity of vertebrate genomes originated by means of genome duplication at the base of the vertebrate lineage. Since then, the theory has remained both popular and controversial. Here we review the theory, and present preliminary results from our analysis of duplications in the draft human genome sequence. We find evidence for extensive duplication of parts of the genome. We also question the validity of the 'parsimony test' that has been used in other analyses. 相似文献
12.
13.
DNA repeats in the human genome 总被引:5,自引:1,他引:5
14.
基因组拷贝数变异及其突变机理与人类疾病 总被引:1,自引:0,他引:1
拷贝数变异(Copy number variation,CNV)是由基因组发生重排而导致的,一般指长度为1 kb以上的基因组大片段的拷贝数增加或者减少,主要表现为亚显微水平的缺失和重复。CNV是基因组结构变异(Structural variation,SV)的重要组成部分。CNV位点的突变率远高于SNP(Single nucleotide polymorphism),是人类疾病的重要致病因素之一。目前,用来进行全基因组范围的CNV研究的方法有:基于芯片的比较基因组杂交技术(array-based comparative genomic hybridization,aCGH)、SNP分型芯片技术和新一代测序技术。CNV的形成机制有多种,并可分为DNA重组和DNA错误复制两大类。CNV可以导致呈孟德尔遗传的单基因病与罕见疾病,同时与复杂疾病也相关。其致病的可能机制有基因剂量效应、基因断裂、基因融合和位置效应等。对CNV的深入研究,可以使我们对人类基因组的构成、个体间的遗传差异、以及遗传致病因素有新的认识。 相似文献
15.
LUO Chunqing LI Yan ZHANG Xiaowei ZHANG Yilin Zhang Haiqing CHEN Chong XU Zuyuan CUI Peng HU Songnian YANG Huanming DONG Wei 《中国科学C辑(英文版)》2005,48(1)
Most proterminal regions of human chromosomes are GC-rich and gene-rich. Chromosome 3p is an exception. Its proterminal region is GC-poor, and likely to lose heterozygosity, thus causing a number of fatal diseases. Except one gap left in the telomeric position, the proterminal region of human chromosome 3p has been completely sequenced. The detailed sequence analysis showed: (i) the GC content of this region was 38.5%, being the lowest among all the human proterminal regions; (ii) this region contained 20 known genes and 22 predicted genes, with an average gene size of 97.5 kb. The previously mapped gene Cntn3 was not found in this region, but instead located in the 74 Mb position of human chromosome 3p; (iii) the interspersed repeats of this region were more active than the average level of the whole human genome, especially (TA)n, the content of which was twice the genome average; (iv) this region had a conserved synteny extending from 104.1 Mb to 112.4 Mb on the mouse chromosome 6, which was 8% larger in size, not in accordance with the whole genome comparison, probably because the 3pter-p26 region was more likely to lose neocleitides and its mouse synteny had more active interspersed repeats. 相似文献
16.
Microstructural changes such as insertions and deletions (=indels) are a major driving force in the evolution of non-coding DNA sequences. To better understand the mechanisms by which indel mutations arise, as well as the molecular evolution of non-coding regions, the number and pattern of indels and nucleotide substitutions were compared in the whole chloroplast genomes. Comparisons were made for a total of over 38 kb non-coding DNA sequences from 126 intergenic regions in two data sets representing species with different divergence times: sugarcane and maize and Oryza sativa var. indica and japonica. The main findings of this study are: (i) Approximately half of all indels are single nucleotide indels. This observation agrees with previous studies in various organisms. (ii) The distribution and number of indels was different between two data sets, and different patterns were observed for tandem repeat and non-repeat indels. (iii) Distribution pattern of tandem repeat indels showed statistically significant bias towards A/T-rich. (iv) The rate of indel mutation was estimated to be approximately 0.8 +/- 0.04 x 10(-9) per site per year, which was similar to previous estimates in other organisms. (v) The frequencies of nucleotide substitutions and indels were significantly lower in inverted repeat (IR). 相似文献
17.
Chiang CW Liu CT Lettre G Lange LA Jorgensen NW Keating BJ Vedantam S Nock NL Franceschini N Reiner AP Demerath EW Boerwinkle E Rotter JI Wilson JG North KE Papanicolaou GJ Cupples LA;Genetic Investigation of ANthropometric Traits 《Genetics》2012,192(1):253-266
Ultraconserved elements in the human genome likely harbor important biological functions as they are dosage sensitive and are able to direct tissue-specific expression. Because they are under purifying selection, variants in these elements may have a lower frequency in the population but a higher likelihood of association with complex traits. We tested a set of highly constrained SNPs (hcSNPs) distributed genome-wide among ultraconserved and nearly ultraconserved elements for association with seven traits related to reproductive (age at natural menopause, number of children, age at first child, and age at last child) and overall [longevity, body mass index (BMI), and height] fitness. Using up to 24,047 European-American samples from the National Heart, Lung, and Blood Institute Candidate Gene Association Resource (CARe), we observed an excess of associations with BMI and height. In an independent replication panel the most strongly associated SNPs showed an 8.4-fold enrichment of associations at the nominal level, including three variants in previously identified loci and one in a locus (DENND1A) previously shown to be associated with polycystic ovary syndrome. Finally, using 1430 family trios, we showed that the transmissions from heterozygous parents to offspring of the derived alleles of rare (frequency ≤0.5%) hcSNPs are not biased, particularly after adjusting for the rates of genotype missingness and error in the data. The lack of transmission bias ruled out an immediately and strongly deleterious effect due to the rare derived alleles, consistent with the observation that mice homozygous for the deletion of ultraconserved elements showed no overt phenotype. Our study also illustrated the importance of carefully modeling potential technical confounders when analyzing genotype data of rare variants. 相似文献
18.
Insertions and deletions (indels) in chloroplast noncoding regions are common genetic markers to estimate population structure and gene flow, although relatively little is known about indel evolution among recently diverged lineages such as within plant families. Because indel events tend to occur nonrandomly along DNA sequences, recurrent mutations may generate homoplasy for indel haplotypes. This is a potential problem for population studies, because indel haplotypes may be shared among populations after recurrent mutation as well as gene flow. Furthermore, indel haplotypes may differ in fitness and therefore be subject to natural selection detectable as rate heterogeneity among lineages. Such selection could contribute to the spatial patterning of cpDNA haplotypes, greatly complicating the interpretation of cpDNA population structure. This study examined both nucleotide and indel cpDNA variation and divergence at six noncoding regions (psbB-psbH, atpB-rbcL, trnL-trnH, rpl20-5'rps12, trnS-trnG, and trnH-psbA) in 16 individuals from eight species in the Lecythidaceae and a Sapotaceae outgroup. We described patterns of cpDNA changes, assessed the level of indel homoplasy, and tested for rate heterogeneity among lineages and regions. Although regression analysis of branch lengths suggested some degree of indel homoplasy among the most divergent lineages, there was little evidence for indel homoplasy within the Lecythidaceae. Likelihood ratio tests applied to the entire phylogenetic tree revealed a consistent pattern rejecting a molecular clock. Tajima's 1D and 2D tests revealed two taxa with consistent rate heterogeneity, one showing relatively more and one relatively fewer changes than other taxa. In general, nucleotide changes showed more evidence of rate heterogeneity than did indel changes. The rate of evolution was highly variable among the six cpDNA regions examined, with the trnS-trnG and trnH-psbA regions showing as much as 10% and 15% divergence within the Lecythidaceae. Deviations from rate homogeneity in the two taxa were constant across cpDNA regions, consistent with lineage-specific rates of evolution rather than cpDNA region-specific natural selection. There is no evidence that indels are more likely than nucleotide changes to experience homoplasy within the Lecythidaceae. These results support a neutral interpretation of cpDNA indel and nucleotide variation in population studies within species such as Corythophora alta. 相似文献
19.
Banihashemi K 《Indian journal of human genetics》2009,15(3):88-92
The Human Genome Project (HGP) refers to the international scientific research program, formally begun in October 1990 and completed in 2003, mainly designated to discover all the human genes, analyzing the structure of human DNA and determining the location of all human genes and also making them accessible for further biological and medical investigations. With the appropriate rationale approach, a similar study has been held in Iran. The study of human genome among Iranian ethnicities (IHGP) has been attempted formally in 2000 through a detailed and fully programmed research among all the major ethnic groups by more than 1,900 samples from all over Iran based on the main demographical and anthropological findings and formally known criteria considered for the international HGP. This paper overviewed the process of the research in the terms of program goals, primary data collection, research designation and methodology and also practical aspects and primary findings of the Iranian genome project and its progress during a nearly 5-year period. 相似文献
20.
Trevor L Hawkins 《Trends in biotechnology》1998,16(12):527-528
Automation Technologies for Genome Characterization
edited by Tony J. Beugelsdijk, John Wiley and Sons, 1997. UK£55.00 hbk (xvi +306 pages) ISBN 0 471 12806 6 相似文献