首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Two classes of mRNA encoding the murine C4 protein were identified by sequence analysis of clones isolated from a liver complementary DNA library. The divergence found within a 357 base pair sequence available for comparison is limited to five nucleotide replacements located in the region corresponding to the carboxy-terminal end of the C4d peptide fragment. One of the nucleotide substitutions influences the presence of a site for the Hind III restriction endonuclease. That this restriction site indeed discriminates the two non-allelic genes encoding the mouse C4 and C4-Slp isoforms has been demonstrated by Southern blot analysis and nucleotide sequencing at the genomic level. Circumstantial evidence supports the identification of the gene lacking the Hind III site in the region corresponding to the carboxy-terminal end of the C4d fragment as the one encoding the C4-Slp isotype.  相似文献   

2.
We have isolated cDNA clones derived from three tadpole alpha-globin mRNAs of Xenopus laevis. The entire nucleotide sequence of the three mRNAs has been determined from the cDNA clones and is presented together with the deduced amino acid sequence of the encoded polypeptides. Two of the three polypeptide sequences are 96% homologous whilst the third sequence is highly diverged, with only a 72% homology. The three tadpole alpha-globin genes are all similarly diverged from the two X. laevis adult alpha-globin genes with which they display approximately 50% homology. Analysis of several independent clones from each class of tadpole alpha-globin sequence reveals a very high degree of coding region polymorphism for each of the three corresponding genes. Using the cloned DNA sequences as hybridisation probes, we have analysed the expression of the corresponding genes during larval development. We show that all three genes are activated simultaneously early in development and that thereafter all three are expressed at an approximately equivalent level. A fourth tadpole alpha-globin mRNA sequence, for which we do not have a cDNA clone, accumulates co-ordinately with the three major mRNA sequences but to a much lower concentration. This pattern of gene expression differs significantly from that of the tadpole beta-globin genes of X. laevis, despite the two classes of genes being closely linked in the genome.  相似文献   

3.
We have analyzed a sequence of approximately 70 base pairs (bp) that shows a high degree of similarity to sequences present in the non-coding regions of a number of human and other mammalian genes. The sequence was discovered in a fragment of human genomic DNA adjacent to an integrated hepatitis B virus genome in cells derived from human hepatocellular carcinoma tissue. When one of the viral flanking sequences was compared to nucleotide sequences in GenBank, more than thirty human genes were identified that contained a similar sequence in their non-coding regions. The sequence element was usually found once or twice in a gene, either in an intron or in the 5' or 3' flanking regions. It did not share any similarities with known short interspersed nucleotide elements (SINEs) or presently known gene regulatory elements. This element was highly conserved at the same position within the corresponding human and mouse genes for myoglobin and N-myc, indicating evolutionary conservation and possible functional importance. Preliminary DNase I footprinting data suggested that the element or its adjacent sequences may bind nuclear factors to generate specific DNase I hypersensitive sites. The size, structure, and evolutionary conservation of this sequence indicates that it is distinct from other types of short interspersed repetitive elements. It is possible that the element may have a cis-acting functional role in the genome.  相似文献   

4.
SNP (single nucleotide polymorphism) discovery using next-generation sequencing data remains difficult primarily because of redundant genomic regions, such as interspersed repetitive elements and paralogous genes, present in all eukaryotic genomes. To address this problem, we developed Sniper, a novel multi-locus Bayesian probabilistic model and a computationally efficient algorithm that explicitly incorporates sequence reads that map to multiple genomic loci. Our model fully accounts for sequencing error, template bias, and multi-locus SNP combinations, maintaining high sensitivity and specificity under a broad range of conditions. An implementation of Sniper is freely available at .  相似文献   

5.
Wu C  Wang S  Zhang HB 《Genomics》2006,88(4):394-406
The genome in a higher organism consists of a number of types of nucleotide sequence-specialized components, with each having tens of thousands of members or elements. It is crucial for our understanding of how a genome as an entity is organized, functions, and evolves to determine how these components are organized in the genome and how they relate with each other; however, no such knowledge is available. Here, we report a comprehensive analysis of the organization and interaction of all 40 components constituting the genome of the plant model species, Arabidopsis thaliana, at the whole-genome and chromosome levels. The 40 components include (i) 6 genome structural components consisting of GC%, genes, retrotransposons, DNA transposons, simple repeats, and low complex repeats; (ii) 3 evolutionarily critical features consisting of recombination rate, nucleotide substitutions, and nucleotide insertions/deletions; and (iii) 31 categories of genes with different functions and numbers of functions. We show that the distributions of 39 of the 40 components of the genome (excepting GC%) deviate significantly from the random distribution model and different types of the genome components are significantly correlated. These results remained to be true even when the genomic regions, such as centromeric regions, where transposable and repeat elements are abundant were excluded from the analyses. These findings suggest that DNA molecules contained in the Arabidopsis genome are each organized and structured from their constituting components in an unambiguous manner and that different types of the components that constitute or characterize the genome interact. The analysis also showed that each chromosome consists of a similar set of the components at similar densities, suggesting that the unique organization and interaction pattern of the components in each chromosome may represent, at least in part, the identity of a chromosome or a genome at the genome level, thus partly accounting for the phenotypic variation among different species. The data also provide comprehensive and new insights into many phenomena significant in genome biology, with which we particularly discuss the variation of genetic recombination. The variation of genetic recombination rate along a chromosomal arm is shaped, not only by the distribution of simple repeats, retrotransposons, DNA transposons, and nucleotide substitutions, but also by the functions of genes contained, especially those with multiple functions, suggesting that variation of genetic recombination along a chromosomal arm is the result of interactions among the components constituting local genome structure, function, and evolution.  相似文献   

6.
K. J. Hardeman  V. L. Chandler 《Genetics》1993,135(4):1141-1150
The Mutator transposable element system of maize has been used to isolate mutations at many different genes. Six different classes of Mu transposable elements have been identified. An important question is whether particular classes of Mu elements insert into different genes at equivalent frequencies. To begin to address this question, we used a small number of closely related Mutator plants to generate multiple independent mutations at two different genes. The overall mutation frequency was similar for the two genes. We then determined what types of Mu elements inserted into the genes. We found that each of the genes was preferentially targeted by a different class of Mu element, even when the two genes were mutated in the same plant. Possible explanations for these findings are discussed. These results have important implications for cloning Mu-tagged genes as other genes may also be resistant or susceptible to the insertion of particular classes of Mu elements.  相似文献   

7.
Histones are small basic proteins encoded by a multigene family and are responsible for the nucleosomal organization of chromatin in eukaryotes. Because of the high degree of protein sequence conservation, it is generally believed that histone genes are subject to concerted evolution. However, purifying selection can also generate a high degree of sequence homogeneity. In this study, we examined the long-term evolution of histone H4 genes to determine whether concerted evolution or purifying selection was the major factor for maintaining sequence homogeneity. We analyzed the proportion (p(S)) of synonymous nucleotide differences between the H4 genes from 59 species of fungi, plants, animals, and protists and found that p(S) is generally very high and often close to the saturation level (p(S) ranging from 0.3 to 0.6) even though protein sequences are virtually identical for all H4 genes. A small proportion of genes showed a low level of p(S) values, but this appeared to be caused by recent gene duplication. Our findings suggest that the members of this gene family evolve according to the birth-and-death model of evolution under strong purifying selection. Using histone-like genes in archaebacteria as outgroups, we also showed that H1, H2A, H2B, H3, and H4 histone genes in eukaryotes form separate clusters and that these classes of genes diverged nearly at the same time, before the eukaryotic kingdoms diverged.  相似文献   

8.
We propose and study the notion of dense regions for the analysis of categorized gene expression data and present some searching algorithms for discovering them. The algorithms can be applied to any categorical data matrices derived from gene expression level matrices. We demonstrate that dense regions are simple but useful and statistically significant patterns that can be used to 1) identify genes and/or samples of interest and 2) eliminate genes and/or samples corresponding to outliers, noise, or abnormalities. Some theoretical studies on the properties of the dense regions are presented which allow us to characterize dense regions into several classes and to derive tailor-made algorithms for different classes of regions. Moreover, an empirical simulation study on the distribution of the size of dense regions is carried out which is then used to assess the significance of dense regions and to derive effective pruning methods to speed up the searching algorithms. Real microarray data sets are employed to test our methods. Comparisons with six other well-known clustering algorithms using synthetic and real data are also conducted which confirm the superiority of our methods in discovering dense regions. The DRIFT code and a tutorial are available as supplemental material, which can be found on the Computer Society Digital Library at http://computer.org/tcbb/archives.htm.  相似文献   

9.
Recent awareness that most microorganisms in the environment are resistant to cultivation has prompted scientists to directly clone useful genes from environmental metagenomes. Two screening methods are currently available for the metagenome approach, namely, nucleotide sequence-based screening and enzyme activity-based screening. Here we have introduced and optimized a third option for the isolation of novel catabolic operons, that is, substrate-induced gene expression screening (SIGEX). This method is based on the knowledge that catabolic-gene expression is generally induced by relevant substrates and, in many cases, controlled by regulatory elements situated proximate to catabolic genes. For SIGEX to be high throughput, we constructed an operon-trap gfp-expression vector available for shotgun cloning that allows for the selection of positive clones in liquid cultures by fluorescence-activated cell sorting. The utility of SIGEX was demonstrated by the cloning of aromatic hydrocarbon-induced genes from a groundwater metagenome library and subsequent genome-informatics analysis.  相似文献   

10.
11.
12.
Understanding the co-variation of nucleotide diversity and local recombination rates is important both for the mapping of disease-associated loci and in understanding the causes of sequence evolution. It is known that single nucleotide polymorphisms (SNPs) around protein coding genes show higher diversity in regions of high recombination. Here, we find that this correlation holds for SNPs across the entire human genome, the great majority of which are not near exons or control elements. Contrasting with results from coding regions, we provide evidence that the higher nucleotide diversity in regions of high recombination is most likely due, at least in part, to a higher mutation rate. One possible explanation for this is that recombination is mutagenic.  相似文献   

13.
B Brenig 《Animal genetics》1999,30(2):120-125
Interspersed elements are ubiquitous in the genomes of higher eukaryotes and account for over a third of the genomic DNA (Smit 1996). In swine the short interspersed elements, SINEs or PREs (porcine repetitive elements), have been found in a number of introns and 3' untranslated regions of different genes. However, compared to human Alu repeats the number of available PRE DNA sequences is still limited. In this study we have compared 85 PREs selected from DNA sequence database entries. The PREs were aligned and for each nucleotide position the relative frequencies of the four bases were calculated. A consensus sequence was derived from the first base usage. Similar to studies of SINEs in other species, the analysis showed that most mutations in PREs occur at CpG dinucleotide hot spots. The position variability for the two most frequent bases shows a bimodal distribution. The analysis suggests that the porcine SINEs can be divided into three major subfamilies sharing conserved nucleotide similarities.  相似文献   

14.
15.
The non-long-terminal repeat retrotransposable elements, R1 and R2, insert at unique locations in the 28S ribosomal RNA genes of insects. Based on the nucleotide sequences of these elements in the eight members of the melanogaster species subgroup of the genus Drosophila, they have been maintained by vertical germline transmission for the 17-20 million year history of this subgroup. The stable inheritance of R1 and R2 within these species has enabled a determination of their nucleotide substitution rates. The sequence of the R1 and R2 elements from D. ambigua, a member of the obscura species group, has also been determined to enable an extrapolation of this rate over an estimated 45-60 million years. The mean rate of substitutions at synonymous sites (K(s)) was 6.6 and 9.6 times the rate at replacement sites (K(a)) in the R1 and R2 elements, respectively. Both elements appear to have been under selective pressure to maintain their open reading frames and thus their ability to retrotranspose for most of their evolution in these lineages. Using the rate of change at synonymous sites (K(s)) as the best indicator of the nucleotide substitution rate, the mean K(s) values for R1 and R2 were 2.3 and 2.2 times that of the alcohol dehydrogenase (Adh) genes. However, this faster rate is a result of the lower codon usage bias of R1 and R2 compared with that of Adh. When the K(s) rates of R1 and R2 were compared with that of a larger number of nuclear genes available from at least two of the nine species under investigation, R1 and R2 were found to evolve in most lineages at rates similar to that of nuclear genes with low codon bias. The ability of R1 and R2 to maintain their presence in this species subgroup by retrotransposition while exhibiting rates of nucleotide evolution similar to nuclear genes suggests these transposition events are rare or not as error prone as that of retroviruses.  相似文献   

16.
This paper describes software (written in Pascal and running on Macintosh computers) allowing localization of unknown DNA fragments from the Escherichia coli chromosome on the restriction map established by Kohara et al. (1987). The program identifies the segment's map position using a restriction pattern analysis obtained with all, or some, of the eight enzymes used by Kohara et al. (1987). Therefore, the sequenced genes available in the EMBL library may be localized on the E. coli chromosome restriction map. This allowed correction of the map (mainly by introducing missing sites in the published maps) at the corresponding positions. Analysis of the data indicates that there is only a very low level of polymorphism, at the nucleotide level, between the E. coli K12 strains used by the various laboratories involved in DNA sequencing. The program is versatile enough to be used with other genomes.  相似文献   

17.
Kauermann G  Eilers P 《Biometrics》2004,60(2):376-387
An important goal of microarray studies is the detection of genes that show significant changes in expression when two classes of biological samples are being compared. We present an ANOVA-style mixed model with parameters for array normalization, overall level of gene expression, and change of expression between the classes. For the latter we assume a mixing distribution with a probability mass concentrated at zero, representing genes with no changes, and a normal distribution representing the level of change for the other genes. We estimate the parameters by optimizing the marginal likelihood. To make this practical, Laplace approximations and a backfitting algorithm are used. The performance of the model is studied by simulation and by application to publicly available data sets.  相似文献   

18.
19.
The nucleotide sequences of nine genes corresponding to tRNA(Ser)4 or tRNA(Ser)7 of Drosophila melanogaster were determined. Eight of the genes compose the major tRNA(Ser)4,7 cluster at 12DE on the X chromosome, while the other is from 23E on the left arm of chromosome 2. Among the eight X-linked genes, five different, interrelated, classes of sequence were found. Four of the eight genes correspond to tRNA(Ser)4 and tRNA(Ser)7 (which are 96% homologous), two appear to result from single crossovers between tRNA(Ser)4 and tRNA(Ser)7 genes, one is an apparent double crossover product, and the last differs from a tRNA(Ser)4 gene by a single C to T transition at position 50. The single autosomal gene corresponds to tRNA(Ser)7. Comparison of a pair of genes corresponding to tRNA(Ser)4 from D. melanogaster and Drosophila simulans showed that, while gene flanking sequences may diverge considerably by accumulation of point changes, gene sequences are maintained intact. Our data indicate that recombination occurs between non-allelic tRNA(Ser) genes, and suggest that at least some recombinational events may be intergenic conversions.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号