首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Differential rates of nucleotide substitution among different gene segments and between distinct evolutionary lineages is well documented among mitochondrial genes and is likely a consequence of locus-specific selective constraints that delimit mutational divergence over evolutionary time. We compared sequence variation of 18 homologous loci (15 coding genes and 3 parts of the control region) among 10 mammalian mitochondrial DNA genomes which allowed us to describe different mitochondrial evolutionary patterns and to produce an estimation of the relative order of gene divergence. The relative rates of divergence of mitochondrial DNA genes in the family Felidae were estimated by comparing their divergence from homologous counterpart genes included in nuclear mitochondrial DNA (Numt, pronounced "new might"), a genomic fossil that represents an ancient transfer of 7.9 kb of mitochondrial DNA to the nuclear genome of an ancestral species of the domestic cat (Felis catus). Phylogenetic analyses of mitochondrial (mtDNA) sequences with multiple outgroup species were conducted to date the ancestral node common to the Numt and the cytoplasmic (Cymt) mtDNA genes and to calibrate the rate of sequence divergence of mitochondrial genes relative to nuclear homologous counterparts. By setting the fastest substitution rate as strictly mutational, an empirical "selective retardation index" is computed to quantify the sum of all constraints, selective and otherwise, that limit sequence divergence of mitochondrial gene sequences over time.   相似文献   

2.
We represent all DNA sequences as points in twelve-dimensional space in such a way that homologous DNA sequences are clustered together, from which a new genomic space is created for global DNA sequences comparison of millions of genes simultaneously. More specifically, basing on the contents of four nucleotides, their distances from the origin and their distribution along the sequences, a twelve-dimensional vector is given to any DNA sequence. The applicability of this analysis on global comparison of gene structures was tested on myoglobin, beta-globin, histone-4, lysozyme, and rhodopsin families. Members from each family exhibit smaller vector distances relative to the distances of members from different families. The vector distance also distinguishes random sequences generated based on same bases composition. Sequence comparisons showed consistency with the BLAST method. Once the new gene is discovered, we can compute the location of this new gene in our genomic space. It is natural to predict that the properties of this new gene are similar to the properties of known genes that are locating near by. Biologists can do various experiments to test these properties.  相似文献   

3.
We present the complete nucleotide sequence and the deduced amino acid sequence of the H-2Dp class I gene. This gene, which was cloned from a B10.P genomic DNA library, encodes and intact, functional H-2Dp molecule. Comparative analysis of the Dp sequence with other class I sequences reveals both similarities and differences. This analysis also shows that these genes exhibit D region-specific, locus-specific, as well as allele-specific sequences. The H-2Dp nucleotide sequence is greater than 90% homologous to the H-2Ld and H-2Db genes and only approximately 85% homologous to the H-2Dd gene. The K region and Qa region genes are less homologous. The 3' noncoding sequences appear to be region-specific. All of the previously described D region genes, Db, Ld, and Dd, possess the B2-SINE Alu-like repetitive sequence, as does Dp. Thus, this B2 repeat is a region-specific marker present in all D region genes studied so far. The additional polyadenylation site found in the H-2Dp gene starting at nucleotide 4671, which is homologous to non-D region sequences, as well as unique protein Dp coding sequences, make this gene an interesting model for studying the evolution of polymorphism and structure/function relationships in the class I gene family.  相似文献   

4.
We have shown that the mRNAs for apopolysialoglycoproteins (apoPSGP) of rainbow trout contain various numbers of a repetitive sequence of 39 base-pairs encoding mature apoPSGP, and that this sequence is bordered by highly homologous 5' and 3' regions encoding pre-, pro- and telopeptides. These mRNAs are thought to be transcribed from different genes that constitute a large multiple gene family (more than 100 members). Here, we have determined the structures of several members of the apoPSGP gene family. The results show that two of three genomic DNA fragments contain two independent apoPSGP genes in the same orientation with unrelated sequences intervening. Five characterized genes have essentially the same organization and sequence. Each gene has four exons, and CAAT and TATA sequences were found in the 5'-flanking regions. However, two noteworthy differences were observed among the five genes; a diversity in the number of the 39 base-pair repeats, also observed among the cDNA clones, and a one-base polymorphism in the 39 base-pair repeat, which causes an amino acid change. This polymorphism was not detected among the cDNA clones obtained. The boundary positions of the genes are various and contain no transposon-like structures. The variation in the number of repeats and the absence of a rule for bordering positions of the genes suggest that apoPSGP genes may have been amplified by gene duplications, unequal recombination, and selection of chromosomes having larger numbers of apoPSGP genes.  相似文献   

5.
6.
Expression patterns of gene products provide important insights into gene function. Reporter constructs are frequently used to analyze gene expression in Caenorhabditis elegans, but the sequence context of a given gene is inevitably altered in such constructs. As a result, these transgenes may lack regulatory elements required for proper gene expression. We developed Gene Catchr, a novel method of generating reporter constructs that exploits yeast homologous recombination (YHR) to subclone and tag worm genes while preserving their local sequence context. YHR facilitates the cloning of large genomic regions, allowing the isolation of regulatory sequences in promoters, introns, untranslated regions and flanking DNA. The endogenous regulatory context of a given gene is thus preserved, producing expression patterns that are as accurate as possible. Gene Catchr is flexible: any tag can be inserted at any position without introducing extra sequence. Each step is simple and can be adapted to process multiple genes in parallel. We show that expression patterns derived from Gene Catchr transgenes are consistent with previous reports and also describe novel expression data. Mutant rescue assays demonstrate that Gene Catchr-generated transgenes are functional. Our results validate the use of Gene Catchr as a valuable tool to study spatiotemporal gene expression.  相似文献   

7.
The Gibbs sampling method has been widely used for sequence analysis after it was successfully applied to the problem of identifying regulatory motif sequences upstream of genes. Since then, numerous variants of the original idea have emerged: however, in all cases the application has been to finding short motifs in collections of short sequences (typically less than 100 nucleotides long). In this paper, we introduce a Gibbs sampling approach for identifying genes in multiple large genomic sequences up to hundreds of kilobases long. This approach leverages the evolutionary relationships between the sequences to improve the gene predictions, without explicitly aligning the sequences. We have applied our method to the analysis of genomic sequence from 14 genomic regions, totaling roughly 1.8 Mb of sequence in each organism. We show that our approach compares favorably with existing ab initio approaches to gene finding, including pairwise comparison based gene prediction methods which make explicit use of alignments. Furthermore, excellent performance can be obtained with as little as four organisms, and the method overcomes a number of difficulties of previous comparison based gene finding approaches: it is robust with respect to genomic rearrangements, can work with draft sequence, and is fast (linear in the number and length of the sequences). It can also be seamlessly integrated with Gibbs sampling motif detection methods.  相似文献   

8.
9.
Phylogenomic studies of prokaryotic taxa often assume conserved marker genes are homologous across their length. However, processes such as horizontal gene transfer or gene duplication and loss may disrupt this homology by recombining only parts of genes, causing gene fission or fusion. We show using simulation that it is necessary to delineate homology groups in a set of bacterial genomes without relying on gene annotations to define the boundaries of homologous regions. To solve this problem, we have developed a graph-based algorithm to partition a set of bacterial genomes into Maximal Homologous Groups of sequences (MHGs) where each MHG is a maximal set of maximum-length sequences which are homologous across the entire sequence alignment. We applied our algorithm to a dataset of 19 Enterobacteriaceae species and found that MHGs cover much greater proportions of genomes than markers and, relatedly, are less biased in terms of the functions of the genes they cover. We zoomed in on the correlation between each individual marker and their overlapping MHGs, and show that few phylogenetic splits supported by the markers are supported by the MHGs while many marker-supported splits are contradicted by the MHGs. A comparison of the species tree inferred from marker genes with the species tree inferred from MHGs suggests that the increased bias and lack of genome coverage by markers causes incorrect inferences as to the overall relationship between bacterial taxa.  相似文献   

10.
11.
The development of the CRISPR–Cas9 system in recent years has made eukaryotic genome editing, and specifically gene knockout for reverse genetics, a simple and effective task. The system is directed to a genomic target site by a programmed single-guide RNA (sgRNA) that base-pairs with it, subsequently leading to site-specific modifications. However, many gene families in eukaryotic genomes exhibit partially overlapping functions, and thus, the knockout of one gene might be concealed by the function of the other. In such cases, the reduced specificity of the CRISPR–Cas9 system, which may lead to the modification of genomic sites that are not identical to the sgRNA, can be harnessed for the simultaneous knockout of multiple homologous genes. We introduce CRISPys, an algorithm for the optimal design of sgRNAs that would potentially target multiple members of a given gene family. CRISPys first clusters all the potential targets in the input sequences into a hierarchical tree structure that specifies the similarity among them. Then, sgRNAs are proposed in the internal nodes of the tree by embedding mismatches where needed, such that the efficiency to edit the induced targets is maximized. We suggest several approaches for designing the optimal individual sgRNA and an approach to compute the optimal set of sgRNAs for cases when the experimental platform allows for more than one. The latter may optionally account for the homologous relationships among gene-family members. We further show that CRISPys outperforms simpler alignment-based techniques by in silico examination over all gene families in the Solanum lycopersicum genome.  相似文献   

12.
Identifying genomic homology within and between genomes is essential when studying genome evolution. In the past years, different computational techniques have been developed to detect homology even when the actual similarity between homologous segments is low. Depending on the strategy used, these methods search for pairs of chromosomal segments between which either both gene content and order are conserved or gene content only. However, due to fact that, after their divergence, homologous segments can lose a different set of genes, these methods still often fail to detect genomic homology. Recently, more advanced approaches have been developed that can combine gene order and content information of multiple genomic segments.  相似文献   

13.
14.
15.
Functional genomic approaches, such as proteomics, greatly enhance the value of genome sequences by providing a global level assessment of which genes are expressed, when genes are expressed and at what cellular levels gene products are synthesized. With over 1000 complete genome sequences of different microorganisms available, and DNA sequencing for environmental samples (metagenomes) producing vast amounts of gene sequence data, there is a real opportunity and a clear need to generate associated functional genomic data to learn about the source microorganisms. In contrast to the technological advances that have led to the accelerated rate and ease at which DNA sequence data can be generated, mass spectrometry based proteomics remains a technically sophisticated and exacting science. In recognition of the need to make proteomics more accessible to a growing number of environmental microbiologists so that the 'functional genomics gap' may be bridged, this review strives to demystify proteomic technologies and describe ways in which they have been applied, and more importantly, can be applied to study the physiology and ecology of extremophiles.  相似文献   

16.
The ocean pout (Macrozoarces americanus) produces a set of antifreeze proteins that depresses the freezing point of its blood by binding to, and inhibiting the growth of, ice crystals. The amino acid sequences of all the major components of the ocean pout antifreeze proteins, including the immunologically distinct QAE component, have been derived by Edman degradation. In addition, sequences of several minor components were deduced from DNA sequencing of cDNA and genomic clones. Fifty percent of the amino acids are perfectly conserved in all these proteins as well as in two homologous sequences from the distantly related wolffish. Several of the conserved residues are threonines and asparagines, amino acids that have been implicated in ice binding in the structurally unrelated antifreeze protein of the righteye flounders. Aside from minor differences in post-translational modifications, heterogeneity in antifreeze protein components stems from amino acid differences encoded by multiple genes. Based on genomic Southern blots and library cloning statistics there are 150 copies of the 0.7-kilobase-long antifreeze protein gene in the Newfoundland ocean pout, the majority of which are closely linked but irregularly spaced. A more southerly population of ocean pout from New Brunswick in which the circulating antifreeze protein levels are considerably lower has approximately one-quater as many antifreeze protein genes. Thus, there appears to be a correlation between gene dosage and antifreeze protein levels, and hence the ability to survive in ice-laden seawater. Southern blot comparison of the two populations indicates that the differences in gene dosage were not generated by a simple set of deletions/duplications. They are more likely to be the result of differential amplification.  相似文献   

17.
18.
The entire bovine corticotropin/beta-lipotropin precursor gene has been isolated as a set of overlapping genomic DNA fragments which extend over a length of approximately 17000 base pairs. Restriction mapping of the cloned DNA fragments and nucleotide sequence analysis of the whole mRNA-coding segments and their surrounding regions have established that the corticotropin/beta-lipotropin precursor gene is approximately 7300-base-pairs long and contains two intervening sequences; one with an approximate length of 4000 base pairs is located within the segment encoding the 5'-untranslated region of the mRNA, and the other with an approximate length of 220 base pairs interrupts the protein-coding sequence near the signal peptide region. Sequence analysis of more than 200 base pairs preceding the proximal end of the corticotropin/beta-lipotropin precursor gene has revealed a 'Hogness box' and a variant of the model sequence d(G-G-TC-C-A-A-T-C-T) as well as palindrome structures as observed in other eukaryotic genes. Furthermore, some sequence similarities in the 5'-flanking region are found between the corticotropin/beta-lipotropin precursor gene and the mouse alpha-globin and beta-globin genes, all of which are negatively regulated by glucocorticoids. At least four homologous repetitive sequences are distributed at 3000-5000-base-pair distances in the corticotropin/beta-lipotropin precursor gene region; two such sequences are located in the 5'-flanking region, and one within each intervening sequence. Blot hybridization analysis of bovine pituitary nuclear RNA has indicated that the entire corticotropin/beta-lipotropin precursor gene is transcribed into a primary hnRNA product, which is then spliced to form the mature mRNA.  相似文献   

19.
Analysis of cloned human genomic loci homologous to the small nuclear RNA U1 established that such sequences are abundant and dispersed in the human genome and that only a fraction represent bona fide genes. The majority of genomic loci bear defective gene copies, or pseudogenes, which contain scattered base mismatches and in some cases lack the sequence corresponding to the 3' end of U1 RNA. Although all of the U1 genes examined to date are flanked by essentially identical sequences and therefore appear to comprise a single multigene family, we present evidence for the existence of at least three structurally distinct classes of U1 pseudogenes. Class I pseudogenes had considerable flanking sequence homology with the U1 gene family and were probably derived from it by a DNA-mediated event such as gene duplication. In contrast, the U1 sequence in class II and III U1 pseudogenes was flanked by single-copy genomic sequences completely unrelated to those flanking the U1 gene family; in addition, short direct repeats flanked the class III but not the class II pseudogenes. We therefore propose that both class II and III U1 pseudogenes were generated by an RNA-mediated mechanism involving the insertion of U1 sequence information into a new chromosomal locus. We also noted that two other types of repetitive DNA sequences in eucaryotes, the Alu family in vertebrates and the ribosomal DNA insertions in Drosophila, bore a striking structural resemblance to the classes of U1 pseudogenes described here and may have been created by an RNA-mediated insertion event.  相似文献   

20.
J B Cohen  D Givol 《The EMBO journal》1983,2(11):2013-2018
The nucleotide sequence of two germline immunoglobulin heavy chain variable region (VH) genes of mouse BALB/c origin was determined. These two genes are highly homologous to each other. They both have the unusual codon CCT for proline at position 7, which so far has been found only in a specific set of VH genes, called the NPb family. We show that the two VH genes belong to this set. One of our BALB/c genes, VH124, is more homologous to a C57BL/6 NPb VH gene than to any BALB/c VH gene, and we propose that these two genes are alleles. A comparison of the substitutions between these two genes with published sequences of all other BALB/c and C57BL/6 NPb VH genes reveals evidence for past homologous recombination events between related germline VH genes Homologous recombination may play an important role in the diversification of germline immunoglobulin VH genes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号