首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Bayes prediction quantifies uncertainty by assigning posterior probabilities. It was used to identify amino acids in a protein under recurrent diversifying selection indicated by higher nonsynonymous (d(N)) than synonymous (d(S)) substitution rates or by omega = d(N)/d(S) > 1. Parameters were estimated by maximum likelihood under a codon substitution model that assumed several classes of sites with different omega ratios. The Bayes theorem was used to calculate the posterior probabilities of each site falling into these site classes. Here, we evaluate the performance of Bayes prediction of amino acids under positive selection by computer simulation. We measured the accuracy by the proportion of predicted sites that were truly under selection and the power by the proportion of true positively selected sites that were predicted by the method. The accuracy was slightly better for longer sequences, whereas the power was largely unaffected by the increase in sequence length. Both accuracy and power were higher for medium or highly diverged sequences than for similar sequences. We found that accuracy and power were unacceptably low when data contained only a few highly similar sequences. However, sampling a large number of lineages improved the performance substantially. Even for very similar sequences, accuracy and power can be high if over 100 taxa are used in the analysis. We make the following recommendations: (1) prediction of positive selection sites is not feasible for a few closely related sequences; (2) using a large number of lineages is the best way to improve the accuracy and power of the prediction; and (3) multiple models of heterogeneous selective pressures among sites should be applied in real data analysis.  相似文献   

2.
The study of the proteins that bind to telomeric DNA in mammals has provided a deep understanding of the mechanisms involved in chromosome-end protection. However, very little is known on the binding of these proteins to nontelomeric DNA sequences. The TTAGGG DNA repeat proteins 1 and 2 (TRF1 and TRF2) bind to mammalian telomeres as part of the shelterin complex and are essential for maintaining chromosome end stability. In this study, we combined chromatin immunoprecipitation with high-throughput sequencing to map at high sensitivity and resolution the human chromosomal sites to which TRF1 and TRF2 bind. While most of the identified sequences correspond to telomeric regions, we showed that these two proteins also bind to extratelomeric sites. The vast majority of these extratelomeric sites contains interstitial telomeric sequences (or ITSs). However, we also identified non-ITS sites, which correspond to centromeric and pericentromeric satellite DNA. Interestingly, the TRF-binding sites are often located in the proximity of genes or within introns. We propose that TRF1 and TRF2 couple the functional state of telomeres to the long-range organization of chromosomes and gene regulation networks by binding to extratelomeric sequences.  相似文献   

3.
The primary structures of the plasmids pECL18 (5571 bp) and pKPN2 (4196 bp) from Escherichia coli and Klebsiella pneumoniae, respectively, which carry genes for a Type II restriction-modification system (RMS2) with the specificity 5'-CCNGG-3', were determined in order to elucidate the structural relationship between them. The data suggest a possible role for recombination events at bom (basis of mobility) regions and the sites of resolution of multimer plasmid forms (so-called cer sequences) in the structural evolution of multicopy plasmids. Analysis of the sequences of pECL18 and pKPN2 showed that the genes for RM* Ecl18kI and RM* Kpn2kI, and the sequences of the rep (replication) regions in the two plasmids, are almost identical. In both plasmids, these regions are localized between the bom regions and the cer sites. The rest of the pECL18 sequence is almost identical to that of the mob (mobilization) region of ColE1, and the corresponding segment of pKPN2 is almost identical to part of pHS-2 from Shigella flexneri. The difference in primary structures results in different mobilization properties of pECL18 and pKPN2. The complete sequences of pECL18, pKPN2 and the pairwise comparison of the sequences of pECL18, pKPN2, ColE1 and pHS-2 suggest that plasmids may exchange DNA units via site-specific recombination events at bom and cer sites. In the course of BLASTN database searches using the cer sites of pECL18 and pKPN2 as queries, we found twenty cer sites of natural plasmids. Alignment of these sequences reveals that they fall into two classes. The plasmids in each group possess related segments between their cer and bom sites.  相似文献   

4.
To determine whether the distribution of estuarine ammonia-oxidizing bacteria (AOB) was influenced by salinity, the community structure of betaproteobacterial ammonia oxidizers (AOB) was characterized along a salinity gradient in sediments of the Ythan estuary, on the east coast of Scotland, UK, by denaturant gradient gel electrophoresis (DGGE), cloning and sequencing of 16S rRNA gene fragments. Ammonia-oxidizing bacteria communities at sampling sites with strongest marine influence were dominated by Nitrosospira cluster 1-like sequences and those with strongest freshwater influence were dominated by Nitrosomonas oligotropha-like sequences. Nitrosomonas sp. Nm143 was the prevailing sequence type in communities at intermediate brackish sites. Diversity indices of AOB communities were similar at marine- and freshwater-influenced sites and did not indicate lower species diversity at intermediate brackish sites. The presence of sequences highly similar to the halophilic Nitrosomonas marina and the freshwater strain Nitrosomonas oligotropha at identical sampling sites indicates that AOB communities in the estuary are adapted to a range of salinities, while individual strains may be active at different salinities. Ammonia-oxidizing bacteria communities that were dominated by Nitrosospira cluster 1 sequence types, for which no cultured representative exists, were subjected to stable isotope probing (SIP) with 13C-HCO3-, to label the nucleic acids of active autotrophic nitrifiers. Analysis of 13C-associated 16S rRNA gene fragments, following CsCl density centrifugation, by cloning and DGGE indicated sequences highly similar to the AOB Nitrosomonas sp. Nm143 and Nitrosomonas cryotolerans and to the nitrite oxidizer Nitrospira marina. No sequence with similarity to the Nitrosospira cluster 1 clade was recovered during SIP analysis. The potential role of Nitrosospira cluster 1 in autotrophic ammonia oxidation therefore remains uncertain.  相似文献   

5.
6.
7.
The envelope (env) protein of human immunodeficiency virus type 1 (HIV-1) plays a crucial role in virus entry and is a central target for HIV vaccine design. Using the QUASI program, we analyzed the conserved regions of all currently available env sequences in the Los Alamos National Laboratory HIV Sequence Database and identified positive selection (PS) sites that are likely to be restricted by host immune responses. We found that PS sites are dispersed across conserved regions of env sequence, and that the C3, C4, and C5 regions were the most targeted. Several regions were identified as being PS free and were mainly distributed in the C1 and C2 regions. When comparing individual QUASI PS site frequencies across clades and geographical regions with the overall frequency of the entire env database, the env sequences from North America showed significantly lower PS site frequency, while those from Asia were significantly higher using Student's t test. The QUASI PS site frequency of env proteins from viruses isolated from different years showed that the PS site frequencies of the env population increased over time. Our study provides an overview of PS sites across the conserved regions of HIV-1 env sequences.  相似文献   

8.
Telomeres are DNA-protein complexes that protect linear chromosomes from degradation and fusions. Telomeric DNA is repetitive and G-rich, and protrudes towards the end of the chromosomes as 3'G-overhangs. In Leishmania spp., sequences adjacent to telomeres comprise the Leishmania conserved telomere associated sequences (LCTAS) that are around 100 bp long and contain two conserved sequence elements (CSB1 and CSB2), in addition to non-conserved sequences. The aim of this work was to study the genomic organization of Leishmania (Leishmania) amazonensis telomeric/subtelomeric sequences. Leishmania amazonensis chromosomes were separated in a single Pulsed Field Gel Electrophoresis (PFGE) gel as 25 ethidium bromide-stained bands. All of the bands hybridized with the telomeric probe (5'-TTAGGG-3')3 and with probes generated from the conserved subtelomeric elements (CSB1, CSB2). Terminal restriction fragments (TRF) of L. amazonensis chromosomes were analyzed by hybridizing restriction digested genomic DNA and chromosomal DNA separated in 2D-PFGE with the telomeric probe. The L. amazonensis TRF was estimated to be approximately 3.3 kb long and the telomeres were polymorphic and ranged in size from 0.2 to 1.0 kb. Afa I restriction sites within the conserved CSB1 elements released the telomeres from the rest of the chromosome. Bal 31-sensitive analysis confirmed the presence of terminal Afa I restriction sites and served to differentiate telomeric fragments from interstitial internal sequences. The size of the L. amazonensis 3' G-overhang was estimated by non-denaturing Southern blotting to be approximately 12 nt long. Using similar approaches, the subtelomeric domains CSB1 and CSB2 were found to be present in a low copy number compared to telomeres and were organized in blocks of 0.3-1.5 kb flanked by Hinf I and Hae III restriction sites. A model for the organization of L. amazonensis chromosomal ends is provided.  相似文献   

9.
Hara T  Chida K 《Gene》2002,283(1-2):11-16
In Chinese hamster extended blocks of telomeric-like repeats were previously detected by in situ hybridization at the pericentromeric region of most chromosomes and short arrays were localized at several interstitial sites. In this work, we analyzed the molecular organization of internal telomeric sequences (ITs) in the Chinese hamster genome. In genomic transfers hybridized with a telomeric probe, multiple Bal31 insensitive fragments were detected. Most of the fragments ranged in size between less than 1 kb and more than 100 kb and some were polymorphic. Fluorescence in situ hybridization experiments on DNA fibers and on elongated chromosomes showed that the pericentromeric ITs are composed of extensive and essentially continuous arrays of telomeric-like sequences. We then isolated three genomic regions which contain short ITs. These ITs are localized at interstitial sites (3q13-15, 3q21-26, 1p26) and are composed of 29-126 bp of (TTAGGG)(n) repeats. A peculiar feature of all the three ITs is the AT richness of the flanking sequences. Since AT-rich DNA is known to be unstable and characteristic of several mammalian fragile sites, we propose that the three ITs were inserted at these sites during the repair of double strand breaks.  相似文献   

10.
11.
12.
13.
All exonic CG sequences in p53 are methylated; this epigenetic modification is correlated with frequent G:C-->A:T transitions in p53. Recent reports reveal the presence in p53 of non-CG methylation in CC and CCC sequences, complementary to sites of selective guanosine adduct formation (GG and GGG), and the association of genetic instability with methylation at repetitive sequences. We presently investigated the distribution of methylation sites and repetitive elements in silent and nonsense p53 mutations (2051) among the IARC's TP53 somatic mutation database for exons 5-8. Silent mutations are nonrandom, but mostly involve G:C-->A:T transitions (62%); in particular C-->T mutations (39% of all silent mutations) are mostly correlated with CC and CCC sequences, while G-->A mutations with GG sequences. Sequence analysis of all non-G:C-->A:T silent mutations reveals the frequent formation of new methylation sites (CG), new CCC and GGG sequences in the resulting sequence, refinement of symmetry elements at interrupted microsatellite-like sequences and formation of small repeats (55.3%). The G:C-->A:T silent mutations characterize cancers associated with cigarette smoking (e.g. bladder or lung and bronchus cancer versus colorectal cancer); on the contrary, non-G:C-->A:T silent mutations have similar frequencies in most cancers. Nonsense mutations in exons 5-8, all resulting in mutants lacking amino acids 307-393, which are crucial for p53 activity, were also analyzed. The frequency of nonsense mutations is higher at methylated sites or repeats 1-2 nucleotides removed from methylation sites. Frameshift mutations are also more frequent at repeated sequences. The frequent G:C-->A:T silent mutations could indicate that CC and CCC sequences of exons 5-8 are occasionally targets of non-CpG methylation of cytosine. This process of de novo methylation in the presence of microsatellite-like sequences and small repeats might influence the genetic stability of a variety of genes.  相似文献   

14.
The three-dimensional structures of globins are known, from crystallographic analyses, to be very similar. Their amino acid sequences, however, differ greatly. Only two residues are absolutely conserved in all sequences, and the residue identities of some pairs of sequences are only 16%. We have determined the nature and exact extent of the sequence variations and the extent to which the conserved features of the globin sequences are unique to this family. The 226 globin sequences now known were aligned and analysed. Because distantly related protein sequences cannot be aligned correctly without the use of structural data, we developed a method that incorporated structural information into the alignment procedure. Analysis of the aligned sequences show that: (1) Although individual chains vary in size between 132 and 157 residues, deletions and insertions result in there being only 102 residue sites common to all globins. These sites form six separate regions. Insertions and deletions between these regions means that their separations can vary in different sequences. (2) Within the conserved regions there are 32 sites that almost always contain hydrophobic residues. In the known structures, these sites are in the protein interior. We measured the variations in the size of the residues that occur in the 226 sequences at these sites. At six sites the residues differ in size by less than 40 A3, at 11 sites they differ by 40 to 100 A3, and at 15 sites they differ by more than 100 A3. There are two other conserved buried sites: one contains the His linked to the haem iron and the other usually contains a His involved with the haem ligand. (3) Within the conserved regions there are another 32 sites that are almost always occupied by charged, polar or small non-polar (Gly or Ala) residues. In the known structures, these sites are on the protein surface. To determine the extent to which the conserved features found for the globin sequences are unique to that protein family, the following procedure was used. The six conserved regions, and the residue restrictions that occur at the 66 sites within these regions, were encoded into two "templates". One was based only on the sequences so far determined; the other was extended to include as yet unobserved substitutions that seemed plausible on the basis of size, hydrophobicity and polarity. Each of the 3286 non-globin sequences in the data bank was then examined by a computer program to see how closely it could be matched to these templates.(ABSTRACT TRUNCATED AT 400 WORDS)  相似文献   

15.
16.
B A Roth  S A Goff  T M Klein    M E Fromm 《The Plant cell》1991,3(3):317-325
Tissue-specific expression of the maize anthocyanin Bronze-1 (Bz1) gene is controlled by the products of several regulatory genes. These include C1 or Pl and R or B that share homology to the myb proto-oncogenes and myc-like genes, respectively. Bz1 expression in embryo tissues is dependent on C1 and an R-sc allele of R. Transient expression from mutated and deleted versions of the Bz1 promoter fused to a luciferase reporter gene was measured in C1, Rscm2 embryos after gene transfer by microprojectiles. This analysis revealed that the sequences between -76 base pairs (bp) and -45 bp and a 9-bp AT-rich block between -88 bp and -80 bp were critical for Bz1 expression. The -76 bp to -45 bp region includes two short sequences that are homologous to the consensus binding sites of the myb- and myc-like proteins. Site-specific mutations of these "myb" and "myc" sequences reduced Bz1 expression to 10% and 1% of normal, respectively. Additionally, a trimer of a 38-bp oligonucleotide containing these myb and myc sites increased the expression of a cauliflower mosaic virus 35S minimal promoter by 26-fold. This enhancement was dependent on both C1 and R. Because the sites critical for Bz1 expression are homologous to the myb and myc consensus binding sequences and the C1 and R proteins share homology with the myb and myc products, respectively, we propose that C1 and R interact with the Bz1 promoter at these sites.  相似文献   

17.
A library of chromosomal DNA from Corynebacterium diphtheriae Belfanti 1030(-)tox- was cloned in the lambda phage vector EMBL4 and screened for sequences homologous to corynephage omega tox+ and the attB1-attB2 region of the C7(-)tox- chromosome. Two portions of the 1030(-)tox- chromosome, 35 and 30.5 kilobases long which contain, respectively, the entire region homologous to corynephage omega tox+ and the attB1-attB2 sites, were mapped with the restriction endonucleases BamHI and EcoRI. Chromosomal DNA from 1030(-)tox- was shown to contain a 15.5-kilobase region that was homologous to ca. 42% of the corynephage omega tox+ genome. These sequences were found to hybridize to three regions of the phage genome and do not contain either the diphtheria tox operon or the attP site. These sequences are distant from the chromosomal region that contains the attB1-attB2 sites. Moreover, unlike other known defective prophages, the physical map of this prophage starts at the cos site and is colinear with the vegetative phage map. The 30.5-kilobase region of the 1030(-)tox- chromosome, which contains the attB1-attB2 sites, has a central core region that is almost identical to the corresponding region of the C7(-)tox- chromosome; however, the flanking sequences in these two strains of C. diphtheriae are different.  相似文献   

18.
The diversity of Cyanobacteria in water and sediment samples from four representative sites of the Salar de Huasco was examined using denaturing gradient gel electrophoresis and analysis of clone libraries of 16S rRNA gene PCR products. Salar de Huasco is a high altitude (3800 m altitude) saline wetland located in the Chilean Altiplano. We analyzed samples from a tributary stream (H0) and three shallow lagoons (H1, H4, H6) that contrasted in their physicochemical conditions and associated biota. Seventy-eight phylotypes were identified in a total of 268 clonal sequences deriving from seven clone libraries of water and sediment samples. Oscillatoriales were frequently found in water samples from sites H0, H1 and H4 and in sediment samples from sites H1 and H4. Pleurocapsales were found only at site H0, while Chroococcales were recovered from sediment samples of sites H0 and H1, and from water samples of site H1. Nostocales were found in sediment samples from sites H1 and H4, and water samples from site H1 and were largely represented by sequences highly similar to Nodularia spumigena. We suggest that cyanobacterial communities from Salar de Huasco are unique - they include sequences related to others previously described from the Antarctic, along with others from diverse, but less extreme environments.  相似文献   

19.
In addition to recently characterized DraI (1), two new Type II restriction endonucleases, DraII and DraIII, with novel site-specificities were isolated and purified from Deinococcus radiophilus ATCC 27603. DraII and DraIII recognize the hepta- and nonanucleotide sequences (sequence in text) The cleavage sites within both strands are indicated by arrows. The recognition sequences were established by mapping of the cleavage sites on pBR322 (DraII) and fd109 RF DNA (DraIII). The sequence specifities were confirmed by computer-assisted restriction analyses of the generated fragment patterns of the sequenced DNA's of the bacteriophages lambda, phi X174 RF, M13mp8 RF and fd109 RF, the viruses Adeno2 and SV40, and the plasmids pBR322 and pBR328. The cleavage positions within the recognition sequences were determined by sequencing experiments.  相似文献   

20.
以藏羚羊(Pantholops hodgsonii)及同海拔分布的藏系绵羊(Tibetan Sheep)的心肌组织为材料,提取总RNA,利用逆转录聚合酶链反应(RT-PCR)技术扩增出过氧化物酶体增生物激活受体γ辅激活因子-1α(PGC-1α)的基因编码区cDNA片段,与载体连接构建重组质粒,经转化、扩增培养、鉴定后测序。利用生物信息学方法分析显示,藏羚羊和藏系绵羊的PGC-1α基因编码区长度均为2 349 bp,编码797个氨基酸(GenBank登录号分别为:JF449959和JF449960);与其他脊椎动物PGC-1α基因的核苷酸及氨基酸序列相似性达到90%以上;其包含RNA/DNA结合位点、RNA识别基序(RRM)、与核呼吸因子1(NRF-1)及肌细胞增强因子2C(MEF2C)相互作用的区域、富含丝氨酸/精氨酸的结构域、负调节功能结构域、LXXLL模体以及TPPTTPP和DHDYCQ两个保守序列,14个氨基酸差异性位点位于以上部分功能结构域中;此外,磷酸化位点的预测提示藏羚羊可能存在一个潜在的蛋白激酶G的磷酸化位点(第329位的苏氨酸)。本研究成功克隆出了藏羚羊PGC-1α基因的编码区序列,为从能量代谢角度深入探讨藏羚羊适应高原的分子生物学机制提供了新的思路。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号