首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Recent pangenome studies have revealed a large fraction of the gene content within a species exhibits presence–absence variation (PAV). However, coding regions alone provide an incomplete assessment of functional genomic sequence variation at the species level. Little to no attention has been paid to noncoding regulatory regions in pangenome studies, though these sequences directly modulate gene expression and phenotype. To uncover regulatory genetic variation, we generated chromosome-scale genome assemblies for thirty Arabidopsis thaliana accessions from multiple distinct habitats and characterized species level variation in Conserved Noncoding Sequences (CNS). Our analyses uncovered not only PAV and positional variation (PosV) but that diversity in CNS is nonrandom, with variants shared across different accessions. Using evolutionary analyses and chromatin accessibility data, we provide further evidence supporting roles for conserved and variable CNS in gene regulation. Additionally, our data suggests that transposable elements contribute to CNS variation. Characterizing species-level diversity in all functional genomic sequences may later uncover previously unknown mechanistic links between genotype and phenotype.  相似文献   

2.
The role of pattern databases in sequence analysis   总被引:2,自引:0,他引:2  
In the wake of the numerous now-fruitful genome projects, we are entering an era rich in biological data. The field of bioinformatics is poised to exploit this information in increasingly powerful ways, but the abundance and growing complexity both of the data and of the tools and resources required to analyse them are threatening to overwhelm us. Databases and their search tools are now an essential part of the research environment. However, the rate of sequence generation and the haphazard proliferation of databases have made it difficult to keep pace with developments. In an age of information overload, researchers want rapid, easy-to-use, reliable tools for functional characterisation of newly determined sequences. But what are those tools? How do we access them? Which should we use? This review focuses on a particular type of database that is increasingly used in the task of routine sequence analysis--the so-called pattern database. The paper aims to provide an overview of the current status of pattern databases in common use, outlining the methods behind them and giving pointers on their diagnostic strengths and weaknesses.  相似文献   

3.
Based on the recently determined X-ray structures of Torpedo californica acetylcholinesterase and Geotrichum candidum lipase and on their three-dimensional superposition, an improved alignment of a collection of 32 related amino acid sequences of other esterases, lipases, and related proteins was obtained. On the basis of this alignment, 24 residues are found to be invariant in 29 sequences of hydrolytic enzymes, and an additional 49 are well conserved. The conservation in the three remaining sequences is somewhat lower. The conserved residues include the active site, disulfide bridges, salt bridges, and residues in the core of the proteins. Most invariant residues are located at the edges of secondary structural elements. A clear structural basis for the preservation of many of these residues can be determined from comparison of the two X-ray structures.  相似文献   

4.
Proteins with similar structures are generally assumed to arise from similar sequences. However, there are more cases than not where this is not true. The dogma is that sequence determines structure; how, then, can very different sequences fold to the same structure? Here, we employ high temperature unfolding simulations to probe the pathways and specific interactions that direct the folding and unfolding of the SH3 domain. The SH3 metafold in the Dynameomics Database consists of 753 proteins with the same structure, but varied sequences and functions. To investigate the relationship between sequence and structure, we selected 17 targets from the SH3 metafold with high sequence variability. Six unfolding simulations were performed for each target, transition states were identified, revealing two general folding/unfolding pathways at the transition state. Transition states were also expressed as mathematical graphs of connected chemical nodes, and it was found that three positions within the structure, independent of sequence, were consistently more connected within the graph than any other nearby positions in the sequence. These positions represent a hub connecting different portions of the structure. Multiple sequence alignment and covariation analyses also revealed certain positions that were more conserved due to packing constraints and stabilizing long‐range contacts. This study demonstrates that members of the SH3 domain with different sequences can unfold through two main pathways, but certain characteristics are conserved regardless of the sequence or unfolding pathway. While sequence determines structure, we show that disparate sequences can provide similar interactions that influence folding and lead to similar structures.  相似文献   

5.
田靖  赵志虎  陈惠鹏 《遗传》2009,31(11):1067-1076
比较基因组学的研究发现: 人类基因组中约5%的序列受到选择压力的限制, 但编码序列只占其中很小一部分, 约3.5%是保守、非编码序列。这些保守非编码元件具有重要功能。可能在染色质构型(高级结构)、DNA转录和RNA加工等不同水平参与了基因的表达调控, 与哺乳动物的形态发生和人类疾病相关。文章简要综述了保守非编码元件的识别、功能及验证、起源演化以及与人类疾病的关系。  相似文献   

6.
Syk expression and novel function in a wide variety of tissues.   总被引:6,自引:0,他引:6  
Syk protein-tyrosine kinase has been implicated in a variety of hematopoietic cell responses, in particular immunoreceptor signaling events that mediate diverse cellular responses including proliferation, differentiation, and phagocytosis. On the other hand, Syk exhibits a more widespread expression pattern in nonhematopoietic cells like fibroblasts, epithelial cells, breast tissue, hepatocytes, neuronal cells, and vascular endothelial cells and has been shown to be functionally important on these cell types. Thus, Syk appears to play a general physiological function in a wide variety of cells. In this article, we briefly review the current literature regarding the expression and novel function of Syk in various cells and tissues.  相似文献   

7.
Profile search methods based on protein domain alignments have proven to be useful tools in comparative sequence analysis. Domain alignments used by currently available search methods have been computed by sequence comparison. With the growth of the protein structure database, however, alignments of many domain pairs have also been computed by structure comparison. Here, we examine the extent to which information from these two sources agrees. We measure agreement with respect to identification of homologous regions in each protein, that is, with respect to the location of domain boundaries. We also measure agreement with respect to identification of homologous residue sites by comparing alignments and assessing the accuracy of the molecular models they predict. We find that domain alignments in publicly available collections based on sequence and structure comparison are largely consistent. However, the homologous regions identified by sequence comparison are often shorter than those identified by 3D structure comparison. In addition, when overall sequence similarity is low alignments from sequence comparison produce less accurate molecular models, suggesting that they less accurately identify homologous sites. These observations suggest that structure comparison results might be used to improve the overall accuracy of domain alignment collections and the performance of profile search methods based on them.  相似文献   

8.
Members of a new molecular family of bacterial nonspecific acid phosphatases (NSAPs), indicated as class C, were found to share significant sequence similarities to bacterial class B NSAPs and to some plant acid phosphatases, representing the first example of a family of bacterial NSAPs that has a relatively close eukaryotic counterpart. Despite the lack of an overall similarity, conserved sequence motifs were also identified among the above enzyme families (class B and class C bacterial NSAPs, and related plant phosphatases) and several other families of phosphohydrolases, including bacterial phosphoglycolate phosphatases, histidinol-phosphatase domains of the bacterial bifunctional enzymes imidazole-glycerolphosphate dehydratases, and bacterial, eukaryotic, and archaeal phosphoserine phosphatases and threalose-6-phosphatases. These conserved motifs are clustered within two domains, separated by a variable spacer region, according to the pattern [FILMAVT]-D-[ILFRMVY]-D-[GSNDE]-[TV]-[ILVAM]-[AT S VILMC]-X-¿YFWHKR)-X-¿YFWHNQ¿-X( 102,191)-¿KRHNQ¿-G-D-¿FYWHILVMC¿-¿QNH¿-¿FWYGP¿-D -¿PSNQYW¿. The dephosphorylating activity common to all these proteins supports the definition of this phosphatase motif and the inclusion of these enzymes into a superfamily of phosphohydrolases that we propose to indicate as "DDDD" after the presence of the four invariant aspartate residues. Database searches retrieved various hypothetical proteins of unknown function containing this or similar motifs, for which a phosphohydrolase activity could be hypothesized.  相似文献   

9.
10.
以目前报道油脂产量最高的解脂耶氏酵母菌株(Yarrowia lipolytica)ATCC 30162为对象,采用逆转录PCR扩增到脂肪酶编码基因Yllip1和Yllip2,编码产物分别为816和549个氨基酸。保守结构域预测表明,Yllip1包含Patatin类磷脂酶和功能未知的DUF3336结构域,而Yllip2包含lipase_3类脂肪酶结构域,且这两个蛋白都具有1~4个跨膜区域。与不同物种来源的脂肪酶同源蛋白的多序列比对表明Yllip1和Yllip2分别包含8和6个保守区域,这些生物信息学分析表明这两个来源于解脂耶氏酵母的脂肪酶作用底物可能分别为细胞内膜磷脂和酰基甘油酯。荧光定量PCR分析表明:培养基中添加油酸在短期内(6 h)诱导了这两个脂肪酶基因Yllip1和Yllip2的显著上调表达,表明它们可能参与了酵母分解利用油酸的生化过程。  相似文献   

11.
12.
Wang J  Feng JA 《Proteins》2005,58(3):628-637
Sequence alignment has become one of the essential bioinformatics tools in biomedical research. Existing sequence alignment methods can produce reliable alignments for homologous proteins sharing a high percentage of sequence identity. The performance of these methods deteriorates sharply for the sequence pairs sharing less than 25% sequence identity. We report here a new method, NdPASA, for pairwise sequence alignment. This method employs neighbor-dependent propensities of amino acids as a unique parameter for alignment. The values of neighbor-dependent propensity measure the preference of an amino acid pair adopting a particular secondary structure conformation. NdPASA optimizes alignment by evaluating the likelihood of a residue pair in the query sequence matching against a corresponding residue pair adopting a particular secondary structure in the template sequence. Using superpositions of homologous proteins derived from the PSI-BLAST analysis and the Structural Classification of Proteins (SCOP) classification of a nonredundant Protein Data Bank (PDB) database as a gold standard, we show that NdPASA has improved pairwise alignment. Statistical analyses of the performance of NdPASA indicate that the introduction of sequence patterns of secondary structure derived from neighbor-dependent sequence analysis clearly improves alignment performance for sequence pairs sharing less than 20% sequence identity. For sequence pairs sharing 13-21% sequence identity, NdPASA improves the accuracy of alignment over the conventional global alignment (GA) algorithm using the BLOSUM62 by an average of 8.6%. NdPASA is most effective for aligning query sequences with template sequences whose structure is known. NdPASA can be accessed online at http://astro.temple.edu/feng/Servers/BioinformaticServers.htm.  相似文献   

13.
Sequenced fragments of genes coding for silicon transporters (SITs) were analyzed for diatoms of evolutionarily distant classes (centric Chaetoceros muelleri Lemmermann, pennate araphid Synedra acus Kützing, pennate raphid Phaeodactylum tricornutum Bohlin, and pennate Cylindrotheca fusiformis Reimann et Lewin with a keeled raphe system). SITs were found to contain a conserved motif, CMLD. Hydropathy profiles showed that the motif CMLD is between two transmembrane domains lacking Lys and Arg, and the domains were consequently assumed to play a role in the formation of a channel mediating silicic acid transport. The motif CMLD proved to be rare. Since Zn2+ is necessary for silica incorporation into diatom cells, a hypothesis was advanced that the motif CMLD acts as a Zn-binding site. Diatom growth suppression was observed in the presence of the alkylating agent N-iodoacetylamidoethyl-1-aminonaphthalene-5-sulfonic acid (AEDANS), which does not penetrate into the cell. Cys of the motif CMLD was assumed to act as a target for AEDANS. Zinc ions inhibited Cys alkylation in the synthetic peptide NCMLDY, testifying to the above hypothesis.__________Translated from Molekulyarnaya Biologiya, Vol. 39, No. 2, 2005, pp. 303–316.Original Russian Text Copyright © 2005 by Sherbakova, Masyukova, Safonova, Petrova, Vereshagin, Minaeva, Adelshin, Triboy, Stonik, Aizdaitcher, Kozlov, Likhoshway, Grachev.  相似文献   

14.
Polymorphic sequence in the D-loop region of equine mitochondrial DNA   总被引:8,自引:0,他引:8  
The D-loop regions in equine mitochondrial DNA were cloned from three thoroughbred horses by polymerase chain reaction (PCR). The total number of bases in the D-loop region were 1114bp, 1115bp and 1146bp. The equine D-loop region is A/T rich like many other mammalian D-loops. The large central conserved sequence block and small conserved sequence blocks 1, 2 and 3, that are common to other mammals, were observed. Between conserved sequence blocks 1 and 2 there were tandem repeats of an 8bp equine-specific sequence TGTGCACC, and the number of tandem repeats differed among individual horses. The base composition in the unit of these repeats is G/C rich as are the short repeats in the D-loops of rabbit and pig. Comparing DNA sequences between horse and other mammals, the difference in the D-loop region length is mostly due to the difference in the number of DNA sequences at both extremities. The similarities of the DNA sequences are in the middle part of the D-loop. In comparison of the sequences among three thoroughbred horses, it was determined that the region between tRNAPro and the large central conserved sequence block was the richest in variation. PCR primers in the D-loop region were designed and the expected maternal inheritance was confirmed by PCR-RFLP (restriction fragment length polymorphism).  相似文献   

15.
Methicillin-resistant Staphylococcus aureus(MRSA) is an increasing cause of serious infection,both in the community and hospital settings. Despite sophisticated strategies and efforts, the antibiotic options for treating MRSA infection are narrowing because of the limited number of newly developed antimicrobials. Here, four newly-isolated MRSA-virulent phages, IME-SA1, IMESA2, IME-SA118 and IME-SA119, were sequenced and analyzed. Their genome termini were identified using our previously proposed "termini analysis theory". We provide evidence that remarkable conserved terminus sequences are found in IME-SA1/2/118/119, and, moreover, are widespread throughout Twortlikevirus Staphylococcus phage G1 and K species. Results also suggested that each phage of the two species has conserved 5′ terminus while the 3′ terminus is variable. More importantly, a variable region with a specific pattern was found to be present near the conserved terminus of Twortlikevirus S. phage G1 species. The clone with the longest variable region had variable terminus lengths in successive generations, while the clones with the shortest variable region and with the average length variable region maintained the same terminal length as themselves during successive generations. IME-SA1 bacterial infection experiments showed that the variation is not derived from adaptation of the phage to different host strains. This is the first study of the conserved terminus and variable region of Twortlikevirus S. phages.  相似文献   

16.
HCV分离株主要分为4个基因型(HCV Ⅰ~Ⅳ),各型间的氨基酸及核苷酸组成同源性均小于80%, 氨基酸变异率分别为C 8%,E1 35%,E2/NS1 53%,NS3 27%,NS4 35%,NS5 39%.不同型别的HCV有不同的地区分布特征.根据HCV表达产物多肽的保守性、亲水性、抗原性及空间构型等特性,已在HCV表达产物中鉴定出一些高度保守的候选B细胞表位及T细胞表位, 其中B细胞表位一般为12~40肽, T细胞表位一般为7~9肽, 这些B/T细胞保守表位的鉴定, 将有助于推动HCV的免疫治疗及疫苗研究的发展.  相似文献   

17.
Vibrio cholerae O1 El Tor, the pathogen responsible for the current cholera pandemic, became pathogenic by acquiring virulent factors including Vibrio seventh pandemic islands (VSP)‐I and ?II. Diversity of VSP‐II is well recognized; however, studies addressing attachment sequence left (attL) sequences of VSP‐II are few. In this report, a wide variety of V. cholerae strains were analyzed for the structure and distribution of VSP‐II in relation to their attachment sequences. Of 188 V. cholerae strains analyzed, 81% (153/188) strains carried VSP‐II; of these, typical VSP‐II, and a short variant was found in 36% (55/153), and 63% (96/153), respectively. A novel VSP‐II was found in two V. cholerae non‐O1/non‐O139 strains. In addition to the typical 14‐bp attL, six new attL‐like sequences were identified. The 14‐bp attL was associated with VSP‐II in 91% (139/153), whereas the remaining six types were found in 9.2% (14/153) of V. cholerae strains. Of note, six distinct types of the attL‐like sequence were found in the seventh pandemic wave 1 strains; however, only one or two types were found in the wave 2 or 3 strains. Interestingly, 86% (24/28) of V. cholerae seventh pandemic strains harboring a 13‐bp attL‐like sequence were devoid of VSP‐II. Six novel genomic islands using two unique insertion sites to those of VSP‐II were identified in 11 V. cholerae strains in this study. Four of those shared similar gene clusters with VSP‐II, except integrase gene.
  相似文献   

18.
A new revision of the sequence of plasmid pBR322   总被引:19,自引:0,他引:19  
Ned Watson 《Gene》1988,70(2):399-403
A revised sequence in the region immediately upstream from the rop gene of pBR322 is reported. Two base pairs in the accepted sequence do not exist in the plasmid DNA. Specifically, a TA base pair is missing at sequence coordinate 1893 [Sutcliffe, Cold Spring Harbor Symp. Quant. Biol. 43 (1979) 77–90] and an AT base pair is missing at position 1915, giving a total size for pBR322 of 4361 bp. These changes are in a potential translation initiation sequence and probably reflect errors in the original sequence rather than recent evolution of the plasmid.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号