首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Analysis of protein sequence is an important tool in studies of both native and recombinant proteins. Novel techniques and instrumentation which facilitate determination of protein primary structure have recently been developed.  相似文献   

3.
《Genomics》2021,113(3):1098-1113
Epigenetic inheritance occurs due to different mechanisms such as chromatin and histone modifications, DNA methylation and processes mediated by non-coding RNAs. It leads to changes in gene expressions and the emergence of new traits in different organisms in many diseases such as cancer. Recent advances in experimental methods led to the identification of epigenetic target sites in various organisms. Computational approaches have enabled us to analyze mass data produced by these methods. Next-generation sequencing (NGS) methods have been broadly used to identify these target sites and their patterns. By using these patterns, the emergence of diseases could be prognosticated. In this study, target site prediction tools for two major epigenetic mechanisms comprising histone modification and DNA methylation are reviewed. Publicly accessible databases are reviewed as well. Some suggestions regarding the state-of-the-art methods and databases have been made, including examining patterns of epigenetic changes that are important in epigenotypes detection.  相似文献   

4.
With the completion of the human and a few model organisms' genomes, and with the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time-based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Our method requires approximately 6 . N byte memory and a computational time of N log N to extract all the repeat-related and periodic or quasi-periodic features from a sequence of length N without any prior knowledge on the consensus sequence of those features, hence enables us to carry out sequence analysis on the whole genomic scale by a PC.  相似文献   

5.
6.
随着真菌感染的增多,仅用表型方法鉴定环境中或临床上的致病真菌不足以快速准确地诊断真菌感染疾病,近年来,分子生物学方法因快速、准确而逐步得到应用,其中DNA序列分析已成为鉴定致病真菌到种水平的重要方法。现就DNA序列分析在常见致病真菌分类鉴定及基因分型的应用加以综述。  相似文献   

7.
This is a review of the methods based on counting oligomers in nucleotide and amino acid sequences. Such methods are analogous to the formal linguistic analysis of human texts. This review includes methods based on the calculation of observed occurrences (frequencies) of oligomers and their distribution, as well as those based on deviations between the observed and the expected occurrences (contrast words, genome signatures) in biological sequences. Both types of methods have a wide range of sensitivity and can identify homologous as well as functionally and taxonomically related sequences.  相似文献   

8.
Sequence determination and analysis began on proteins in the 1950s, with RNA starting about a decade later and DNA a similar period later still. Hence many of the concepts for function prediction were first developed by looking at amino acid sequences. Over time these methods have become much more sophisticated, allowing better discrimination of only weak similarities. The most recent developments concern an examination of contextual information, such as operon structure, metabolic reconstruction or co-expression profiles.  相似文献   

9.
DNA sequence classification is the activity of determining whether or not an unlabeled sequence S belongs to an existing class C. This paper proposes two new techniques for DNA sequence classification. The first technique works by comparing the unlabeled sequence S with a group of active motifs discovered from the elements of C and by distinction with elements outside of C. The second technique generates and matches gapped fingerprints of S with elements of C. Experimental results obtained by running these algorithms on long and well conserved Alu sequences demonstrate the good performance of the presented methods compared with FASTA. When applied to less conserved and relatively short functional sites such as splice-junctions, a variation of the second technique combining fingerprinting with consensus sequence analysis gives better results than the current classifiers employing text compression and machine learning algorithms.  相似文献   

10.
Katokhin  A. V.  Efimov  V. M.  Badratinov  M. Sh.  Kamneva  O. K.  Mordvinov  V. A. 《Biophysics》2008,51(1):100-109

The results of two independent DNA-microarray experiments concerning adipogenesis in the murine preadipocyte 3T3-L1 cell line, which covered the first two days after the induction of differentiation, were analyzed using the multidimensional scaling (MDS) method. In both data arrays, the first three scaling components accounted for 73.5–73.8% of the total dispersion. This result implies that both arrays of the gene expression profiles are in fact three-dimensional and each component reflects a definite principal process involved in one of the three early stages of adipogenesis: (i) determination of the fibroblast-like stem cells, (ii) clonal expansion of adipoblasts, and (iii) preadipocyte conversion into a mature adipocyte phenotype. Each profile of the gene expression is characterized by coefficients of correlation with the first three scaling components. The functional annotation in terms of the Gene Ontology database profiles (sorted according to the correlations with each component) generally corresponds to a regular change of elementary biological processes during the three early stages of adipogenesis. Analysis of correlations with the principal scaling components for the genes previously classified as subject to differential expression in the course of adipogenesis in mice suggests a complicated role of these genes in early adipogenesis (in some cases, described in the literature). The MDS analysis of the gene expression profiles and the analysis of correlations between these profiles and the main scaling components provides a deeper insight into the fine role of these genes and makes possible the search for new biomarkers of various differentiation stages.

  相似文献   

11.
为了利用生物信息学方法预测miR-21的靶基因及其功能,为后续研究miR-21及其靶基因在结肠癌发生中的作用机制奠定基础。研究通过miRBase获取并分析多个物种的miR-21的序列特征;应用Target Scan、Pic Tar,miRanda及miRecords 4种在线工具预测miR-21的靶基因,结合已证实的靶基因,对靶基因进行功能注释和信号通路富集分析;通过查找文献,综述miR-21的功能,结合功能注释和信号通路富集分析为进一步研究mir-21在结肠癌发生中的作用提供理论基础。  相似文献   

12.
Summary A broad-spectrum mercury resistance locus (mer) from a spontaneous chloramphenicol-sensitive (Cms), arginine auxotrophic (Arg) mutant of Streptomyces lividan 1326 was isolated on a 6 kb DNA fragment by shotgun cloning into the mercury-sensitive derivative S. lividans TK64 using the vector pIJ702. The mer genes form part of a very large amplifiable DNA sequence present in S. lividans 1326. This element was amplified to about 20 copies per chromosome in the Cms Arg mutant and was missing from strains like S. lividans TK64, cured for the plasmid SLP3. DNA sequence analysis of a 5 kb region encompassing the whole region required for broad-spectrum mercury resistance revealed six open reading frames (ORFs) transcribed in opposite directions from a common intercistronic region. The protein sequences predicted from the two ORFs transcribed in one direction showed a high degree of similarity to mercuric reductase and organomercurial lyase from other gram-negative and gram-positive sources. Few, if any, similarities were found between the predicted polypeptide sequences of the other four ORFs and other known proteins.  相似文献   

13.
周学  杜宜兰  金萍  马飞 《遗传》2015,37(9):855-864
MicroRNAs(miRNAs)是一类长度约为22nt的内源性非编码RNA,通过与靶基因转录本互补结合调控基因的表达。近年来,研究发现miRNA与癌症发生密切相关,miRNA可以直接充当癌基因或者抑癌基因而影响肿瘤的发生和生长。为更进一步揭示癌症相关miRNA的特征及靶基因的功能,文章通过数据库搜索及文献检索,在人类基因组中发现了475个癌症相关miRNA,系统地比较了癌症相关miRNA与非癌症miRNA以及基因内和基因间区癌症相关miRNA在保守性、SNP位点分布、癌谱及转录调控等特性。研究发现,癌症相关miRNA比非癌症miRNA保守性要强,发生SNP概率比较低,同时发现miRNA所涉及癌症数目与保守性成正相关。基因组定位分析发现,癌症相关miRNA比非癌症miRNA更倾向于成簇存在。进一步对宿主基因、癌症相关miRNA及作用的靶基因与癌症发生进行关联分析,发现一些非癌症miRNA的宿主基因倾向于被癌症miRNA作用。本研究结果为深入理解miRNA与癌症之间的关系,以及进一步为miRNA作为癌症诊断指示物提供理论依据。  相似文献   

14.
Benchmarking tools for the alignment of functional noncoding DNA   总被引:1,自引:0,他引:1  

Background

Numerous tools have been developed to align genomic sequences. However, their relative performance in specific applications remains poorly characterized. Alignments of protein-coding sequences typically have been benchmarked against "correct" alignments inferred from structural data. For noncoding sequences, where such independent validation is lacking, simulation provides an effective means to generate "correct" alignments with which to benchmark alignment tools.

Results

Using rates of noncoding sequence evolution estimated from the genus Drosophila, we simulated alignments over a range of divergence times under varying models incorporating point substitution, insertion/deletion events, and short blocks of constrained sequences such as those found in cis-regulatory regions. We then compared "correct" alignments generated by a modified version of the ROSE simulation platform to alignments of the simulated derived sequences produced by eight pairwise alignment tools (Avid, BlastZ, Chaos, ClustalW, DiAlign, Lagan, Needle, and WABA) to determine the off-the-shelf performance of each tool. As expected, the ability to align noncoding sequences accurately decreases with increasing divergence for all tools, and declines faster in the presence of insertion/deletion evolution. Global alignment tools (Avid, ClustalW, Lagan, and Needle) typically have higher sensitivity over entire noncoding sequences as well as in constrained sequences. Local tools (BlastZ, Chaos, and WABA) have lower overall sensitivity as a consequence of incomplete coverage, but have high specificity to detect constrained sequences as well as high sensitivity within the subset of sequences they align. Tools such as DiAlign, which generate both local and global outputs, produce alignments of constrained sequences with both high sensitivity and specificity for divergence distances in the range of 1.25–3.0 substitutions per site.

Conclusion

For species with genomic properties similar to Drosophila, we conclude that a single pair of optimally diverged species analyzed with a high performance alignment tool can yield accurate and specific alignments of functionally constrained noncoding sequences. Further algorithm development, optimization of alignment parameters, and benchmarking studies will be necessary to extract the maximal biological information from alignments of functional noncoding DNA.
  相似文献   

15.
The alkBFGHJKL and alkST operons encode enzymes that allow Pseudomonas putida (oleovorans) to metabolize alkanes. In this paper we report the nucleotide sequence of a 4592 bp region of the alkBFGHJKL operon encoding the AlkJ, AlkK and AlkL polypeptides. The alkJ gene encodes a protein of 59 kilodaltons. The predicted amino acid sequence shows significant homology with four flavin proteins: choline dehydrogenase, a glucose dehydrogenase and two oxidases. AlkJ is membrane-bound and converts aliphatic medium-chain-length alcohols into aldehydes. The properties of AlkJ suggest that it is linked to the electron transfer chain. AlkJ is necessary for growth on alkanes only in P. putida alcohol dehydrogenase (AlcA) mutants. AlkK is homologous to a range of proteins which act by an ATP-dependent covalent binding of AMP to their substrate. This list includes the acetate, coumarate and long-chain fatty acid CoA ligases. The alkK gene complements a fadD mutation in Escherichia coli, which shows that it indeed encodes an acyl-CoA synthetase. AlkK is a 60 kilodalton protein located in the cytoplasm. AlkL is homologous to OmpW, a Vibrio cholerae outer membrane protein of unknown function, and a hypothetical polypeptide encoded by ytt4 in E. coli. AlkL, OmpW and Ytt4 all have a signal peptide and end with a sequence characteristic of outer membrane proteins. The alkL gene product was found in the outer membrane of E. coli W3110 containing the alk-genes. The alkL gene can be deleted without a clear effect on growth rate. Its function remains unknown. The G+C content of the alkJKL genes is 45%, identical to that of the alkBFGH genes, and significantly lower than the G+C content of the OCT-plasmid and the P. putida chromosome.  相似文献   

16.
X Zou  TK Pham  PC Wright  J Noirel 《Genomics》2012,100(4):240-244
Although protein expression and regulation have been intensively studied, a complete picture of its mechanisms is still to be drawn. Analysis of high-throughput quantitative proteomics data provides a way to better understand protein regulation. Here, we introduce a bioinformatic analysis method to correlate protein regulation with individual amino acid patterns. We compare the amino acid composition between groups of regulated and unregulated proteins and investigate the correlation between codon usage patterns and protein regulation levels in two Sulfolobus species in "biofilm vs planktonic" experiments. The identified amino acids can then be associated with the regulation of specific gene functions. Strikingly, our analysis shows that functional categories of regulated proteins with similar composition and codon usage pattern of specific amino acids behave similarly. This finding can contribute to a better understanding of protein and gene expression regulation and could find applications in gene optimisation.  相似文献   

17.
A systematic characterization of lens crystallins from five major classes of vertebrates was carried out by exclusion gel filtration, cation-exchange chromatography and N-terminal sequence determination. All crystallin fractions except that of -crystallin were found to be N-terminally blocked. -Crystallin is present in major classes of vertebrates except the bird, showing none, or decreased amounts, of this protein in chicken and duck lenses, respectively. N-Terminal sequence analysis of the purified -crystallin polypeptides showed extensive homology between different classes of vertebrates, supporting the close relatedness of this family of crystallin even from the evolutionarily distant species. Comparison of nucleotide sequences and their predicted amino acid sequences between -crystallins of carp and rat lenses and heat-shock proteins demonstrated partial sequence homology of the encoded polypeptides and striking homology at the gene level. The unexpected strong homology of complementary DNA (cDNA) lies in the regions coding for 40 N-terminal residues of carp -II, rat 2-1, and the middle segments of 23,000- and 70,000-M r heat-shock proteins. The optimal alignment of DNA sequences along these two segments shows about 50% homology. The percentage of protein sequence identity for the corresponding aligned segments is only 20%. The weak sequence homology at the protein level is also found between the invertebrate squid crystallin and rat -crystallin polypeptides. These results pointed to the possibility of unifying three major classes of vertebrate crystallins into one // superfamily and corroborated the previous supposition that the existing crystallins in the animal kingdom are probably mutually interrelated, sharing a common ancestry.  相似文献   

18.
19.
20.
Fungal laccases have been used in various fields ranging from processes in wood and paper industries to environmental applications. Although a few bacterial laccases have been characterized in recent years, prokaryotes have largely been neglected as a source of novel enzymes, in part due to the lack of knowledge about the diversity and distribution of laccases within Bacteria. In this work genes for laccase-like enzymes were searched for in over 2,200 complete and draft bacterial genomes and four metagenomic datasets, using the custom profile Hidden Markov Models for two- and three-domain laccases. More than 1,200 putative genes for laccase-like enzymes were retrieved from chromosomes and plasmids of diverse bacteria. In 76% of the genes, signal peptides were predicted, indicating that these bacterial laccases may be exported from the cytoplasm, which contrasts with the current belief. Moreover, several examples of putatively horizontally transferred bacterial laccase genes were described. Many metagenomic sequences encoding fragments of laccase-like enzymes could not be phylogenetically assigned, indicating considerable novelty. Laccase-like genes were also found in anaerobic bacteria, autotrophs and alkaliphiles, thus opening new hypotheses regarding their ecological functions. Bacteria identified as carrying laccase genes represent potential sources for future biotechnological applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号