首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background  

Pathogenicity islands (PAIs), distinct genomic segments of pathogens encoding virulence factors, represent a subgroup of genomic islands (GIs) that have been acquired by horizontal gene transfer event. Up to now, computational approaches for identifying PAIs have been focused on the detection of genomic regions which only differ from the rest of the genome in their base composition and codon usage. These approaches often lead to the identification of genomic islands, rather than PAIs.  相似文献   

2.
3.
LL Zheng  YX Li  J Ding  XK Guo  KY Feng  YJ Wang  LL Hu  YD Cai  P Hao  KC Chou 《PloS one》2012,7(8):e42517
Bacterial pathogens continue to threaten public health worldwide today. Identification of bacterial virulence factors can help to find novel drug/vaccine targets against pathogenicity. It can also help to reveal the mechanisms of the related diseases at the molecular level. With the explosive growth in protein sequences generated in the postgenomic age, it is highly desired to develop computational methods for rapidly and effectively identifying virulence factors according to their sequence information alone. In this study, based on the protein-protein interaction networks from the STRING database, a novel network-based method was proposed for identifying the virulence factors in the proteomes of UPEC 536, UPEC CFT073, P. aeruginosa PAO1, L. pneumophila Philadelphia 1, C. jejuni NCTC 11168 and M. tuberculosis H37Rv. Evaluated on the same benchmark datasets derived from the aforementioned species, the identification accuracies achieved by the network-based method were around 0.9, significantly higher than those by the sequence-based methods such as BLAST, feature selection and VirulentPred. Further analysis showed that the functional associations such as the gene neighborhood and co-occurrence were the primary associations between these virulence factors in the STRING database. The high success rates indicate that the network-based method is quite promising. The novel approach holds high potential for identifying virulence factors in many other various organisms as well because it can be easily extended to identify the virulence factors in many other bacterial species, as long as the relevant significant statistical data are available for them.  相似文献   

4.
5.
6.
7.
MOTIVATION: We consider the problem of identifying low-complexity regions (LCRs) in a protein sequence. LCRs are regions of biased composition, normally consisting of different kinds of repeats. RESULTS: We define new complexity measures to compute the complexity of a sequence based on a given scoring matrix, such as BLOSUM 62. Our complexity measures also consider the order of amino acids in the sequence and the sequence length. We develop a novel graph-based algorithm called GBA to identify LCRs in a protein sequence. In the graph constructed for the sequence, each vertex corresponds to a pair of similar amino acids. Each edge connects two pairs of amino acids that can be grouped together to form a longer repeat. GBA finds short subsequences as LCR candidates by traversing this graph. It then extends them to find longer subsequences that may contain full repeats with low complexities. Extended subsequences are then post-processed to refine repeats to LCRs. Our experiments on real data show that GBA has significantly higher recall compared to existing algorithms, including 0j.py, CARD, and SEG. AVAILABILITY: The program is available on request.  相似文献   

8.
Zang X  Komatsu S 《Phytochemistry》2007,68(4):426-437
Osmotic stress can endanger the survival of plants. To investigate the mechanisms of how plants respond to osmotic stress, rice protein profiles from mannitol-treated plants, were monitored using a proteomics approach. Two-week-old rice seedlings were treated with 400mM mannitol for 48h. After separation of proteins from the basal part of leaf sheaths by two-dimensional polyacrylamide gel electrophoresis, 327 proteins were detected. The levels of 12 proteins increased and the levels of three proteins decreased with increasing concentration or duration, of mannitol treatment. Levels of a heat shock protein and a dnaK-type molecular chaperone were reduced under osmotic, cold, salt and drought stresses, and ABA treatment, whereas a 26S proteasome regulatory subunit was found to be responsive only to osmotic stress. Furthermore, proteins whose accumulation was sensitive to osmotic stress are present in an osmotic-tolerant cultivar. These results indicate that specific proteins expressed in the basal part of rice leaf sheaths show a coordinated response to cope with osmotic stress.  相似文献   

9.
ABSTRACT: BACKGROUND: Many studies have demonstrated genetic and environmental factors that lead to renal cell carcinoma (RCC) and that occur during a protracted period of tumourigenesis. It appears suitable to identify and characterise potential molecular markers that appear during tumourigenesis and that might provide rapid and effective possibilities for the early detection of RCC. EGFR activation induces cell cycle progression, inhibition of apoptosis and angiogenesis, promotion of invasion/metastasis, and other tumour promoting activities. Over-expression of EGFR is thought to play an important role in tumour initiation and progression of RCC because up-regulation of EGFR has been associated with high grade cancers and a worse prognosis. METHODS: Characterisation of the protein profile interacting with EGFR was performed using the following: an immunohistochemical (IHC) study of EGFR, a comprehensive computational study of EGFR protein-protein interactions, an analysis correlating the expression levels of EGFR with other significant markers in the tumourigenicity of RCC, and finally, an analysis of the utility of EGFR for prognosis in a cohort of patients with renal cell carcinoma. RESULTS: The cases that showed a higher level of this protein fell within the clear cell histological subtype (p = 0.001). The EGFR significance statistic was found with respect to a worse prognosis. In vivo significant correlations were found with PDGFR-beta, Flk-1, Hif1-alpha, proteins related to differentiation (such as DLL3 and DLL4 ligands), and certain metabolic proteins such as Glut5. In silico significant associations gave us a panel of 32 EGFR-interacting proteins (EIP) using the APID and STRING databases. CONCLUSIONS: This work summarises the multifaceted role of EGFR in the pathology of RCC, and it identifies EIPs that could help to provide mechanistic explanations for the different behaviours observed in tumours.  相似文献   

10.
Although poorly positioned nucleosomes are ubiquitous in the eukaryotic genome, they are difficult to identify with existing nucleosome identification methods. Recently available enhanced high-throughput chromatin conformation capture techniques such as Micro-C, DNase Hi-C, and Hi-CO characterize nucleosome-level chromatin proximity, probing the positions of mono-nucleosomes and the spacing between nucleosome pairs at the same time, enabling nucleosome profiling in poorly positioned regions. Here we develop a novel computational approach, NucleoMap, to identify nucleosome positioning from ultra-high resolution chromatin contact maps. By integrating nucleosome read density, contact distances, and binding preferences, NucleoMap precisely locates nucleosomes in both prokaryotic and eukaryotic genomes and outperforms existing nucleosome identification methods in both precision and recall. We rigorously characterize genome-wide association in eukaryotes between the spatial organization of mono-nucleosomes and their corresponding histone modifications, protein binding activities, and higher-order chromatin functions. We also find evidence of two tetra-nucleosome folding structures in human embryonic stem cells and analyze their association with multiple structural and functional regions. Based on the identified nucleosomes, nucleosome contact maps are constructed, reflecting the inter-nucleosome distances and preserving the contact distance profiles in original contact maps.  相似文献   

11.
Animals display characteristic behavioural patterns when performing a task, such as the spiraling of a soaring bird or the surge-and-cast of a male moth searching for a female. Identifying such recurring sequences occurring rarely in noisy behavioural data is key to understanding the behavioural response to a distributed stimulus in unrestrained animals. Existing models seek to describe the dynamics of behaviour or segment individual locomotor episodes rather than to identify the rare and transient sequences of locomotor episodes that make up the behavioural response. To fill this gap, we develop a lexical, hierarchical model of behaviour. We designed an unsupervised algorithm called “BASS” to efficiently identify and segment recurring behavioural action sequences transiently occurring in long behavioural recordings. When applied to navigating larval zebrafish, BASS extracts a dictionary of remarkably long, non-Markovian sequences consisting of repeats and mixtures of slow forward and turn bouts. Applied to a novel chemotaxis assay, BASS uncovers chemotactic strategies deployed by zebrafish to avoid aversive cues consisting of sequences of fast large-angle turns and burst swims. In a simulated dataset of soaring gliders climbing thermals, BASS finds the spiraling patterns characteristic of soaring behaviour. In both cases, BASS succeeds in identifying rare action sequences in the behaviour deployed by freely moving animals. BASS can be easily incorporated into the pipelines of existing behavioural analyses across diverse species, and even more broadly used as a generic algorithm for pattern recognition in low-dimensional sequential data.  相似文献   

12.
Chatterji S  Pachter L 《Genomics》2007,90(1):44-48
The exon-intron structure of eukaryotic genes allows for phenomena such as alternative splicing, nonsense-mediated decay, and regulation through untranslated regions. However, the evolution of the exon structure of genes is not well elucidated because of limited and phylogenetically sparse data sets. In this study, we use the phylogenetically diverse sequencing of the ENCODE regions to study gene structure evolution in mammalian genomes. This first phylogenetically diverse study of gene structure changes offers insights into the mode and tempo of mammalian gene structure evolution. The genes undergoing structure changes appear to be moderately to highly expressed in germline cells and show levels of selection similar to those of other ENCODE genes. Patterns of gene duplication of the affected genes are more complex than expected. The number of sampled genomes is sufficiently dense to infer that certain gene duplications happened after intron loss. Thus, although gene duplication is highly correlated with intron loss, we conclude that structural changes in genes are not necessarily due to a loss of constraint following gene duplication as previously suggested.  相似文献   

13.

Background  

We describe the distribution of indels in the 44 Encyclopedia of DNA Elements (ENCODE) regions (about 1% of the human genome) and evaluate the potential contributions of small insertion and deletion polymorphisms (indels) to human genetic variation. We relate indels to known genomic annotation features and measures of evolutionary constraint.  相似文献   

14.
15.
Protein tyrosine kinases (PTKs) play a central role in the modulation of a wide variety of cellular events such as differentiation, proliferation and metabolism, and their unregulated activation can lead to various diseases including cancer and diabetes. PTKs represent a diverse family of proteins including both receptor tyrosine kinases (RTKs) and non-receptor tyrosine kinases (NRTKs). Due to the diversity and important cellular roles of PTKs, accurate classification methods are required to better understand and differentiate different PTKs. In addition, PTKs have become important targets for drugs, providing a further need to develop novel methods to accurately classify this set of important biological molecules. Here, we introduce a novel statistical model for the classification of PTKs that is based on their structural features. The approach allows for both the recognition of PTKs and the classification of RTKs into their subfamilies. This novel approach had an overall accuracy of 98.5% for the identification of PTKs, and 99.3% for the classification of RTKs.  相似文献   

16.
Protein-protein interactions are important to understanding cell functions; however, our theoretical understanding is limited. There is a general discontinuity between the well-accepted physical and chemical forces that drive protein-protein interactions and the large collections of identified protein-protein interactions in various databases. Minimotifs are short functional peptide sequences that provide a basis to bridge this gap in knowledge. However, there is no systematic way to study minimotifs in the context of protein-protein interactions or vice versa. Here we have engineered a set of algorithms that can be used to identify minimotifs in known protein-protein interactions and implemented this for use by scientists in Minimotif Miner. By globally testing these algorithms on verified data and on 100 individual proteins as test cases, we demonstrate the utility of these new computation tools. This tool also can be used to reduce false-positive predictions in the discovery of novel minimotifs. The statistical significance of these algorithms is demonstrated by an ROC analysis (P = 0.001).  相似文献   

17.
We employ a structurally-motivated phenomenological formulation to identify biomechanical experiments which can be used to determine a vascular constitutive relation directly from data. Large deformations, nonlinear material behavior, load-dependent anisotropy, material heterogeneity and incompressibility are accounted for in the analysis. For purposes of illustration, we outline a procedure for studying elastic arteries wherein the behavior of the media and adventitia is considered separately. This general approach for identifying vascular constitutive relations can be applied to any vessel or airway, however, and should provide certain advantages over previous microstructural or purely phenomenological formulations.  相似文献   

18.
The diagnosis of cancer by examination of the urine has the potential to improve patient outcomes by means of earlier detection. Due to the fact that the urine contains metabolic signatures of many biochemical pathways, this biofluid is ideally suited for metabolomic analysis, especially involving diseases of the kidney and urinary system. In this pilot study, we test three independent analytical techniques for suitability for detection of renal cell carcinoma (RCC) in urine of affected patients. Hydrophilic interaction chromatography (HILIC-LC-MS), reversed-phase ultra performance liquid chromatography (RP-UPLC-MS), and gas chromatography time-of-flight mass spectrometry (GC-TOF-MS) all were used as complementary separation techniques. The combination of these techniques is best suited to cover a very large part of the urine metabolome by enabling the detection of both lipophilic and hydrophilic metabolites present therein. In this study, it is demonstrated that sample pretreatment with urease dramatically alters the metabolome composition apart from removal of urea. Two new freely available peak alignment methods, MZmine and XCMS, are used for peak detection and retention time alignment. The results are analyzed by a feature selection algorithm with subsequent univariate analysis of variance (ANOVA) and a multivariate partial least squares (PLS) approach. From more than 2000 mass spectral features detected in the urine, we identify several significant components that lead to discrimination between RCC patients and controls despite the relatively small sample size. A feature selection process condensed the significant features to less than 30 components in each of the data sets. In future work, these potential biomarkers will be further validated with a larger patient cohort. Such investigation will likely lead to clinically applicable assays for earlier diagnosis of RCC, as well as other malignancies, and thereby improved patient prognosis.  相似文献   

19.
20.
Heme is a key cofactor in aerobic life, both in eukaryotes and prokaryotes. Because of the high reactivity of ferrous protoporphyrin IX, the reactions of heme in cells are often carried out through heme-protein complexes. Traditionally studies of heme-binding proteins have been approached on a case by case basis, thus there is a limited global view of the distribution of heme-binding proteins in different cells or tissues. The procedure described here is aimed at profiling heme-binding proteins in mouse tissues sequentially by 1) purification of heme-binding proteins by heme-agarose, an affinity chromatographic resin; 2) isolation of heme-binding proteins by SDS-PAGE or two-dimensional electrophoresis; 3) identification of heme-binding proteins by mass spectrometry. In five mouse tissues, over 600 protein spots were visualized on 2DE gel stained by Commassie blue and 154 proteins were identified by MALDI-TOF, in which most proteins belong to heme related. This methodology makes it possible to globally c  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号