首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MOTIVATION: High-throughput technologies such as DNA sequencing and microarrays have created the need for automated annotation of large sets of genes, including whole genomes, and automated identification of pathways. Ontologies, such as the popular Gene Ontology (GO), provide a common controlled vocabulary for these types of automated analysis. Yet, while GO offers tremendous value, it also has certain limitations such as the lack of direct association with pathways. RESULTS: We demonstrated the use of the KEGG Orthology (KO), part of the KEGG suite of resources, as an alternative controlled vocabulary for automated annotation and pathway identification. We developed a KO-Based Annotation System (KOBAS) that can automatically annotate a set of sequences with KO terms and identify both the most frequent and the statistically significantly enriched pathways. Results from both whole genome and microarray gene cluster annotations with KOBAS are comparable and complementary to known annotations. KOBAS is a freely available stand-alone Python program that can contribute significantly to genome annotation and microarray analysis.  相似文献   

2.
The automated sequence annotation pipeline (ASAP) is designed to ease routine investigation of new functional annotations on unknown sequences, such as expressed sequence tags (ESTs), through querying of web-accessible resources and maintenance of a local database. The system allows easy use of the output from one search as the input for a new search, as well as the filtering of results. The database is used to store formats and parameters and information for parsing data from web sites. The database permits easy updating of format information should a site modify the format of a query or of a returned web page.  相似文献   

3.
While glycoproteins are abundant in nature, and changes in glycosylation occur in cancer and other diseases, glycoprotein characterization remains a challenge due to the structural complexity of the biopolymers. This paper presents a general strategy, termed GlyDB, for glycan structure annotation of N-linked glycopeptides from tandem mass spectra in the LC-MS analysis of proteolytic digests of glycoproteins. The GlyDB approach takes advantage of low-energy collision-induced dissociation of N-linked glycopeptides that preferentially cleaves the glycosidic bonds while the peptide backbone remains intact. A theoretical glycan structure database derived from biosynthetic rules for N-linked glycans was constructed employing a novel representation of branched glycan structures consisting of multiple linear sequences. The commonly used peptide identification program, Sequest, could then be utilized to assign experimental tandem mass spectra to individual glycoforms. Analysis of synthetic glycopeptides and well-characterized glycoproteins demonstrate that the GlyDB approach can be a useful tool for annotation of glycan structures and for selection of a limited number of potential glycan structure candidates for targeted validation.  相似文献   

4.
Modifications for improving the prediction quality in a previously described adaptive algorithm of automated annotation (A 4) were considered. First, the direct use of the basis statistic η ensures a higher prediction quality than the use of a previously proposed statistic γ. Second, the quality is improved when only some of the found similar sequences, rather than all of them, are used for prediction, since this reduces the data noise.  相似文献   

5.
The scurfy (sf) murine mutation causes severe lymphoproliferation, which results in death of hemizygous males (sf/Y) by 22 to 26 days of age. The CD4+ T cells are crucial mediators of this disease. Recent publications have not only identified this mutation as the genetic equivalent of the human disease X-linked neonatal diabetes mellitus, enteropathy, and endocrinopathy syndrome, but also have indicated that the defective protein-scurfin-is a new forkhead/winged-helix protein with a frameshift mutation, resulting in a product without the functional forkhead. These results have lead to speculation that the scurfy gene acts by disrupting the T-cell tolerance mechanism, resulting in hyperresponsiveness and lack of down-regulation. The Rag1KO/sf/Y OVA strain, with virtually 100% of its CD4+ T cells reactive strictly to ovalbumin (OVA) peptide 323-339, is an excellent model for determination of the sf mutation's ability to disrupt tolerance. We hypothesized that Rag1KO/sf/OVA mice would not be tolerant to antigen at a dose that tolerizes control animals. We found that splenic cells from Rag1KO/sf/Y OVA mice injected with the same dose of OVA peptide that induces tolerance in cells from control mice proliferate in vitro in response to OVA peptide. These results are consistent with a defect in the pathway responsible for peripheral T-cell tolerization.  相似文献   

6.
7.
8.
Reliable automated NOE assignment and structure calculation on the basis of a largely complete, assigned input chemical shift list and a list of unassigned NOESY cross peaks has recently become feasible for routine NMR protein structure calculation and has been shown to yield results that are equivalent to those of the conventional, manual approach. However, these algorithms rely on the availability of a virtually complete list of the chemical shifts. This paper investigates the influence of incomplete chemical shift assignments on the reliability of NMR structures obtained with automated NOESY cross peak assignment. The program CYANA was used for combined automated NOESY assignment with the CANDID algorithm and structure calculations with torsion angle dynamics at various degrees of completeness of the chemical shift assignment which was simulated by random omission of entries in the experimental 1H chemical shift lists that had been used for the earlier, conventional structure determinations of two proteins. Sets of structure calculations were performed choosing the omitted chemical shifts randomly among all assigned hydrogen atoms, or among aromatic hydrogen atoms. For comparison, automated NOESY assignment and structure calculations were performed with the complete experimental chemical shift but under random omission of NOESY cross peaks. When heteronuclear-resolved three-dimensional NOESY spectra are available the current CANDID algorithm yields in the absence of up to about 10% of the experimental 1H chemical shifts reliable NOE assignments and three-dimensional structures that deviate by less than 2 Å from the reference structure obtained using all experimental chemical shift assignments. In contrast, the algorithm can accommodate the omission of up to 50% of the cross peaks in heteronuclear- resolved NOESY spectra without producing structures with a RMSD of more than 2 Å to the reference structure. When only homonuclear NOESY spectra are available, the algorithm is slightly more susceptible to missing data and can tolerate the absence of up to about 7% of the experimental 1H chemical shifts or of up to 30% of the NOESY peaks.Abbreviations: BmPBPA – Bombyx mori pheromone binding protein form A; CYANA – combined assignment and dynamics algorithm for NMR applications; NMR – nuclear magnetic resonance; NOE – nuclear Overhauser effect; NOESY – NOE spectroscopy; RMSD – root-mean-square deviation; WmKT – Williopsis mrakii killer toxin  相似文献   

9.
In the past few years, the field of metagenomics has been growing at an accelerated pace, particularly in response to advancements in new sequencing technologies. The large volume of sequence data from novel organisms generated by metagenomic projects has triggered the development of specialized databases and tools focused on particular groups of organisms or data types. Here we describe a pipeline for the functional annotation of viral metagenomic sequence data. The Viral MetaGenome Annotation Pipeline (VMGAP) pipeline takes advantage of a number of specialized databases, such as collections of mobile genetic elements and environmental metagenomes to improve the classification and functional prediction of viral gene products. The pipeline assigns a functional term to each predicted protein sequence following a suite of comprehensive analyses whose results are ranked according to a priority rules hierarchy. Additional annotation is provided in the form of enzyme commission (EC) numbers, GO/MeGO terms and Hidden Markov Models together with supporting evidence.  相似文献   

10.
11.
Natale DA  Shankavaram UT  Galperin MY  Wolf YI  Aravind L  Koonin EV 《Genome biology》2000,1(5):research0009.1-research000919

Background  

Standard archival sequence databases have not been designed as tools for genome annotation and are far from being optimal for this purpose. We used the database of Clusters of Orthologous Groups of proteins (COGs) to reannotate the genomes of two archaea, Aeropyrum pernix, the first member of the Crenarchaea to be sequenced, and Pyrococcus abyssi.  相似文献   

12.
Sequence similarity was used to predict the position of expressed sequence tags (ESTs) in the genome of the turkey (Meleagris gallopavo). Turkey EST sequences were compared with the draft assembly of the chicken whole-genome sequence and the chicken EST database by BLASTN. Among the 877 ESTs examined, 788 had significant matches in the chicken genome sequence. Position of orthologous sequences in the chicken genome and the predicted position of the EST loci in the turkey genome are presented. Genetic assignments suggest a high level of accuracy for the COMPASS predictions.  相似文献   

13.
Sequence similarity was used to predict the position of expressed sequence tags (ESTs) in the genome of the turkey (Meleagris gallopavo). Turkey EST sequences were compared with the draft assembly of the chicken whole-genome sequence and the chicken EST database by BLASTN. Among the 877 ESTs examined, 788 had significant matches in the chicken genome sequence. Position of orthologous sequences in the chicken genome and the predicted position of the EST loci in the turkey genome are presented Genetic assignments suggest a high level of accuracy for the COMPASS predictions.  相似文献   

14.

Background  

In the recent past, the introduction of Classical Swine Fever Virus (CSFV) followed by between-herd spread has given rise to a number of large epidemics in The Netherlands and Belgium. Both these countries are pork-exporting countries. Particularly important in these epidemics has been the occurrence of substantial "neighborhood transmission" from herd to herd in the presence of base-line control measures prescribed by EU legislation. Here we propose a calculation procedure to map out "high-risk areas" for local between-herd spread of CSFV as a tool to support decision making on prevention and control of CSFV outbreaks. In this procedure the identification of such areas is based on an estimated inter-herd distance dependent probability of neighborhood transmission or "local transmission". Using this distance-dependent probability, we derive a threshold value for the local density of herds. In areas with local herd density above threshold, local transmission alone can already lead to epidemic spread, whereas in below-threshold areas this is not the case. The first type of area is termed 'high-risk' for spread of CSFV, while the latter type is termed 'low-risk'.  相似文献   

15.

Background

Lack of NDUFS4, a subunit of mitochondrial complex I (NADH:ubiquinone oxidoreductase), causes Leigh syndrome (LS), a progressive encephalomyopathy. Knocking out Ndufs4, either systemically or in brain only, elicits LS in mice. In patients as well as in KO mice distinct regions of the brain degenerate while surrounding tissue survives despite systemic complex I dysfunction. For the understanding of disease etiology and ultimately for the development of rationale treatments for LS, it appears important to uncover the mechanisms that govern focal neurodegeneration.

Results

Here we used the Ndufs4(KO) mouse to investigate whether regional and temporal differences in respiratory capacity of the brain could be correlated with neurodegeneration. In the KO the respiratory capacity of synaptosomes from the degeneration prone regions olfactory bulb, brainstem and cerebellum was significantly decreased. The difference was measurable even before the onset of neurological symptoms. Furthermore, neither compensating nor exacerbating changes in glycolytic capacity of the synaptosomes were found. By contrast, the KO retained near normal levels of synaptosomal respiration in the degeneration-resistant/resilient “rest” of the brain. We also investigated non-synaptic mitochondria. The KO expectedly had diminished capacity for oxidative phosphorylation (state 3 respiration) with complex I dependent substrate combinations pyruvate/malate and glutamate/malate but surprisingly had normal activity with α-ketoglutarate/malate. No correlation between oxidative phosphorylation (pyruvate/malate driven state 3 respiration) and neurodegeneration was found: Notably, state 3 remained constant in the KO while in controls it tended to increase with time leading to significant differences between the genotypes in older mice in both vulnerable and resilient brain regions. Neither regional ROS damage, measured as HNE-modified protein, nor regional complex I stability, assessed by blue native gels, could explain regional neurodegeneration.

Conclusion

Our data suggests that locally insufficient respiration capacity of the nerve terminals may drive focal neurodegeneration.  相似文献   

16.
17.
18.
Phytophthora megakarya, the causative agent of cacao black pod disease in West African countries causes an extensive loss of yield. In this study we have analyzed 4 libraries of ESTs derived from Phytophthora megakarya infected cocoa leaf and pod tissues. Totally 6379 redundant sequences were retrieved from ESTtik database and EST processing was performed using seqclean tool. Clustering and assembling using CAP3 generated 3333 non-redundant (907 contigs and 2426 singletons) sequences. The primary sequence analysis of 3333 non-redundant sequences showed that the GC percentage was 42.7 and the sequence length ranged from 101 - 2576 nucleotides. Further, functional analysis (Blast, Interproscan, Gene ontology and KEGG search) were executed and 1230 orthologous genes were annotated. Totally 272 enzymes corresponding to 114 metabolic pathways were identified. Functional annotation revealed that most of the sequences are related to molecular function, stress response and biological processes. The annotated enzymes are aldehyde dehydrogenase (E.C: 1.2.1.3), catalase (E.C: 1.11.1.6), acetyl-CoA C-acetyltransferase (E.C: 2.3.1.9), threonine ammonia-lyase (E.C: 4.3.1.19), acetolactate synthase (E.C: 2.2.1.6), O-methyltransferase (E.C: 2.1.1.68) which play an important role in amino acid biosynthesis and phenyl propanoid biosynthesis. All this information was stored in MySQL database management system to be used in future for reconstruction of biotic stress response pathway in cocoa.  相似文献   

19.
日本七鳃鳗(Lampetra japonica)口腔腺表达序列标签(EST)分析   总被引:9,自引:0,他引:9  
高琪  逄越  吴毓  马飞  李庆伟 《遗传学报》2005,32(10):1045-1052
以日本七鳃鳗口腔腺为材料,构建库容量为2.1×106pfu/mL的cDNA文库。通过对文库中克隆子的序列测定和生物信息学初步分析,得到1323条有效EST序列。经BlastX及BlastN软件进行同源对比分析,653条(49.36%)EST可在蛋白质或核苷酸水平上找到同源序列,其中328条与七鳃鳗科物种同源。同源序列功能分类大致分为11类,与蛋白质合成有关的蛋白所占比例最大。1323条EST进行片段重叠群分析(contig analysis)获得包括547条序列在内的162组片段重叠群并确定了8条全长cDNA。日本七鳃鳗口腔腺cDNA文库以及EST文库的成功构建,为研究日本七鳃鳗口腔腺的功能基因和蛋白质组学奠定了基础。  相似文献   

20.
In order to verify the reproducibility, precision, and robustness of the optical immunosensor River Analyser (RIANA), we investigated two common statistical methods to evaluate the limit of detection (LOD) and the limit of quantification (LOQ). Therefore, we performed a simultaneous multi-analyte calibration with atrazine, bisphenol A, and estrone in Milli-Q water. Using an automated biosensor, it was possible for the first time to achieve a LOD below 0.020 microg L(-1) using a common statistically based method without sample pre-treatment and pre-concentration for each of the analytes in a simultaneous multi-analyte calibration. This biosensor setup shows values comparable to those obtained by more classical analytical methods. Based on this calibration, we measured spiked and un-spiked real water samples with complex matrices (samples from different water bodies, from ground water sources, and tap water samples). The comparison between our River Analyser and common analytical methods (like GC-MS and HPLC-DAD) shows overall comparable values for all three analytes. Furthermore, a calibration of isoproturon (IPU) (in single analyte mode) resulted in a LOD of 0.016 microg L(-1), and a LOQ of 0.091 microg L(-1). In compliance with guidelines of the Association of Analytical Communities International (AOAC), six out of nine recovery rates (recovery rate: measured concentration divided by real concentration in percent) for three surface water samples with different matrices (spiked and un-spiked) could be obtained between 70 and 120% (recovery rates between 70 and 120%, as demanded by the guidelines of the AOAC International). The reproducibility was checked by measuring replica of each sample within independent repetitions. Robustness could be demonstrated by long-term stability tests of the biosensor surface. These studies show that the biosensor used offers the necessary reproducibility, precision, and robustness required for an analytical method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号