首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Sfixem is an sequence feature series (SFS) visualization tool implemented in Java. It is designed to visualize data from sequence analysis programs, allowing the user to view multiple sets of computationally generated analysis to assist the analysis process. SFS is used as the data exchange format. AVAILABILITY: Sfixem is available for direct usage or download for local usage at http://sfixem.cgb.ki.se. A protein sequence analysis workbench using Sfixem is available at http://sfinx.cgb.ki.se.  相似文献   

2.
MEDUSA is a tool for automatic selection and visual assessment of PCR primer pairs, developed to assist large scale gene expression analysis projects. The system allows specification of constraints of the location and distances between the primers in a pair. For instance, primers in coding, non-coding, exon/intron-spanning regions might be selected. Medusa applies these constraints as a filter to primers predicted by three external programs, and displays the resulting primer pairs graphically in the Blixem (Sonnhammer and Durbin, COMPUT: Appl. Biosci. 10, 301-307, 1994; http://www.cgr.ki.se/cgr/groups/sonnhammer/Blixem.html) viewer. AVAILABILITY: The MEDUSA web server is available at http://www.cgr.ki.se/cgr/MEDUSA. The source code and user information are available at ftp://ftp.cgr.ki.se/pub/prog/medusa.  相似文献   

3.
WiGID, wireless genome information database, is a new application for mobile internet and can be reached through wireless application protocol (WAP). The main purpose of WiGID is to give easy access to information on completely sequenced genomes. Genome entries in WiGID can be queried by the number of open reading frames (ORFs), genus and species name and year published. Initial search results are linked to information on the full entry. AVAILABILITY: WiGID can be accessed through WAP at http://wigid.cgb.ki.se/index.wml and through the regular internet at http://wigid.cgb.ki.se.  相似文献   

4.
Bias in the estimation of false discovery rate in microarray studies   总被引:4,自引:0,他引:4  
MOTIVATION: The false discovery rate (FDR) provides a key statistical assessment for microarray studies. Its value depends on the proportion pi(0) of non-differentially expressed (non-DE) genes. In most microarray studies, many genes have small effects not easily separable from non-DE genes. As a result, current methods often overestimate pi(0) and FDR, leading to unnecessary loss of power in the overall analysis. METHODS: For the common two-sample comparison we derive a natural mixture model of the test statistic and an explicit bias formula in the standard estimation of pi(0). We suggest an improved estimation of pi(0) based on the mixture model and describe a practical likelihood-based procedure for this purpose. RESULTS: The analysis shows that a large bias occurs when pi(0) is far from 1 and when the non-centrality parameters of the distribution of the test statistic are near zero. The theoretical result also explains substantial discrepancies between non-parametric and model-based estimates of pi(0). Simulation studies indicate mixture-model estimates are less biased than standard estimates. The method is applied to breast cancer and lymphoma data examples. AVAILABILITY: An R-package OCplus containing functions to compute pi(0) based on the mixture model, the resulting FDR and other operating characteristics of microarray data, is freely available at http://www.meb.ki.se/~yudpaw CONTACT: yudi.pawitan@meb.ki.se and alexander.ploner@meb.ki.se.  相似文献   

5.
Pfam is a collection of multiple alignments and profile hidden Markov models of protein domain families. Release 3.1 is a major update of the Pfam database and contains 1313 families which are available on the World Wide Web in Europe at http://www.sanger.ac.uk/Software/Pfam/ and http://www.cgr.ki.se/Pfam/, and in the US at http://pfam.wustl.edu/. Over 54% of proteins in SWISS-PROT-35 and SP-TrEMBL-5 match a Pfam family. The primary changes of Pfam since release 2.1 are that we now use the more advanced version 2 of the HMMER software, which is more sensitive and provides expectation values for matches, and that it now includes proteins from both SP-TrEMBL and SWISS-PROT.  相似文献   

6.
The Pfam protein families database   总被引:105,自引:12,他引:93  
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the WWW in the UK at http://www.sanger.ac.uk/Software/Pfam/, in Sweden at http://www.cgr.ki.se/Pfam/ and in the US at http://pfam.wustl.edu/. The latest version (4.3) of Pfam contains 1815 families. These Pfam families match 63% of proteins in SWISS-PROT 37 and TrEMBL 9. For complete genomes Pfam currently matches up to half of the proteins. Genomic DNA can be directly searched against the Pfam library using the Wise2 package.  相似文献   

7.
SUMMARY: LogoBar is a Java application to display protein sequence logos. In our software gaps are accounted for when calculating the information content present at each residue position in a multiple alignment. The resulting logo is displayed as a graph consisting of bars, although traditional letter representation is also possible. Amino acids are displayed from the bottom up with decreasing frequencies i.e. the most abundant residue is placed at the bottom of the logo. The bars can be color-coded according to user specifications. Gaps in the alignment are also displayed, either on top or at the bottom of the logo. Furthermore, residues can either be arranged according to their relative abundance or grouped according to user criteria to emphasize the conserved nature of particular positions. AVAILABILITY: LogoBar and further documentation is available at http://www.biosci.ki.se/groups/tbu/logobar/  相似文献   

8.
The Pfam Protein Families Database   总被引:17,自引:0,他引:17       下载免费PDF全文
Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at http://www.sanger.ac.uk/Software/Pfam/, in Sweden at http://www.cgb.ki.se/Pfam/, in France at http://pfam.jouy.inra.fr/ and in the US at http://pfam.wustl.edu/. The latest version (6.6) of Pfam contains 3071 families, which match 69% of proteins in SWISS-PROT 39 and TrEMBL 14. Structural data, where available, have been utilised to ensure that Pfam families correspond with structural domains, and to improve domain-based annotation. Predictions of non-domain regions are now also included. In addition to secondary structure, Pfam multiple sequence alignments now contain active site residue mark-up. New search tools, including taxonomy search and domain query, greatly add to the functionality and usability of the Pfam resource.  相似文献   

9.
10.
SUMMARY: KIND (Karolinska Institutet Nonredundant Database) is a protein database where identical sequences, both full length and partial, have been removed. The database contains nearly 274 900 sequences, half of which originate from the protein sequence databases Swissprot and PIR, while the other half come from translated open reading frames in GenPept and TrEMBL. AVAILABILITY: KIND is downloadable from ftp://ftp.mbb.ki.se/pub/KIND.  相似文献   

11.
PfamAlyzer is a Java applet that enables exploration of Pfam domain architectures using a user-friendly graphical interface. It can search the UniProt protein database for a domain pattern. Domain patterns similar to the query are presented graphically by PfamAlyzer either in a ranked list or pinned to the tree of life. Such domain-centric homology search can assist identification of distant homologs with shared domain architecture. AVAILABILITY: PfamAlyzer has been integrated with the Pfam database and can be accessed at http://pfam.cgb.ki.se/pfamalyzer.  相似文献   

12.
13.
Multidimensional local false discovery rate for microarray studies   总被引:1,自引:0,他引:1  
MOTIVATION: The false discovery rate (fdr) is a key tool for statistical assessment of differential expression (DE) in microarray studies. Overall control of the fdr alone, however, is not sufficient to address the problem of genes with small variance, which generally suffer from a disproportionally high rate of false positives. It is desirable to have an fdr-controlling procedure that automatically accounts for gene variability. METHODS: We generalize the local fdr as a function of multiple statistics, combining a common test statistic for assessing DE with its standard error information. We use a non-parametric mixture model for DE and non-DE genes to describe the observed multi-dimensional statistics, and estimate the distribution for non-DE genes via the permutation method. We demonstrate this fdr2d approach for simulated and real microarray data. RESULTS: The fdr2d allows objective assessment of DE as a function of gene variability. We also show that the fdr2d performs better than commonly used modified test statistics. AVAILABILITY: An R-package OCplus containing functions for computing fdr2d() and other operating characteristics of microarray data is available at http://www.meb.ki.se/~yudpaw.  相似文献   

14.
MOTIVATION: The complete sequencing of many genomes has made it possible to identify orthologous genes descending from a common ancestor. However, reconstruction of evolutionary history over long time periods faces many challenges due to gene duplications and losses. Identification of orthologous groups shared by multiple proteomes therefore becomes a clustering problem in which an optimal compromise between conflicting evidences needs to be found. RESULTS: Here we present a new proteome-scale analysis program called MultiParanoid that can automatically find orthology relationships between proteins in multiple proteomes. The software is an extension of the InParanoid program that identifies orthologs and inparalogs in pairwise proteome comparisons. MultiParanoid applies a clustering algorithm to merge multiple pairwise ortholog groups from InParanoid into multi-species ortholog groups. To avoid outparalogs in the same cluster, MultiParanoid only combines species that share the same last ancestor. To validate the clustering technique, we compared the results to a reference set obtained by manual phylogenetic analysis. We further compared the results to ortholog groups in KOGs and OrthoMCL, which revealed that MultiParanoid produces substantially fewer outparalogs than these resources. AVAILABILITY: MultiParanoid is a freely available standalone program that enables efficient orthology analysis much needed in the post-genomic era. A web-based service providing access to the original datasets, the resulting groups of orthologs, and the source code of the program can be found at http://multiparanoid.cgb.ki.se.  相似文献   

15.
We present a statistical graphical model to infer specific molecular function for unannotated protein sequences using homology. Based on phylogenomic principles, SIFTER (Statistical Inference of Function Through Evolutionary Relationships) accurately predicts molecular function for members of a protein family given a reconciled phylogeny and available function annotations, even when the data are sparse or noisy. Our method produced specific and consistent molecular function predictions across 100 Pfam families in comparison to the Gene Ontology annotation database, BLAST, GOtcha, and Orthostrapper. We performed a more detailed exploration of functional predictions on the adenosine-5′-monophosphate/adenosine deaminase family and the lactate/malate dehydrogenase family, in the former case comparing the predictions against a gold standard set of published functional characterizations. Given function annotations for 3% of the proteins in the deaminase family, SIFTER achieves 96% accuracy in predicting molecular function for experimentally characterized proteins as reported in the literature. The accuracy of SIFTER on this dataset is a significant improvement over other currently available methods such as BLAST (75%), GeneQuiz (64%), GOtcha (89%), and Orthostrapper (11%). We also experimentally characterized the adenosine deaminase from Plasmodium falciparum, confirming SIFTER's prediction. The results illustrate the predictive power of exploiting a statistical model of function evolution in phylogenomic problems. A software implementation of SIFTER is available from the authors.  相似文献   

16.
Finishing, i.e. gap closure and editing, is the most time-consuming part of genome sequencing. Repeated sequences together with sequencing errors complicate the assembly and often result in misassemblies that are difficult to correct. Repeat Discrepancy Tagger (ReDiT) is a tool designed to aid in the finishing step. This software processes assembly results produced by any fragment assembly program that outputs ace files. The input sequences are analyzed to determine possible differences between repeated sequences. The output is written as tags in an ace file that can be viewed by, e.g. the Consed sequence editor. AVAILABILITY: The ReDiT program is freely available at http://web.cgb.ki.se/redit  相似文献   

17.
 The concerted and self-organizing behavior of spinal cord segments in generating locomotor patterns is modulated by afferent sensory information and controlled by descending pathways from the brainstem, cerebellum, or cortex. The purpose of this study was to define a minimal set of parameters that could control a similar self-organizing behavior in a two-dimensional neural network. When we implemented synaptic depression and active membrane repolarization as two properties of the neurons, the two-dimensional neural network generated traveling waves. Their wavelength and angle of propagation could be independently controlled by two parameters that modulated excitatory premotor neurons and inhibitory commissural neurons. It is further demonstrated that the selection of wave parameters corresponds to the selection of quadruped gaits. Received: 30 July 2001 / Accepted in revised form: 17 April 2002 Correspondence to: A. Kaske (e-mail: alexander.kaske@mtc.ki.se, alexander.kaske@vglab.com)  相似文献   

18.
Abhiman S  Sonnhammer EL 《Proteins》2005,60(4):758-768
Protein function shift can be predicted from sequence comparisons, either using positive selection signals or evolutionary rate estimation. None of the methods have been validated on large datasets, however. Here we investigate existing and novel methods for protein function shift prediction, and benchmark the accuracy against a large dataset of proteins with known enzymatic functions. Function change was predicted between subfamilies by identifying two kinds of sites in a multiple sequence alignment: Conservation-Shifting Sites (CSS), which are conserved in two subfamilies using two different amino acid types, and Rate-Shifting Sites (RSS), which have different evolutionary rates in two subfamilies. CSS were predicted by a new entropy-based method, and RSS using the Rate-Shift program. In principle, the more CSS and RSS between two subfamilies, the more likely a function shift between them. A test dataset was built by extracting subfamilies from Pfam with different EC numbers that belong to the same domain family. Subfamilies were generated automatically using a phylogenetic tree-based program, BETE. The dataset comprised 997 subfamily pairs with four or more members per subfamily. We observed a significant increase in CSS and RSS for subfamily comparisons with different EC numbers compared to cases with same EC numbers. The discrimination was better using RSS than CSS, and was more pronounced for larger families. Combining RSS and CSS by discriminant analysis improved classification accuracy to 71%. The method was applied to the Pfam database and the results are available at http://FunShift.cgb.ki.se. A closer examination of some superfamily comparisons showed that single EC numbers sometimes embody distinct functional classes. Hence, the measured accuracy of function shift is underestimated.  相似文献   

19.
20.
This essay examines the portrayal of returning overseas Vietnamese (Vi?t ki?u) in Charlie Nguyen’s romantic comedy ?? Mai tính (Fool for Love, 2010). I call this conflictual relationship with Vietnamese nationals Vi?t ki?u intimacy. This intimacy is marked by disgust and desire, past and future: they are far away and too close. The Vi?t ki?u’s upward and outward mobility speaks to global capitalism’s enticing opportunities, a mobility that is explicitly linked to the Vi?t ki?u characters’ non-normative sexual-gender expressions. At the same time, that the two film characters’ career aspirations are complicated and curtailed by sex and romance suggests that the Vi?t ki?u’s mobility is compromised by their stubborn affective attachment to “Vietnam.” The rich affective, temporal and spatial dimensions of Vi?t ki?u intimacy confirms that the intimate, far from being discretely tucked away in the private realm, non-commodified and physically bounded, is inevitably linked to the public sphere, economic aims and national interests.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号