首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
MOTIVATION: This paper gives a new and efficient algorithm for the sparse logistic regression problem. The proposed algorithm is based on the Gauss-Seidel method and is asymptotically convergent. It is simple and extremely easy to implement; it neither uses any sophisticated mathematical programming software nor needs any matrix operations. It can be applied to a variety of real-world problems like identifying marker genes and building a classifier in the context of cancer diagnosis using microarray data. RESULTS: The gene selection method suggested in this paper is demonstrated on two real-world data sets and the results were found to be consistent with the literature. AVAILABILITY: The implementation of this algorithm is available at the site http://guppy.mpe.nus.edu.sg/~mpessk/SparseLOGREG.shtml Supplementary Information: Supplementary material is available at the site http://guppy.mpe.nus.edu.sg/~mpessk/SparseLOGREG.shtml  相似文献   

2.
G-PRIMER, a web-based primer design program, has been developed to compute a minimal primer set specifically annealed to all the open reading frames in a given microbial genome. This program has been successfully used in the microarray experiment for analyzing the expression of genes in the Xanthomonas campestris genome. AVAILABILITY: It is available at http://mammoth.bii.a-star.edu.sg/gprimer/. Its source code is available upon request.  相似文献   

3.
SUMMARY: Eukaryotes have both 'intron containing' and 'intron less' genes. Several databases are available for 'intron containing' genes in eukaryotes. In this note, we describe a database for 'intron less' genes from eukaryotes. 'Intron less' eukaryotic genes having prokaryotic architecture will help to understand gene evolution in a much simpler way unlike 'intron containing' genes. AVAILABILITY: SEGE is available at http://intron.bic.nus.edu.sg/seg/ CONTACT: mmeena@ntu.edu.sg  相似文献   

4.
MOTIVATION: One problem with discriminant analysis of DNA microarray data is that each sample is represented by quite a large number of genes, and many of them are irrelevant, insignificant or redundant to the discriminant problem at hand. Methods for selecting important genes are, therefore, of much significance in microarray data analysis. In the present study, a new criterion, called LS Bound measure, is proposed to address the gene selection problem. The LS Bound measure is derived from leave-one-out procedure of LS-SVMs (least squares support vector machines), and as the upper bound for leave-one-out classification results it reflects to some extent the generalization performance of gene subsets. RESULTS: We applied this LS Bound measure for gene selection on two benchmark microarray datasets: colon cancer and leukemia. We also compared the LS Bound measure with other evaluation criteria, including the well-known Fisher's ratio and Mahalanobis class separability measure, and other published gene selection algorithms, including Weighting factor and SVM Recursive Feature Elimination. The strength of the LS Bound measure is that it provides gene subsets leading to more accurate classification results than the filter method while its computational complexity is at the level of the filter method. AVAILABILITY: A companion website can be accessed at http://www.ntu.edu.sg/home5/pg02776030/lsbound/. The website contains: (1) the source code of the gene selection algorithm; (2) the complete set of tables and figures regarding the experimental study; (3) proof of the inequality (9). CONTACT: ekzmao@ntu.edu.sg.  相似文献   

5.
SUMMARY: A high throughput Basic Local Alignment Search Tool (BLAST) system based on Web services is implemented. It provides an alternative BLAST service and allows users to perform multiple BLAST queries at one run in a distributed, parallel environment through the Internet. AVAILABILITY: It is available at http://mammoth.bii.a-star.edu.sg/webservices/htblast/index.html and at http://www.bii.a-star.edu.sg/jiren/download.html  相似文献   

6.
SUMMARY: DNAFSMiner (DNA Functional Sites Miner) is a web-based software toolbox to recognize functional sites in nucleic acid sequences. Currently in this toolbox, we provide two software: TIS Miner and Poly(A) Signal Miner. The TIS Miner can be used to predict translation initiation sites in vertebrate DNA/mRNA/cDNA sequences, and the Poly(A) Signal Miner can be used to predict polyadenylation [poly(A)] signals in human DNA sequences. The prediction results are better than those by literature methods on two benchmark applications. This good performance is mainly attributable to our unique learning method. DNAFSMiner is available free of charge for academic and non-profit organizations. AVAILABILITY: http://research.i2r.a-star.edu.sg/DNAFSMiner/ CONTACT: huiqing@i2r.a-star.edu.sg.  相似文献   

7.
SUMMARY: Data processing, analysis and visualization (datPAV) is an exploratory tool that allows experimentalist to quickly assess the general characteristics of the data. This platform-independent software is designed as a generic tool to process and visualize data matrices. This tool explores organization of the data, detect errors and support basic statistical analyses. Processed data can be reused whereby different step-by-step data processing/analysis workflows can be created to carry out detailed investigation. The visualization option provides publication-ready graphics. Applications of this tool are demonstrated at the web site for three cases of metabolomics, environmental and hydrodynamic data analysis. AVAILABILITY: datPAV is available free for academic use at http://www.sdwa.nus.edu.sg/datPAV/.  相似文献   

8.
MOTIVATION: Analysis of gene expression data can provide insights into the time-lagged co-regulation of genes/gene clusters. However, existing methods such as the Event Method and the Edge Detection Method are inefficient as they compare only two genes at a time. More importantly, they neglect some important information due to their scoring criterian. In this paper, we propose an efficient algorithm to identify time-lagged co-regulated gene clusters. The algorithm facilitates localized comparison and processes several genes simultaneously to generate detailed and complete time-lagged information for genes/gene clusters. RESULTS: We experimented with the time-series Yeast gene dataset and compared our algorithm with the Event Method. Our results show that our algorithm is not only efficient, but also delivers more reliable and detailed information on time-lagged co-regulation between genes/gene clusters. AVAILABILITY: The software is available upon request. CONTACT: jiliping@comp.nus.edu.sg SUPPLEMENTARY INFORMATION: Supplementary tables and figures for this paper can be found at http://www.comp.nus.edu.sg/~jiliping/p2.htm.  相似文献   

9.
ExInt: an Exon Intron Database   总被引:5,自引:0,他引:5       下载免费PDF全文
The Exon/Intron Database (ExInt) stores information of all GenBank eukaryotic entries containing an annotated intron sequence. Data are available through a retrieval system, as flat-files and as a MySQL dump file. In this report we discuss several implementations added to ExInt, which is accessible at http://intron.bic.nus.edu.sg/exint/newexint/exint.html.  相似文献   

10.
SUMMARY: The relationship between intron distribution in the eukaryotic gene and protein structural elements is essential for understanding the origin and evolution of genes. XdomView is a web-based viewer mapping protein structural domains and intron positions in eukaryotic homologues to its tertiary structure. The association of sequence signals to 3D structure in XdomView provides a valuable visualization environment for eukaryotic gene organization, gene evolution, protein folding and protein structure classification. AVAILABILITY: Freely available from http://surya.bic.nus.edu.sg/xdom.  相似文献   

11.
SUMMARY: Dragon Promoter Mapper (DPM) is a tool to model promoter structure of co-regulated genes using methodology of Bayesian networks. DPM exploits an exhaustive set of motif features (such as motif, its strand, the order of motif occurrence and mutual distance between the adjacent motifs) and generates models from the target promoter sequences, which may be used to (1) detect regions in a genomic sequence which are similar to the target promoters or (2) to classify other promoters as similar or not to the target promoter group. DPM can also be used for modelling of enhancers and silencers. AVAILABILITY: http://defiant.i2r.a-star.edu.sg/projects/BayesPromoter/ CONTACT: vlad@sanbi.ac.za SUPPLEMENTARY INFORMATION: Manual for using DPM web server is provided at http://defiant.i2r.a-star.edu.sg/projects/BayesPromoter/html/manual/manual.htm.  相似文献   

12.
A modification to Phred and program to detect heterogeneous positions, which is particularly useful in the identification of mutations and other abnormalities in Phred/Phrap genome assemblies. AVAILABILITY: The package is made available at http://glscompute.gis.a-star.edu.sg/~charlie/DHetero.html  相似文献   

13.
MOTIVATION: Feature selection approaches, such as filter and wrapper, have been applied to address the gene selection problem in the literature of microarray data analysis. In wrapper methods, the classification error is usually used as the evaluation criterion of feature subsets. Due to the nature of high dimensionality and small sample size of microarray data, however, counting-based error estimation may not necessarily be an ideal criterion for gene selection problem. RESULTS: Our study reveals that evaluating genes in terms of counting-based error estimators such as resubstitution error, leave-one-out error, cross-validation error and bootstrap error may encounter severe ties problem, i.e. two or more gene subsets score equally, and this in turn results in uncertainty in gene selection. Our analysis finds that the ties problem is caused by the discrete nature of counting-based error estimators and could be avoided by using continuous evaluation criteria instead. Experiment results show that continuous evaluation criteria such as generalised the absolute value of w2 measure for support vector machines and modified Relief's measure for k-nearest neighbors produce improved gene selection compared with counting-based error estimators. AVAILABILITY: The companion website is at http://www.ntu.edu.sg/home5/pg02776030/wrappers/ The website contains (1) the source code of all the gene selection algorithms and (2) the complete set of tables and figures of experiments.  相似文献   

14.
Methods are presented for detecting differential expression using statistical hypothesis testing methods including analysis of variance (ANOVA). Practicalities of experimental design, power, and sample size are discussed. Methods for multiple testing correction and their application are described. Instructions for running typical analyses are given in the R programming environment. R code and the sample data set used to generate the examples are available at http://microarray.cpmc.columbia.edu/pavlidis/pub/aovmethods/.  相似文献   

15.
BLAST++ is a tool that is integrated with NCBI BLAST, allowing multiple, say K, queries to be searched against a database concurrently. The results obtained by BLAST++ are identical to that obtained by executing BLAST on each of the K queries, but BLAST++ completes the processing in a much shorter time. AVAILABILITY: http://xena1.ddns.comp.nus.edu.sg/~genesis/blast++ Supplementary information: http://xena1.ddns.comp.nus.edu.sg/~genesis/blast++  相似文献   

16.
MOTIVATION: A serious limitation in microarray analysis is the unreliability of the data generated from low signal intensities. Such data may produce erroneous gene expression ratios and cause unnecessary validation or post-analysis follow-up tasks. Therefore, the elimination of unreliable signal intensities will enhance reproducibility and reliability of gene expression ratios produced from microarray data. In this study, we applied fuzzy c-means (FCM) and normal mixture modeling (NMM) based classification methods to separate microarray data into reliable and unreliable signal intensity populations. RESULTS: We compared the results of FCM classification with those of classification based on NMM. Both approaches were validated against reference sets of biological data consisting of only true positives and true negatives. We observed that both methods performed equally well in terms of sensitivity and specificity. Although a comparison of the computation times indicated that the fuzzy approach is computationally more efficient, other considerations support the use of NMM for the reliability analysis of microarray data. AVAILABILITY: The classification approaches described in this paper and sample microarray data are available as Matlab( TM ) (The MathWorks Inc., Natick, MA) programs (mfiles) and text files, respectively, at http://rc.kfshrc.edu.sa/bssc/staff/MusaAsyali/Downloads.asp. The programs can be run/tested on many different computer platforms where Matlab is available. CONTACT: asyali@kfshrc.edu.sa.  相似文献   

17.
18.
Alternative splicing of mRNA allows many gene products with different functions to be produced from a single coding sequence. Exon skipping is the most commonly known alternative splicing mechanism. A comprehensive database of alternative splicing by exon skipping is made available for the human genome data. 1,229 human genes are identified to exhibit alternative splicing by exon skipping. Availability: http://sege.ntu.edu.sg/wester/ashes/.  相似文献   

19.
MOTIVATION: We present a study of antigen expression signals from a newly developed high-throughput protein microarray technique. These signals are a measure of antibody-antigen binding activity and provide a basis for understanding humoral immune responses to various infectious agents and supporting vaccine and diagnostic development. RESULTS: We investigate the characteristics of these expression profiles and show that noise models, normalization, variance estimation and differential expression analysis techniques developed in the context of DNA microarray analysis can be adapted and applied to these protein arrays. Using a high-dimensional dataset containing measurements of expression profiles of antibody reactivity against each protein (295 antigens and 9 controls) in 42 malaria (Plasmodium falciparum) protein arrays derived from 22 donors with various clinical presentations of malaria, we present a methodology for the analysis and identification of significantly expressed antigens targeted by immune responses for individual sera, groups of sera and across stages of infection. We also conduct a short study highlighting the top immunoreactive antigens where we identify three novel high priority antigens for future evaluation. AVAILABILITY: All software programs (in R) used for the analysis described in this paper are freely available for academic purposes at www.igb.uci.edu/servers/servers.html.  相似文献   

20.
MPID-T     
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号