首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Ding M  Rosner GL  Müller P 《Biometrics》2008,64(3):886-894
Summary .   Most phase II screening designs available in the literature consider one treatment at a time. Each study is considered in isolation. We propose a more systematic decision-making approach to the phase II screening process. The sequential design allows for more efficiency and greater learning about treatments. The approach incorporates a Bayesian hierarchical model that allows combining information across several related studies in a formal way and improves estimation in small data sets by borrowing strength from other treatments. The design incorporates a utility function that includes sampling costs and possible future payoff. Computer simulations show that this method has high probability of discarding treatments with low success rates and moving treatments with high success rates to phase III trial.  相似文献   

2.
Structural variation (SV) has been reported to be associated with numerous diseases such as cancer. With the advent of next generation sequencing (NGS) technologies, various types of SV can be potentially identified. We propose a model based clustering approach utilizing a set of features defined for each type of SV events. Our method, termed SVMiner, not only provides a probability score for each candidate, but also predicts the heterozygosity of genomic deletions. Extensive experiments on genome-wide deep sequencing data have demonstrated that SVMiner is robust against the variability of a single cluster feature, and it significantly outperforms several commonly used SV detection programs. SVMiner can be downloaded from http://cbc.case.edu/svminer/.  相似文献   

3.
4.
SUMMARY: A web-based application to analyze protein amino acids conservation-Consensus Sequence (ConSSeq) is presented. ConSSeq graphically represents information about amino acid conservation based on sequence alignments reported in homology-derived structures of proteins. Beyond the relative entropy for each position in the alignment, ConSSeq also presents the consensus sequence and information about the amino acids, which are predominant at each position of the alignment. ConSSeq is part of the STING Millennium Suite and is implemented as a Java Applet. AVAILABILITY: http://sms.cbi.cnptia.embrapa.br/SMS/STINGm/consseq/, http://trantor.bioc.columbia.edu/SMS/STINGm/consseq/, http://mirrors.rcsb.org//SMS/STINGm/consseq/, http://www.es.embnet.org/SMS/STINGm/consseq/ and http://www.ar.embnet.org/SMS/STINGm/consseq/  相似文献   

5.
In many phase II clinical trials, it is essential to assess both efficacy and safety. Although several phase II designs that accommodate multiple outcomes have been proposed recently, none are derived using decision theory. This paper describes a Bayesian decision theoretic strategy for constructing phase II designs based on both efficacy and adverse events. The gain function includes utilities assigned to patient outcomes, a reward for declaring the new treatment promising, and costs associated with the conduct of the phase II trial and future phase III testing. A method for eliciting gain function parameters from medical collaborators and for evaluating the design's frequentist operating characteristics is described. The strategy is illustrated by application to a clinical trial of peripheral blood stem cell transplantation for multiple myeloma.  相似文献   

6.
How to characterize short protein sequences to make an effective connection to their functions is an unsolved problem. Here we propose to map the physicochemical properties of each amino acid onto unit spheres so that each protein sequence can be represented quantitatively. We demonstrate the usefulness of this representation by applying it to the prediction of cell penetrating peptides. We show that its combination with traditional composition features yields the best performance across different datasets, among several methods compared. For the convenience of users, a web server has been established for automatic calculations of the proposed features at http://biophy.dzu.edu.cn/SNumD/ .  相似文献   

7.
SUMMARY: Differential gene expression detection using microarrays has received lots of research interests recently. Many methods have been proposed, including variants of F-statistics, non-parametric approaches and empirical Bayesian methods etc. The SAM statistics has been shown to have good performance in empirical studies. SAM is more like an ad hoc shrinkage method. The idea is that for small sample microarray data, it is often useful to pool information across genes to improve efficiency. Under Bayesian framework Smyth formally derived the test statistics with shrinkage using the hierarchical models. In this paper we cast differential gene expression detection in the familiar framework of linear regression model. Commonly used test statistics correspond to using least squares to estimate the regression parameters. Based on the vast literature of research on linear models, we can naturally consider other alternatives. Here we explore the penalized linear regression. We propose the penalized t-/F-statistics for two-class microarray data based on [Formula: see text] penalty. We will show that the penalized test statistics intuitively makes sense and through applications we illustrate its good performance. AVAILABILITY: Supplementary information including program codes, more detailed analysis results and R functions for the proposed methods can be found at http://www.biostat.umn.edu/~baolin/research CONTACT: baolin@biostat.umn.edu SUPPLEMENTARY INFORMATION: http://www.biostat.umn.edu/~baolin/research.  相似文献   

8.
Englert S  Kieser M 《Biometrics》2012,68(3):886-892
Summary Phase II trials in oncology are usually conducted as single-arm two-stage designs with binary endpoints. Currently available adaptive design methods are tailored to comparative studies with continuous test statistics. Direct transfer of these methods to discrete test statistics results in conservative procedures and, therefore, in a loss in power. We propose a method based on the conditional error function principle that directly accounts for the discreteness of the outcome. It is shown how application of the method can be used to construct new phase II designs that are more efficient as compared to currently applied designs and that allow flexible mid-course design modifications. The proposed method is illustrated with a variety of frequently used phase II designs.  相似文献   

9.
TileMap: create chromosomal map of tiling array hybridizations   总被引:12,自引:0,他引:12  
  相似文献   

10.
Protein attribute prediction from primary sequences is an important task and how to extract discriminative features is one of the most crucial aspects. Because single-view feature cannot reflect all the information of a protein, fusing multi-view features is considered as a promising route to improve prediction accuracy. In this paper, we propose a novel framework for protein multi-view feature fusion: first, features from different views are parallely combined to form complex feature vectors; Then, we extend the classic principal component analysis to the generalized principle component analysis for further feature extraction from the parallely combined complex features, which lie in a complex space. Finally, the extracted features are used for prediction. Experimental results on different benchmark datasets and machine learning algorithms demonstrate that parallel strategy outperforms the traditional serial approach and is particularly helpful for extracting the core information buried among multi-view feature sets. A web server for protein structural class prediction based on the proposed method (COMSPA) is freely available for academic use at: http://www.csbio.sjtu.edu.cn/bioinf/COMSPA/.  相似文献   

11.
Functional annotation from predicted protein interaction networks   总被引:1,自引:0,他引:1  
MOTIVATION: Progress in large-scale experimental determination of protein-protein interaction networks for several organisms has resulted in innovative methods of functional inference based on network connectivity. However, the amount of effort and resources required for the elucidation of experimental protein interaction networks is prohibitive. Previously we, and others, have developed techniques to predict protein interactions for novel genomes using computational methods and data generated from other genomes. RESULTS: We evaluated the performance of a network-based functional annotation method that makes use of our predicted protein interaction networks. We show that this approach performs equally well on experimentally derived and predicted interaction networks, for both manually and computationally assigned annotations. We applied the method to predicted protein interaction networks for over 50 organisms from all domains of life, providing annotations for many previously unannotated proteins and verifying existing low-confidence annotations. AVAILABILITY: Functional predictions for over 50 organisms are available at http://bioverse.compbio.washington.edu and datasets used for analysis at http://data.compbio.washington.edu/misc/downloads/nannotation_data/. SUPPLEMENTARY INFORMATION: A supplemental appendix gives additional details not in the main text. (http://data.compbio.washington.edu/misc/downloads/nannotation_data/supplement.pdf).  相似文献   

12.
A Bayesian design is proposed for randomized phase II clinical trials that screen multiple experimental treatments compared to an active control based on ordinal categorical toxicity and response. The underlying model and design account for patient heterogeneity characterized by ordered prognostic subgroups. All decision criteria are subgroup specific, including interim rules for dropping unsafe or ineffective treatments, and criteria for selecting optimal treatments at the end of the trial. The design requires an elicited utility function of the two outcomes that varies with the subgroups. Final treatment selections are based on posterior mean utilities. The methodology is illustrated by a trial of targeted agents for metastatic renal cancer, which motivated the design methodology. In the context of this application, the design is evaluated by computer simulation, including comparison to three designs that conduct separate trials within subgroups, or conduct one trial while ignoring subgroups, or base treatment selection on estimated response rates while ignoring toxicity.  相似文献   

13.
MOTIVATION: We propose a general method for deriving amino acid substitution matrices from low resolution force fields. Unlike current popular methods, the approach does not rely on evolutionary arguments or alignment of sequences or structures. Instead, residues are computationally mutated and their contribution to the total energy/score is collected. The average of these values over each position within a set of proteins results in a substitution matrix. RESULTS: Example substitution matrices have been calculated from force fields based on different philosophies and their performance compared with conventional substitution matrices. Although this can produce useful substitution matrices, the methodology highlights the virtues, deficiencies and biases of the source force fields. It also allows a rather direct comparison of sequence alignment methods with the score functions underlying protein sequence to structure threading. AVAILABILITY: Example substitution matrices are available from http://www.rsc.anu.edu.au/~zsuzsa/suppl/matrices.html. SUPPLEMENTARY INFORMATION: The list of proteins used for data collection and the optimized parameters for the alignment are given as supplementary material at http://www.rsc.anu.edu.au/~zsuzsa/suppl/matrices.html.  相似文献   

14.
Seamlessly expanding a randomized phase II trial to phase III   总被引:1,自引:0,他引:1  
Inoue LY  Thall PF  Berry DA 《Biometrics》2002,58(4):823-831
A sequential Bayesian phase II/III design is proposed for comparative clinical trials. The design is based on both survival time and discrete early events that may be related to survival and assumes a parametric mixture model. Phase II involves a small number of centers. Patients are randomized between treatments throughout, and sequential decisions are based on predictive probabilities of concluding superiority of the experimental treatment. Whether to stop early, continue, or shift into phase III is assessed repeatedly in phase II. Phase III begins when additional institutions are incorporated into the ongoing phase II trial. Simulation studies in the context of a non-small-cell lung cancer trial indicate that the proposed method maintains overall size and power while usually requiring substantially smaller sample size and shorter trial duration when compared with conventional group-sequential phase III designs.  相似文献   

15.
Nucleic acids are particularly amenable to structural characterization using chemical and enzymatic probes. Each individual structure mapping experiment reveals specific information about the structure and/or dynamics of the nucleic acid. Currently, there is no simple approach for making these data publically available in a standardized format. We therefore developed a standard for reporting the results of single nucleotide resolution nucleic acid structure mapping experiments, or SNRNASMs. We propose a schema for sharing nucleic acid chemical probing data that uses generic public servers for storing, retrieving, and searching the data. We have also developed a consistent nomenclature (ontology) within the Ontology of Biomedical Investigations (OBI), which provides unique identifiers (termed persistent URLs, or PURLs) for classifying the data. Links to standardized data sets shared using our proposed format along with a tutorial and links to templates can be found at http://snrnasm.bio.unc.edu.  相似文献   

16.
Although there are several new designs for phase I cancer clinical trials including the continual reassessment method and accelerated titration design, the traditional algorithm-based designs, like the '3 + 3' design, are still widely used because of their practical simplicity. In this paper, we study some key statistical properties of the traditional algorithm-based designs in a general framework and derive the exact formulae for the corresponding statistical quantities. These quantities are important for the investigator to gain insights regarding the design of the trial, and are (i) the probability of a dose being chosen as the maximum tolerated dose (MTD); (ii) the expected number of patients treated at each dose level; (iii) target toxicity level (i.e. the expected dose-limiting toxicity (DLT) incidences at the MTD); (iv) expected DLT incidences at each dose level and (v) expected overall DLT incidences in the trial. Real examples of clinical trials are given, and a computer program to do the calculation can be found at the authors' website approximately linyo" locator-type="url">http://www2.umdnj.edu/ approximately linyo.  相似文献   

17.
SUMMARY: The Kinase Sequence Database (KSD) located at http://kinase.ucsf.edu/ksd contains information on 290 protein kinase families derived by profile-based clustering of the non-redundant list of sequences obtained from a GenBank-wide search. Included in the database are a total of 5,041 protein kinases from over 100 organisms. Clustering into families is based on the extent of homology within the kinase catalytic domain (250-300 residues in length). Alignments of the families are viewed by interactive Excel-based sequence spreadsheets. In addition, KSD features evolutionary trees derived for each family and detailed information on each sequence as well as links to the corresponding GenBank entries. Sequence manipulation tools, such as evolutionary tree generation, novel sequence assignment, and statistical analysis, are also provided. AVAILABILITY: The kinase sequence database is a web-based service accessible at http://kinase.ucsf.edu/ksd CONTACT: buzko@cmp.ucsf.edu; shokat@cmp.ucsf.edu/ksd  相似文献   

18.
19.
SUMMARY: We present SynView, a simple and generic approach to dynamically visualize multi-species comparative genome data. It is a light-weight application based on the popular and configurable web-based GBrowse framework. It can be used with a variety of databases and provides the user with a high degree of interactivity. The tool is written in Perl and runs on top of the GBrowse framework. It is in use in the PlasmoDB (http://www.PlasmoDB.org) and the CryptoDB (http://www.CryptoDB.org) projects and can be easily integrated into other cross-species comparative genome projects. AVAILABILITY: The program and instructions are freely available at http://www.ApiDB.org/apps/SynView/ CONTACT: jkissing@uga.edu.  相似文献   

20.
Background

Genomic islands (GIs) are clusters of alien genes in some bacterial genomes, but not be seen in the genomes of other strains within the same genus. The detection of GIs is extremely important to the medical and environmental communities. Despite the discovery of the GI associated features, accurate detection of GIs is still far from satisfactory.

Results

In this paper, we combined multiple GI-associated features, and applied and compared various machine learning approaches to evaluate the classification accuracy of GIs datasets on three genera: Salmonella, Staphylococcus, Streptococcus, and their mixed dataset of all three genera. The experimental results have shown that, in general, the decision tree approach outperformed better than other machine learning methods according to five performance evaluation metrics. Using J48 decision trees as base classifiers, we further applied four ensemble algorithms, including adaBoost, bagging, multiboost and random forest, on the same datasets. We found that, overall, these ensemble classifiers could improve classification accuracy.

Conclusions

We conclude that decision trees based ensemble algorithms could accurately classify GIs and non-GIs, and recommend the use of these methods for the future GI data analysis. The software package for detecting GIs can be accessed at http://www.esu.edu/cpsc/che_lab/software/GIDetector/.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号