首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A DSRPCL-SVM approach to informative gene analysis   总被引:1,自引:0,他引:1  
Microarray data based tumor diagnosis is a very interesting topic in bioinformatics. One of the key problems is the discovery and analysis of informative genes of a tumor. Although there are many elaborate approaches to this problem, it is still difficult to select a reasonable set of informative genes for tumor diagnosis only with microarray data. In this paper, we classify the genes expressed through microarray data into a number of clusters via the distance sensitive rival penalized competitive learning (DSRPCL) algorithm and then detect the informative gene cluster or set with the help of support vector machine (SVM). Moreover, the critical or powerful informative genes can be found through further classifications and detections on the obtained informative gene clusters. It is well demonstrated by experiments on the colon, leukemia, and breast cancer datasets that our proposed DSRPCL-SVM approach leads to a reasonable selection of informative genes for tumor diagnosis.  相似文献   

2.
Despite benefits for precision, ecologists rarely use informative priors. One reason that ecologists may prefer vague priors is the perception that informative priors reduce accuracy. To date, no ecological study has empirically evaluated data‐derived informative priors' effects on precision and accuracy. To determine the impacts of priors, we evaluated mortality models for tree species using data from a forest dynamics plot in Thailand. Half the models used vague priors, and the remaining half had informative priors. We found precision was greater when using informative priors, but effects on accuracy were more variable. In some cases, prior information improved accuracy, while in others, it was reduced. On average, models with informative priors were no more or less accurate than models without. Our analyses provide a detailed case study on the simultaneous effect of prior information on precision and accuracy and demonstrate that when priors are specified appropriately, they lead to greater precision without systematically reducing model accuracy.  相似文献   

3.
Bayesian inference of mixed models in quantitative genetics of crop species   总被引:1,自引:0,他引:1  
The objectives of this study were to implement a Bayesian framework for mixed models analysis in crop species breeding and to exploit alternatives for informative prior elicitation. Bayesian inference for genetic evaluation in annual crop breeding was illustrated with the first two half-sib selection cycles in a popcorn population. The Bayesian framework was based on the Just Another Gibbs Sampler software and the R2jags package. For the first cycle, a non-informative prior for the inverse of the variance components and an informative prior based on meta-analysis were used. For the second cycle, a non-informative prior and an informative prior defined as the posterior from the non-informative and informative analyses of the first cycle were used. Regarding the first cycle, the use of an informative prior from the meta-analysis provided clearly distinct results relative to the analysis with a non-informative prior only for the grain yield. Regarding the second cycle, the results for the expansion volume and grain yield showed differences among the three analyses. The differences between the non-informative and informative prior analyses were restricted to variance components and heritability. The correlations between the predicted breeding values from these analyses were almost perfect.  相似文献   

4.
5.

Longitudinal studies with binary outcomes characterized by informative right censoring are commonly encountered in clinical, basic, behavioral, and health sciences. Approaches developed to analyze data with binary outcomes were mainly tailored to clustered or longitudinal data with missing completely at random or at random. Studies that focused on informative right censoring with binary outcomes are characterized by their imbedded computational complexity and difficulty of implementation. Here we present a new maximum likelihood-based approach with repeated binary measures modeled in a generalized linear mixed model as a function of time and other covariates. The longitudinal binary outcome and the censoring process determined by the number of times a subject is observed share latent random variables (random intercept and slope) where these subject-specific random effects are common to both models. A simulation study and sensitivity analysis were conducted to test the model under different assumptions and censoring settings. Our results showed accuracy of the estimates generated under this model when censoring was fully informative or partially informative with dependence on the slopes. A successful implementation was undertaken on a cohort of renal transplant patients with blood urea nitrogen as a binary outcome measured over time to indicate normal and abnormal kidney function until the emanation of graft rejection that eventuated in informative right censoring. In addition to its novelty and accuracy, an additional key feature and advantage of the proposed model is its viability of implementation on available analytical tools and widespread application on any other longitudinal dataset with informative censoring.

  相似文献   

6.
Zhang J 《PloS one》2010,5(11):e13734
Identification of a small panel of population structure informative markers can reduce genotyping cost and is useful in various applications, such as ancestry inference in association mapping, forensics and evolutionary theory in population genetics. Traditional methods to ascertain ancestral informative markers usually require the prior knowledge of individual ancestry and have difficulty for admixed populations. Recently Principal Components Analysis (PCA) has been employed with success to select SNPs which are highly correlated with top significant principal components (PCs) without use of individual ancestral information. The approach is also applicable to admixed populations. Here we propose a novel approach based on our recent result on summarizing population structure by graph laplacian eigenfunctions, which differs from PCA in that it is geometric and robust to outliers. Our approach also takes advantage of the priori sparseness of informative markers in the genome. Through simulation of a ring population and the real global population sample HGDP of 650K SNPs genotyped in 940 unrelated individuals, we validate the proposed algorithm at selecting most informative markers, a small fraction of which can recover the similar underlying population structure efficiently. Employing a standard Support Vector Machine (SVM) to predict individuals' continental memberships on HGDP dataset of seven continents, we demonstrate that the selected SNPs by our method are more informative but less redundant than those selected by PCA. Our algorithm is a promising tool in genome-wide association studies and population genetics, facilitating the selection of structure informative markers, efficient detection of population substructure and ancestral inference.  相似文献   

7.
Allelotyping of human prostatic adenocarcinoma.   总被引:14,自引:0,他引:14  
Allelotyping (using at least one probe detecting a restriction fragment length polymorphism on each chromosomal arm, with the exception of the short arms of the acrocentric chromosomes), showed loss of genetic information in 11 of 18 prostate adenocarcinoma specimens analyzed (61%). Frequent allelic deletions were detected on the long arm of chromosome 16 (6 of 10 informative cases, 60%), on the short arm of chromosome 8 (3 of 6 informative cases, 50%), and on the short and/or the long arms of chromosome 10 (6 of 11 informative cases (10p), 55% and 4 of 13 informative cases (10q), 30%, respectively). No losses of alleles were detected in any case unless at least one of the chromosomes 8, 10, or 16 also showed deletions. The long arm of chromosome 18 also showed a high frequency of allelic deletions (3 of 7 informative cases, 43%). Allelic deletions on the following chromosomes were detected at lower frequencies: chromosomes 2, 3, 7, 12, 13, 17, 22, and XY. Tumors with allelic deletions on more than one chromosome had a higher histological malignancy grade. Tumors from patients with advanced disease all showed allelic deletions.  相似文献   

8.
The search for the association between complex diseases and single nucleotide polymorphisms (SNPs) or haplotypes has recently received great attention. For these studies, it is essential to use a small subset of informative SNPs accurately representing the rest of the SNPs. Informative SNP selection can achieve (1) considerable budget savings by genotyping only a limited number of SNPs and computationally inferring all other SNPs or (2) necessary reduction of the huge SNP sets (obtained, e.g. from Affymetrix) for further fine haplotype analysis. A novel informative SNP selection method for unphased genotype data based on multiple linear regression (MLR) is implemented in the software package MLR-tagging. This software can be used for informative SNP (tag) selection and genotype prediction. The stepwise tag selection algorithm (STSA) selects positions of the given number of informative SNPs based on a genotype sample population. The MLR SNP prediction algorithm predicts a complete genotype based on the values of its informative SNPs, their positions among all SNPs, and a sample of complete genotypes. An extensive experimental study on various datasets including 10 regions from HapMap shows that the MLR prediction combined with stepwise tag selection uses fewer tags than the state-of-the-art method of Halperin et al. (2005). AVAILABILITY: MLR-Tagging software package is publicly available at http://alla.cs.gsu.edu/~software/tagging/tagging.html  相似文献   

9.
Individual loci affecting economic traits can be located using genetic linkage. Application of either daughter or granddaughter designs requires determination of allele origin in the progeny. If only the sires and their progeny are genotyped, the paternal allele origin of progeny having the same genotype as the sire cannot be determined. The expected frequency of informative sons can be predicted for each sire and genetic marker from the allele frequencies in the population. The accuracy of a predictor of the frequency of informative progeny was tested on 103 grandsire x microsatellite combinations. Number of sons per grandsire varied from 24 to 129. Allele frequencies in the population were estimated by genotyping seven sires. The regression of the frequency of informative sons on the predicted frequency was 1.04 with a zero intercept model. Thus, considering the large number of genetic markers available for analysis, predicted informative frequency is a useful criterion for selection of genetic markers.  相似文献   

10.
Bf allele frequencies in a material of 172 unrelated Norwegians are given. Bf/HLA linkage relations in 49 informative matings with 178 children, and Bf/HLA association data of a material of 212 Bf-HLA haplotypes are presented. Of 171 informative meioses, there were no Bf-HLA-B recombinations, while 3 out of 158 Bf-HLA-A informative meioses showed recombination. There is significant association between the BfF and the HLA-BW35 allele. It is concluded that the Bf locus is situated on the HLA-B side of HLA-A within the HLA region, in very close proximity to HLA-B.  相似文献   

11.
The discovery of RFLPs and their utilization as genetic markers has revolutionized research in human molecular genetics. However, only a fraction of the DNA sequence polymorphisms in the human genome affect the length of a restriction fragment and hence result in an RFLP. Polymorphisms that are not detected as RFLPs are typically passed over in the screening process though they represent a potentially important source of informative genetic markers. We have used a rapid method for the detection of naturally occurring DNA sequence variations that is based on enzymatic amplification and direct sequencing of genomic DNA. This approach can detect essentially all useful sequence variations within the region screened. We demonstrate the feasibility of the technique by applying it to the human retinoblastoma susceptibility locus. We screened 3,712 bp of genomic DNA from each of nine individuals and found four DNA sequence polymorphisms. At least one of these DNA sequence polymorphisms was informative in each of three families with hereditary retinoblastoma that were not informative with any of the known RFLPs at this locus. We believe that direct sequencing is a reasonable alternative to other methods of screening for DNA sequence polymorphisms and that it represents a step forward for obtaining informative markers at well-characterized loci that have been minimally informative in the past.  相似文献   

12.
Informative proteins are the proteins that play critical functional roles inside cells. They are the fundamental knowledge of translating bioinformatics into clinical practices. Many methods of identifying informative biomarkers have been developed which are heuristic and arbitrary, without considering the dynamics characteristics of biological processes. In this paper, we present a generative model of identifying the informative proteins by systematically analyzing the topological variety of dynamic protein-protein interaction networks (PPINs). In this model, the common representation of multiple PPINs is learned using a deep feature generation model, based on which the original PPINs are rebuilt and the reconstruction errors are analyzed to locate the informative proteins. Experiments were implemented on data of yeast cell cycles and different prostate cancer stages. We analyze the effectiveness of reconstruction by comparing different methods, and the ranking results of informative proteins were also compared with the results from the baseline methods. Our method is able to reveal the critical members in the dynamic progresses which can be further studied to testify the possibilities for biomarker research.  相似文献   

13.
This paper pursues three basic definitions of comparative information motivated by various theories of information. The first definition involves the ordering of experiments according to a qualitative relation “not more informative than”, the second is derived from measure-theoretic properties of information without probability leading to a construction of a partially ordered algebra of information, the third is based on a particular aspect of qualitative semantic information involving the ordering of propositions according to their information content. This approach leads to a Boolean interpretation of informative propositions generating a qualitative probability structure. Some ways are discussed how to represent informative propositions by compatible normed information measures, leading to a measure of probability in terms of information.  相似文献   

14.
The attentional field resulting from the presentation of peripheral cues; which were either informative or non informative about the position of the imperative stimulus, was studied. Different time intervals between cue and stimulus (120, 300, 600 ms) were used. The results showed facilitation of the response with the informative cue and inhibition with the non informative cue. This happened for the longest cue-stimulus intervals and when the position of the cue and the position of the stimulus were congruent. Also order of cue presentation (i.e., either informative followed by the non informative cue or vice versa) proved important in producing facilitatory and inhibitory effects.  相似文献   

15.
Fragment-based learning of visual object categories   总被引:2,自引:0,他引:2  
When we perceive a visual object, we implicitly or explicitly associate it with a category we know. It is known that the visual system can use local, informative image fragments of a given object, rather than the whole object, to classify it into a familiar category. How we acquire informative fragments has remained unclear. Here, we show that human observers acquire informative fragments during the initial learning of categories. We created new, but naturalistic, classes of visual objects by using a novel "virtual phylogenesis" (VP) algorithm that simulates key aspects of how biological categories evolve. Subjects were trained to distinguish two of these classes by using whole exemplar objects, not fragments. We hypothesized that if the visual system learns informative object fragments during category learning, then subjects must be able to perform the newly learned categorization by using only the fragments as opposed to whole objects. We found that subjects were able to successfully perform the classification task by using each of the informative fragments by itself, but not by using any of the comparable, but uninformative, fragments. Our results not only reveal that novel categories can be learned by discovering informative fragments but also introduce and illustrate the use of VP as a versatile tool for category-learning research.  相似文献   

16.
S W Lagakos 《Biometrics》1979,35(1):139-156
This paper concerns general right censoring and some of the difficulties it creates in the analysis of survival data. A general formulation of censored-survival processes leads to the partition of all models into those based on noninformative and informative censoring. Nearly all statistical methods for censored data assume that censoring is noninformative. Topics considered within this class include: the relationships between three models for noninformative censoring, the use of likelihood methods for inferences about the distribution of survival time, the effects of censoring on the K-sample problem, and the effects of censoring on model testing. Also considered are several topics which relate to informative censoring models. These include: problems of nonidentifiability that can be encountered when attempting to assess a set of data for the type of censoring in effect, the consequences of falsely assuming that censoring is noninformative, and classes of informative censoring models.  相似文献   

17.
《IRBM》2021,42(5):334-344
Active learning is an effective solution to interactively select a limited number of informative examples and use them to train a learning algorithm that can achieve its optimal performance for specific tasks. It is suitable for medical image applications in which unlabeled data are abundant but manual annotation could be very time-consuming and expensive. However, designing an effective active learning strategy for informative example selection is a challenging task, due to the intrinsic presence of noise in medical images, the large number of images, and the variety of imaging modalities. In this study, a novel low-rank modeling-based multi-label active learning (LRMMAL) method is developed to address these challenges and select informative examples for training a classifier to achieve the optimal performance. The proposed method independently quantifies image noise and integrates it with other measures to guide a pool-based sampling process to determine the most informative examples for training a classifier. In addition, an automatic adaptive cross entropy-based parameter determination scheme is proposed for further optimizing the example sampling strategy. Experimental results on varied medical image datasets and comparisons with other state-of-the-art multi-label active learning methods illustrate the superior performance of the proposed method.  相似文献   

18.
The candidate region for the Huntington disease (HD) gene has been narrowed down to a 2.2-Mb region between D4S10 and D4S98 on the short arm of chromosome 4. To map the HD gene within this candidate region 65 Dutch HD families were studied. In total 338 informative meioses were analyzed and 11 multiple informative crossovers were detected. Assuming a minimum number of recombinations and no double recombinations, our multiple informative crossovers are consistent with one specific genetic order for 12 loci: D4S10-(D4S81, D4S126)-D4S125-(D4S127, D4S95)-D4S43-(D4S115, D4S96, D4S111, D4S90, D4S141). This is in agreement with the known data derived from similar and other methods. The loci between brackets could not be mapped relative to each other. In our family material, two informative three-point marker recombination events were detected in the proximal HD candidate region, which are also informative for HD. Both recombination events map the HD gene distal to D4S81 and most likely distal to D4S125, narrowing down the HD candidate region to a 1.7-Mb region between D4S125 and D4S98.  相似文献   

19.
Huang Y  Leroux B 《Biometrics》2011,67(3):843-851
Summary Williamson, Datta, and Satten's (2003, Biometrics 59 , 36–42) cluster‐weighted generalized estimating equations (CWGEEs) are effective in adjusting for bias due to informative cluster sizes for cluster‐level covariates. We show that CWGEE may not perform well, however, for covariates that can take different values within a cluster if the numbers of observations at each covariate level are informative. On the other hand, inverse probability of treatment weighting accounts for informative treatment propensity but not for informative cluster size. Motivated by evaluating the effect of a binary exposure in presence of such types of informativeness, we propose several weighted GEE estimators, with weights related to the size of a cluster as well as the distribution of the binary exposure within the cluster. Choice of the weights depends on the population of interest and the nature of the exposure. Through simulation studies, we demonstrate the superior performance of the new estimators compared to existing estimators such as from GEE, CWGEE, and inverse probability of treatment‐weighted GEE. We demonstrate the use of our method using an example examining covariate effects on the risk of dental caries among small children.  相似文献   

20.
One of the first steps in analyzing high-dimensional functional genomics data is an exploratory analysis of such data. Cluster Analysis and Principal Component Analysis are then usually the method of choice. Despite their versatility they also have a severe drawback: they do not always generate simple and interpretable solutions. On the basis of the observation that functional genomics data often contain both informative and non-informative variation, we propose a method that finds sets of variables containing informative variation. This informative variation is subsequently expressed in easily interpretable simplivariate components.We present a new implementation of the recently introduced simplivariate models. In this implementation, the informative variation is described by multiplicative models that can adequately represent the relations between functional genomics data. Both a simulated and two real-life metabolomics data sets show good performance of the method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号