首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Ando  Tomohiro 《Biometrika》2007,94(2):443-458
The problem of evaluating the goodness of the predictive distributionsof hierarchical Bayesian and empirical Bayes models is investigated.A Bayesian predictive information criterion is proposed as anestimator of the posterior mean of the expected loglikelihoodof the predictive distribution when the specified family ofprobability distributions does not contain the true distribution.The proposed criterion is developed by correcting the asymptoticbias of the posterior mean of the loglikelihood as an estimatorof its expected loglikelihood. In the evaluation of hierarchicalBayesian models with random effects, regardless of our parametricfocus, the proposed criterion considers the bias correctionof the posterior mean of the marginal loglikelihood becauseit requires a consistent parameter estimator. The use of thebootstrap in model evaluation is also discussed.  相似文献   

3.
The data from genome-wide association studies (GWAS) in humans are still predominantly analyzed using single-marker association methods. As an alternative to single-marker analysis (SMA), all or subsets of markers can be tested simultaneously. This approach requires a form of penalized regression (PR) as the number of SNPs is much larger than the sample size. Here we review PR methods in the context of GWAS, extend them to perform penalty parameter and SNP selection by false discovery rate (FDR) control, and assess their performance in comparison with SMA. PR methods were compared with SMA, using realistically simulated GWAS data with a continuous phenotype and real data. Based on these comparisons our analytic FDR criterion may currently be the best approach to SNP selection using PR for GWAS. We found that PR with FDR control provides substantially more power than SMA with genome-wide type-I error control but somewhat less power than SMA with Benjamini–Hochberg FDR control (SMA-BH). PR with FDR-based penalty parameter selection controlled the FDR somewhat conservatively while SMA-BH may not achieve FDR control in all situations. Differences among PR methods seem quite small when the focus is on SNP selection with FDR control. Incorporating linkage disequilibrium into the penalization by adapting penalties developed for covariates measured on graphs can improve power but also generate more false positives or wider regions for follow-up. We recommend the elastic net with a mixing weight for the Lasso penalty near 0.5 as the best method.  相似文献   

4.
Establishing that a set of population‐splitting events occurred at the same time can be a potentially persuasive argument that a common process affected the populations. Recently, Oaks et al. ( 2013 ) assessed the ability of an approximate‐Bayesian model‐choice method (msBayes ) to estimate such a pattern of simultaneous divergence across taxa, to which Hickerson et al. ( 2014 ) responded. Both papers agree that the primary inference enabled by the method is very sensitive to prior assumptions and often erroneously supports shared divergences across taxa when prior uncertainty about divergence times is represented by a uniform distribution. However, the papers differ about the best explanation and solution for this problem. Oaks et al. ( 2013 ) suggested the method's behavior was caused by the strong weight of uniformly distributed priors on divergence times leading to smaller marginal likelihoods (and thus smaller posterior probabilities) of models with more divergence‐time parameters (Hypothesis 1); they proposed alternative prior probability distributions to avoid such strongly weighted posteriors. Hickerson et al. ( 2014 ) suggested numerical‐approximation error causes msBayes analyses to be biased toward models of clustered divergences because the method's rejection algorithm is unable to adequately sample the parameter space of richer models within reasonable computational limits when using broad uniform priors on divergence times (Hypothesis 2). As a potential solution, they proposed a model‐averaging approach that uses narrow, empirically informed uniform priors. Here, we use analyses of simulated and empirical data to demonstrate that the approach of Hickerson et al. ( 2014 ) does not mitigate the method's tendency to erroneously support models of highly clustered divergences, and is dangerous in the sense that the empirically derived uniform priors often exclude from consideration the true values of the divergence‐time parameters. Our results also show that the tendency of msBayes analyses to support models of shared divergences is primarily due to Hypothesis 1, whereas Hypothesis 2 is an untenable explanation for the bias. Overall, this series of papers demonstrates that if our prior assumptions place too much weight in unlikely regions of parameter space such that the exact posterior supports the wrong model of evolutionary history, no amount of computation can rescue our inference. Fortunately, as predicted by fundamental principles of Bayesian model choice, more flexible distributions that accommodate prior uncertainty about parameters without placing excessive weight in vast regions of parameter space with low likelihood increase the method's robustness and power to detect temporal variation in divergences.  相似文献   

5.
RNA-Seq technologies are quickly revolutionizing genomic studies, and statistical methods for RNA-seq data are under continuous development. Timely review and comparison of the most recently proposed statistical methods will provide a useful guide for choosing among them for data analysis. Particular interest surrounds the ability to detect differential expression (DE) in genes. Here we compare four recently proposed statistical methods, edgeR, DESeq, baySeq, and a method with a two-stage Poisson model (TSPM), through a variety of simulations that were based on different distribution models or real data. We compared the ability of these methods to detect DE genes in terms of the significance ranking of genes and false discovery rate control. All methods compared are implemented in freely available software. We also discuss the availability and functions of the currently available versions of these software.  相似文献   

6.
In microarray-based case–control studies of a disease, people often attempt to identify a few diagnostic or prognostic markers amongst the most significant differentially expressed (DE) genes. However, the reproducibility of DE genes identified in different studies for a disease is typically very low. To tackle the problem, we could evaluate the reproducibility of DE genes across studies and define robust markers for disease diagnosis using disease-associated protein–protein interaction (PPI) subnetwork. Using datasets for four cancer types, we found that the most significant DE genes in cancer exhibit consistent up- or down-regulation in different datasets. For each cancer type, the 5 (or 10) most significant DE genes separately extracted from different datasets tend to be significantly coexpressed and closely connected in the PPI subnetwork, thereby indicating that they are highly reproducible at the PPI level. Consequently, we were able to build robust subnetwork-based classifiers for cancer diagnosis.  相似文献   

7.

Background

Argonaute (Ago) proteins are essential for the biogenesis and function of ~ 20–30 nucleotide long RNAs such as microRNAs (miRNAs). Ago expression increases or decreases under various physiological conditions, although the functional consequences are unknown. In addition, while reduced global miRNA production was shown to enhance cellular transformation and tumorigenesis, how Ago proteins contribute to human diseases has not been reported.

Method

Ago2, an essential Ago isoform in mammals, was stably expressed in 293 T, the human embryonic kidney cell line, and H1299, the human lung adenocarcinoma cell line. miRNA and mRNA expression was investigated by quantitative PCR and microarray profiling. Cell proliferation and migration was examined by 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide assay and scratch assay in the cell cultures, respectively. How Ago2 affected cell growth in vivo was determined by H1299 xenograft tumor growth in mice. Changes in Ago2 expression in human lung cancer samples were investigated by quantitative PCR and immunohistochemistry.

Results

Stable Ago2 overexpression elicited specific changes in miRNA and mRNA expression in both 293 T and H1299 cells. It also inhibited cell proliferation and migration in cell cultures as well as xenograft tumor growth in nude mice. Ago2 expression was lower in human lung adenocarcinomas than in the paired, non-cancerous tissues.

General significance

We concluded that changes in Ago2 expression might have significant physiological and pathological consequences in vivo.  相似文献   

8.
9.
10.
This article proposes resampling-based empirical Bayes multiple testing procedures for controlling a broad class of Type I error rates, defined as generalized tail probability (gTP) error rates, gTP (q,g) = Pr(g (V(n),S(n)) > q), and generalized expected value (gEV) error rates, gEV (g) = E [g (V(n),S(n))], for arbitrary functions g (V(n),S(n)) of the numbers of false positives V(n) and true positives S(n). Of particular interest are error rates based on the proportion g (V(n),S(n)) = V(n) /(V(n) + S(n)) of Type I errors among the rejected hypotheses, such as the false discovery rate (FDR), FDR = E [V(n) /(V(n) + S(n))]. The proposed procedures offer several advantages over existing methods. They provide Type I error control for general data generating distributions, with arbitrary dependence structures among variables. Gains in power are achieved by deriving rejection regions based on guessed sets of true null hypotheses and null test statistics randomly sampled from joint distributions that account for the dependence structure of the data. The Type I error and power properties of an FDR-controlling version of the resampling-based empirical Bayes approach are investigated and compared to those of widely-used FDR-controlling linear step-up procedures in a simulation study. The Type I error and power trade-off achieved by the empirical Bayes procedures under a variety of testing scenarios allows this approach to be competitive with or outperform the Storey and Tibshirani (2003) linear step-up procedure, as an alternative to the classical Benjamini and Hochberg (1995) procedure.  相似文献   

11.
D Wang  Y Zhang  Y Huang  P Li  M Wang  R Wu  L Cheng  W Zhang  Y Zhang  B Li  C Wang  Z Guo 《Gene》2012,506(1):36-42
Nowadays, some researchers normalized DNA methylation arrays data in order to remove the technical artifacts introduced by experimental differences in sample preparation, array processing and other factors. However, other researchers analyzed DNA methylation arrays without performing data normalization considering that current normalizations for methylation data may distort real differences between normal and cancer samples because cancer genomes may be extensively subject to hypomethylation and the total amount of CpG methylation might differ substantially among samples. In this study, using eight datasets by Infinium HumanMethylation27 assay, we systemically analyzed the global distribution of DNA methylation changes in cancer compared to normal control and its effect on data normalization for selecting differentially methylated (DM) genes. We showed more differentially methylated (DM) genes could be found in the Quantile/Lowess-normalized data than in the non-normalized data. We found the DM genes additionally selected in the Quantile/Lowess-normalized data showed significantly consistent methylation states in another independent dataset for the same cancer, indicating these extra DM genes were effective biological signals related to the disease. These results suggested normalization can increase the power of detecting DM genes in the context of diagnostic markers which were usually characterized by relatively large effect sizes. Besides, we evaluated the reproducibility of DM discoveries for a particular cancer type, and we found most of the DM genes additionally detected in one dataset showed the same methylation directions in the other dataset for the same cancer type, indicating that these DM genes were effective biological signals in the other dataset. Furthermore, we showed that some DM genes detected from different studies for a particular cancer type were significantly reproducible at the functional level.  相似文献   

12.
Colorectal cancer (CRC) is the fourth most common cause of cancer-related death worldwide. Accurate non-invasive screening for CRC would greatly enhance a population’s health. Adenomatous polyposis coli (Apc) gene mutations commonly occur in human colorectal adenomas and carcinomas, leading to Wnt signalling pathway activation. Acute conditional transgenic deletion of Apc in murine intestinal epithelium (AhCre+Apcfl/fl) causes phenotypic changes similar to those found during colorectal tumourigenesis. This study comprised a proteomic analysis of murine small intestinal epithelial cells following acute Apc deletion to identify proteins that show altered expression during human colorectal carcinogenesis, thus identifying proteins that may prove clinically useful as blood/serum biomarkers of colorectal neoplasia. Eighty-one proteins showed significantly increased expression following iTRAQ analysis, and validation of nine of these by Ingenuity Pathaway Analysis showed they could be detected in blood or serum. Expression was assessed in AhCre+Apcfl/fl small intestinal epithelium by immunohistochemistry, western blot and quantitative real-time PCR; increased nucelolin concentrations were also detected in the serum of AhCre+Apcfl/fl and ApcMin/+ mice by ELISA. Six proteins; heat shock 60 kDa protein 1, Nucleolin, Prohibitin, Cytokeratin 18, Ribosomal protein L6 and DEAD (Asp-Glu-Ala-Asp) box polypeptide 5,were selected for further investigation. Increased expression of 4 of these was confirmed in human CRC by qPCR. In conclusion, several novel candidate biomarkers have been identified from analysis of transgenic mice in which the Apc gene was deleted in the intestinal epithelium that also showed increased expression in human CRC. Some of these warrant further investigation as potential serum-based biomarkers of human CRC.  相似文献   

13.
The analysis of allele-specific gene expression (ASE) is essential for the mapping of genetic variants that affect gene regulation, and for the identification of alleles that modify disease risk. Although RNA sequencing offers the opportunity to measure expression at allele levels, the availability of powerful statistical methods for mapping ASE in single or multiple individuals is limited. We developed a maximum likelihood model to characterize ASE in the human genome. Approximately 17% of genes displayed an allele-specific effect on gene expression in a single individual. Simulations using our model gave a better performance and improved robustness when compared with the binomial test, with different coverage levels, allelic expression fractions and random noise. In addition, our method can identify ASE in multiple individuals, with enhanced performance. This is helpful in understanding the mechanism of genetic regulation leading to expression changes, alternative splicing variants and even disease susceptibility.  相似文献   

14.
We examined miRNA expression from RNA isolated from the frontal cortex (Broadman area 9) of 9 alcoholics (6 males, 3 females, mean age 48 years) and 9 matched controls using both the Affymetrix GeneChip miRNA 2.0 and Human Exon 1.0 ST Arrays to further characterize genetic influences in alcoholism and the effects of alcohol consumption on predicted target mRNA expression. A total of 12 human miRNAs were significantly up-regulated in alcohol dependent subjects (fold change ≥ 1.5, false discovery rate (FDR) ≤ 0.3; p < 0.05) compared with controls including a cluster of 4 miRNAs (e.g., miR-377, miR-379) from the maternally expressed 14q32 chromosome region. The status of the up-regulated miRNAs was supported using the high-throughput method of exon microarrays showing decreased predicted mRNA gene target expression as anticipated from the same RNA aliquot. Predicted mRNA targets were involved in cellular adhesion (e.g., THBS2), tissue differentiation (e.g., CHN2), neuronal migration (e.g., NDE1), myelination (e.g., UGT8, CNP) and oligodendrocyte proliferation (e.g., ENPP2, SEMA4D1). Our data support an association of alcoholism with up-regulation of a cluster of miRNAs located in the genomic imprinted domain on chromosome 14q32 with their predicted gene targets involved with oligodendrocyte growth, differentiation and signaling.  相似文献   

15.
The capacity of inducing angiogenesis is a recognized hallmark of cancer cells. The cancer microenvironment, characterized by hypoxia and inflammatory signals, promotes proliferation, migration and activation of quiescent endothelial cells (EC) from surrounding vascular network. Current anti-angiogenic drugs present side effects, temporary efficacy, and issues of primary resistance, thereby calling for the identification of new therapeutic targets.MICALs are a unique family of redox enzymes that destabilize F-actin in cytoskeletal dynamics. MICAL2 mediates Semaphorin3A-NRP2 response to VEGFR1 in rat ECs. MICAL2 also enters the p130Cas interactome in response to VEGF in HUVEC. Previously, we showed that MICAL2 is overexpressed in metastatic cancer. A small-molecule inhibitor of MICAL2 exists (CCG-1423).Here we report that 1) MICAL2 is expressed in neo-angiogenic ECs in human solid tumors (kidney and breast carcinoma, glioblastoma and cardiac myxoma, n = 67, were analyzed with immunohistochemistry) and in animal models of ischemia/inflammation neo-angiogenesis, but not in normal capillary bed; 2) MICAL2 protein pharmacological inhibition (CCG-1423) or gene KD reduce EC viability and functional performance; 3) MICAL2 KD disables ECs response to VEGF in vitro. Whole-genome gene expression profiling reveals MICAL2 involvement in angiogenesis and vascular development pathways.Based on these results, we propose that MICAL2 expression in ECs participates to inflammation-induced neo-angiogenesis and that MICAL2 inhibition should be tested in cancer- and noncancer-associated neo-angiogenesis, where chronic inflammation represents a relevant pathophysiological mechanism.  相似文献   

16.
The wide application of prostate-specific antigen (PSA) has contributed to the early diagnosis and improved management of prostate cancer (PCa). Accumulating evidence has suggested the involvement of genetic components in regulating serum PSA levels, and several single nucleotide polymorphisms (SNPs) have been identified by genome-wide association studies (GWASs). However, the GWASs' results have the limited power to identify the causal variants and pathways. After the quality control filters, a total of 330,540 genotyped SNPs from one GWAS with 657 PCa-free Caucasian males were included for the identify candidate causal SNPs and pathways (ICSNPathway) analysis. In addition, the genotype–phenotype association analysis has been conducted with the data from HapMap database. Overall, a total of four SNPs in three genes and six pathways were identified by ICSNPathway analysis, which in total provided three hypothetical mechanisms. First, CYP26B1 rs2241057 polymorphism (nonsynonymous coding) which leads to a Leu-to-Ser amino acid shift at position 264, was implicated in the pathways including meiosis, proximal/distal pattern formation, and M phase of meiotic cell cycle. Second, CLIC5 rs3734207 and rs11752816 polymorphisms (regulatory region) to the 2 iron, 2 sulfur cluster binding pathway through regulating expression levels of CLIC5 mRNA. Third, rs4819522 polymorphism (nonsynonymous coding) leads to a Thr-to-Met transition at position 350 of TBX1 and involves in the pathways about gland and endocrine system development. In summary, our results demonstrated four candidate SNPs in three genes (CYP26B1 rs2241057, CISD1 rs2251039, rs2590370, and TBX1 rs4819522 polymorphisms), which were involved in six potential pathways to influence serum PSA levels.  相似文献   

17.
Estimating false discovery rates (FDRs) of protein identification continues to be an important topic in mass spectrometry–based proteomics, particularly when analyzing very large datasets. One performant method for this purpose is the Picked Protein FDR approach which is based on a target-decoy competition strategy on the protein level that ensures that FDRs scale to large datasets. Here, we present an extension to this method that can also deal with protein groups, that is, proteins that share common peptides such as protein isoforms of the same gene. To obtain well-calibrated FDR estimates that preserve protein identification sensitivity, we introduce two novel ideas. First, the picked group target-decoy and second, the rescued subset grouping strategies. Using entrapment searches and simulated data for validation, we demonstrate that the new Picked Protein Group FDR method produces accurate protein group-level FDR estimates regardless of the size of the data set. The validation analysis also uncovered that applying the commonly used Occam’s razor principle leads to anticonservative FDR estimates for large datasets. This is not the case for the Picked Protein Group FDR method. Reanalysis of deep proteomes of 29 human tissues showed that the new method identified up to 4% more protein groups than MaxQuant. Applying the method to the reanalysis of the entire human section of ProteomicsDB led to the identification of 18,000 protein groups at 1% protein group-level FDR. The analysis also showed that about 1250 genes were represented by ≥2 identified protein groups. To make the method accessible to the proteomics community, we provide a software tool including a graphical user interface that enables merging results from multiple MaxQuant searches into a single list of identified and quantified protein groups.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号