首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Domain-enhanced analysis of microarray data using GO annotations   总被引:2,自引:0,他引:2  
MOTIVATION: New biological systems technologies give scientists the ability to measure thousands of bio-molecules including genes, proteins, lipids and metabolites. We use domain knowledge, e.g. the Gene Ontology, to guide analysis of such data. By focusing on domain-aggregated results at, say the molecular function level, increased interpretability is available to biological scientists beyond what is possible if results are presented at the gene level. RESULTS: We use a 'top-down' approach to perform domain aggregation by first combining gene expressions before testing for differentially expressed patterns. This is in contrast to the more standard 'bottom-up' approach, where genes are first tested individually then aggregated by domain knowledge. The benefits are greater sensitivity for detecting signals. Our method, domain-enhanced analysis (DEA) is assessed and compared to other methods using simulation studies and analysis of two publicly available leukemia data sets. AVAILABILITY: Our DEA method uses functions available in R (http://www.r-project.org/) and SAS (http://www.sas.com/). The two experimental data sets used in our analysis are available in R as Bioconductor packages, 'ALL' and 'golubEsets' (http://www.bioconductor.org/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

2.
adegenet: a R package for the multivariate analysis of genetic markers   总被引:4,自引:0,他引:4  
The package adegenet for the R software is dedicated to the multivariate analysis of genetic markers. It extends the ade4 package of multivariate methods by implementing formal classes and functions to manipulate and analyse genetic markers. Data can be imported from common population genetics software and exported to other software and R packages. adegenet also implements standard population genetics tools along with more original approaches for spatial genetics and hybridization. AVAILABILITY: Stable version is available from CRAN: http://cran.r-project.org/mirrors.html. Development version is available from adegenet website: http://adegenet.r-forge.r-project.org/. Both versions can be installed directly from R. adegenet is distributed under the GNU General Public Licence (v.2).  相似文献   

3.
Three-dimensional (3D) culture models are critical tools for understanding tissue morphogenesis. A key requirement for their analysis is the ability to reconstruct the tissue into computational models that allow quantitative evaluation of the formed structures. Here, we present Software for Automated Morphological Analysis (SAMA), a method by which epithelial structures grown in 3D cultures can be imaged, reconstructed and analyzed with minimum human intervention. SAMA allows quantitative analysis of key features of epithelial morphogenesis such as ductal elongation, branching and lumen formation that distinguish different hormonal treatments. SAMA is a user-friendly set of customized macros operated via FIJI (http://fiji.sc/Fiji), an open-source image analysis platform in combination with a set of functions in R (http://www.r-project.org/), an open-source program for statistical analysis. SAMA enables a rapid, exhaustive and quantitative 3D analysis of the shape of a population of structures in a 3D image. SAMA is cross-platform, licensed under the GPLv3 and available at http://montevil.theobio.org/content/sama.  相似文献   

4.
SUMMARY: Differential Identification using Mixtures Ensemble (DIME) is a package for identification of biologically significant differential binding sites between two conditions using ChIP-seq data. It considers a collection of finite mixture models combined with a false discovery rate (FDR) criterion to find statistically significant regions. This leads to a more reliable assessment of differential binding sites based on a statistical approach. In addition to ChIP-seq, DIME is also applicable to data from other high-throughput platforms. Availability and implementation: DIME is implemented as an R-package, which is available at http://www.stat.osu.edu/~statgen/SOFTWARE/DIME. It may also be downloaded from http://cran.r-project.org/web/packages/DIME/.  相似文献   

5.
Individual multilocus heterozygosity estimates based on a limited number of loci are expected to correlate only weakly with the inbreeding level of an individual. Before using multilocus heterozygosity estimates in studies of inbreeding, their ability to capture information on inbreeding in the given setting should be tested. A convenient method for this is to compute the heterozygosity-heterozygosity correlation, i.e. the mean correlation between multilocus heterozygosity estimates calculated from random samples of loci, which should be positive if multilocus heterozygosity carries a signature of inbreeding. Rhh is an extension package for the statistical software r that estimates this correlation and calculates three measures of individual multilocus heterozygosity: homozygosity by loci, internal relatedness and standardized heterozygosity. The extension package is available through the CRAN (http://cran.r-project.org) and has a homepage at http://www.helsinki.fi/biosci/egru/research/software.  相似文献   

6.
snp.plotter is a newly developed R package which produces high-quality plots of results from genetic association studies. The main features of the package include options to display a linkage disequilibrium (LD) plot below the P-value plot using either the r2 or D' LD metric, to set the X-axis to equal spacing or to use the physical map of markers, and to specify plot labels, colors, symbols and LD heatmap color scheme. snp.plotter can plot single SNP and/or haplotype data and simultaneously plot multiple sets of results. R is a free software environment for statistical computing and graphics available for most platforms. The proposed package provides a simple way to convey both association and LD information in a single appealing graphic for genetic association studies. AVAILABILITY: Downloadable R package and example datasets are available at http://cbdb.nimh.nih.gov/~kristin/snp.plotter.html and http://www.r-project.org.  相似文献   

7.
The software tool P2BAT provides a massive parallel and user friendly implementation of the PBAT-analysis tools for family-based association tests (FBATs) in large-scale studies, including genome-wide association studies with several thousand subjects. Built on the original PBAT-implementation of the Lange-Van Steen algorithm to bypass the multiple testing problem in family-based association studies, P2BAT integrates all PBAT-analysis tools for binary and complex traits into R and makes them accessible through a user-friendly GUI. The genome-wide analysis tools are fully automated and can be ran massively parallel directly through the GUI. P2BAT is fully documented and contains graphical output tools for time-to-onset analysis. P2BAT also features the ability to test for gene and environment/drug interaction. AVAILABILITY: The P2BAT package is available as the R package 'pbatR' which can be downloaded from http://cran.r-project.org/. The PBAT-software is available at http://www.biostat.harvard.edu/~clange/.  相似文献   

8.
Estimating pairwise correlation from replicated genome-scale (a.k.a. OMICS) data is fundamental to cluster functionally relevant biomolecules to a cellular pathway. The popular Pearson correlation coefficient estimates bivariate correlation by averaging over replicates. It is not completely satisfactory since it introduces strong bias while reducing variance. We propose a new multivariate correlation estimator that models all replicates as independent and identically distributed (i.i.d.) samples from the multivariate normal distribution. We derive the estimator by maximizing the likelihood function. For small sample data, we provide a resampling-based statistical inference procedure, and for moderate to large sample data, we provide an asymptotic statistical inference procedure based on the Likelihood Ratio Test (LRT). We demonstrate advantages of the new multivariate correlation estimator over Pearson bivariate correlation estimator using simulations and real-world data analysis examples. AVAILABILITY: The estimator and statistical inference procedures have been implemented in an R package 'CORREP' that is available from CRAN [http://cran.r-project.org] and Bioconductor [http://www.bioconductor.org/]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

9.
We have developed a simulation tool HapSim for the generation of haplotype data. The simulated haplotypes are such that their allele frequencies and linkage disequilibrium coefficients match exactly those estimated in a real sample. AVAILABILITY: The program is available as an R package and can be downloaded from http://cran.r-project.org/.  相似文献   

10.
In his book, Capital in the Twenty-First Century, Thomas Piketty demonstrates that capitalism produces income inequality. He shows that over several hundred years and across many nations the rate of return on capital exceeds the growth rate of market economies. Returns to most forms of labour do not keep up with economic growth. Piketty explains as well that when economic growth slows, the gap between capital's increase and the economy's growth expands. When this happens, inheritors of capital benefit relative to others.

Piketty's study makes use of income tax and other records that have been systematically collected since the French Revolution and especially in the early part of the twentieth-century. His empirical and historical study covers much of Europe and the United States, as well as Japan and other countries. The cross-cultural and historical similarities in the rates of capital and economic growth are striking. They change only with major disruptions such as wars. Piketty's results contradict many theories of economic growth and development.

For anthropologists, Piketty's coupling of inheritance with capital accumulation has implications for the role of kinship and marriage arrangements in relation to status. Given his view, however, that economy only means markets, Piketty has difficulty justifying ways to counter capitalism's inherent inequality. Anthropologists who hold a broader vision of material life as composed of both impersonal exchange and mutuality, may usefully enter this discussion to explain and justify ways to counteract market economy's inherent inequality and instability.  相似文献   


11.
Whole-genome sequencing of tumor tissue has the potential to provide comprehensive characterization of genomic alterations in tumor samples. We present Patchwork, a new bioinformatic tool for allele-specific copy number analysis using whole-genome sequencing data. Patchwork can be used to determine the copy number of homologous sequences throughout the genome, even in aneuploid samples with moderate sequence coverage and tumor cell content. No prior knowledge of average ploidy or tumor cell content is required. Patchwork is freely available as an R package, installable via R-Forge (http://patchwork.r-forge.r-project.org/).  相似文献   

12.
Ram B. Jain 《Biomarkers》2017,22(5):476-487
Context: Prevalence of smoking is needed to estimate the need for future public health resources.

Objective: To compute and compare smoking prevalence rates by using self-reported smoking statuses, two serum cotinine (SCOT) based biomarker methods, and one urinary 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol (NNAL) based biomarker method. These estimates were then used to develop correction factors to be applicable to self-reported prevalences to arrive at corrected smoking prevalence rates.

Materials and methods: Data from National Health and Nutrition Examination Survey (NHANES) for 2007–2012 for those aged ≥20 years (N?=?16826) were used.

Results: Self-reported prevalence rate for the total population computed as the weighted number of self-reported smokers divided by weighted number of all participants was 21.6% and 24% when computed by weighted number of self-reported smokers divided by the weighted number of self-reported smokers and nonsmokers. The corrected prevalence rate was found to be 25.8%.

Discussion and conclusions: A 1% underestimate in smoking prevalence is equivalent to not being able to identify 2.2 million smokers in US in a given year. This underestimation, if not corrected, could lead to serious gap in the public health services available and needed to provide adequate preventive and corrective treatment to smokers.  相似文献   


13.
Zhao Jh 《BMC genetics》2005,6(Z1):S127
The presence of disease is commonly used in genetic studies; however, the time to onset often provides additional information. To apply the popular Cox model for such data, it is desirable to consider the familial correlation, which involves kinship or identity by descent (IBD) information between family members. Recently, such a framework has been developed and implemented in a UNIX-based S-PLUS package called kinship, extending the Cox model with mixed effects and familial relationship. The model is of great potential in joint analysis of family data with genetic and environmental factors. We apply this framework to data from the Collaborative Study on the Genetics of Alcoholism data as part of Genetic Analysis Workshop 14. We use the S-PLUS package, ported into the R environment http://www.r-project.org, for the analysis of microsatellite data on chromosomes 4 and 7. In these analyses, IBD information at those markers is used in addition to the basic Cox model with mixed effects, which provides estimates of the relative contribution of specific genetic markers. D4S1645 had the largest variance and contribution to the log-likelihood on chromosome 4, but the significance of this finding requires further investigation.  相似文献   

14.
Aim of the study: Independent sitting requires the control of the involved body segments over the base of support using information obtained from the three sensory systems (visual, vestibular, and somatosensory). The contribution of somatosensory information in infant sitting has not been explored. To address this gap, we altered the context of the sitting support surface and examined the infants’ immediate postural responses.

Materials and methods: Ten 7-month-old typically developing infants sat on compliant and firm surfaces in one session. Spatial, frequency, and temporal measures of postural control were obtained using center of pressure data.

Results Our results suggest that infants’ postural sway is not immediately affected by the different types of foam surface while sitting.

Conclusions: It seems that mature sitter infants are able to adapt to different environmental constraints by disregarding the distorted somatosensory information from the support surface and relying more on their remaining senses (visual and vestibular) to control their sitting posture.  相似文献   


15.
SpotWhatR is a user-friendly microarray data analysis tool that runs under a widely and freely available R statistical language (http://www.r-project.org) for Windows and Linux operational systems. The aim of SpotWhatR is to help the researcher to analyze microarray data by providing basic tools for data visualization, normalization, determination of differentially expressed genes, summarization by Gene Ontology terms, and clustering analysis. SpotWhatR allows researchers who are not familiar with computational programming to choose the most suitable analysis for their microarray dataset. Along with well-known procedures used in microarray data analysis, we have introduced a stand-alone implementation of the HTself method, especially designed to find differentially expressed genes in low-replication contexts. This approach is more compatible with our local reality than the usual statistical methods. We provide several examples derived from the Blastocladiella emersonii and Xylella fastidiosa Microarray Projects. SpotWhatR is freely available at http://blasto.iq.usp.br/~tkoide/SpotWhatR, in English and Portuguese versions. In addition, the user can choose between "single experiment" and "batch processing" versions.  相似文献   

16.
MOTIVATION: Many methods of identifying differential expression in genes depend on testing the null hypotheses of exactly equal means or distributions of expression levels for each gene across groups, even though a statistically significant difference in the expression level does not imply the occurrence of any difference of biological or clinical significance. This is because a mathematical definition of 'differential expression' as any non-zero difference does not correspond to the differential expression biologists seek. Furthermore, while some current methods account for multiple comparisons in hypothesis tests, they do not accordingly adjust estimates of the degrees to which genes are differentially expressed. Both problems lead to overstating the relevance of findings. RESULTS: Testing whether genes have relevant differential expression can be accomplished with customized null hypotheses, thereby redefining 'differential expression' in a way that is more biologically meaningful. When such tests control the false discovery rate, they effectively discover genes based on a desired quantile of differential gene expression. Estimation of the degree to which genes are differentially expressed has been corrected for multiple comparisons. AVAILABILITY: R code is freely available from http://www.davidbickel.com and may become available from www.r-project.org or www.bioconductor.org SUPPLEMENTARY INFORMATION: Applications to cancer microarrays, an application in the absence of differential expression, pseudocode, and a guide to customizing the methods may be found at www.davidbickel.com and www.mathpreprints.com  相似文献   

17.
18.
GEIGER: investigating evolutionary radiations   总被引:2,自引:0,他引:2  
SUMMARY: GEIGER is a new software package, written in the R language, to describe evolutionary radiations. GEIGER can carry out simulations, parameter estimation and statistical hypothesis testing. Additionally, GEIGER's simulation algorithms can be used to analyze the statistical power of comparative approaches. AVAILABILITY: This open source software is written entirely in the R language and is freely available through the Comprehensive R Archive Network (CRAN) at http://cran.r-project.org/.  相似文献   

19.
Spider: SPecies IDentity and Evolution in R is a new R package implementing a number of useful analyses for DNA barcoding studies and associated research into species delimitation and speciation. Included are functions essential for generating important summary statistics from DNA barcode data, assessing specimen identification efficacy, and for testing and optimizing divergence threshold limits. In terms of investigating evolutionary and taxonomic questions, techniques for assessing diagnostic nucleotides and probability of reciprocal monophyly are also provided. Additionally, a sliding window function offers opportunities to analyse information across a gene, essential for marker design in degraded DNA studies. Spider capitalizes on R's extensible ethos and offers an integrated platform ideal for the analysis of both nucleotide and morphological data. The program can be obtained from the comprehensive R archive network (CRAN, http://cran.r-project.org) and from the R-Forge package development site (http://spider.r-forge.r-project.org/).  相似文献   

20.
SUMMARY: We present SynView, a simple and generic approach to dynamically visualize multi-species comparative genome data. It is a light-weight application based on the popular and configurable web-based GBrowse framework. It can be used with a variety of databases and provides the user with a high degree of interactivity. The tool is written in Perl and runs on top of the GBrowse framework. It is in use in the PlasmoDB (http://www.PlasmoDB.org) and the CryptoDB (http://www.CryptoDB.org) projects and can be easily integrated into other cross-species comparative genome projects. AVAILABILITY: The program and instructions are freely available at http://www.ApiDB.org/apps/SynView/ CONTACT: jkissing@uga.edu.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号