首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 791 毫秒
1.
Hornberg JJ  de Haas RR  Dekker H  Lankelma J 《BioTechniques》2002,33(1):108, 110, 112-108,3, passim
Nylon membrane-based macroarrays form a widely available alternative to microarrays for the collection of large-scale gene expression data. To carry out repetitive hybridization experiments with nylon cDNA arrays, we used phosphorothioate 33P-cDNA, followed by stripping under relatively mild conditions. We were able to use the same membranes more than 10 times without a measurable reduction in their performance. Thus, our protocol allowsfor more comparative studies of multiple data sets obtained from sequential hybridizations of the same set of membranes. We demonstrate how to analyze repetitive macroarray experiments and to determine the reliability or statistical significance of the gene expression data obtained. Both the averaging of signals per gene and the reversal of nylon membranes had a favorable effect on accuracy. By self-self comparisons, we show that in a duplicate experiment with four membranes, a 2-fold change in the gene expression can be measured reliably.  相似文献   

2.
We propose a model-based approach to unify clustering and network modeling using time-course gene expression data. Specifically, our approach uses a mixture model to cluster genes. Genes within the same cluster share a similar expression profile. The network is built over cluster-specific expression profiles using state-space models. We discuss the application of our model to simulated data as well as to time-course gene expression data arising from animal models on prostate cancer progression. The latter application shows that with a combined statistical/bioinformatics analyses, we are able to extract gene-to-gene relationships supported by the literature as well as new plausible relationships.  相似文献   

3.
Human microarrays are readily available, and it would be advantageous if they could be used to study gene expression in other species, such as pigs. The objectives of this research were to validate the use of human microarrays in the analysis of porcine gene expression, to assess the variability of the data generated, and to compare gene expression in boars with different levels of steroidogenesis. Cytochrome b5 (CYB5) expression was used to assess array detection sensitivity. Samples having high or low CYB5 RNA levels were hybridized to microarrays to determine if the known expression difference could be detected. Six hybridizations were conducted using human microarrays containing 3840 total spots representing 1718 characterized human ESTs. To analyze gene expression in boars with different levels of steroidogenesis, testis RNA from four boars with high levels of plasma estrone sulphate was hybridized to testis RNA from four boars with lower levels. Eight microarray hybridizations were conducted including fluor-flips. Self-self hybridizations were also conducted to assess the variability of array experiments. The Cy5 and Cy3 intensity values for each array were normalized using a locally weighted linear regression (LOESS). Statistical significance was assessed using a Student's t-test followed by the Benjamini and Hochberg multiple testing correction procedure. Quantitative real-time PCR (Q-RT-PCR) was used to verify select gene expression differences. The results show that CYB5 was significantly overexpressed in the high CYB5 sample by 1.8 fold (P < 0.05), verifying the known expression difference. The average log2 ratio of the majority of genes (1643) falls within one standard deviation of the mean, indicating the data were reproducible. In the high versus low steroidogenesis experiment, seven genes were significantly overexpressed in the high group (P < 0.05). Quantitative real-time PCR was used to validate five genes with the highest fold change, and the results corroborated those found by the microarray experiments. The results of the self-self hybridizations showed that no genes were significantly differentially expressed following the application of the Benjamini and Hochberg multiple testing correction procedure. The results presented in this report show that human arrays can be used for gene expression analysis in pigs.  相似文献   

4.
5.
Human microarrays are readily available, and it would be advantageous if they could be used to study gene expression in other species, such as pigs. The objectives of this research were to validate the use of human microarrays in the analysis of porcine gene expression, to assess the variability of the data generated, and to compare gene expression in boars with different levels of steroidogenesis. Cytochrome b5 (CYB5) expression was used to assess array detection sensitivity. Samples having high or low CYB5 RNA levels were hybridized to microarrays to determine if the known expression difference could be detected. Six hybridizations were conducted using human microarrays containing 3840 total spots representing 1718 characterized human ESTs. To analyze gene expression in boars with different levels of steroidogenesis, testis RNA from four boars with high levels of plasma estrone sulphate was hybridized to testis RNA from four boars with lower levels. Eight microarray hybridizations were conducted including fluor-flips. Self-self hybridizations were also conducted to assess the variability of array experiments. The Cy5 and Cy3 intensity values for each array were normalized using a locally weighted linear regression (LOESS). Statistical significance was assessed using a Student's t-test followed by the Benjamini and Hochberg multiple testing correction procedure. Quantitative real-time PCR (Q-RT-PCR) was used to verify select gene expression differences. The results show that CYB5 was significantly overexpressed in the high CYB5 sample by 1.8 fold (P < 0.05), verifying the known expression difference. The average log2 ratio of the majority of genes (1643) falls within one standard deviation of the mean, indicating the data were reproducible. In the high versus low steroidogenesis experiment, seven genes were significantly overexpressed in the high group (P < 0.05). Quantitative real-time PCR was used to validate five genes with the highest fold change, and the results corroborated those found by the microarray experiments. The results of the self-self hybridizations showed that no genes were significantly differentially expressed following the application of the Benjamini and Hochberg multiple testing correction procedure. The results presented in this report show that human arrays can be used for gene expression analysis in pigs.  相似文献   

6.
GCHap quickly finds maximum likelihood estimates (MLEs) of frequencies of haplotypes given genotype information on a random sample of individuals. It uses the gene counting method but by excluding haplotypes with zero MLE at an early stage, this implementation uses many orders of magnitude less space and time than naive implementations. A second program, ApproxGCHap, is provided to give alternate estimates for data sets with large numbers of loci or large amounts of missing genotypes. AVAILABILITY: The Java classes and Javadocs pages for GCHap can be obtained from bioinformatics.med.utah.edu/~alun  相似文献   

7.
Chen HY  Xie H  Qian Y 《Biometrics》2011,67(3):799-809
Multiple imputation is a practically useful approach to handling incompletely observed data in statistical analysis. Parameter estimation and inference based on imputed full data have been made easy by Rubin's rule for result combination. However, creating proper imputation that accommodates flexible models for statistical analysis in practice can be very challenging. We propose an imputation framework that uses conditional semiparametric odds ratio models to impute the missing values. The proposed imputation framework is more flexible and robust than the imputation approach based on the normal model. It is a compatible framework in comparison to the approach based on fully conditionally specified models. The proposed algorithms for multiple imputation through the Markov chain Monte Carlo sampling approach can be straightforwardly carried out. Simulation studies demonstrate that the proposed approach performs better than existing, commonly used imputation approaches. The proposed approach is applied to imputing missing values in bone fracture data.  相似文献   

8.
Making sense of score statistics for sequence alignments   总被引:1,自引:0,他引:1  
The search for similarity between two biological sequences lies at the core of many applications in bioinformatics. This paper aims to highlight a few of the principles that should be kept in mind when evaluating the statistical significance of alignments between sequences. The extreme value distribution is first introduced, which in most cases describes the distribution of alignment scores between a query and a database. The effects of the similarity matrix and gap penalty values on the score distribution are then examined, and it is shown that the alignment statistics can undergo an abrupt phase transition. A few types of random sequence databases used in the estimation of statistical significance are presented, and the statistics employed by the BLAST, FASTA and PRSS programs are compared. Finally the different strategies used to assess the statistical significance of the matches produced by profiles and hidden Markov models are presented.  相似文献   

9.
We propose a method for improving the quality of signal from DNA microarrays by using several scans at varying scanner sen-sitivities. A Bayesian latent intensity model is introduced for the analysis of such data. The method improves the accuracy at which expressions can be measured in all ranges and extends the dynamic range of measured gene expression at the high end. Our method is generic and can be applied to data from any organism, for imaging with any scanner that allows varying the laser power, and for extraction with any image analysis software. Results from a self-self hybridization data set illustrate an improved precision in the estimation of the expression of genes compared to what can be achieved by applying standard methods and using only a single scan.  相似文献   

10.
Dye-specific bias effects, commonly observed in the two-color microarray platform, are normally corrected using the dye swap design. This design, however, is relatively expensive and labor-intensive. We propose a self-self hybridization design as an alternative to the dye swap design. In this design, the treated and control samples are labeled with Cy5 and Cy3 (or Cy3 and Cy5), respectively, without dye swap, along with a set of self-self hybridizations on the control sample. We compare this design with the dye swap design through investigation of mouse primary hepatocytes treated with three peroxisome proliferator-activated receptor-alpha (PPARalpha) agonists at three dose levels. Using Agilent's Whole Mouse Genome microarray, differentially expressed genes (DEG) were determined for both the self-self hybridization and dye swap designs. The DEG concordance between the two designs was over 80% across each dose treatment and chemical. Furthermore, 90% of DEG-associated biological pathways were in common between the designs, indicating that biological interpretations would be consistent. The reduced labor and expense for the self-self hybridization design make it an efficient substitute for the dye swap design. For example, in larger toxicogenomic studies, only about half the chips are required for the self-self hybridization design compared to that needed in the dye swap design.  相似文献   

11.
Phylogenies of organisms are essential to investigating a range of evolutionary questions of interest to researchers in the field of bioinformatics. Phylogenies not only help to define how to study many evolutionary questions, they must also be taken into account when conducting statistical analyses. Here it is shown how phylogenies can be used to investigate variability along the sites of a gene, reconstruct ancestral states of ancient genes and proteins, identify and characterise events of parallel and convergent evolution, find events of gene duplication, analyse predictions from molecular clocks, seek evidence for correlated changes among different parts of the same gene or genome, and test theories of molecular evolution. A table of statistical and phylogenetic methods is presented.  相似文献   

12.
The availability of hundreds of complete bacterial genomes has created new challenges and simultaneously opportunities for bioinformatics. In the area of statistical analysis of genomic sequences, the studies of nucleotide compositional bias and gene bias between strands and replichores paved way to the development of tools for prediction of bacterial replication origins. Only a few (about 20) origin regions for eubacteria and archaea have been proven experimentally. One reason for that may be that this is now considered as an essentially bioinformatics problem, where predictions are sufficiently reliable not to run labor-intensive experiments, unless specifically needed. Here we describe the main existing approaches to the identification of replication origin (oriC) and termination (terC) loci in prokaryotic chromosomes and characterize a number of computational tools based on various skew types and other types of evidence. We also classify the eubacterial and archaeal chromosomes by predictability of their replication origins using skew plots. Finally, we discuss possible combined approaches to the identification of the oriC sites that may be used to improve the prediction tools, in particular, the analysis of DnaA binding sites using the comparative genomic methods.  相似文献   

13.
SpotWhatR is a user-friendly microarray data analysis tool that runs under a widely and freely available R statistical language (http://www.r-project.org) for Windows and Linux operational systems. The aim of SpotWhatR is to help the researcher to analyze microarray data by providing basic tools for data visualization, normalization, determination of differentially expressed genes, summarization by Gene Ontology terms, and clustering analysis. SpotWhatR allows researchers who are not familiar with computational programming to choose the most suitable analysis for their microarray dataset. Along with well-known procedures used in microarray data analysis, we have introduced a stand-alone implementation of the HTself method, especially designed to find differentially expressed genes in low-replication contexts. This approach is more compatible with our local reality than the usual statistical methods. We provide several examples derived from the Blastocladiella emersonii and Xylella fastidiosa Microarray Projects. SpotWhatR is freely available at http://blasto.iq.usp.br/~tkoide/SpotWhatR, in English and Portuguese versions. In addition, the user can choose between "single experiment" and "batch processing" versions.  相似文献   

14.
Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research—translating basic science results into new interventions—and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

What to Learn in This Chapter

Text mining is an established field, but its application to translational bioinformatics is quite new and it presents myriad research opportunities. It is made difficult by the fact that natural (human) language, unlike computer language, is characterized at all levels by rampant ambiguity and variability. Important sub-tasks include gene name recognition, or finding mentions of gene names in text; gene normalization, or mapping mentions of genes in text to standard database identifiers; phenotype recognition, or finding mentions of phenotypes in text; and phenotype normalization, or mapping mentions of phenotypes to concepts in ontologies. Text mining for translational bioinformatics can necessitate dealing with two widely varying genres of text—published journal articles, and prose fields in electronic medical records. Research into the latter has been impeded for years by lack of public availability of data sets, but this has very recently changed and the field is poised for rapid advances. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.
This article is part of the “Translational Bioinformatics” collection for PLOS Computational Biology.
  相似文献   

15.
It has been argued that the missing heritability in common diseases may be in part due to rare variants and gene-gene effects. Haplotype analyses provide more power for rare variants and joint analyses across genes can address multi-gene effects. Currently, methods are lacking to perform joint multi-locus association analyses across more than one gene/region. Here, we present a haplotype-mining gene-gene analysis method, which considers multi-locus data for two genes/regions simultaneously. This approach extends our single region haplotype-mining algorithm, hapConstructor, to two genes/regions. It allows construction of multi-locus SNP sets at both genes and tests joint gene-gene effects and interactions between single variants or haplotype combinations. A Monte Carlo framework is used to provide statistical significance assessment of the joint and interaction statistics, thus the method can also be used with related individuals. This tool provides a flexible data-mining approach to identifying gene-gene effects that otherwise is currently unavailable. AVAILABILITY: http://bioinformatics.med.utah.edu/Genie/hapConstructor.html.  相似文献   

16.
Simões I  Faro R  Bur D  Kay J  Faro C 《The FEBS journal》2011,278(17):3177-3186
The view has been widely held that pepsin-like aspartic proteinases are found only in eukaryotes, and not in bacteria. However, a recent bioinformatics search [Rawlings ND & Bateman A (2009) BMC Genomics10, 437] revealed that, in seven of ~ 1000 completely sequenced bacterial genomes, genes were present encoding polypeptides that displayed the requisite hallmark sequence motifs of pepsin-like aspartic proteinases. The implications of this theoretical observation prompted us to generate biochemical data to validate this finding experimentally. The aspartic proteinase gene from one of the seven identified bacterial species, Shewanella amazonensis, was expressed in Escherichia coli. The recombinant protein, termed shewasin A, was produced in soluble form, purified to homogeneity, and shown to display properties remarkably similar to those of pepsin-like aspartic proteinases. Shewasin A was maximally active at acidic pH values, cleaving a substrate that has been widely used for assessment of the proteolytic activity of other aspartic proteinases, and displayed a clear preference for cleaving peptide bonds between hydrophobic residues in the P1*P1' positions of the substrate. It was completely inhibited by the general inhibitor of aspartic proteinases, pepstatin, and mutation of one of the catalytic Asp residues (in the Asp-Thr-Gly motif of the N-terminal domain) resulted in complete loss of enzymatic activity. It can thus be concluded unequivocally that this Shewanella gene encodes an active pepsin-like aspartic proteinase. It is now beyond doubt that pepsin-like aspartic proteinases are not confined to eukaryotes, but are encoded within some species of bacteria. The distinctions between the bacterial and eukaryotic polypeptides are discussed and their evolutionary relationships are outlined.  相似文献   

17.
Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by combining eight kinds of models: (i) independent and identically distributed process; (ii) variable-length Markov chain; (iii) inhomogeneous Markov chain; (iv) hidden Markov model; (v) profile hidden Markov model; (vi) pair hidden Markov model; (vii) generalized hidden Markov model; and (viii) similarity based sequence weighting. The framework includes functionality for training, simulation and decoding of the models. Additionally, it provides two methods to help parameter setting: Akaike and Bayesian information criteria (AIC and BIC). The models can be used stand-alone, combined in Bayesian classifiers, or included in more complex, multi-model, probabilistic architectures using GHMMs. In particular the framework provides a novel, flexible, implementation of decoding in GHMMs that detects when the architecture can be traversed efficiently.
This is a PLOS Computational Biology Software Article.
  相似文献   

18.
Pise is interface construction software for bioinformatics applications that run by command-line operations. It creates common, easy-to-use interfaces to these applications for the Web, or other uses. It is adaptable to new bioinformatics tools, and offers program chaining, Unix system batch and other controls, making it an attractive method for building and using your own bioinformatics web services.  相似文献   

19.
Bayesian inference on biopolymer models   总被引:8,自引:0,他引:8  
  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号