首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
As macromolecular crystal structures are determined and refined in an increasingly automated fashion, careful assessment of the reliability and quality of the resulting models becomes increasingly important. Here, we analyze various issues related to the reliability and quality of macromolecular crystal structures deposited between 1991 and 2000. We find that the average resolution at which these structures are determined is essentially constant. In line with this observation, the average quality as measured by Ramachandran analysis does not improve as a function of time. On the other hand, an observed decrease of the average discrepancy between free and conventional R values suggests that the fit of model and data is improving. Finally, we present a surprising correlation between the tendency of crystallographers to deposit their experimental data and the free R values of their models.  相似文献   

2.
3.
4.
Beckman KB  Lee KY  Golden T  Melov S 《Mitochondrion》2004,4(5-6):453-470
Mitochondrial diseases are a heterogeneous array of disorders with a complex etiology. Use of microarrays as a tool to investigate complex human disease is increasingly common, however, a principle drawback of microarrays is their limited dynamic range, due to the poor quantification of weak signals. Although it is generally understood that low-intensity microarray 'spots' may be unreliable, there exists little documentation of their accuracy. Quantitative PCR (Q-PCR) is frequently used to validate microarray data, yet few Q-PCR validation studies have focused on the accuracy of low-intensity microarray signals. Hence, we have used Q-PCR to systematically assess microarray accuracy as a function of signal strength in a mouse model of mitochondrial disease, the superoxide dismutase 2 (SOD2) nullizygous mouse. We have focused on a unique category of data--spots with only one weak signal in a two-dye comparative hybridization--and show that such 'high-low' signal intensities are common for differentially expressed genes. This category of differential expression may be more important in mitochondrial disease in which there are often mosaic expression patterns due to the idiosyncratic distribution of mutant mtDNA in heteroplasmic individuals. Using RNA from the SOD2 mouse, we found that when spotted cDNA microarray data are filtered for quality (low variance between many technical replicates) and spot intensity (above a negative control threshold in both channels), there is an excellent quantitative concordance with Q-PCR (R2 = 0.94). The accuracy of gene expression ratios from low-intensity spots (R2 = 0.27) and 'high-low' spots (R2 = 0.32) is considerably lower. Our results should serve as guidelines for microarray interpretation and the selection of genes for validation in mitochondrial disorders.  相似文献   

5.
6.
We study the effects on clustering quality by different normalization and pre-clustering techniques for a novel mixed-integer nonlinear optimization-based clustering algorithm, the Global Optimum Search with Enhanced Positioning (EP_GOS_Clust). These are important issues to be addressed. DNA microarray experiments are informative tools to elucidate gene regulatory networks. But in order for gene expression levels to be comparable across microarrays, normalization procedures have to be properly undertaken. The aim of pre-clustering is to use an adequate amount of discriminatory characteristics to form rough information profiles, so that data with similar features can be pre-grouped together and outliers deemed insignificant to the clustering process can be removed. Using experimental DNA microarray data from the yeast Saccharomyces Cerevisiae, we study the merits of pre-clustering genes based on distance/correlation comparisons and symbolic representations such as {+, o, -}. As a performance metric, we look at the intra- and inter-cluster error sums, two generic but intuitive measures of clustering quality. We also use publicly available Gene Ontology resources to assess the clusters' level of biological coherence. Our analysis indicates a significant effect by normalization and pre-clustering methods on the clustering results. Hence, the outcome of this study has significance in fine-tuning the EP_GOS_Clust clustering approach.  相似文献   

7.
We have developed theoretical models for analysis of X-ray diffuse scattering from protein crystals. A series of models are proposed to be used for experimental data with different degrees of precision. First, we propose the normal mode model, where conformational dynamics of a protein is assumed to occur mostly in a limited conformational subspace spanned by a small number of low-frequency normal modes in the protein. When high precision data are available, variances and covariances of the normal mode variables can be determined from experimental data using this model. For experimental data with lower degrees of precision, we introduce a series of simpler models. These models express the covariance matrix using relatively simple empirical correlation functions by assuming the correlation between a pair of atoms to be isotropic. As an application of these simpler models, we calculate diffuse-scattering patterns from a human lysozyme crystal to examine how each adjustable parameter in the models affects general features of the resulting patterns. The results of the calculation are summarized as follows. (1) The higher order scattering makes a significant contribution at high resolutions. (2) The resulting simulated patterns are sensitive to changes in correlation lengths of about 1 Å, as well as to changes of the functional form of the correlation function. (3) But only the “average” value of the intra- and intermolecular correlation lengths seems to determine the gross features of the pattern. (4) The effect of the atom-dependent amplitude of fluctuations is difficult to observe. © 1994 John Wiley & Sons, Inc.  相似文献   

8.
9.
10.
This research analyzes some aspects of the relationship between gene expression, gene function, and gene annotation. Many recent studies are implicitly based on the assumption that gene products that are biologically and functionally related would maintain this similarity both in their expression profiles as well as in their gene ontology (GO) annotation. We analyze how accurate this assumption proves to be using real publicly available data. We also aim to validate a measure of semantic similarity for GO annotation. We use the Pearson correlation coefficient and its absolute value as a measure of similarity between expression profiles of gene products. We explore a number of semantic similarity measures (Resnik, Jiang, and Lin) and compute the similarity between gene products annotated using the GO. Finally, we compute correlation coefficients to compare gene expression similarity against GO semantic similarity. Our results suggest that the Resnik similarity measure outperforms the others and seems better suited for use in gene ontology. We also deduce that there seems to be correlation between semantic similarity in the GO annotation and gene expression for the three GO ontologies. We show that this correlation is negligible up to a certain semantic similarity value; then, for higher similarity values, the relationship trend becomes almost linear. These results can be used to augment the knowledge provided by clustering algorithms and in the development of bioinformatic tools for finding and characterizing gene products.  相似文献   

11.
12.
A scientific ontology is a formal representation of knowledge within a domain, typically including central concepts, their properties, and relations. With the rise of computers and high-throughput data collection, ontologies have become essential to data mining and sharing across communities in the biomedical sciences. Powerful approaches exist for testing the internal consistency of an ontology, but not for assessing the fidelity of its domain representation. We introduce a family of metrics that describe the breadth and depth with which an ontology represents its knowledge domain. We then test these metrics using (1) four of the most common medical ontologies with respect to a corpus of medical documents and (2) seven of the most popular English thesauri with respect to three corpora that sample language from medicine, news, and novels. Here we show that our approach captures the quality of ontological representation and guides efforts to narrow the breach between ontology and collective discourse within a domain. Our results also demonstrate key features of medical ontologies, English thesauri, and discourse from different domains. Medical ontologies have a small intersection, as do English thesauri. Moreover, dialects characteristic of distinct domains vary strikingly as many of the same words are used quite differently in medicine, news, and novels. As ontologies are intended to mirror the state of knowledge, our methods to tighten the fit between ontology and domain will increase their relevance for new areas of biomedical science and improve the accuracy and power of inferences computed across them.  相似文献   

13.
The single-cell RNA sequencing (scRNA-seq) technologies obtain gene expression at single-cell resolution and provide a tool for exploring cell heterogeneity and cell types. As the low amount of extracted mRNA copies per cell, scRNA-seq data exhibit a large number of dropouts, which hinders the downstream analysis of the scRNA-seq data. We propose a statistical method, SDImpute (Single-cell RNA-seq Dropout Imputation), to implement block imputation for dropout events in scRNA-seq data. SDImpute automatically identifies the dropout events based on the gene expression levels and the variations of gene expression across similar cells and similar genes, and it implements block imputation for dropouts by utilizing gene expression unaffected by dropouts from similar cells. In the experiments, the results of the simulated datasets and real datasets suggest that SDImpute is an effective tool to recover the data and preserve the heterogeneity of gene expression across cells. Compared with the state-of-the-art imputation methods, SDImpute improves the accuracy of the downstream analysis including clustering, visualization, and differential expression analysis.  相似文献   

14.
15.
MOTIVATION: A major issue in computational biology is the reconstruction of pathways from several genomic datasets, such as expression data, protein interaction data and phylogenetic profiles. As a first step toward this goal, it is important to investigate the amount of correlation which exists between these data. RESULTS: These methods are successfully tested on their ability to recognize operons in the Escherichia coli genome, from the comparison of three datasets corresponding to functional relationships between genes in metabolic pathways, geometrical relationships along the chromosome, and co-expression relationships as observed by gene expression data.  相似文献   

16.
Quality control in molecular immunohistochemistry   总被引:2,自引:1,他引:1  
Immunoperoxidase histochemistry is a widespread method of assessing expression of biomolecules in tissue samples. Accurate assessment of the expression levels of genes is critical for the management of disease, particularly as therapy targeted to specific molecules becomes more widespread. Determining the quality of preservation of macromolecules in tissue is important to avoid false negative and false positive results. In this review we discuss (1) issues of sensitivity (false negativity) and specificity (false positivity) of immunohistochemical stains, (2) approaches to better understanding differences in immunostains done by different laboratories (including the recently proposed MISFISHIE specification for tissue localization studies), and (3) approaches to assessing the quality of preservation of macromolecules in tissue, particularly in small biopsy samples.  相似文献   

17.
While it is universally recognised that environmental factors can cause phenotypic trait variation via phenotypic plasticity, the extent to which causal processes operate in the reverse direction has received less consideration. In fact individuals are often active agents in determining the environments, and hence the selective regimes, they experience. There are several important mechanisms by which this can occur, including habitat selection and niche construction, that are expected to result in phenotype–environment correlations (i.e. non-random assortment of phenotypes across heterogeneous environments). Here we highlight an additional mechanism – intraspecific competition for preferred environments – that may be widespread, and has implications for phenotypic evolution that are currently underappreciated. Under this mechanism, variation among individuals in traits determining their competitive ability leads to phenotype–environment correlation; more competitive phenotypes are able to acquire better patches. Based on a concise review of the empirical evidence we argue that competition-induced phenotype–environment correlations are likely to be common in natural populations before highlighting the major implications of this for studies of natural selection and microevolution. We focus particularly on two central issues. First, competition-induced phenotype–environment correlation leads to the expectation that positive feedback loops will amplify phenotypic and fitness variation among competing individuals. As a result of being able to acquire a better environment, winners gain more resources and even better phenotypes – at the expense of losers. The distinction between individual quality and environmental quality that is commonly made by researchers in evolutionary ecology thus becomes untenable. Second, if differences among individuals in competitive ability are underpinned by heritable traits, competition results in both genotype–environment correlations and an expectation of indirect genetic effects (IGEs) on resource-dependent life-history traits. Theory tells us that these IGEs will act as (partial) constraints, reducing the amount of genetic variance available to facilitate evolutionary adaptation. Failure to recognise this will lead to systematic overestimation of the adaptive potential of populations. To understand the importance of these issues for ecological and evolutionary processes in natural populations we therefore need to identify and quantify competition-induced phenotype–environment correlations in our study systems. We conclude that both fundamental and applied research will benefit from an improved understanding of when and how social competition causes non-random distribution of phenotypes, and genotypes, across heterogeneous environments.  相似文献   

18.
Habitat suitability models can be generated using methods requiring information on species presence or species presence and absence. Knowledge of the predictive performance of such methods becomes a critical issue to establish their optimal scope of application for mapping current species distributions under different constraints. Here, we use breeding bird atlas data in Catalonia as a working example and attempt to analyse the relative performance of two methods: the Ecological Niche factor Analysis (ENFA) using presence data only and Generalised Linear Models (GLM) using presence/absence data. Models were run on a set of forest species with similar habitat requirements, but with varying occurrence rates (prevalence) and niche positions (marginality). Our results support the idea that GLM predictions are more accurate than those obtained with ENFA. This was particularly true when species were using available habitats proportionally to their suitability, making absence data reliable and useful to enhance model calibration. Species marginality in niche space was also correlated to predictive accuracy, i.e. species with less restricted ecological requirements were modelled less accurately than species with more restricted requirements. This pattern was irrespective of the method employed. Models for wide‐ranging and tolerant species were more sensitive to absence data, suggesting that presence/absence methods may be particularly important for predicting distributions of this type of species. We conclude that modellers should consider that species ecological characteristics are critical in determining the accuracy of models and that it is difficult to predict generalist species distributions accurately and this is independent of the method used. Being based on distinct approaches regarding adjustment to data and data quality, habitat distribution modelling methods cover different application areas, making it difficult to identify one that should be universally applicable. Our results suggest however, that if absence data is available, methods using this information should be preferably used in most situations.  相似文献   

19.
表达谱基因芯片的可靠性验证分析   总被引:7,自引:0,他引:7  
cDNA芯片是一项新兴的能评估检测全范围mRNA表达水平变化的技术。通过同种组织RNA自身比较实验及不同组织RNA的差异分析实验对cDNA芯片实验的重复性进行检验,利用相关系数(correlation coefficient,R)、变异系数(coefficient of variation,CV)和假阳性率(false positiver ate,FPR)分析eDNA芯片数据的可靠程度,对cDNA芯片实验数据作了整体的评估。结果证实,该芯片系统得到的cDNA表达谱数据相关系数一般大于0.9,平均变异系数15%左右,假阳性率控制在3%以内。还提出一致率(consistence rate,CR)的概念,作为衡量cDNA芯片系统重复性的新参数,同时阐述了该参数优于目前常用的相关系数及变异系数的特点。另外,通过比较芯片制备中点样浓度、mRNA和总RNA以及不同批次芯片和不同标记过程对实验的影响,来分析芯片数据的系统误差来源。并提出重复两次实验,可以克服绝大部分实验系统引入的假阳性。  相似文献   

20.
The miRNAs regulate cell functions by inhibiting expression of proteins. Research on miRNAs had usually focused on identifying targets by base pairing between miRNAs and their targets. Instead of identifying targets, this paper proposed an innovative approach, namely impact significance analysis, to study the correlation between mature sequence, expression across patient samples or time and global function on cell cycle signaling of miRNAs. With three distinct types of data: The Cancer Genome Atlas miRNA expression data for 354 human breast cancer specimens, microarray of 266 miRNAs in mouse Embryonic Stem cells (ESCs), and Reverse Phase Protein Array (RPPA) transfected by 776 miRNAs in MDA-MB-231 cell line, we linked the expression and function of miRNAs by their mature sequence and discovered systematically that the similarity of miRNA expression enhances the similarity of miRNA function, which indicates the miRNA expression can be used as a supplementary factor to predict miRNA function. The results also show that both seed region and 3'' portion are associated with miRNA expression levels across human breast cancer specimens and in ESCs; miRNAs with similar seed tend to have similar 3'' portion. And we discussed that the impact of 3'' portion, including nucleotides , is not significant for miRNA function. These results provide novel insights to understand the correlation between miRNA sequence, expression and function. They can be applied to improve the prediction algorithm and the impact significance analysis can also be implemented to similar analysis for other small RNAs such as siRNAs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号